I have the following working NASM code:
global _start
section .text
_start:
mov eax, 0x4
mov ebx, 0x1
mov ecx, message
mov edx, 0xF
int 0x80
mov eax, 0x1
mov ebx, 0x0
int 0x80
section .data
message: db "Hello, World!", 0dh, 0ah
which prints "Hello, World!\n" to the screen. I also have the following C wrapper which contains the previous NASM object code:
char code[] =
"\xb8\x04\x00\x00\x00"
"\xbb\x01\x00\x00\x00"
"\xb9\x00\x00\x00\x00"
"\xba\x0f\x00\x00\x00"
"\xcd\x80\xb8\x01\x00"
"\x00\x00\xbb\x00\x00"
"\x00\x00\xcd\x80";
int main(void)
{
(*(void(*)())code)();
}
However when I run the code, it seems like the assembler code isn't executed, but the program exits fine. Any ideas?
Thanks
When you inject this shellcode, you don't know what is at message:
mov ecx, message
in the injected process, it can be anything but it will not be "Hello world!\r\n" since it is in the data section while you are dumping only the text section. You can see that your shellcode doesn't have "Hello world!\r\n":
"\xb8\x04\x00\x00\x00"
"\xbb\x01\x00\x00\x00"
"\xb9\x00\x00\x00\x00"
"\xba\x0f\x00\x00\x00"
"\xcd\x80\xb8\x01\x00"
"\x00\x00\xbb\x00\x00"
"\x00\x00\xcd\x80";
This is common problem in shellcode development, the way to work around it is this way:
global _start
section .text
_start:
jmp MESSAGE ; 1) lets jump to MESSAGE
GOBACK:
mov eax, 0x4
mov ebx, 0x1
pop ecx ; 3) we are poping into `ecx`, now we have the
; address of "Hello, World!\r\n"
mov edx, 0xF
int 0x80
mov eax, 0x1
mov ebx, 0x0
int 0x80
MESSAGE:
call GOBACK ; 2) we are going back, since we used `call`, that means
; the return address, which is in this case the address
; of "Hello, World!\r\n", is pushed into the stack.
db "Hello, World!", 0dh, 0ah
section .data
Now dump the text section:
$ nasm -f elf shellcode.asm
$ ld shellcode.o -o shellcode
$ ./shellcode
Hello, World!
$ objdump -d shellcode
shellcode: file format elf32-i386
Disassembly of section .text:
08048060 <_start>:
8048060: e9 1e 00 00 00 jmp 8048083 <MESSAGE>
08048065 <GOBACK>:
8048065: b8 04 00 00 00 mov $0x4,%eax
804806a: bb 01 00 00 00 mov $0x1,%ebx
804806f: 59 pop %ecx
8048070: ba 0f 00 00 00 mov $0xf,%edx
8048075: cd 80 int $0x80
8048077: b8 01 00 00 00 mov $0x1,%eax
804807c: bb 00 00 00 00 mov $0x0,%ebx
8048081: cd 80 int $0x80
08048083 <MESSAGE>:
8048083: e8 dd ff ff ff call 8048065 <GOBACK>
8048088: 48 dec %eax <-+
8048089: 65 gs |
804808a: 6c insb (%dx),%es:(%edi) |
804808b: 6c insb (%dx),%es:(%edi) |
804808c: 6f outsl %ds:(%esi),(%dx) |
804808d: 2c 20 sub $0x20,%al |
804808f: 57 push %edi |
8048090: 6f outsl %ds:(%esi),(%dx) |
8048091: 72 6c jb 80480ff <MESSAGE+0x7c> |
8048093: 64 fs |
8048094: 21 .byte 0x21 |
8048095: 0d .byte 0xd |
8048096: 0a .byte 0xa <-+
$
The lines I marked are our "Hello, World!\r\n" string:
$ printf "\x48\x65\x6c\x6c\x6f\x2c\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a"
Hello, World!
$
So our C wrapper will be:
char code[] =
"\xe9\x1e\x00\x00\x00" // jmp (relative) <MESSAGE>
"\xb8\x04\x00\x00\x00" // mov $0x4,%eax
"\xbb\x01\x00\x00\x00" // mov $0x1,%ebx
"\x59" // pop %ecx
"\xba\x0f\x00\x00\x00" // mov $0xf,%edx
"\xcd\x80" // int $0x80
"\xb8\x01\x00\x00\x00" // mov $0x1,%eax
"\xbb\x00\x00\x00\x00" // mov $0x0,%ebx
"\xcd\x80" // int $0x80
"\xe8\xdd\xff\xff\xff" // call (relative) <GOBACK>
"Hello wolrd!\r\n"; // OR "\x48\x65\x6c\x6c\x6f\x2c\x20\x57"
// "\x6f\x72\x6c\x64\x21\x0d\x0a"
int main(int argc, char **argv)
{
(*(void(*)())code)();
return 0;
}
Lets test it, using -z execstack to enable read-implies-exec (process-wide, despite "stack" in the name) so we can executed code in the .data or .rodata sections:
$ gcc -m32 test.c -z execstack -o test
$ ./test
Hello wolrd!
It works. (-m32 is necessary, too, on 64-bit systems. The int $0x80 32-bit ABI doesn't work with 64-bit addresses like .rodata in a PIE executable. Also, the machine code was assembled for 32-bit. It happens that the same sequence of bytes would decode to equivalent instructions in 64-bit mode but that's not always the case.)
Modern GNU ld puts .rodata in a separate segment from .text, so it can be non-executable. It used to be sufficient to use const char code[] to put executable code in a page of read-only data. At least for shellcode that doesn't want to modify itself.
As BSH mentioned, your shellcode does not contain the message bytes. Jumping to the MESSAGE label and calling the GOBACK routine just before defining the msg byte was a good move as the address of msg would be on the top of the stack as return address which could be popped to ecx, where the address of msg is stored.
But both yours and BSH's code has a slight limitation.
It contains NULL bytes ( \x00 ) which would be considered as end of string when dereferenced by the function pointer.
There is a smart way around this. The values you store into eax, ebx and edx are small enough to be directly written into the lower nibbles of the respective registers in one go by accessing al, bl and dl respectively.
The upper nibble may contain junk value so it can be xored.
b8 04 00 00 00 ------ mov $0x4,%eax
becomes
b0 04 ------ mov $0x4,%al
31 c0 ------ xor %eax,%eax
Unlike the prior instruction set, the new instruction set does not contain any NULL byte.
So, the final program looks like this :
global _start
section .text
_start:
jmp message
proc:
xor eax, eax
mov al, 0x04
xor ebx, ebx
mov bl, 0x01
pop ecx
xor edx, edx
mov dl, 0x16
int 0x80
xor eax, eax
mov al, 0x01
xor ebx, ebx
mov bl, 0x01 ; return 1
int 0x80
message:
call proc
msg db " y0u sp34k 1337 ? "
section .data
Assembling and linking :
$ nasm -f elf hello.asm -o hello.o
$ ld -s -m elf_i386 hello.o -o hello
$ ./hello
y0u sp34k 1337 ? $
Now extract the shellcode from the hello binary :
$ for i in `objdump -d hello | tr '\t' ' ' | tr ' ' '\n' | egrep '^[0-9a-f]{2}$' ` ; do echo -n "\\x$i" ; done
output:
\xeb\x19\x31\xc0\xb0\x04\x31\xdb\xb3\x01\x59\x31\xd2\xb2\x12\xcd\x80\x31\xc0\xb0\x01\x31\xdb\xb3\x01\xcd\x80\xe8\xe2\xff\xff\xff\x20\x79\x30\x75\x20\x73\x70\x33\x34\x6b\x20\x31\x33\x33\x37\x20\x3f\x20
Now we can have our driver program to launch the shellcode.
#include <stdio.h>
char shellcode[] = "\xeb\x19\x31\xc0\xb0\x04\x31\xdb"
"\xb3\x01\x59\x31\xd2\xb2\x12\xcd"
"\x80\x31\xc0\xb0\x01\x31\xdb\xb3"
"\x01\xcd\x80\xe8\xe2\xff\xff\xff"
"\x20\x79\x30\x75\x20\x73\x70\x33"
"\x34\x6b\x20\x31\x33\x33\x37\x20"
"\x3f\x20";
int main(int argc, char **argv) {
(*(void(*)())shellcode)();
return 0;
}
There are certain security features in modern compilers like NX protection which prevents execution of code in data segment or stack. So we should explicitly specify the compiler to disable these.
$ gcc -g -Wall -fno-stack-protector -z execstack launcher.c -o launcher
Now the launcher can be invoked to launch the shellcode.
$ ./launcher
y0u sp34k 1337 ? $
For more complex shellcodes, there would be another hurdle. Modern Linux kernels have ASLR or Address Space Layout Randomization
You may need to disable this before your inject the shellcode, especially when it is through buffer overflows.
root#localhost:~# echo 0 > /proc/sys/kernel/randomize_va_space
Related
I am trying to develop a shellcode that would execute the following command /bin/ls / (list root directory).
So, I first try to develop a script in assembly to do the job for me, below is the script:
SECTION .data
buf: db "./", 0
SECTION .text
global _start
_start:
xor eax, eax
xor edx, edx
push eax
push long 0x736c2f2f ; "sl/"
push long 0x6e69622f ; "nib/"
mov ebx, esp
push eax
push byte 0x2f
mov esi, esp
push eax
push esi
push ebx
mov ecx, esp
mov eax, 0x0b
int 0x80
mov eax, 1
int 0x80
I then compile with following: nasm -f elf -g shell.asm && ld -s -o shell shell.o -m elf_i386. This works perfectly, If executed, it list root directory.
Then, I disassemble the binary using objdump to get the op-codes:
#objdump -d ./shell
./shell: file format elf32-i386
Disassembly of section .text:
08049000 <.text>:
8049000: 31 c0 xor %eax,%eax
8049002: 31 d2 xor %edx,%edx
8049004: 50 push %eax
8049005: 68 2f 2f 6c 73 push $0x736c2f2f
804900a: 68 2f 62 69 6e push $0x6e69622f
804900f: 89 e3 mov %esp,%ebx
8049011: 50 push %eax
8049012: 6a 2f push $0x2f
8049014: 89 e6 mov %esp,%esi
8049016: 50 push %eax
8049017: 56 push %esi
8049018: 53 push %ebx
8049019: 89 e1 mov %esp,%ecx
804901b: b8 0b 00 00 00 mov $0xb,%eax
8049020: cd 80 int $0x80
8049022: b8 01 00 00 00 mov $0x1,%eax
8049027: cd 80 int $0x80
To make life easy, I did the following: objdump -d ./shell | awk -F " " '{print $1}' |awk -F ":" '{print $2}' | tr -d " " | tr -d "\n" | tr -d "\t" | xclip -selection c. Basically it copies the opcodes to the clipboard. 31c031d250682f2f6c73682f62696e89e3506a2f89e650565389e1b80b000000cd80b801000000cd80.
I then to test my shellcode, I inject it to a c program as below:
#include<stdio.h>
#include<string.h>
unsigned char shellcode[] = "\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x6c\x73\x68\x2f\x62\x69\x6e\x89\xe3\x50\x6a\x2f\x89\xe6\x50\x56\x53\x89\xe1\xb8\x0b\x00\x00\x00\xcd\x80\xb8\x01\x00\x00\x00\xcd\x80";
int main(int argc, char **argv) {
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}
# gcc -g -o foo foo.c -z execstack -fno-stack-protector
foo.c: In function ‘main’:
foo.c:9:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
9 | (*ret) = (int)shellcode;
|
with that done, If I execute my foo binary, it execute the program but do not list the root directory.
I am using nasm with the following arc: Linux kali 5.2.0-kali2-amd64 #1 SMP Debian 5.2.9-2kali1 (2019-08-22) x86_64 GNU/Linux
I programmed a program in nasm (x64) which should execute /bin/bash, and that works fine. Then i ran the program with objdump -D and i wrote down the machine code like this: \xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05. Then i ran this with ./shell $(python -c 'print "\xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05"') and i got an illegal instruction. But the assembler program worked fine! Can someone help?
shell.c:
int main(int argc, char **argv) {
int (*func)();
func = (int (*)()) argv[1];
(int)(*func)();
}
bash.asm:
section .text
global start
start:
mov rbx, 0x68
push rbx
mov rbx, 0x7361622f6e69622f
push rbx
mov rdi, rsp
push rax
push rdi
mov rsi, rsp
mov al, 59
syscall
objdump:
./bash: file format elf64-x86-64
Disassembly of section .text:
0000000000401000 <start>:
401000: bb 68 00 00 00 mov $0x68,%ebx
401005: 53 push %rbx
401006: 48 bb 2f 62 69 6e 2f movabs $0x7361622f6e69622f,%rbx
40100d: 62 61 73
401010: 53 push %rbx
401011: 48 89 e7 mov %rsp,%rdi
401014: 50 push %rax
401015: 57 push %rdi
401016: 48 89 e6 mov %rsp,%rsi
401019: b0 3b mov $0x3b,%al
40101b: 0f 05 syscall
You are omitting the zero bytes here:
\xbb\x68\x53\x48\xbb\x2f\x62\x69\x6e\x2f\x62\x61\x73\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05
as opposed to
401000: bb 68 00 00 00 mov $0x68,%ebx
The zero bytes are part of the instructions and cannot be skipped. So you have to include them.
The problem is, however, that the zero bytes would terminate the argument string and hence have to be avoided. It is your duty as shellcode designer to construct it in a way, that it does not include byte values that may not occur. In many cases this means no zero bytes, because the shellcode is injected as a C string, but other values may be problematic in other situations, too.
BITS 64
section .text
global _start
_start:
jmp short two
one:
pop rbx
xor al,al
xor cx,cx
mov al,8
mov cx,0755
int 0x80
xor al,al
inc al
xor bl,bl
int 0x80
two:
call one
db 'H'`
This is my assembly code.
Then I used two commands. "nasm -f elf64 newdir.s -o newdir.o" and "ld newdir.o -o newdir".I run ./newdir and worked fine but when I extracted op code and tried to test this shellcode using following c program . It is not working(no segmentation fault).I have compiled using cmd gcc newdir -z execstack
#include <stdio.h>
char sh[]="\xeb\x16\x5b\x30\xc0\x66\x31\xc9\xb0\x08\x66\xb9\xf3\x02\xcd\x80\x30\xc0\xfe\xc0\x30\xdb\xcd\x80\xe8\xe5\xff\xff\xff\x48";
void main(int argc, char **argv)
{
int (*func)();
func = (int (*)()) sh;
(int)(*func)();
}
objdump -d newdir
newdir: file format elf64-x86-64
Disassembly of section .text:
0000000000400080 <_start>:
400080: eb 16 jmp 400098 <two>
0000000000400082 <one>:
400082: 5b pop %rbx
400083: 30 c0 xor %al,%al
400085: 66 31 c9 xor %cx,%cx
400088: b0 08 mov $0x8,%al
40008a: 66 b9 f3 02 mov $0x2f3,%cx
40008e: cd 80 int $0x80
400090: 30 c0 xor %al,%al
400092: fe c0 inc %al
400094: 30 db xor %bl,%bl
400096: cd 80 int $0x80
0000000000400098 <two>:
400098: e8 e5 ff ff ff callq 400082 <one>
40009d: 48 rex.W
when I run ./a.out , I am getting something like in photo. I am attaching photo because I cant explain what is happening.image
P.S- My problem is resolved. But I wanted to know where things was going wrong. So I used debugger and the result is below
`
(gdb) list
1 char shellcode[] = "\xeb\x16\x5b\x30\xc0\x66\x31\xc9\xb0\x08\x66\xb9\xf3\x02\xcd\x80\x30\xc0\xfe\xc0\x30\xdb\xcd\x80\xe8\xe5\xff\xff\xff\x48";
2 int main (int argc, char **argv)
3 {
4 int (*ret)();
5 ret = (int(*)())shellcode;
6
7 (int)(*ret)();
8 } (gdb) disassemble main
Dump of assembler code for function main:
0x00000000000005fa <+0>: push %rbp
0x00000000000005fb <+1>: mov %rsp,%rbp
0x00000000000005fe <+4>: sub $0x20,%rsp
0x0000000000000602 <+8>: mov %edi,-0x14(%rbp)
0x0000000000000605 <+11>: mov %rsi,-0x20(%rbp)
0x0000000000000609 <+15>: lea 0x200a20(%rip),%rax # 0x201030 <shellcode>
0x0000000000000610 <+22>: mov %rax,-0x8(%rbp)
0x0000000000000614 <+26>: mov -0x8(%rbp),%rdx
0x0000000000000618 <+30>: mov $0x0,%eax
0x000000000000061d <+35>: callq *%rdx
0x000000000000061f <+37>: mov $0x0,%eax
0x0000000000000624 <+42>: leaveq
0x0000000000000625 <+43>: retq
End of assembler dump.
(gdb) b 7
Breakpoint 1 at 0x614: file test.c, line 7.
(gdb) run
Starting program: /root/Desktop/Progs/shell/a.out
Breakpoint 1, main (argc=1, argv=0x7fffffffe2b8) at test.c:7
7 (int)(*ret)();
(gdb) info registers rip
rip 0x555555554614 0x555555554614 <main+26>
(gdb) x/5i $rip
=> 0x555555554614 <main+26>: mov -0x8(%rbp),%rdx
0x555555554618 <main+30>: mov $0x0,%eax
0x55555555461d <main+35>: callq *%rdx
0x55555555461f <main+37>: mov $0x0,%eax
0x555555554624 <main+42>: leaveq
(gdb) s
(Control got stuck here, so i pressed ctrl+c)
^C
Program received signal SIGINT, Interrupt.
0x0000555555755048 in shellcode ()
(gdb) x/5i 0x0000555555755048
=> 0x555555755048 <shellcode+24>: callq 0x555555755032 <shellcode+2>
0x55555575504d <shellcode+29>: rex.W add %al,(%rax)
0x555555755050: add %al,(%rax)
0x555555755052: add %al,(%rax)
0x555555755054: add %al,(%rax)
Here is the debugging information. I am not able to find where the control goes wrong.If need more info please ask.
Below is a working example using x86-64; which could be further optimized for size. That last 0x00 null is ok for the purpose of executing the shellcode.
assemble & link:
$ nasm -felf64 -g -F dwarf pushpam_001.s -o pushpam_001.o && ld pushpam_001.o -o pushpam_001
Code:
BITS 64
section .text
global _start
_start:
jmp short two
one:
pop rdi ; pathname
xor rax, rax
add al, 85 ; creat syscall 64-bit Linux
xor rsi, rsi
add si, 0755 ; mode - octal
syscall
xor rax, rax
add ax, 60
xor rdi, rdi
syscall
two:
call one
db 'H',0
objdump:
pushpam_001: file format elf64-x86-64
0000000000400080 <_start>:
400080: eb 1c jmp 40009e <two>
0000000000400082 <one>:
400082: 5f pop rdi
400083: 48 31 c0 xor rax,rax
400086: 04 55 add al,0x55
400088: 48 31 f6 xor rsi,rsi
40008b: 66 81 c6 f3 02 add si,0x2f3
400090: 0f 05 syscall
400092: 48 31 c0 xor rax,rax
400095: 66 83 c0 3c add ax,0x3c
400099: 48 31 ff xor rdi,rdi
40009c: 0f 05 syscall
000000000040009e <two>:
40009e: e8 df ff ff ff 48 00
.....H.
encoding extraction: There are many other ways to do this.
$ for i in `objdump -d pushpam_001 | grep "^ " | cut -f2`; do echo -n '\x'$i; done; echo
\xeb\x1c\x5f\x48\x31\xc0\x04\x55\x48\x31\xf6\x66\x81\xc6\xf3\x02\x0f\x05\x48\x31\xc0\x66\x83\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xdf\xff\xff\xff\x48\x00\x.....H.
C shellcode.c - partial
...
unsigned char code[] = \
"\xeb\x1c\x5f\x48\x31\xc0\x04\x55\x48\x31\xf6\x66\x81\xc6\xf3\x02\x0f\x05\x48\x31\xc0\x66\x83\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xdf\xff\xff\xff\x48\x00";
...
final:
./shellcode
--wxrw---t 1 david david 0 Jan 31 12:25 H
If int 0x80 in 64-bit code was the only problem, building your C test with gcc -fno-pie -no-pie would have worked, because then char sh[] would be in the low 32 bits of virtual address space, so system calls that truncate pointers to 32 bits would still work.
Run your program under strace to see what system calls it actually makes. (Except that strace decodes int 0x80 syscalls incorrectly in 64-bit code, decoding as if you'd used the 64-bit syscall ABI. The call numbers and arg registers are different.) But at least you can see the system-call return values (which will be -EFAULT for 32-bit creat with a truncated 64-bit pointer.)
You can also just gdb to single-step and check the system call return values. Having strace decode the system-call inputs is really nice, though, so I'd recommend porting your code to use the 64-bit ABI, and then it would just work.
Also, it would actually be able to exploit 64-bit processes where the buffer overflow is in memory at an address outside the low 32 bits. (e.g. like the stack). So yes, you should really stop using int 0x80 or stick to 32-bit code.
You're also depending on registers being zeroed before your code runs, like they are on process startup, but not when called from anywhere else.
xor al,al before mov al,8 is completely pointless, because xor-zeroing al doesn't clear upper bytes. Writing 32-bit registers clears the upper 32, but not writing 8 or 16 bit registers. And if it did, you wouldn't need the xor-zeroing before using mov which is also write-only.
If you want to set RAX=8 without any zero bytes in the machine code, you can
push 8 / pop rax (3 bytes)
xor eax,eax / mov al,8 (4 bytes)
Or given a zeroed rcx register, lea eax, [rcx+8] (3 bytes)
Setting CX to 0755 isn't so simple, because the constant doesn't fit in an imm8. Your 16-bit mov is a good choice (or would have been if you'd zeroed rcx first.
xor ecx,ecx
lea eax, [rcx+8] ; SYS_creat = 8 from unistd_32.h
mov cx, 0755 ; mode
int 0x80 ; invoke 32-bit ABI
xor ebx,ebx
lea eax, [rbx+1] ; SYS_exit = 1
int 0x80
I am writing a bootloader as follows:
bits 16
[org 0x7c00]
KERN_OFFSET equ 0x1000
mov [BOOTDISK], dl
mov dl, 0x0 ;0 is for floppy-disk
mov ah, 0x2 ;Read function for the interrupt
mov al, 0x15 ;Read 15 sectors conating kernel
mov ch, 0x0 ;Use cylinder 0
mov cl, 0x2 ;Start from the second sector which contains kernel
mov dh, 0x0 ;Read head 0
mov bx, KERN_OFFSET
int 0x13
jc disk_error
cmp al, 0x15
jne disk_error
jmp KERN_OFFSET:0x0
jmp $
disk_error:
jmp $
BOOTDISK: db 0
times 510-($-$$) db 0
dw 0xaa55
The kernel is a simple C program which prints "e" on the VGA display (seen on QEmu):
void main()
{
extern void put_in_mem();
char c = 'e';
put_in_mem(c, 0xA0);
}
I am using this code in 16 bit (real mode) in QEmu so I am using the compiler bcc for this code using:
bcc -ansi -c -o kernel.o kernel.c
I have the following questions:
1. When I try to disassemble this code, using
objdump -D -b binary -mi386 kernel.o
I get an output like this (only initial portion of output):
kernel.o: file format binary
Disassembly of section .data:
00000000 <.data>:
0: a3 86 01 00 2a mov %eax,0x2a000186
5: 3e 00 00 add %al,%ds:(%eax)
8: 00 22 add %ah,(%edx)
a: 00 00 add %al,(%eax)
c: 00 19 add %bl,(%ecx)
e: 00 00 add %al,(%eax)
10: 00 55 55 add %dl,0x55(%ebp)
13: 55 push %ebp
14: 55 push %ebp
15: 00 00 add %al,(%eax)
17: 00 02 add %al,(%edx)
19: 22 00 and (%eax),%al
This output does not seem to correspond to the kernel.c file I made. For example I could not see where 'e' is stored as ASCII 0x65 or where is the call to put_in_mem made. Is something wrong with the way I am disassembling the code?
To make the object file of the kernel for QEmu I used the following command:
ld86 -o kernel -d kernel.o put_in_mem.o
Here put_in_mem.o is the object file created after assembling the put_in_mem.asm file which contains the definition of the function put_in_mem() used in kernel.c.
Then floppy image for QEmu is made using:
cat boot.o kernel > floppy_img
But when I try to look at the address 0x10000 (using GDB), where the kernel was supposed to be present after loading (using the boot.asm program), it was not present.
Why is this happening?
Further, in ld command we used -Ttext option to specify the load address of the binary, should we use some similar option here with ld86?
Your kernel.o is in an object file format not understood by objdump so it tries to disassemble everything in it, including headers and whatnot. Try to disassemble the linked output kernel instead. Also objdump might not understand 16 bit code. Better try objdump86 if you have that available.
As to why it's not present: you are looking at the wrong place. You are loading it to offset 0x1000 (3 zeroes) but you are looking at 0x10000 (4 zeroes). Also note that you don't set up ES which is bad practice. Maybe you intended to set ES to 0x1000 and BX to 0x0000 and then you would find your kernel at 0x10000 physical address.
The -Ttext doesn't influence loading, it only specifies where the code expects to find itself.
I am trying to learn how to write shellcode. After searching around, I wrote my own shellcode for hello world. I think the logic is correct, but somehow when I compile the wrapper with the shellcode, it always gives me "illegal instruction".
Could anybody help me to check what is wrong with this code:
Shellcode
.section .data
.section .text
.globl _start
jmp dummy
_start:
# write(1, message, 13)
mov $4, %al # system call 4 is write
mov $1, %bl # file handle 1 is stdout
popl %ecx
mov $12, %dl # number of bytes to write
int $0x80 # invoke operating system code
# exit(0)
xor %eax, %eax
mov $1, %al # system call 1 is exit
xor %ebx, %ebx # we want return code 0
int $0x80 # invoke operating system code
dummy:
call _start
.string "Hello, World"
After running objdump:
file format elf32-i386
Disassembly of section .text:
00000000 <_start-0x2>:
0: eb 11 jmp 13 <dummy>
00000002 <_start>:
2: b0 04 mov $0x4,%al
4: b3 01 mov $0x1,%bl
6: 59 pop %ecx
7: b2 0c mov $0xc,%dl
9: cd 80 int $0x80
b: 31 c0 xor %eax,%eax
d: b0 01 mov $0x1,%al
f: 31 db xor %ebx,%ebx
11: cd 80 int $0x80
00000013 <dummy>:
13: e8 fc ff ff ff call 14 <dummy+0x1>
18: 48 dec %eax
19: 65 gs
1a: 6c insb (%dx),%es:(%edi)
1b: 6c insb (%dx),%es:(%edi)
1c: 6f outsl %ds:(%esi),(%dx)
1d: 2c 20 sub $0x20,%al
1f: 57 push %edi
20: 6f outsl %ds:(%esi),(%dx)
21: 72 6c jb 8f <dummy+0x7c>
23: 64 fs
...
The C Wrapper I used
char code[] = "\xeb\x11"
"\xb0\x04"
"\xb3\x01"
"\x59"
"\xb2\x0c"
"\xcd\x80"
"\x31\xc0"
"\xb0\x01"
"\x31\xdb"
"\xcd\x80"
"\xe8\xfc\xff\xff\xff"
"\x48\x65\x6c\x6c\x6f\x2c\x20\x57\x6f\x72\x6c\x64";
void main() {
int (*func)();
func = (int(*)()) code;
(int) (*func)();
}
There are several problems with your code.
First and foremost, you're only setting the low byte of all of the parameter registers (namely al, bl, and dl). You need to set the full 32 bits. When you execute the way it is now, whatever is left in the remaining 24 bits gets passed to the kernel.
Also, in your C code, the call is not correct:
"\xe8\xfc\xff\xff\xff"
That's essentially call $+1 which is the second byte of the call instruction, which is why you're getting the illegal instruction.
I'm not sure how you arrived at the byte in your code variable, but you need to re-assemble.
Tested with gcc 4.7.2 on Fedora 17, with gcc -m32. (Sorry, I only use Intel syntax)
char code[] __attribute__((section(".text"))) =
"\xeb\x17" // jmp $+19
"\xB8\x04\x00\x00\x00" // mov eax, 4 ; (sys_write)
"\x31\xDB" // xor ebx, ebx
"\x43" // inc ebx
"\x59" // pop ecx ; (addr of string pushed by call below)
"\x31\xD2" // xor edx, edx
"\xb2\x0c" // mov dl, 0Ch ; (length of string)
"\xcd\x80" // int 80h
"\x31\xc0" // xor eax, eax
"\xb0\x01" // mov al, 1 ; (sys_exit)
"\x31\xdb" // xor ebx, ebx
"\xcd\x80" // int 80h
"\xe8\xe4\xff\xff\xff" // call $-23
"\x48\x65\x6c\x6c\x6f\x2c\x20\x57\x6f\x72\x6c\x64\x00"; // "Hello, World"
void main() {
int (*func)();
func = (int(*)()) code;
(int) (*func)();
}
Note that there are certainly ways to make the code smaller, but that is of course left as an exercise for the reader.
If you're going to play around with hand-tweaked assembly like this, be prepared to debug, debug, debug. Learn how to use GDB now, or you will be forever helpless. Set a breakpoint on the beginning of the assembly (b code) and step through it. You'll quickly see what went wrong.