Why it is showing frame pointer in out of 3GB address space? - c

I get the frame at 0xffffd3d0 and saveed eip = 0xf7e04e7e in stack level 0 while doing gdb debug.
(gdb) info frame
Stack level 0, frame at 0xffffd3d0:
eip = 0x8048452 in main (test.c:13); saved eip = 0xf7e04e7e
source language c.
Arglist at 0xffffd3b8, args:
Locals at 0xffffd3b8, Previous frame's sp is 0xffffd3d0
Saved registers:
ebp at 0xffffd3b8, eip at 0xffffd3cc
(gdb)
Here is my question about 3GB address space of userspace. Why it is showing frame pointer in out of 3GB address space ?
Normally, the address space of the user space is 0 to 0xc000000 in a 3: 1 virtual address distribution.

3GB limit does not apply to 64-bit processes

Related

Get the current size of the stack in bytes with GDB

Is it possible to get the current size of the stack in bytes with GDB (at a breakpoint)?
I didn't find anything regarding this on the Internet.
It's unclear whether you are asking "how much stack have my thread consumed so far", or "what is the maximum stack size this thread may consume in the future".
The first question can be trivially answered by using:
# go to the innermost frame
(gdb) down 100000
(gdb) set var $stack = $sp
# go to the outermost frame
(gdb) up 100000
(gdb) print $sp - $stack
To answer the second question, you would need libpthread that is built with debug symbols. If using GLIBC, you can do this:
# Go to frame which is `start_thread`
(gdb) frame 2
#2 0x00007ffff7d7eeae in start_thread (arg=0x7ffff7a4c640) at pthread_create.c:463
463 in pthread_create.c
(gdb) p pd.stackblock
$1 = (void *) 0x7ffff724c000 # low end of stack block
(gdb) p pd.stackblock_size
$1 = 8392704
Here you can see that the entire stack spans [0x7ffff724c000, 0x7ffff7a4d000] region. You can also confirm that $sp is in that region, near the high address end of the stack (which grows from high to low addresses on this system):
(gdb) p $sp
$9 = (void *) 0x7ffff7a4be60

how to get line numbers same as lldb using atos/addr2line/llvm-symbolizer/lldb image lookup --address

I want to programmatically convert backtrace stack addresses (eg obtained from backtrace_symbols/libunwind) to file:line:column. I'm on OSX but doubt this makes a difference.
All of these give wrong line number (line 11) for the call to fun1():
atos
addr2line
llvm-symbolizer
lldb image lookup --address using lldb's pc addresses in bt
lldb bt itself gives correct file:line:column, (line 7) as shown below.
How do I programmatically get the correct stack address such that, when using atos/addr2line/llvm-symbolizer/image lookup --address, it would resolve to the correct line number? lldb bt is doing it correctly, so there must be a way to do it. Note that if I use backtrace_symbols or libunwind (subtracted from info.dli_saddr after calling dladdr), I'd end up with the same address 0x0000000100000f74 as shown in lldb bt that points to the wrong line number 11
Note: in .lldbinit, if I add settings set frame-format frame start-addr:${line.start-addr}\n it will show the correct address (ie resolve to 0x0000000100000f6f instead of 0x0000000100000f74, which will resolve to the correct line 7). However, how do I programmatically generate start-addr from a c program without calling spawning a call to lldb -p $pid (calling lldb has other issues, eg overhead compared to llvm-symbolizer, and in fact can hang forever even with -batch).
clang -g -o /tmp/z04 test_D20191123T162239.c
test_D20191123T162239.c:
void fun1(){
}
void fun1_aux(){
int a = 0;
fun1(); // line 7
mylabel:
if(1){
a++; // line 11
}
}
int main(int argc, char *argv[]) {
fun1_aux();
return 0;
}
lldb /tmp/z04
(lldb) target create "/tmp/z04"
Current executable set to '/tmp/z04' (x86_64).
(lldb) b fun1
Breakpoint 1: where = z04`fun1 + 4 at test_D20191123T162239.c:2:1, address = 0x0000000100000f54
(lldb) r
Process 7258 launched: '/tmp/z04' (x86_64)
Process 7258 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100000f54 z04 fun1 + 4 at test_D20191123T162239.c:2:1
1 void fun1(){
-> 2 }
3
4 void fun1_aux(){
5 int a = 0;
6
7 fun1();
Target 0: (z04) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100000f54 z04 fun1 + 4 at test_D20191123T162239.c:2:1
frame #1: 0x0000000100000f74 z04 fun1_aux + 20 at test_D20191123T162239.c:7:3
frame #2: 0x0000000100000fab z04 main(argc=1, argv=0x00007ffeefbfb748) + 27 at test_D20191123T162239.c:16:3
frame #3: 0x00007fff71c182e5 libdyld.dylib start + 1
frame #4: 0x00007fff71c182e5 libdyld.dylib start + 1
(lldb)
(lldb) image lookup --address 0x0000000100000f74
Address: z04[0x0000000100000f74] (z04.__TEXT.__text + 36)
Summary: z04`fun1_aux + 20 at test_D20191123T162239.c:11:8
echo 0x0000000100000f74 | llvm-symbolizer -obj=/tmp/z04
fun1_aux
test_D20191123T162239.c:11:8
atos -o /tmp/z04 0x0000000100000f74
fun1_aux (in z04) (test_D20191123T162239.c:11)
likewise with addr2line
It's easier to understand if you look at the disassembly for fun1_aux -- you'll see a CALLQ instruction to fun1, followed by something like a mov %rax, $rbp-16 or something like that, the first instruction of your a++ line. When you have called fun1, the return address is the instruction that will be executed when fun1 exits, the mov %rax, $rbp-16 or whatever.
This isn't intuitively how most people think of the computer working -- they expect to look at frame 1, fun1_aux, and see the "current pc value" be the CALLQ, because the call is executing. But of course, that's not correct, the call instruction has completed, and the saved pc is going to point to the next instruction.
In cases like this, the next instruction is part of the next source line, so it's a little extra confusing. Even better is if you have a function that calls a "noreturn" function like abort() -- the final instruction in the function will be a CALLQ, and if you look at the return address instruction, it may point to the next function.
So when lldb is symbolicating stack frames above frame #0, it knows to do a symbol lookup with saved_pc - 1 to move the address back into the CALLQ instruction. That's not a valid address, so it should never show you saved_pc - 1, but it should do symbol / file & line lookups based on it.
You can get the same effect for your manual symbolication by doing the same thing. The one caveat is if you have an asynchronous interrupt (_sigtramp on macOS), the frame above _sigtramp should not have its saved pc value decremented. You could be executing the first instruction of a function when the signal is received, and decrementing it would put you in the previous function which would be very confusing.

stack frames and gdb

I'm new to reverse engeneering. I wrote the following C code to help me understand a bit more about stack frames.
#include <stdio.h>
int sum(int a, int b,int c)
{
return(a+b+c);
}
int media(int a, int b,int c)
{
int total;
total = a + b + c;
return (total/3);
}
int main ()
{
int num1,num2,num3;
char keypress[1];
num1 = 5;
num2 = 10;
num3 = 15;
printf ("\nCalling sum function\n");
sum(num1,num2,num3);
printf ("\nWaiting a keypress to call media function\n");
scanf ("%c",keypress);
media(num1,num2,num3);
printf ("\nWaiting a keypress to end\n");
scanf ("%c",keypress);
return(0);
}
As far as I know every time you call a function
a stack frame is created (see: ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_41.html). So, my goal with the above C code is to see, at least, three stack-frames.
1) main function - stack frame
2) sum function - stack frame
3) media function - stack frame
BTW: Those printfs are just to help me 'follow' the program in gdb. =)
So I guess if I compare the output of info frame after the program started with the output of info frame just after sum function is called I would get different output right? I did not got it as you can see:
Temporary breakpoint 1, main () at parastack.c:27
warning: Source file is more recent than executable.
27 num1 = 5;
(gdb) nexti
28 num2 = 10;
(gdb) info frame
Stack level 0, frame at 0x7fffffffdf00:
rip = 0x400605 in main (parastack.c:28); saved rip = 0x7ffff7a3c790
source language c.
Arglist at 0x7fffffffdef0, args:
Locals at 0x7fffffffdef0, Previous frame's sp is 0x7fffffffdf00
Saved registers:
rbp at 0x7fffffffdef0, rip at 0x7fffffffdef8
(gdb) nexti
29 num3 = 15;
(gdb) nexti
31 printf ("\nCalling sum function\n");
(gdb) nexti
0x0000000000400618 31 printf ("\nCalling sum function\n");
(gdb) nexti
Calling sum function
32 sum(num1,num2,num3);
(gdb) info frame
Stack level 0, frame at 0x7fffffffdf00:
rip = 0x40061d in main (parastack.c:32); saved rip = 0x7ffff7a3c790
source language c.
Arglist at 0x7fffffffdef0, args:
Locals at 0x7fffffffdef0, Previous frame's sp is 0x7fffffffdf00
Saved registers:
rbp at 0x7fffffffdef0, rip at 0x7fffffffdef8
(gdb) nexti
0x0000000000400620 32 sum(num1,num2,num3);
(gdb) info frame
Stack level 0, frame at 0x7fffffffdf00:
rip = 0x400620 in main (parastack.c:32); saved rip = 0x7ffff7a3c790
source language c.
Arglist at 0x7fffffffdef0, args:
Locals at 0x7fffffffdef0, Previous frame's sp is 0x7fffffffdf00
Saved registers:
rbp at 0x7fffffffdef0, rip at 0x7fffffffdef8
just after sum function is called
Your problem is that you never actually stopped inside of the sum function. You stopped after you printed that you are about to call it, and then you stepped a few instructions, but you never actually landed inside (it takes a few instructions to prepare arguments, one more to actually call, and few more inside the function to set up the frame).
You should start by setting breakpoints inside sum and media, and doing info frame when these breakpoints are hit. You'll notice that the breakpoint is set a few instructions after the beginning of the function (i.e. after function prolog). The skipped instructions are exactly the ones that set up the new frame.
After you understand how that works, you should progress to using step and next commands.
And after that you can graduate to using disas, stepi and nexti commands.
Based on my interpretation of your prose, your understanding of stack frames is slightly off. You are correct that when a function is called a stack frame is created, however, what you're missing is that when a function returns, the stack frame is popped. The stack is in the same state is was before the function began executing except that the program counter contains the address of the first instruction of the statement immediately following the function that just finished executing. So, you should not expect to see 3 stack frames after the two functions in main execute. You will only see one since you're only one frame deep into main().
As for the gdb session, as #Employed Russian points out, you never actually step into any function when printing the stack frame information.
Thanks for everyone that helped me. Below are the gdb session with shows that the stack-frame changed.
First I recompiled the C code: gcc -ggdb stack.c -o stack.bin
gdb stack.bin
(gdb) break sum
(gdb) start
(gdb) info frame
Stack level 0, frame at 0x7fffffffe1a0:
rip = 0x400653 in main (stack.c:26); saved rip 0x7ffff7a6fead
source language c.
Arglist at 0x7fffffffe190, args:
Locals at 0x7fffffffe190, Previous frame's sp is 0x7fffffffe1a0
Saved registers:
rbp at 0x7fffffffe190, rip at 0x7fffffffe198
(gdb) continue
Continuing.
Calling sum function
Breakpoint 1, sum (a=5, b=10, c=15) at stack.c:6
6 total = a + b + c;
(gdb) info frame
Stack level 0, frame at 0x7fffffffe180:
rip = 0x4005dd in sum (stack.c:6); saved rip 0x400684
called by frame at 0x7fffffffe1a0
source language c.
Arglist at 0x7fffffffe170, args: a=5, b=10, c=15
Locals at 0x7fffffffe170, Previous frame's sp is 0x7fffffffe180
Saved registers:
rbp at 0x7fffffffe170, rip at 0x7fffffffe178
Now I will search/learn more about the information in the output.

gdb find memory address of line number

Say I have attached gdb to a process and within the its memory layout there is a file and line number which I would like the memory address of. How can I get the memory address of line n in file x? This is on Linux x86.
(gdb) info line test.c:56
Line 56 of "test.c" starts at address 0x4005ae <main+37>
and ends at 0x4005ba <main+49>.
additionally with python you may be able to use the 'last' attribute from
Symbol-Tables-In-Python this currently requires a very recent version of gdb from cvs, but i imagine will have general availability in 7.5
(gdb) py x = gdb.find_pc_line(gdb.decode_line("test.c:56")[1][0].pc); gdb.execute("p/x " + str(x.pc)); gdb.execute("p/x " + str(x.last))
$15 = 0x4005ae
$16 = 0x4005b9

How can one see content of stack with GDB?

I am new to GDB, so I have some questions:
How can I look at content of the stack?
Example: to see content of register, I type info registers. For the stack, what should it be?
How can I see the content of $0x4(%esp)? When I type print /d $0x4(%esp), GDB gives an error.
Platform: Linux and GDB
info frame to show the stack frame info
To read the memory at given addresses you should take a look at x
x/x $esp for hex x/d $esp for signed x/u $esp for unsigned etc. x uses the format syntax, you could also take a look at the current instruction via x/i $eip etc.
Use:
bt - backtrace: show stack functions and args
info frame - show stack start/end/args/locals pointers
x/100x $sp - show stack memory
(gdb) bt
#0 zzz () at zzz.c:96
#1 0xf7d39cba in yyy (arg=arg#entry=0x0) at yyy.c:542
#2 0xf7d3a4f6 in yyyinit () at yyy.c:590
#3 0x0804ac0c in gnninit () at gnn.c:374
#4 main (argc=1, argv=0xffffd5e4) at gnn.c:389
(gdb) info frame
Stack level 0, frame at 0xffeac770:
eip = 0x8049047 in main (goo.c:291); saved eip 0xf7f1fea1
source language c.
Arglist at 0xffeac768, args: argc=1, argv=0xffffd5e4
Locals at 0xffeac768, Previous frame's sp is 0xffeac770
Saved registers:
ebx at 0xffeac75c, ebp at 0xffeac768, esi at 0xffeac760, edi at 0xffeac764, eip at 0xffeac76c
(gdb) x/10x $sp
0xffeac63c: 0xf7d39cba 0xf7d3c0d8 0xf7d3c21b 0x00000001
0xffeac64c: 0xf78d133f 0xffeac6f4 0xf7a14450 0xffeac678
0xffeac65c: 0x00000000 0xf7d3790e
You need to use gdb's memory-display commands. The basic one is x, for examine. There's an example on the linked-to page that uses
gdb> x/4xw $sp
to print "four words (w ) of memory above the stack pointer (here, $sp) in hexadecimal (x)". The quotation is slightly paraphrased.

Resources