Understanding reversing C prog. in assembly - disassembly

I have just started learning reverse engineering (self study). I know assembly upto some understandable point. Basic instructions that pop up after disasassembling the C code are almost understandable to me (like- what does each instruction mean). Since it is beginning, somebody might feel these like dumb questions, and can plz suggest some good e-book on basics of reversing, so that i could stop asking noob questions. Well, the query is:-
Here is a simple C code
#include<stdio.h>
int main(void)
{
printf("hello world");
}
and followed is the disassembled code of main.
0x004013b0 <+0>: push %ebp //saves ebp to stack
0x004013b1 <+1>: mov %esp,%ebp //saves esp onto ebp
0x004013b3 <+3>: and $0xfffffff0,%esp //alignong the stack
0x004013b6 <+6>: sub $0x10,%esp //creating 16 bytes on stack
0x004013b9 <+9>: call 0x401980 <__main> //main call
0x004013be <+14>: movl $0x403064,(%esp) ?? what is it exactly doing??
0x004013c5 <+21>: call 0x401bf0 <printf> //print call
0x004013ca <+26>: leave
0x004013cb <+27>: ret
Here i couldn't understand what it is doing (although it seems like the contents in 0x403064 is copied in stack at esp)- movl $0x403064,(%esp)
In this assembly code I need to know where is "hello world" stored?
Also if somebody could suggest me some good readings in order to learn reversing from basics. Thanks in advance.

printf awaits its parameters in this case on the stack. The address where your string is stored in the memory is $0x403064. So you can see the instruction
movl $0x403064,(%esp)
copies this address on the stack (please note the braces around esp).
To be honest, this is not the usual way. But your program is so simple and therefore the compiler does some micro optimization. This helps to skip some machine instructions. In general one would use some kind of combination of lea and push instructions to copy the address to the stack and later after the call (in the cdecl calling convention which we have here) it is usual to use an add instruction to correct theesp afterwards.
EDIT:
Following a debugging sessien with gdb, im using the command x/sb 0x403064 to show the string in memory.
GNU gdb (GDB) 7.5
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-mingw32".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from d:\temp\C++11\test.exe...(no debugging symbols found)...don
e.
(gdb) start
Temporary breakpoint 1 at 0x4013b3
Starting program: d:\temp\C++11\test.exe
[New Thread 5348.0x16f4]
Temporary breakpoint 1, 0x004013b3 in main ()
(gdb) disassemble $eip
Dump of assembler code for function main:
0x004013b0 <+0>: push %ebp
0x004013b1 <+1>: mov %esp,%ebp
=> 0x004013b3 <+3>: and $0xfffffff0,%esp
0x004013b6 <+6>: sub $0x10,%esp
0x004013b9 <+9>: call 0x401990 <__main>
0x004013be <+14>: movl $0x403064,(%esp)
0x004013c5 <+21>: call 0x401c10 <printf>
0x004013ca <+26>: mov $0x0,%eax
0x004013cf <+31>: leave
0x004013d0 <+32>: ret
0x004013d1 <+33>: nop
0x004013d2 <+34>: nop
0x004013d3 <+35>: nop
0x004013d4 <+36>: add %al,(%eax)
0x004013d6 <+38>: add %al,(%eax)
0x004013d8 <+40>: add %al,(%eax)
0x004013da <+42>: add %al,(%eax)
0x004013dc <+44>: add %al,(%eax)
0x004013de <+46>: add %al,(%eax)
End of assembler dump.
(gdb) x/sb 0x403064
0x403064 <_Jv_RegisterClasses+4206692>: "hello world"
(gdb) x/12xb 0x403064
0x403064 <_Jv_RegisterClasses+4206692>: 0x68 0x65 0x6c 0x6c 0x6f
0x20 0x77 0x6f
0x40306c <_Jv_RegisterClasses+4206700>: 0x72 0x6c 0x64 0x00
(gdb)

Related

How can I access particular memory address during a GDB session?

This is the disassembly of a very simple C program (strcpy() a constant string and print it):
No symbol table is loaded. Use the "file" command.
Reading symbols from string...done.
(gdb) break 6
Breakpoint 1 at 0x6b8: file string.c, line 6.
(gdb) break 7
Breakpoint 2 at 0x6f2: file string.c, line 7.
(gdb) r
Starting program: /home/wsllnx/Detached/string
Breakpoint 1, main () at string.c:6
6 strcpy(buf, "Memento Mori\n\tInjected_string");
(gdb) disass main
Dump of assembler code for function main:
0x00005555554006b0 <+0>: push %rbp
0x00005555554006b1 <+1>: mov %rsp,%rbp
0x00005555554006b4 <+4>: sub $0x70,%rsp
0x00005555554006b8 <+8>: lea -0x70(%rbp),%rax
0x00005555554006bc <+12>: movabs $0x206f746e656d654d,%rdx
0x00005555554006c6 <+22>: mov %rdx,(%rax)
0x00005555554006c9 <+25>: movabs $0x6e49090a69726f4d,%rcx
0x00005555554006d3 <+35>: mov %rcx,0x8(%rax)
0x00005555554006d7 <+39>: movabs $0x735f64657463656a,%rsi
0x00005555554006e1 <+49>: mov %rsi,0x10(%rax)
0x00005555554006e5 <+53>: movl $0x6e697274,0x18(%rax)
0x00005555554006ec <+60>: movw $0x67,0x1c(%rax)
0x00005555554006f2 <+66>: lea -0x70(%rbp),%rax
0x00005555554006f6 <+70>: mov %rax,%rdi
0x00005555554006f9 <+73>: mov $0x0,%eax
0x00005555554006fe <+78>: callq 0x555555400560 <printf#plt>
0x0000555555400703 <+83>: mov $0x0,%eax
0x0000555555400708 <+88>: leaveq
0x0000555555400709 <+89>: retq
End of assembler dump.
(gdb)
I am currently learning how to fully use GBD and I was wondering:
How can I access particular address like '0x206f746e656d654d'? When I try to do so with x/x or x/s GDB says:
'0x206f746e656d654d: Cannot access memory at address 0x206f746e656d654d'
Same goes for 0x6e49090a69726f4d, 0x735f64657463656a and so on...
Thanks in advance to all the useful answers.
Those aren't actually memory addresses. It's a compiler optimization to represent ASCII values using 64-bit constants. Instead of actually calling strcpy() the compiler is moving the string constant values through registers.
0x206f746e656d654d is the ASCII values for the string 'Memento ' (with a space) in x86 little-endian format.

gdb addresses: 0x565561f5 instead of 0x41414141

I want to try a buffer overflow on a c program. I compiled it like this gcc -fno-stack-protector -m32 buggy_program.c with gcc. If i run this program in gdb and i overflow the buffer, it should said 0x41414141, because i sent A's. But its saying 0x565561f5. Sorry for my bad english. Can somebody help me?
This is the source code:
#include <stdio.h>
int main(int argc, char **argv)
{
char buffer[64];
printf("Type in something: ");
gets(buffer);
}
Starting program: /root/Downloads/a.out
Type in something: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x565561f5 in main ()
I want to see this:
Starting program: /root/Downloads/a.out
Type in something: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in main ()
Looking at the address at which the process segfaulted shows the relevant line in the disassembled code:
gdb a.out <<EOF
set logging on
r < inp
disassemble main
x/i $eip
p/x $esp
Produces the following output:
(gdb) Starting program: .../a.out < in
Program received signal SIGSEGV, Segmentation fault.
0x08048482 in main (argc=, argv=) at tmp.c:10 10 }
(gdb) Dump of assembler code for function main:
0x08048436 <+0>: lea 0x4(%esp),%ecx
0x0804843a <+4>: and $0xfffffff0,%esp
0x0804843d <+7>: pushl -0x4(%ecx)
0x08048440 <+10>: push %ebp
0x08048441 <+11>: mov %esp,%ebp
0x08048443 <+13>: push %ebx
0x08048444 <+14>: push %ecx
0x08048445 <+15>: sub $0x40,%esp
0x08048448 <+18>: call
0x8048370 <__x86.get_pc_thunk.bx>
0x0804844d <+23>: add $0x1bb3,%ebx
0x08048453 <+29>: sub $0xc,%esp
0x08048456 <+32>: lea -0x1af0(%ebx),%eax
0x0804845c <+38>: push %eax
0x0804845d <+39>: call 0x8048300
0x08048462 <+44>: add $0x10,%esp
0x08048465 <+47>: sub $0xc,%esp
0x08048468 <+50>: lea -0x48(%ebp),%eax
0x0804846b <+53>: push %eax
0x0804846c <+54>: call 0x8048310
0x08048471 <+59>: add $0x10,%esp
0x08048474 <+62>: mov $0x0,%eax
0x08048479 <+67>: lea -0x8(%ebp),%esp
0x0804847c <+70>: pop %ecx
0x0804847d <+71>: pop %ebx
0x0804847e <+72>: pop %ebp
0x0804847f <+73>: lea -0x4(%ecx),%esp
=> 0x08048482 <+76>: ret
End of assembler dump.
(gdb) => 0x8048482 : ret
(gdb) $1 = 0x4141413d
(gdb) quit
The failing statement is the ret at the end of main. The program fails, when ret attempts to load the return-address from the top of the stack. The produced executable stores the old value of esp on the stack, before aligning to word-boundaries. When main is completed, the program attempts to restore the esp from the stack and afterwards read the return-address. However the whole top of the stack is compromised, thus rendering the new value of the stack-pointer garbage ($1 = 0x4141413d). When ret is executed, it attempts to read a word from address 0x4141413d, which isn't allocated and produces as segfault.
Notes
The above disassembly was produced from the code in the question using the following compiler-options:
-m32 -fno-stack-protector -g -O0
So guys, i found a solution:
Just compile it with gcc 3.3.4
gcc -m32 buggy_program.c
Modern operating systems use address-space-layout-randomization ASLR to make this stuff not work quite so easily.
I remember the controversy when it was first started. ASLR was kind of a bad idea for 32 bit processes due to the number of other constraints it imposed on the system and dubious security benefit. On the other hand, it works great on 64 bit processes and almost everybody uses it now.
You don't know where the code is. You don't know where the heap is. You don't know where the stack is. Writing exploits is hard now.
Also, you tried to use 32 bit shellcode and documentation on a 64 bit process.
On reading the updated question: Your code is compiled with frame pointers (which is the default). This is causing the ret instruction itself to fault because esp is trashed. ASLR appears to still be in play most likely it doesn't really matter.

Buffer Overflow strcpy()

I would like to know how many bytes do we have to overflow to run a shellcode ?
int fun (char data[256]){
int i;
char *tmp;
strcpy(tmp,data);
}
It is known that:
If string chain *data is larger than *tmp then there will be overflow.
Otherwise there will be no buffer overflow.
*tmp isn't initialized, so you're usually going to segmentation fault anyway.
A better example would be to change char *tmp; to something like char tmp[64]; and have stuff (over 64 bytes of stuff in this case) from data be copied over to tmp. To answer your question from thereon out, you'll need to fire up a debugger like gdb after changing the code then see how far you can write out until you overwrite the RIP. On my system that's 78 bytes.
marshall#marshall-debian-testbed:~$ cat bof.c
int fun (char data[256]) {
int i;
char tmp[64];
strcpy(tmp,data);
}
int main (int argc, char *argv[]) {
fun(argv[1]);
return(0);
}
marshall#marshall-debian-testbed:~$ gcc bof.c -o bof
bof.c: In function ‘fun’:
bof.c:4:1: warning: implicit declaration of function ‘strcpy’ [-Wimplicit-function-declaration]
strcpy(tmp,data);
^~~~~~
bof.c:4:1: warning: incompatible implicit declaration of built-in function ‘strcpy’
bof.c:4:1: note: include ‘<string.h>’ or provide a declaration of ‘strcpy’
marshall#marshall-debian-testbed:~$ ./bof AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault
marshall#marshall-debian-testbed:~$ gdb ./bof
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bof...(no debugging symbols found)...done.
(gdb) disas main
Dump of assembler code for function main:
0x00000000000006d2 <+0>: push %rbp
0x00000000000006d3 <+1>: mov %rsp,%rbp
0x00000000000006d6 <+4>: sub $0x10,%rsp
0x00000000000006da <+8>: mov %edi,-0x4(%rbp)
0x00000000000006dd <+11>: mov %rsi,-0x10(%rbp)
0x00000000000006e1 <+15>: mov -0x10(%rbp),%rax
0x00000000000006e5 <+19>: add $0x8,%rax
0x00000000000006e9 <+23>: mov (%rax),%rax
0x00000000000006ec <+26>: mov %rax,%rdi
0x00000000000006ef <+29>: callq 0x6b0 <fun>
0x00000000000006f4 <+34>: mov $0x0,%eax
0x00000000000006f9 <+39>: leaveq
0x00000000000006fa <+40>: retq
End of assembler dump.
(gdb) disas fun
Dump of assembler code for function fun:
0x00000000000006b0 <+0>: push %rbp
0x00000000000006b1 <+1>: mov %rsp,%rbp
0x00000000000006b4 <+4>: sub $0x50,%rsp
0x00000000000006b8 <+8>: mov %rdi,-0x48(%rbp)
0x00000000000006bc <+12>: mov -0x48(%rbp),%rdx
0x00000000000006c0 <+16>: lea -0x40(%rbp),%rax
0x00000000000006c4 <+20>: mov %rdx,%rsi
0x00000000000006c7 <+23>: mov %rax,%rdi
0x00000000000006ca <+26>: callq 0x560 <strcpy#plt>
0x00000000000006cf <+31>: nop
0x00000000000006d0 <+32>: leaveq
0x00000000000006d1 <+33>: retq
End of assembler dump.
(gdb) r `perl -e 'print "A"x78;'`
Starting program: /home/marshall/bof `perl -e 'print "A"x78;'`
Program received signal SIGSEGV, Segmentation fault.
0x0000414141414141 in ?? ()
(gdb) info registers
rax 0x7fffffffdce0 140737488346336
rbx 0x0 0
rcx 0x4141414141414141 4702111234474983745
rdx 0x414141 4276545
rsi 0x7fffffffe140 140737488347456
rdi 0x7fffffffdd23 140737488346403
rbp 0x4141414141414141 0x4141414141414141
rsp 0x7fffffffdd30 0x7fffffffdd30
r8 0x555555554770 93824992233328
r9 0x7ffff7de99e0 140737351948768
r10 0x5b 91
r11 0x7ffff7b9ab28 140737349528360
r12 0x555555554580 93824992232832
r13 0x7fffffffde20 140737488346656
r14 0x0 0
r15 0x0 0
rip 0x414141414141 0x414141414141
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
makes general way for the compiler. This is an exam for a computer master. we must explain the 2 cases :
-when *tmp[200] for example and
-when *tmp[300] ie a case or *tmp larger than *data (no overflows) and *tmp smaller than *data (overflow)
how to know the number of bytes that are overwhelmed that the code executes ?

Finding the Starting Address of an array

I've been working on the bufbomb lab from CSAPPS and I've gotten stuck on one of the phases.
I won't get into the gore-y details of the project since I just need a nudge in the right direction. I'm having a hard time finding the starting address of the array called "buf" in the given assembly.
We're given a function called getbuf:
#define NORMAL_BUFFER_SIZE 32
int getbuf()
{
char buf[NORMAL_BUFFER_SIZE];
Gets(buf);
return 1;
}
And the assembly dumps:
Dump of assembler code for function getbuf:
0x08048d92 <+0>: sub $0x3c,%esp
0x08048d95 <+3>: lea 0x10(%esp),%eax
0x08048d99 <+7>: mov %eax,(%esp)
0x08048d9c <+10>: call 0x8048c66 <Gets>
0x08048da1 <+15>: mov $0x1,%eax
0x08048da6 <+20>: add $0x3c,%esp
0x08048da9 <+23>: ret
End of assembler dump.
Dump of assembler code for function Gets:
0x08048c66 <+0>: push %ebp
0x08048c67 <+1>: push %edi
0x08048c68 <+2>: push %esi
0x08048c69 <+3>: push %ebx
0x08048c6a <+4>: sub $0x1c,%esp
0x08048c6d <+7>: mov 0x30(%esp),%esi
0x08048c71 <+11>: movl $0x0,0x804e100
0x08048c7b <+21>: mov %esi,%ebx
0x08048c7d <+23>: jmp 0x8048ccf <Gets+105>
0x08048c7f <+25>: mov %eax,%ebp
0x08048c81 <+27>: mov %al,(%ebx)
0x08048c83 <+29>: add $0x1,%ebx
0x08048c86 <+32>: mov 0x804e100,%eax
0x08048c8b <+37>: cmp $0x3ff,%eax
0x08048c90 <+42>: jg 0x8048ccf <Gets+105>
0x08048c92 <+44>: lea (%eax,%eax,2),%edx
0x08048c95 <+47>: mov %ebp,%ecx
0x08048c97 <+49>: sar $0x4,%cl
0x08048c9a <+52>: mov %ecx,%edi
0x08048c9c <+54>: and $0xf,%edi
0x08048c9f <+57>: movzbl 0x804a478(%edi),%edi
0x08048ca6 <+64>: mov %edi,%ecx
---Type <return> to continue, or q <return> to quit---
0x08048ca8 <+66>: mov %cl,0x804e140(%edx)
0x08048cae <+72>: mov %ebp,%ecx
0x08048cb0 <+74>: and $0xf,%ecx
0x08048cb3 <+77>: movzbl 0x804a478(%ecx),%ecx
0x08048cba <+84>: mov %cl,0x804e141(%edx)
0x08048cc0 <+90>: movb $0x20,0x804e142(%edx)
0x08048cc7 <+97>: add $0x1,%eax
0x08048cca <+100>: mov %eax,0x804e100
0x08048ccf <+105>: mov 0x804e110,%eax
0x08048cd4 <+110>: mov %eax,(%esp)
0x08048cd7 <+113>: call 0x8048820 <_IO_getc#plt>
0x08048cdc <+118>: cmp $0xffffffff,%eax
0x08048cdf <+121>: je 0x8048ce6 <Gets+128>
0x08048ce1 <+123>: cmp $0xa,%eax
0x08048ce4 <+126>: jne 0x8048c7f <Gets+25>
0x08048ce6 <+128>: movb $0x0,(%ebx)
0x08048ce9 <+131>: mov 0x804e100,%eax
0x08048cee <+136>: movb $0x0,0x804e140(%eax,%eax,2)
0x08048cf6 <+144>: mov %esi,%eax
0x08048cf8 <+146>: add $0x1c,%esp
0x08048cfb <+149>: pop %ebx
0x08048cfc <+150>: pop %esi
0x08048cfd <+151>: pop %edi
---Type <return> to continue, or q <return> to quit---
0x08048cfe <+152>: pop %ebp
0x08048cff <+153>: ret
End of assembler dump.
I'm having a difficult time locating where the starting address of buf is (or where buf is at all in this mess!). If someone could point that out to me, I'd greatly appreciate it.
Attempt at a solution
Reading symbols from /home/user/CS247/buflab/buflab-handout/bufbomb...(no debugging symbols found)...done.
(gdb) break getbuf
Breakpoint 1 at 0x8048d92
(gdb) run -u user < firecracker-exploit.bin
Starting program: /home/user/CS247/buflab/buflab-handout/bufbomb -u user < firecracker-exploit.bin
Userid: ...
Cookie: ...
Breakpoint 1, 0x08048d92 in getbuf ()
(gdb) print buf
No symbol table is loaded. Use the "file" command.
(gdb)
As has been pointed out by some other people, buf is allocated on the stack at run time. See these lines in the getbuf() function:
0x08048d92 <+0>: sub $0x3c,%esp
0x08048d95 <+3>: lea 0x10(%esp),%eax
0x08048d99 <+7>: mov %eax,(%esp)
The first line subtracts 0x3c (60) bytes from the stack pointer, effectively allocating that much space. The extra bytes beyond 32 are probably for parameters for Gets (Its hard to tell what the calling convention is for Gets is precisely, so its hard to say) The second line gets the address of the 16 bytes up. This leaves 44 bytes above it that are unallocated. The third line puts that address onto the stack for probably for the gets function call. (remember the stack grows down, so the stack pointer will be pointing at the last item on the stack). I am not sure why the compiler generated such strange offsets (60 bytes and then 44) but there is probably a good reason. If I figure it out I will update here.
Inside the gets function we have the following lines:
0x08048c66 <+0>: push %ebp
0x08048c67 <+1>: push %edi
0x08048c68 <+2>: push %esi
0x08048c69 <+3>: push %ebx
0x08048c6a <+4>: sub $0x1c,%esp
0x08048c6d <+7>: mov 0x30(%esp),%esi
Here we see that we save the state of some of the registers, which add up to 16-bytes, and then Gets reserves 28 (0x1c) bytes on the stack. The last line is key: It grabs the value at 0x30 bytes up the stack and loads it into %esi. This value is the address of buf put on the stack by getbuf. Why? 4 for the return addres plus 16 for the registers+28 reserved = 48. 0x30 = 48, so it is grabbing the last item placed on the stack by getbuf() before calling gets.
To get the address of buf you have to actually run the program in the debugger because the address will probably be different everytime you run the program, or even call the function for that matter. You can set a break point at any of these lines above and either dump the %eax register when the it contains the address to be placed on the stack on the second line of getbuf, or dump the %esi register when it is pulled off of the stack. This will be the pointer to your buffer.
to be able to see debugging info while using gdb,you must use the -g3 switch with gcc when you compile.see man gcc for more details on the -g switch.
Only then, gcc will add debugging info (symbol table) into the executable.
0x08048cd4 <+110>: mov %eax,(%esp)
0x08048cd7 <+113>: **call 0x8048820 <_IO_getc#plt>**
0x08048cdc <+118>: cmp $0xffffffff,%eax
0x0848cdf <+121>: je 0x8048ce6 <Gets+128>
0x08048ce1 <+123>: cmp $0xa,%eax
0x08048ce4 <+126>: jne 0x8048c7f <Gets+25>
0x08048ce6 <+128>: movb $0x0,(%ebx)
0x08048ce9 <+131>: mov 0x804e100,%eax
0x08048cee <+136>: movb $0x0,0x804e140(%eax,%eax,2)
0x08048cf6 <+144>: mov %esi,%eax
0x08048cf8 <+146>: add $0x1c,%esp
0x08048cfb <+149>: **pop %ebx**
0x08048cfc <+150>: **pop %esi**
0x08048cfd <+151>: **pop %edi**
---Type <return> to continue, or q <return> to quit---
0x08048cfe <+152>: **pop %ebp**
0x08048cff <+153>: ret
End of assembler dump.
I Don't know your flavour of asm but there's a call in there which may use the start address
The end of the program pops various pointers
That's where I'd start looking
If you can tweak the asm for these functions you can input your own routines to dump data as the function runs and before those pointers get popped
buf is allocated on the stack. Therefore, you will not be able to spot its address from an assembly listing. In other words, buf is allocated (and its address therefore known) only when you enter the function getbuf() at runtime.
If you must know the address, one option would be to use gbd (but make sure you compile with the -g flag to enable debugging support) and then:
gdb a.out # I'm assuming your binary is a.out
break getbuf # Set a breakpoint where you want gdb to stop
run # Run the program. Supply args if you need to
# WAIT FOR your program to reach getbuf and stop
print buf
If you want to go this route, a good gdb tutorial (example) is essential.
You could also place a printf inside getbuf and debug that way - it depends on what you are trying to do.
One other point leaps out from your code. Upon return from getbuf, the result of Gets will be trashed. This is because Gets is presumably writing its results into the stack-allocated buf. When you return from getbuf, your stack is blown and you cannot reliably access buf.

My overflow code does not work

The code below is from the well-known article Smashing The Stack For Fun And Profit.
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret)+=8;
}
void main() {
int x;
x=0;
function(1,2,3);
x=1;
printf("%d\n",x);
}
I think I must explain my target of this code.
The stack model is below. The number below the word is the number of bytes of the variable in the stack. So, if I want to rewrite RET to skip the statement I want, I calculate the offset from buffer1 to RET is 8+4=12. Since the architecture is x86 Linux.
buffer2 buffer1 BSP RET a b c
(12) (8) (4) (4) (4) (4) (4)
I want to skip the statement x=1; and let printf() output 0 on the screen.
I compile the code with:
gcc stack2.c -g
and run it in gdb:
gdb ./a.out
gdb gives me the result like this:
Program received signal SIGSEGV, Segmentation fault.
main () at stack2.c:17
17 x = 1;
I think Linux uses some mechanism to protect against stack overflow. Maybe Linux stores the RET address in another place and compares the RET address in the stack before functions return.
And what is the detail about the mechanism? How should I rewrite the code to make the program output 0?
OK,the disassemble code is below.It comes form the output of gdb since I think is more easy to read for you.And anybody can tell me how to paste a long code sequence?Copy and paste one by one makes me too tired...
Dump of assembler code for function main:
0x08048402 <+0>: push %ebp
0x08048403 <+1>: mov %esp,%ebp
0x08048405 <+3>: sub $0x10,%esp
0x08048408 <+6>: movl $0x0,-0x4(%ebp)
0x0804840f <+13>: movl $0x3,0x8(%esp)
0x08048417 <+21>: movl $0x2,0x4(%esp)
0x0804841f <+29>: movl $0x1,(%esp)
0x08048426 <+36>: call 0x80483e4 <function>
0x0804842b <+41>: movl $0x1,-0x4(%ebp)
0x08048432 <+48>: mov $0x8048520,%eax
0x08048437 <+53>: mov -0x4(%ebp),%edx
0x0804843a <+56>: mov %edx,0x4(%esp)
0x0804843e <+60>: mov %eax,(%esp)
0x08048441 <+63>: call 0x804831c <printf#plt>
0x08048446 <+68>: mov $0x0,%eax
0x0804844b <+73>: leave
0x0804844c <+74>: ret
Dump of assembler code for function function:
0x080483e4 <+0>: push %ebp
0x080483e5 <+1>: mov %esp,%ebp
0x080483e7 <+3>: sub $0x14,%esp
0x080483ea <+6>: lea -0x9(%ebp),%eax
0x080483ed <+9>: add $0x3,%eax
0x080483f0 <+12>: mov %eax,-0x4(%ebp)
0x080483f3 <+15>: mov -0x4(%ebp),%eax
0x080483f6 <+18>: mov (%eax),%eax
0x080483f8 <+20>: lea 0x8(%eax),%edx
0x080483fb <+23>: mov -0x4(%ebp),%eax
0x080483fe <+26>: mov %edx,(%eax)
0x08048400 <+28>: leave
0x08048401 <+29>: ret
I check the assemble code and find some mistake about my program,and I have rewrite (*ret)+=8 to (*ret)+=7,since 0x08048432 <+48>minus0x0804842b <+41> is 7.
Because that article is from 1996 and the assumptions are incorrect.
Refer to "Smashing The Modern Stack For Fun And Profit"
http://www.ethicalhacker.net/content/view/122/24/
From the above link:
However, the GNU C Compiler (gcc) has evolved since 1998, and as a result, many people are left wondering why they can't get the examples to work for them, or if they do get the code to work, why they had to make the changes that they did.
The function function overwrites some place of the stack outside of its own, which is this case is the stack of main. What it overwrites I don't know, but it causes the segmentation fault you see. It might be some protection employed by the operating system, but it might as well be the generated code just does something wrong when wrong value is at that position on the stack.
This is a really good example of what may happen when you write outside of your allocated memory. It might crash directly, it might crash somewhere completely different, or if might not crash at all but instead just do some calculation wrong.
Try ret = buffer1 + 3;
Explanation: ret is an integer pointer; incrementing it by 1 adds 4 bytes to the address on 32bit machines.

Resources