I have a small program, written in C, echo():
/* Read input line and write it back */
void echo() {
char buf[8]; /* Way too small! */
gets(buf);
puts(buf);
}
The corresponding assembly code:
1 echo:
2 pushl %ebp //Save %ebp on stack
3 movl %esp, %ebp
4 pushl %ebx //Save %ebx
5 subl $20, %esp //Allocate 20 bytes on stack
6 leal -12(%ebp), %ebx //Compute buf as %ebp-12
7 movl %ebx, (%esp) //Store buf at top of stack
8 call gets //Call gets
9 movl %ebx, (%esp) //Store buf at top of stack
10 call puts //Call puts
11 addl $20, %esp //Deallocate stack space
12 popl %ebx //Restore %ebx
13 popl %ebp //Restore %ebp
14 ret //Return
I have a few questions.
Why does the %esp allocate 20 bytes? The buf is 8 bytes, why the extra 12?
The return address is right above where we pushed %ebp right? (Assuming we draw the stack upside down, where it grows downward) What is the purpose of the old %ebp (which the current %ebp is pointing at, as a result of line 3)?
If i want to change the return address (by inputting anything more than 12 bytes), it would change where echo() returns to. What is the consequence of changing the old %ebp (aka 4 bytes before the return address)? Is there any possibility of changing the return address or where echo returns to by just changing the old %ebp?
What is the purpose of the %ebp? I know its the frame pointer but, what is that?
Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?
*c code and assembly copied from Computer Systems - A programmer's Perspective 2nd ed.
** Using gets() because doing buffer overflows
The reason 20 bytes are allocated is for the purpose of stack alignment. GCC 4.5+ generates code that ensures that the callee's local stack space is aligned to a 16-byte boundary, in order to ensure that compiled code can do aligned SSE loads and stores on the stack in a well-defined manner. For that reason, the compiler in this case needs to throw away some stack-space in order to ensure that gets/puts get a properly aligned frame.
In essence, this is how the stack will look, where each line is a 4-byte word except for --- lines that denote 16-byte address boundaries:
...
Saved EIP from caller
Saved EBP
---
Saved EBX # This is where echo's frame starts
buf
buf
Unused
---
Unused
Parameter to gets/puts
Saved EIP
Saved EBP
---
... # This is where gets'/puts' frame starts
As you can hopefully see from my fantastic ASCII graphics, if it weren't for the "unused" portions, gets/puts would get an unaligned frame. Do note also, however, that not 12 bytes are unused; 4 of them are reserved for the parameter.
Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?
Certainly. The compiler is free to organize the stack however it feels like. In order to do buffer overflows predictably, you have to be looking at a specific compiled binary of a program.
As for what the purpose of EBP is (and thus to answer your questions 2, 3 and 5), please see any introductory text to how the call stack is organized, such as the Wikipedia article.
Related
I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):
#include <stdio.h>
#include <stdlib.h>
int addition(int a, int b);
int main(void) {
int sum = addition(3, 4);
printf("%d", sum);
return EXIT_SUCCESS;
}
Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
# Push the current EBP (base pointer) to the stack, so that we
# can reset the EBP to its original state after the function's
# execution
push %ebp
# Move the EBP (base pointer) to the current position of the ESP
# register
movl %esp, %ebp
# Read in the parameters of the addition function
# addition(a, b)
#
# Since we are pushing to the stack, we need to obtain the parameters
# in reverse order:
# EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
#
# Utilize advanced indexing in order to obtain the parameters, and
# store them in the CPU's registers
movzbl 8(%ebp), %ebx
movzbl 12(%ebp), %ecx
# Clear the EAX register to store the sum
xorl %eax, %eax
# Add the values into the section of memory storing the return value
addl %ebx, %eax
addl %ecx, %eax
I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).
Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.
You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.
movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.
You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.
I would write:
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
push %ebp
movl %esp, %ebp
# load first argument
movl 8(%ebp), %eax
# add second argument
addl 12(%ebp), %eax
# epilogue
movl %ebp, %esp # redundant since we haven't touched esp, but will be needed in more complex functions
pop %ebp
ret
In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to
.text
.global addition
addition:
movl 4(%esp), %eax
addl 8(%esp), %eax
ret
You are corrupting the stacke here:
movb %al, 4(%ebp)
To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.
Also EBX must be saved if you intend to use it, but you don't really need it anyway.
Say we are given a function:
int exchange(int*xp, int y)
{
x = *xp;
*xp = y;
return x;
}
So, the book I am reading explains that xp is stored at offsets 8 and 12 relative to the address register %ebp. What I am not understanding is why they are stored as any kind of unit 8 and 12, further more: What is an offset in this context? Finally, how do 8 and 12 fit when the register accepts movement in units of 1 2 and 4 bytes respectively?
The assembly code :
xp at %ebp+8, y at%ebp+12
1 movl 8(%ebp), %edx (Get xp By copying to %eax below, x becomes the return value)
2 movl (%edx), %eax (Get x at xp)
3 movl 12(%ebp), %ecx (Get y)
4 movl %ecx, (%edx) (Store y at xp)
What I think the answer is:
So, when examining registries, it was common to see something like registry %rdi holding a value of 0x1004 which is an address and 0x1004 is in the address which holds a value 0xAA.
Of course, this is a hypothetical example that doesn't line up with the registries listed in the book. Each registry is 16-32 bit and the top four can be used to store integers freely. Does offsetting it by 8 make it akin to 0x1000 + 8? Again, I'm not entirely sure what the offset in this scenario is for when we are storing new units into empty space.
Because of how the call stack is structured when using C declaration.
First the caller will push the 4-byte y, then the 4-byte xp (this order is important so C can support Variadic Functions), then the call to your function will implicitly push the return address which is also 4-byte (this is a 32-bit program).
The first thing your function does is push the state of ebp which it will need to recover later so that the caller can continue working properly, and then copy the current state of esp (stack pointer) to ebp. In sum:
push %ebp
movl %esp, %ebp
This is also known as function prologue.
When all this is done you are finally ready to actually run the code you wrote, at this stage the stack is something like this:
%ebp- ? = address of your local variables (which in this example you don't have)
%ebp+ 0 = address of the saved state of previous ebp
%ebp+ 4 = ret address
%ebp+ 8 = address where is stored the value of xp
%ebp+12 = address where is stored the value of y
%ebp+16 = out of bonds, this memory space belongs to the caller
When your function is done it will wrap it up by setting esp back to ebp, then pop the original ebp and ret.
movl %ebp, %esp
pop %ebp
ret
ret is basically a shortcut to pop a pointer from the stack and jmp to it.
Edit: Fixed order of parameters for AT&T assembly
Look at the normal function entry in assembler:
push ebp
mov ebp, esp
sub esp, <size of local variables>
So ebp+4 holds the previous value of ebp. Before the old ebp was the return address, at ebp+8. Before that are the parameters of the function, in reverse order, so the first parameter is at ebp+12 and the second at ebp+8.
I'm trying to understand the underlying assembly for a simple C function.
program1.c
void function() {
char buffer[1];
}
=>
push %ebp
mov %esp, %ebp
sub $0x10, %esp
leave
ret
Not sure how it's arriving at 0x10 here? Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
program2.c
void function() {
char buffer[4];
}
=>
push %ebp
mov %esp, %ebp
sub $0x18, %esp
mov ...
mov ...
[a bunch of random instructions]
Not sure how it's arriving at 0x18 here either? Also, why are there so many additional instructions after the SUB instruction? All I did was change the length of the array from 1 to 4.
gcc uses -mpreferred-stack-boundary=4 by default for x86 32 and 64bit ABIs, so it keeps %esp 16B-aligned.
I was able to reproduce your output with gcc 4.8.2 -O0 -m32 on the Godbolt Compiler Explorer
void f1() { char buffer[1]; }
pushl %ebp
movl %esp, %ebp # make a stack frame (`enter` is super slow, so gcc doesn't use it)
subl $16, %esp
leave # `leave` is not terrible compared to mov/pop
ret
You must be using a version of gcc with -fstack-protector enabled by default. Newer gcc isn't usually configured to do that, so you don't get the same sentinel value and check written to the stack. (Try a newer gcc in that godbolt link)
void f4() { char buffer[4]; }
pushl %ebp #
movl %esp, %ebp # make a stack frame
subl $24, %esp # IDK why it reserves 24, rather than 16 or 32B, but prob. has something to do with aligning the stack for the possible call to __stack_chk_fail
movl %gs:20, %eax # load a value from thread-local storage
movl %eax, -12(%ebp) # store it on the stack
xorl %eax, %eax # tmp59
movl -12(%ebp), %eax # D.1377, tmp60
xorl %gs:20, %eax # check that the sentinel value matches what we stored
je .L3 #,
call __stack_chk_fail #
.L3:
leave
ret
Apparently gcc considers char buffer[4] a "vulnerable object", but not char buffer[1]. Without -fstack-protector, there'd be little to no difference in the asm even at -O0.
Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
This values are not bits, they are bytes.
Not sure how it's arriving at 0x10 here?
This lines:
push %ebp
mov %esp, %ebp
sub $0x10, %esp
Are allocating space on the stack, 16 bytes of memory are being reserved for the execution of this function.
All those bytes are needed to store information like:
A 4 byte memory address for the instruction that will be jumped to in the ret instruction
The local variables of the functions
Data structure alignment
Other stuff i can't remember right now :)
In your example, 16 bytes were allocated. 4 of them are for the address of the next instruction that will be called, so we have 12 bytes left. 1 byte is for the char array of size 1, which is probably optimized by the compiler to a single char. The last 11 bytes are probably to store some of the stuff i can't remember and the padding's added by the compiler.
Not sure how it's arriving at 0x18 here either?
Each of the additional bytes in your second example increased the stack size in 2 bytes, 1 byte for the char, and 1 likely for memory alignment purposes.
Also, why are there so many additional instructions after the SUB instruction?
Please update the question with the instructions.
This code is just setting up the stack frame. This is used as scratch space for local variables, and will have some kind of alignment requirement.
You haven't mentioned your platform, so I can't tell you exactly what the requirements are for your system, but obviously both values are at least 8-byte aligned (so the size of your local variables is rounded up so %esp is still a multiple of 8).
Search for "c function prolog epilog" or "c function call stack" to find more resources in this area.
Edit - Peter Cordes' answer explains the discrepancy and the mysterious extra instructions.
And for completeness, although Fábio already answered this part:
Not sure how it's arriving at 0x10 here? Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
On x86, %esp is the stack pointer, and pointers store addresses, and these are addresses of bytes. Sub-byte addressing is rarely used (cf. Peter's comment). If you want to examine individual bits inside a byte, you'd usually use bitwise (&,|,~,^) operations on the value, but not change the address.
(You could equally argue that sub-cache-line addressing is a convenient fiction, but we're rapidly getting off-topic).
Whenever you allocate memory, your operating system almost never actually gives you exactly that amount, unless you use a function like pvalloc, which gives you a page-aligned amount of bytes (usually 4K). Instead, your operating system assumes that you might need more in the future, so goes ahead and gives you a bit more.
To disable this behavior, use a lower-level system call that doesn't do buffering, like sbrk(). These lecture notes are an excellent resource:
http://web.eecs.utk.edu/~plank/plank/classes/cs360/360/notes/Malloc1/lecture.html
I am reading "Smashing The Stack For Fun And Profit" by Aleph one,
and reached this spot:
jmp 0x2a # 2 bytes
popl %esi # 1 byte
movl %esi,0x8(%esi) # 3 bytes
movb $0x0,0x7(%esi) # 4 bytes
movl $0x0,0xc(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal 0x8(%esi),%ecx # 3 bytes
leal 0xc(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call -0x2f # 5 bytes
.string \"/bin/sh\" # 8 bytes
------------------------------------------------------------------------------
Looks good. To make sure it works correctly we must compile it and run it.
**But there is a problem. Our code modifies itself**, but most operating system
mark code pages read-only.
My question is where (and how) does this code modifies itself? [I don't know assembly that well]
Thanks!
The first instruction jumps to the call at the end of the code which calls back to the second instruction that pops the return address placed on the stack by the call. Thus esi points to the string at the end. As you can see, the next 3 instructions write to memory relative to esi, setting up the argument pointer and zero terminating the string and the argument list. This is what the self modification refers to. It's slightly misleading because it isn't modifying code, just data. During standalone testing that data is part of the .text section which is typically read only, but can be made writable easily. Note that during actual usage this would be in the stack which is writable, but not executable so you'd have a different problem then.
So we have the following code, setting up for a function call with its arguments, its main body omitted (etc etc etc), and then the popping at the end of the function.
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl 8(%ebp), %ebx
movl 12(%ebp), %ecx
etc
etc
etc
//end of function
popl %ebx
popl %ebp
Here's what I (think) I understand.
Suppose we have %esp pointing to memory address 100.
pushl %ebp
So this essentially makes %ebp point to where %esp points (memory address 100) + 4. So now %ebp points to memory address 104. This leaves our current memory state looking like so:
----------
|100|%esp
|104|%ebp
----------
Then we have the next line of code:
movl %esp, %ebp
So from what I understand, ebp now pointers to memory address 100. I have a little intuition as to why we do this step, but my confusion is the next line:
pushl %ebx
What is the purpose of pushing ebx, which I assume will then point to memory address 104? I have a vague idea of how the space right below ebp (104) is supposed to be a reference to an "old stack pointer," so I can see why the next 2 lines add 8 and 12 to ebp to be the "arguments" of our function, rather than 4 and 8.
But I'm confused as to why we push ebx onto the stack, first.
I also do not understand popping, and why we pop ebx and ebp?
Talking to someone about this before he had to sleep, he mentioned that we have no reference to the fact that our stack pointer was at 100 -- until we pop ebp back. Now, I thought ebp's value was 100, so I don't understand the point he was trying to make.
So to clarify:
Is my understanding thus far correct?
Why do we push ebx onto the stack?
What is this "reference to the old stack pointer" that lives right below ebp? Is that the ebx that we push?
Is there something I'm not understanding, like some sort of difference between the ebx that we push, and the ebx in the line right after (our argument)? Is there a difference between the ebp that gets pushed and the ebp in the line right after?
Why are we popping at the end?
I apologize if this is difficult to understand. I understand similar questions have been asked about this, but I'm trying to intuitively understand and picture what exactly is going on in a function call in a way that makes sense to me.
Note: I edited some important things regarding my understanding of what's going on, particularly with regards to ebp.
As Joachim stated in a comment on your question, pushing a register pushes the contents of the register at that moment onto the stack; it doesn’t push a reference to the register or anything else. I’m not sure if you were saying that’s what was happening, but otherwise this diagram was unclear:
----------
|100|%esp
|104|%ebp
----------
Nevertheless, I’ll try to explain what it does and why.
Say %esp was 0x100 when the caller calls our function and the instruction after the call is at 0x200. When we execute call, we push 0x200 (the return address) and jump to the procedure. Our stack is then:
Address Value
%esp --> 0x100 0x200
And %ebp is some value or another; it might point into the stack or it might not. It doesn’t even need to represent an address. So %ebp is meaningless to us at this point.
But though it’s meaningless to us, the caller does expect it to stay the same before and after the call, so we have to preserve it. Let’s say it contained the value 0xDEADBEEF. We push it, so the stack now looks like this:
Address Value
0x100 0x 200
%esp --> 0x0fc 0xDEADBEEF
In most situations, we can address everything as an offset from %esp, and that applies here, too. But if the compiler is compiling some C code that deals with variable-length arrays or other features, we often will want to index from the first thing we pushed rather than the last thing we pushed. To do that, we’ll set %ebp to where we are right now. Then things look like this:
Address Value
0x100 0x 200
%esp, %ebp --> 0x0fc 0xDEADBEEF
Note that the value at the address pointed to by %ebp is the old value of %ebp, so you can walk the stack, as you mentioned you were aware of before.
Next, we push %ebx, which we’ll say has the value 0xBEEFCAFE. This is the first thing not directly related to a function prologue. Then our stack looks like this:
Address Value
0x100 0x 200
%ebp --> 0x0fc 0xDEADBEEF
%esp --> 0x0f8 0xBEEFCAFE
But why do we push %ebx? Well, as it turns out, the x86 C calling convention dictates that, like %ebp, %ebx must stay the same as it was before the call. So because the code you omitted presumably changes %ebx, it has to preserve the initial value so it can restore it for the caller.
After we’ve restored %ebx, we pop %ebp, restoring its value as well, since that, too, must be preserved after the call. And finally we return.
TL;DR: %ebp and %ebx are pushed and popped because they are manipulated during the execution of the body of the function, but the x86 C calling convention dictates that the values must remain the same before and after the call, so the initial values must be preserved so we can restore them.
pushl %ebp
Save the value of ebp on the stack. Any push command affects the value of %esp.
movl %esp, %ebp
Move the current value of esp into ebp. This sets the stack frame, you can now find function arguments above ebp (as the stack grows down).
pushl %ebx
Save the value of ebp (not 100% sure but most likely the ABI rules).
movl 8(%ebp), %ebx
Move the memory ebp+8 into ebx. As previously stated, since the stack grows down this is one of the function arguments.
movl 12(%ebp), %ecx
Similar to the previous instruction, this moves another function argument into ecx.
popl %ebx
Restore the value of ebx we saved on the stack earlier.
popl %ebp
And restore the value of ebp. At this point, there is a match pop for every push so the esp is back to what it was on function entry so we can return.