Some time ago I was experimenting with writing assembly
routines and linking it with C programs and I found that
I just can skip standard C-call prologue epilogue
push ebp
mov ebp, esp
(sub esp, 4
...
mov esp, ebp)
pop ebp
just skip it all and adress just by esp, like
mov eax, [esp+4] ;; take argument
mov [esp-4], eax ;; use some local variable storage
It seem to work quite good. Why this ebp is used - is maybe
addressing through ebp faster or what ?
There's no requirement to use a stack frame, but there are certainly some advantages:
Firstly, if every function has uses this same process, we can use this knowledge to easily determine a sequence of calls (the call stack) by reversing the process. We know that after a call instruction, ESP points to the return address, and that the first thing the called function will do is push the current EBP and then copy ESP into EBP. So, at any point we can look at the data pointed to by EBP which will be the previous EBP and that EBP+4 will be the return address of the last function call. We can therefore print the call stack (assuming 32bit) using something like (excuse the rusty C++):
void LogStack(DWORD ebp)
{
DWORD prevEBP = *((DWORD*)ebp);
DWORD retAddr = *((DWORD*)(ebp+4));
if (retAddr == 0) return;
HMODULE module;
GetModuleHandleExA(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS, (const char*)retAddr, &module);
char* fileName = new char[256];
fileName[255] = 0;
GetModuleFileNameA(module, fileName, 255);
printf("0x%08x: %s\n", retAddr, fileName);
delete [] fileName;
if (prevEBP != 0) LogStack(prevEBP);
}
This will then print out the entire sequence of calls (well, their return addresses) up until that point.
Furthermore, since EBP doesn't change unless you explicitly update it (unlike ESP, which changes when you push/pop), it's usually easier to reference data on the stack relative to EBP, rather than relative to ESP, since with the latter, you have to be aware of any push/pop instructions that might have been called between the start of the function and the reference.
As others have mentioned, you should avoid using stack addresses below ESP as any calls you make to other functions are likely to overwrite the data at these addresses. You should instead reserve space on the stack for use by your function by the usual:
sub esp, [number of bytes to reserve]
After this, the region of the stack between the initial ESP and ESP - [number of bytes reserved] is safe to use.
Before exiting your function you must release the reserved stack space using a matching:
add esp, [number of bytes reserved]
The use of EBP is of great help when debugging code, as it allows debuggers to traverse the stack frames in a call chain.
It [creates] a singly linked list that linked the frame pointer for each of the callers to a function. From the EBP for a routine, you could recover the entire call stack for a function.
See http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames
And in particular the page it links to which covers your question: http://blogs.msdn.com/b/larryosterman/archive/2007/03/12/fpo.aspx
It works, However, once you'll get an interrupt, the processor will push all it's registers and flags into the stack, overwriting your value.
The stack is there for a reason, use it...
Related
I am having an issue with some inline assembly. I am writing a compiler, and it is compiling to assembly, and for portability i made it add the main function in C and just use inline assembly. Though even the simplest inline assembly is giving me a segfault. Thanks for your help
int main(int argc, char** argv) {
__asm__(
"push $1\n"
);
return 0;
}
TLDR at bottom. Note: everything here is assuming x86_64.
The issue here is that compilers will effectively never use push or pop in a function body (except for prologues/epilogues).
Consider this example.
When the function begins, room is made on the stack in the prologue with:
push rbp
mov rbp, rsp
sub rsp, 32
This creates 32 bytes of room for main. Then notice how throughout the function, instead of pushing items to the stack, they are mov'd to the stack through offsets from rbp:
mov DWORD PTR [rbp-20], edi
mov QWORD PTR [rbp-32], rsi
mov DWORD PTR [rbp-4], 2
mov DWORD PTR [rbp-8], 5
The reason for this is it allows for variables to be stored anywhere at anytime, and loaded from anywhere at anytime without requiring a huge amount of push/pops.
Consider the case where variables are stored using push and pop. Say a variable is stored early on in the function, let's call this foo. 8 variables on the stack later, you need foo, how should you access it?
Well, you can pop everything until foo, and then push everything back, but that's costly.
It also doesn't work when you have conditional statements. Say a variable is only ever stored if foo is some certain value. Now you have a conditional where the stack pointer could be at one of two locations after it!
For this reason, compilers always prefer to use rbp - N to store variables, as at any point in the function, the variable will still live at rbp - N.
NB: On different ABIs (such as i386 system V), parameters to arguments may be passed on the stack, but this isn't too much of an issue, as ABIs will generally specify how this should be handled. Again, using i386 system V as an example, the calling convention for a function will go something like:
push edi ; 2nd argument to the function.
push eax ; 1st argument to the function.
call my_func
; here, it can be assumed that the stack has been corrected
So, why does push actually cause an issue?
Well, I'll add a small asm snippet to the code
At the end of the function, we now have the following:
push 64
mov eax, 0
leave
ret
There's 2 things that fail now due to pushing to the stack.
The first is the leave instruction (see this thread)
The leave instruction will attempt to pop the value of rbp that was stored at the beginning of the function (notice the only push that the compiler generates is at the start: push rbp).
This is so that the stack frame of the caller is preserved following main. By pushing to the stack, in our case rbp is now going to be set to 64, since the last value pushed is 64. When the callee of main resumes it's execution, and tries to access a value at say, rbp - 8, a crash will occur, as rbp - 8 is 0x38 in hex, which is an invalid address.
But that assumes the callee even get's execution back!
After rbp has it's value restored with the invalid value, the next thing on the stack will be the original value of rbp.
The ret instruction will pop a value from the stack, and return to that address...
Notice how this might be slightly problematic?
The CPU is going to try and jump to the value of rbp stored at the start of the function!
On nearly every modern program, the stack is a "no execute" zone (see here), and attempting to execute code from there will immediately cause a crash.
So, TLDR: Pushing to the stack violates assumptions made by the compiler, most importantly about the return address of the function. This violation causes program execution to end up on the stack (generally), which will cause a crash
Im pretty new to assembly, and am trying my best to learn it. Im taking a course to learn it and they mentioned a very remedial Hello World example, that I decomplied.
original c file:
#include <stdio.h>
int main()
{
printf("Hello Students!");
return 0;
}
This was decompiled using the following command:
C:> objdump -d -Mintel HelloStudents.exe > disasm.txt
decompliation (assembly):
push ebp
mov ebp, esp
and esp, 0xfffffff0
sub esp, 0x10
call 401e80 <__main>
mov DWORD PTR [esp], 0x404000
call 4025f8 <_puts>
mov eax, 0x0
leave
ret
Im having issues mapping this output from the decompliation, to the original C file can someone help?
Thank you very much!
The technical term for decompiling assembly back into C is "turning hamburger back into cows". The generated assembly will not be a 1-to-1 translation of the source, and depending on the level of optimization may be radically different. You will get something functionally equivalent to the original source, but how closely it resembles that source in structure is heavily variable.
push ebp
mov ebp, esp
and esp, 0xfffffff0
sub esp, 0x10
This is all preamble, setting up the stack frame for the main function. It aligns the stack pointer (ESP) by 16 bytes then reserves another 16 bytes of space for outgoing function args.
call 401e80, <___main>
This function call to ___main is how MinGW arranges for libc initialization functions to run at the start of the program, making sure stdio buffers are allocated and stuff like that.
That's the end of the pre-amble; the part of the function that implements the C statements in your source starts with:
mov DWORD PTR [esp], 0x404000
This writes the address of the string literal "Hello Students!" onto the stack. Combined with the earliersub esp, 16, this is like apush` instruction. In this 32-bit calling convention, function args are passed on the stack, not registers, so that's where the compiler has to put them before function calls.
call 4025f8 <_puts>
This calls the puts function. The compiler realized that you weren't doing any format processing in the printf call and replaced it with the simpler puts call.
mov eax, 0x0
The return value of main is loaded into the eax register
leave
ret
Restore the previous EBP value, and tear down the stack frame, then exit the function. ret pops a return address off the stack, which can only work when ESP is pointing at the return address.
I am currently trying to understand Writing buffer overflow exploits - a tutorial for beginners.
The C code, compiled with cc -ggdb exploitable.c -o exploitable
#include <stdio.h>
void exploitableFunction (void) {
char small[30];
gets (small);
printf("%s\n", small);
}
main() {
exploitableFunction();
return 0;
}
seems to have the assembly code
0x000000000040063b <+0>: push %rbp
0x000000000040063c <+1>: mov %rsp,%rbp
0x000000000040063f <+4>: callq 0x4005f6 <exploitableFunction>
0x0000000000400644 <+9>: mov $0x0,%eax
0x0000000000400649 <+14>: pop %rbp
0x000000000040064a <+15>: retq
I think it does the following, but I'm really not sure about it and I would like to hear from somebody who is experienced with assembly code if I'm right / what is right.
40063b: Put the address which is currently in the base pointer register into the stack segment (How is this register initialized? Why is that done?)
40063c: Copy the value from the stack pointer register into the base pointer register (why?)
40063f: Call exploitableFunction (What exactly does it mean to "call" a function in assembly? What happens here?)
400644: Copy the value from the address $0x0 to the EAX register
400649: Copy the value from the top of the stack (determined by the value in %rsp) into the base pointer register (seems to be confirmed by Assembler: Push / pop registers?)
40064a: Return (the OS uses what is in %EAX as return code - so I guess the address $0x0 contains the constant 0? Or is that not an address but the constant?)
40063b: Put the address which is currently in the base pointer register into the stack segment (How is this register initialized? Why is that done?)
You want to save the base pointer because it is probably used by the calling function.
40063c: Copy the value from the stack pointer register into the base pointer register (why?)
This gives you a fixed position into the stack, which might contain parameters for the function. It can also be used as a base address for any local variables.
40063f: Call exploitableFunction (What exactly does it mean to "call" a function in assembly? What happens here?)
"call" means pushing the return address (address of the next instruction) onto the stack, and then jumping to the start of the called function.
400644: Copy the value from the address $0x0 to the EAX register
It is actually the value 0 from the return statement.
400649: Copy the value from the top of the stack (determined by the value in %rsp) into the base pointer register (seems to be confirmed by Assembler: Push / pop registers?)
This restores the base pointer we saved at the top. The calling function might assume that we do.
40064a: Return (the OS uses what is in %EAX as return code - so I guess the address $0x0 contains the constant 0? Or is that not an address but the constant?)
It was the constant from return 0. Using EAX for a small return value is a common convention.
I found a Link which have similar code to your own with full explenation.
40063b: push the old base pointer onto the stack to save it for later. It's pushed because this is not the only process in the code. some other process call it.
40063c: copy the value of the stack pointer to the base pointer. After this, %rbp points to the base of main’s stack frame.
40063f: call the function in address 0x4005f6 which push the program counter into stack and load address 0x4005f6 into program conter, when the function returns, pop operation is happened to return the saved address in the stack to program counter which is 0x400644 here
400644: This instruction copies 0 into %eax, The x86 calling convention dictates that a function’s return value is stored in %eax
400649: We pop the old base pointer off the stack and store it back in %rbp
40064a: jumps back to return address, which is also stored in the stack frame. which specify the end of the program.
Also you didn't mention the assembly code for the function exploitableFunction. here is only main function
The function entry saves bp and moves sp into bp. All parameters of the function will now be addressed using bp. This is a standard cdecl convention (in Intel assembler):
; int example(char *s, int i)
push bp ; save the caller's value of bp
mov bp,sp ; set-up our base pointer to the stack-frame
sub sp, 16 ; room for automatic variables
mov ax,dword ptr [bp+8] ; ax has *s
mov bx,dword ptr [bp+12] ; bx has i
... ; do your thing
mov ax, dword ptr[result] ; function return in ax
pop bp ; restore caller's base-pointer
ret
When calling this function, the compiler pushes the parameters onto the stack and then calls the function. Upon return, it cleans up the stack:
; i= example(myString, k);
mov ax, [bp+16] ; this gets a parameter of the curent function
push ax ; this will be parameter i
mov ax, [bp-16] ; this gets a local variable
push ax ; this is parameter s
call example
add sp,8 ; remove the pushed parameters from the stack
mov dword ptr [i], ax ; save return value - always in ax
Different compilers can use different conventions about passing parameters in registers, but I think the above is the basics of calls in C (using cdecl).
My question is related to Stack allocation, padding, and alignment. Consider the following function:
void func(int a,int b)
{
char buffer[5];
}
At assembly level, the function looks like this:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
I want to know how the 24 bytes on the stack is allocated. I understand that 16 bytes is allocated for the char buffer[5]. I don't understand why the extra 8 bytes are for and how they are allocated. The top answer in the above link says that it is for ret and leave. Can someone please expand on that?
I'm thinking that the stack structure looks like this:
[bottom] b , a , return address , frame pointer , buffer1 [top]
But this could be wrong because i'm writing a simple buffer overflow and trying to change the return address. But for some reason the return address is not changing. Is something else present on the stack?
There are a couple of reasons for extra space. One is for alignment of variables. A second is to introduce padding for checking the stack (typically a debug build rather than a release build use of space). A third is to have additional space for temporary storage of registers or compiler generated temporary variables.
In the C calling sequence, the way it is normally done is there will be a series of push instructions pushing the arguments onto the stack and then a call instruction is used to call the function. The call instruction will push the return address onto the stack.
When the function returns, the calling function will then remove the pushed on arguments. For instance the call to a function (this is Visual Studio 2005 with a C++ program) will look like:
push OFFSET ?pHead##3VPerson##A ; pHead
call ?exterminateStartingFrom##YAXPAVPerson###Z ; exterminateStartingFrom
add esp, 4
This is pushing the address of a variable onto the stack, calling the function (the function name is mangled per C++), and then after the called function returns, it readjusts the stack by adding to the stack pointer the number of bytes used for the address.
The following is the entry part of the called function. What this does is to allocate space on the stack for the local variables. Notice that after setting up the entry environment, it then gets the function argument from the stack.
push ebp
mov ebp, esp
sub esp, 232 ; 000000e8H
push ebx
push esi
push edi
lea edi, DWORD PTR [ebp-232]
When the function returns, it basically adjusts the stack back to where it was at the time the function was called. Each function is responsible for cleaning up whatever changes it has made to the stack before it returns.
pop edi
pop esi
pop ebx
add esp, 232 ; 000000e8H
pop ebp
ret 0
You mention that you are trying to change the return address. From these examples what you can see is that the return address is after the last argument that was pushed onto the stack.
Here is a brief writeup on function call conventions. Also take a look at this document on Intel assembler instructions.
Doing some example work with Visual Studio 2005 what I see is that if I do the following code, I can access the return for this example function.
void MyFunct (unsigned short arg) {
unsigned char *retAddress = (unsigned char *)&arg;
retAddress -=4;
printf ("Return address is 0x%2.2x%2.2x%2.2x%2.2x\n", retAddress[3], retAddress[2], retAddress[1], retAddress[0]);
}
Notice that the call assembler instruction for this Windows 32 bit addressing appears to put the return address in a byte order in which the return address is stored from low byte to high byte.
The extra space is for the stack alignment, which is usually done for better performance.
Whenever I read about program execution in C, it speaks very less about the function execution. I am still trying to find out what happens to a function when the program starts executing it from the time it is been called from another function to the time it returns? How do the function arguments get stored in memory?
That's unspecified; it's up to the implementation. As pointed out by Keith Thompson, it doesn't even have to tell you how it works. :)
Some implementations will put all the arguments on the stack, some will use registers, and many use a mix (the first n arguments passed in registers, any more and they go on the stack).
But the function itself is just code, it's read-only and nothing much "happens" to it during execution.
There is no one correct answer to this question, it depends heavily upon how the compiler writer determines is the best model to do this. There are various bits in the standard that describes this process but most of it is implementation defined. Also, the process is dependent on the architecture of the system, the OS you're aiming for, the level of optimisation and so forth.
Take the following code:-
int DoProduct (int a, int b, int c)
{
return a * b * c;
}
int result = DoProduct (4, 5, 6);
The MSVC2005 compiler, using standard debug build options created this for the last line of the above code:-
push 6
push 5
push 4
call DoProduct (411186h)
add esp,0Ch
mov dword ptr [ebp-18h],eax
Here, the arguments are pushed onto the stack, starting with the last argument, then the penultimate argument and so on until the the first argument is pushed onto the stack. The function is called, then the arguments are removed from the stack (the add esp,0ch) and then the return value is saved - the result is stored in the eax register.
Here's the code for the function:-
push ebp
mov ebp,esp
sub esp,0C0h
push ebx
push esi
push edi
lea edi,[ebp-0C0h]
mov ecx,30h
mov eax,0CCCCCCCCh
rep stos dword ptr es:[edi]
mov eax,dword ptr [a]
imul eax,dword ptr [b]
imul eax,dword ptr [c]
pop edi
pop esi
pop ebx
mov esp,ebp
pop ebp
ret
The first thing the function does is to create a local stack frame. This involves creating a space on the stack to store local and temporary variables in. In this case, 192 (0xc0) bytes are reserved (the first three instructions). The reason it's so many is to allow the edit-and-continue feature some space to put new variables into.
The next three instructions save the reserved registers as defined by the MS compiler. Then the stack frame space just created is initialised to contain a special debug signature, in this case 0xCC. This means unitialised memory and if you ever see a value consisting of just 0xCC's in debug mode then you've forgotten to initialise the value (unless 0xCC was the value).
Once all that housekeeping has been done, the next three instructions implement the body of the function, the two multiplies. After that, the reserved registers are restored and then the stack frame destroyed and finally the function ends with a ret. Fortunately, the imul puts the result of the multiplication into the eax register so there's no special code to get the result into the right register.
Now, you've probably been thinking that there's a lot there that isn't really necessary. And you're right, but debug is about getting the code right and a lot of the above helps to achieve that. In release, there's a lot that can be got rid of. There's no need for a stack frame, no need, therefore, to initialise it. There's no need to save the reserved registers as they aren't modified. In fact, the compiler creates this:-
mov eax,dword ptr [esp+4]
imul eax,dword ptr [esp+8]
imul eax,dword ptr [esp+0Ch]
ret
which, if I'd let the compiler do it, would have been in-lined into the caller.
There's a lot more stuff that can happen: values passed in registers and so on. Also, I've not got into how floating point values and structures / classes as passed to and from functions. And there's more that I've probably left out.