Explain the strange assembly of empty C `main` function by Visual C++ compiler

Explain the strange assembly of empty C `main` function by Visual C++ compiler - c

I just noticed some strange assembly language code of empty main method.
//filename: main.c
void main()
{
}
disassembly:
push ebp
mov ebp,esp
sub esp,0C0h; why on the earth is it reserving 192 bytes?
push ebx
push esi
push edi ; good compiler. Its saving ebx, esi & edi values.
lea edi,[ebp-0C0h] ; line 1
mov ecx,30h ; line 2
mov eax,0CCCCCCCCh ; line 3
rep stos dword ptr es:[edi] ; line 4
xor eax,eax ; returning value 0. Code following this line is explanatory.
pop edi ; restoring the original states of edi,esi & ebx
pop esi
pop ebx
mov esp,ebp
pop ebp
ret
why on the earth is it reserving 192 bytes for function where there aren't any variables
whats up with the four lines: line 1, line 2, line 3, line 4? what is it trying to do & WHY?

Greg already explained how the compiler generates code to diagnose uninitialized local variables, enabled by the /RTCu compile option. The 0xcccccccc value was chosen to be distinctive and easily recognized in the debugger. And to ensure the program bombs when an uninitialized pointer is dereferenced. And to ensure it terminates the program when it is executed as code. 0xcc is pretty ideal to do all of these jobs well, it is the instruction opcode for INT3.
The mysterious 192 bytes that are allocated in the stack frame are there to support the Edit + Continue feature, /ZI compile option. It allows you to edit the code while a breakpoint is active. And add local variables to a function those 192 bytes are available to provide the space for those added locals. Exceeding that space will make the IDE force you to rebuild your program.
Btw: this can cause a problem if you use recursion in your code. The debug build will bomb with this site's name a lot quicker. Not normally much of an issue, you debug with practical dataset sizes.

The four code lines you've indicated are the debug build clearing out the local variable space with the "clear" special value (0xCCCCCCCC).
I'm not sure why there are 192 bytes of seemingly dead space, but that might be VC++ building some guard space into your local variable area to try to detect stack smashing.
You will probably get a very different output if you switch from Debug to Release build.

Related

map exe decompilation back to C language

Im pretty new to assembly, and am trying my best to learn it. Im taking a course to learn it and they mentioned a very remedial Hello World example, that I decomplied.
original c file:
#include <stdio.h>
int main()
{
printf("Hello Students!");
return 0;
}
This was decompiled using the following command:
C:> objdump -d -Mintel HelloStudents.exe > disasm.txt
decompliation (assembly):
push ebp
mov ebp, esp
and esp, 0xfffffff0
sub esp, 0x10
call 401e80 <__main>
mov DWORD PTR [esp], 0x404000
call 4025f8 <_puts>
mov eax, 0x0
leave
ret
Im having issues mapping this output from the decompliation, to the original C file can someone help?
Thank you very much!

The technical term for decompiling assembly back into C is "turning hamburger back into cows". The generated assembly will not be a 1-to-1 translation of the source, and depending on the level of optimization may be radically different. You will get something functionally equivalent to the original source, but how closely it resembles that source in structure is heavily variable.
push ebp
mov ebp, esp
and esp, 0xfffffff0
sub esp, 0x10
This is all preamble, setting up the stack frame for the main function. It aligns the stack pointer (ESP) by 16 bytes then reserves another 16 bytes of space for outgoing function args.
call 401e80, <___main>
This function call to ___main is how MinGW arranges for libc initialization functions to run at the start of the program, making sure stdio buffers are allocated and stuff like that.
That's the end of the pre-amble; the part of the function that implements the C statements in your source starts with:
mov DWORD PTR [esp], 0x404000
This writes the address of the string literal "Hello Students!" onto the stack. Combined with the earliersub esp, 16, this is like apush` instruction. In this 32-bit calling convention, function args are passed on the stack, not registers, so that's where the compiler has to put them before function calls.
call 4025f8 <_puts>
This calls the puts function. The compiler realized that you weren't doing any format processing in the printf call and replaced it with the simpler puts call.
mov eax, 0x0
The return value of main is loaded into the eax register
leave
ret
Restore the previous EBP value, and tear down the stack frame, then exit the function. ret pops a return address off the stack, which can only work when ESP is pointing at the return address.

Why stack grows by 16 bytes in this disassembly, when I only have one 4 byte local variable?

I'm having trouble understanding why the compiler chose to offset the stack space in the way it did with the code I wrote.
I was toying with Godbolt's Compiler Explorer in order to study the C calling convention, when I came up with a simple code that puzzled me by its choices.
The code is found in this link. I selected GCC 8.2 x86-64, but am targetting x86 processors and this is important. Bellow is the transcription of the C code and the generated assembly reported by the Compiler Explorer.
// C code
int testing(char a, int b, char c) {
return 42;
}
int main() {
int x = testing('0', 0, '7');
return 0;
}
; Generated assembly
testing(char, int, char):
push ebp
mov ebp, esp
sub esp, 8
mov edx, DWORD PTR [ebp+8]
mov eax, DWORD PTR [ebp+16]
mov BYTE PTR [ebp-4], dl
mov BYTE PTR [ebp-8], al
mov eax, 42
leave
ret
main:
push ebp
mov ebp, esp
sub esp, 16
push 55
push 0
push 48
call testing(char, int, char)
add esp, 12
mov DWORD PTR [ebp-4], eax
mov eax, 0
leave
ret
Looking at the assembly column from now on, as I understood, line 15 is responsible for reserving space in the stack for the local variables. The problem is that I have only one local int and the offset was by 16 bytes instead of 4. This feels like wasted space.
Is this somewhat related to word alignment? But even if it is, if the sizes of the general purpose registers are 4 bytes, shouldn't this alignment be with regards to 4 bytes?
One other strange thing I see is with respect to the placement of the local chars of the testing function. They seem to be taking 4 bytes each in the stack, as seen in lines 7-8, but only the lower bytes are manipulated. Why not use only 1 byte each?
These choices are probably well intended, and I would really like to understand their purpose (or whether there is no purpose). Or maybe I'm just confused and didn't quite get it.

So, by the comments, I could figure out that the stack growth issue is due to the i386 SystemV ABI requirements, as stated by #PeterCordes.
The reason why the chars are word aligned may be due to GCC's default behavior to improve speed, as maybe inferenced from #Ped7g's comment. Although not definite, this is a good enough answer for me.

It's common today to acquire stack space in multiples of this size, for several reasons:
cache lines favor this behaviour by maintaining the whole data in the cache.
space for temporaries is preallocated, avoiding push and pop instructions to be used in case some temporary storage is needed out of the cpu.
individual push and pop instructions degrade pipeline execution, by requiring data to be updated before next instruction is executed. This decouples the data dependencies between consecutive instructions and allow them to run faster.
For this reasons, actual compilers specify ABIs to be designed in this way.

What happens to a function called in C from the time being called to the time it returns?

Whenever I read about program execution in C, it speaks very less about the function execution. I am still trying to find out what happens to a function when the program starts executing it from the time it is been called from another function to the time it returns? How do the function arguments get stored in memory?

That's unspecified; it's up to the implementation. As pointed out by Keith Thompson, it doesn't even have to tell you how it works. :)
Some implementations will put all the arguments on the stack, some will use registers, and many use a mix (the first n arguments passed in registers, any more and they go on the stack).
But the function itself is just code, it's read-only and nothing much "happens" to it during execution.

There is no one correct answer to this question, it depends heavily upon how the compiler writer determines is the best model to do this. There are various bits in the standard that describes this process but most of it is implementation defined. Also, the process is dependent on the architecture of the system, the OS you're aiming for, the level of optimisation and so forth.
Take the following code:-
int DoProduct (int a, int b, int c)
{
return a * b * c;
}
int result = DoProduct (4, 5, 6);
The MSVC2005 compiler, using standard debug build options created this for the last line of the above code:-
push 6
push 5
push 4
call DoProduct (411186h)
add esp,0Ch
mov dword ptr [ebp-18h],eax
Here, the arguments are pushed onto the stack, starting with the last argument, then the penultimate argument and so on until the the first argument is pushed onto the stack. The function is called, then the arguments are removed from the stack (the add esp,0ch) and then the return value is saved - the result is stored in the eax register.
Here's the code for the function:-
push ebp
mov ebp,esp
sub esp,0C0h
push ebx
push esi
push edi
lea edi,[ebp-0C0h]
mov ecx,30h
mov eax,0CCCCCCCCh
rep stos dword ptr es:[edi]
mov eax,dword ptr [a]
imul eax,dword ptr [b]
imul eax,dword ptr [c]
pop edi
pop esi
pop ebx
mov esp,ebp
pop ebp
ret
The first thing the function does is to create a local stack frame. This involves creating a space on the stack to store local and temporary variables in. In this case, 192 (0xc0) bytes are reserved (the first three instructions). The reason it's so many is to allow the edit-and-continue feature some space to put new variables into.
The next three instructions save the reserved registers as defined by the MS compiler. Then the stack frame space just created is initialised to contain a special debug signature, in this case 0xCC. This means unitialised memory and if you ever see a value consisting of just 0xCC's in debug mode then you've forgotten to initialise the value (unless 0xCC was the value).
Once all that housekeeping has been done, the next three instructions implement the body of the function, the two multiplies. After that, the reserved registers are restored and then the stack frame destroyed and finally the function ends with a ret. Fortunately, the imul puts the result of the multiplication into the eax register so there's no special code to get the result into the right register.
Now, you've probably been thinking that there's a lot there that isn't really necessary. And you're right, but debug is about getting the code right and a lot of the above helps to achieve that. In release, there's a lot that can be got rid of. There's no need for a stack frame, no need, therefore, to initialise it. There's no need to save the reserved registers as they aren't modified. In fact, the compiler creates this:-
mov eax,dword ptr [esp+4]
imul eax,dword ptr [esp+8]
imul eax,dword ptr [esp+0Ch]
ret
which, if I'd let the compiler do it, would have been in-lined into the caller.
There's a lot more stuff that can happen: values passed in registers and so on. Also, I've not got into how floating point values and structures / classes as passed to and from functions. And there's more that I've probably left out.

Help deciphering simple Assembly Code

I am learning assembly using GDB & Eclipse
Here is a simple C code.
int absdiff(int x, int y)
{
if(x < y)
return y-x;
else
return x-y;
}
int main(void) {
int x = 10;
int y = 15;
absdiff(x,y);
return EXIT_SUCCESS;
}
Here is corresponding assembly instructions for main()
main:
080483bb: push %ebp #push old frame pointer onto the stack
080483bc: mov %esp,%ebp #move the frame pointer down, to the position of stack pointer
080483be: sub $0x18,%esp # ???
25 int x = 10;
080483c1: movl $0xa,-0x4(%ebp) #move the "x(10)" to 4 address below frame pointer (why not push?)
26 int y = 15;
080483c8: movl $0xf,-0x8(%ebp) #move the "y(15)" to 8 address below frame pointer (why not push?)
28 absdiff(x,y);
080483cf: mov -0x8(%ebp),%eax # -0x8(%ebp) == 15 = y, and move it into %eax
080483d2: mov %eax,0x4(%esp) # from this point on, I am confused
080483d6: mov -0x4(%ebp),%eax
080483d9: mov %eax,(%esp)
080483dc: call 0x8048394 <absdiff>
31 return EXIT_SUCCESS;
080483e1: mov $0x0,%eax
32 }
Basically, I am asking to help me to make sense of this assembly code, and why it is doing things in this particular order. Point where I am stuck, is shown in assembly comments. Thanks !

Lines 0x080483cf to 0x080483d9 are copying x and y from the current frame on the stack, and pushing them back onto the stack as arguments for absdiff() (this is typical; see e.g. http://en.wikipedia.org/wiki/X86_calling_conventions#cdecl). If you look at the disassembler for absdiff() (starting at 0x8048394), I bet you'll see it pick these values up from the stack and use them.
This might seem like a waste of cycles in this instance, but that's probably because you've compiled without optimisation, so the compiler does literally what you asked for. If you use e.g. -O2, you'll probably see most of this code disappear.

First it bears saying that this assembly is in the AT&T syntax version of x86_32, and that the order of arguments to operations is reversed from the Intel syntax (used with MASM, YASM, and many other assemblers and debuggers).
080483bb: push %ebp #push old frame pointer onto the stack
080483bc: mov %esp,%ebp #move the frame pointer down, to the position of stack pointer
080483be: sub $0x18,%esp # ???
This enters a stack frame. A frame is an area of memory between the stack pointer (esp) and the base pointer (ebp). This area is intended to be used for local variables that have to live on the stack. NOTE: Stack frames don't have to be implemented in this way, and GCC has the optimization switch -fomit-frame-pointer that does away with it except when alloca or variable sized arrays are used, because they are implemented by changing the stack pointer by arbitrary values. Not using ebp as the frame pointer allows it to be used as an extra general purpose register (more general purpose registers is usually good).
Using the base pointer makes several things simpler to calculate for compilers and debuggers, since where variables are located relative to the base does not change while in the function, but you can also index them relative to the stack pointer and get the same results, though the stack pointer does tend to change around so the same location may require a different index at different times.
In this code 0x18 (or 24) bytes are being reserved on the stack for local use.
This code so far is often called the function prologue (not to be confused with the programming language "prolog").
25 int x = 10;
080483c1: movl $0xa,-0x4(%ebp) #move the "x(10)" to 4 address below frame pointer (why not push?)
This line moves the constant 10 (0xA) to a location within the current stack frame relative to the base pointer. Because the base pointer below the top of the stack and since the stack grows downward in RAM the index is negative rather than positive. If this were indexed relative to the stack pointer a different index would be used, but it would be positive.
You are correct that this value could have been pushed rather than copied like this. I suspect that this is done this way because you have not compiled with optimizations turned on. By default gcc (which I assume you are using based on your use of gdb) does not optimize much, and so this code is probably the default "copy a constant to a location in the stack frame" code. This may not be the case, but it is one possible explanation.
26 int y = 15;
080483c8: movl $0xf,-0x8(%ebp) #move the "y(15)" to 8 address below frame pointer (why not push?)
Similar to the previous line of code. These two lines of code put the 10 and 15 into local variables. They are on the stack (rather than in registers) because this is unoptimized code.
28 absdiff(x,y);
gdb printing this meant that this is the source code line being executed, not that this function is being executed (yet).
080483cf: mov -0x8(%ebp),%eax # -0x8(%ebp) == 15 = y, and move it into %eax
In preparation for calling the function the values that are being passed as arguments need to be retrieved from their storage locations (even though they were just placed at those locations and their values are known because of the no optimization thing)
080483d2: mov %eax,0x4(%esp) # from this point on, I am confused
This is the second part of the move to the stack of one of the local variables' value so that it can be use as an argument to the function. You can't (usually) move from one memory address to another on x86, so you have to move it through a register (eax in this case).
080483d6: mov -0x4(%ebp),%eax
080483d9: mov %eax,(%esp)
These two lines do the same thing except for the other variable. Note that since this variable is being moved to the top of the stack that no offset is being used in the second instruction.
080483dc: call 0x8048394 <absdiff>
This pushed the return address to the top of the stack and jumps to the address of absdiff.
You didn't include code for absdiff, so you probably did not step through that.
31 return EXIT_SUCCESS;
080483e1: mov $0x0,%eax
C programs return 0 upon success, so EXIT_SUCCESS was defined as 0 by someone. Integer return values are put in eax, and some code that called the main function will use that value as the argument when calling the exit function.
32 }
This is the end. The reason that gdb stopped here is that there are things that actually happen to clean up. In C++ it is common to see destructor for local class instances being called here, but in C you will probably just see the function epilogue. This is the compliment to the function prologue, and consists of returning the stack pointer and base pointer to the values that they were originally at. Sometimes this is done with similar math on them, but sometimes it is done with the leave instruction. There is also an enter instruction which can be used for the prologue, but gcc doesn't do this (I don't know why). If you had continued to view the disassembly here you would have seen the epilogue code and a ret instruction.
Something you may be interested in is the ability to tell gcc to produce assembly files. If you do:
gcc -S source_file.c
a file named source_file.s will be produced with assembly code in it.
If you do:
gcc -S -O source_file.c
Then the same thing will happen, but some basic optimizations will be done. This will probably make reading the assembly code easier since the code will not likely have as many odd instructions that seem like they could have been done a better way (like moving constant values to the stack, then to a register, then to another location on the stack and never using the push instruction).
You regular optimization flags for gcc are:
-O0 default -- none
-O1 a few optimizations
-O the same as -O1
-O2 a lot of optimizations
-O3 a bunch more, some of which may take a long time and/or make the code a lot bigger
-Os optimize for size -- similar to -O2, but not quite
If you are actually trying to debug C programs then you will probably want the least optimizations possible since things will happen in the order that they are written in your code and variables won't disappear.
You should have a look at the gcc man page:
man gcc

Remember, if you're running in a debugger or debug mode, the compiler reserves the right to insert whatever debugging code it likes and make other nonsensical code changes.
For example, this is Visual Studio's debug main():
int main(void) {
001F13D0 push ebp
001F13D1 mov ebp,esp
001F13D3 sub esp,0D8h
001F13D9 push ebx
001F13DA push esi
001F13DB push edi
001F13DC lea edi,[ebp-0D8h]
001F13E2 mov ecx,36h
001F13E7 mov eax,0CCCCCCCCh
001F13EC rep stos dword ptr es:[edi]
int x = 10;
001F13EE mov dword ptr [x],0Ah
int y = 15;
001F13F5 mov dword ptr [y],0Fh
absdiff(x,y);
001F13FC mov eax,dword ptr [y]
001F13FF push eax
001F1400 mov ecx,dword ptr [x]
001F1403 push ecx
001F1404 call absdiff (1F10A0h)
001F1409 add esp,8
*(int*)nullptr = 5;
001F140C mov dword ptr ds:[0],5
return 0;
001F1416 xor eax,eax
}
001F1418 pop edi
001F1419 pop esi
001F141A pop ebx
001F141B add esp,0D8h
001F1421 cmp ebp,esp
001F1423 call #ILT+300(__RTC_CheckEsp) (1F1131h)
001F1428 mov esp,ebp
001F142A pop ebp
001F142B ret
It helpfully posts the C++ source next to the corresponding assembly. In this case, you can fairly clearly see that x and y are stored on the stack explicitly, and an explicit copy is pushed on, then absdiff is called. I explicitly de-referenced nullptr to cause the debugger to break in. You may wish to change compiler.

Compile with -fverbose-asm -g -save-temps for additional information with GCC.

Explanation of the disassembly of the simplest program (x86)

The following code
int _main() {return 0;}
Compiled using the command:
gcc -s -nostdlib -nostartfiles 01-simple.c -o01-simple.exe
gcc version 4.4.1 (TDM-1 mingw32)
OllyDbg produced this output:
Can you explain what happens here?
Analysis so far:
// these two seems to be an idiom:
PUSH EBP // places EBP on stack
MOV EBP, ESP // overwrites EBP with ESP
MOV EAX, 0 // EAX = 0
LEAVE // == mov esp, ebp
// pop ebp
// according to
// http://en.wikipedia.org/wiki/X86_instruction_listings
What is the meaning of all this?

This creates a stack frame.
PUSH EBP
MOV EBP, ESP
In the calling convention being used, the return value is sent back via EAX (so the 0 is there because you wrote return 0; - try changing that to return 1; and see how that affects the code).
MOV EAX, 0
And this tells the processor to clean up the stack frame (it's the equivalent of MOV ESP, EBP followed by POP EBP which is the opposite of what was done when creating the stack frame):
LEAVE

The instructions sets up the stack frame upon the runtime loader entering the int _main() function,
PUSH EBP
MOV EBP, ESP
Stack frame is set up and to access the parameters if any were supplied would be offset from EBP + the size of the parameter (WORD, BYTE, LONG, etc).
Usually the EAX register is the normal register to return an exit status from the runtime environment to the operating system loader,
MOV EAX, 0
LEAVE
in other words, to say the program has exited successfully returning a 0 to the Operating system.
Where a return is used, the stack frame is restored upon runtime execution prior to handing control back to the Operating system.
POP EBP
The general consensus is, if an error has occurred, the value would be non-zero, and can be used both from batch files (harking back to the old DOS days) and unix scripts where it checks if the program ran successfully or not and proceed onwards depending on the nature of the batch file or script.

The 'MOV EAX,0' instruction is the body of your function. The header code is for setting up space for your code to run in.
The 'LEAVE' instruction returns your code to where it came. Even though you said no standard libraries, there is plenty of other code in there that the linker puts into place.