I've written a basic C program that defines an integer variable x, sets it to zero and returns the value of that variable:
#include <stdio.h>
int main(int argc, char **argv) {
int x;
x = 0;
return x;
}
When I dump the object code using objdump (compiled on Linux X86-64 with gcc):
0x0000000000400474 <main+0>: push %rbp
0x0000000000400475 <main+1>: mov %rsp,%rbp
0x0000000000400478 <main+4>: mov %edi,-0x14(%rbp)
0x000000000040047b <main+7>: mov %rsi,-0x20(%rbp)
0x000000000040047f <main+11>: movl $0x0,-0x4(%rbp)
0x0000000000400486 <main+18>: mov -0x4(%rbp),%eax
0x0000000000400489 <main+21>: leaveq
0x000000000040048a <main+22>: retq
I can see the function prologue, but before we set x to 0 at address 0x000000000040047f there are two instructions that move %edi and %rsi onto the stack. What are these for?
In addition, unlike where we set x to 0, the mov instruction as shown in GAS syntax does not have a suffix.
If the suffix is not specified, and there are no memory operands for the instruction, GAS infers the operand size from the size of the destination register operand.
In this case, are -0x14(%rsbp) and -0x20(%rbp) both memory operands and what are their sizes? Since %edi is a 32 bit register, are 32 bits moved to -0x14(%rsbp) whereas since %rsi is a 64 bit register, 64 bits are moved to %rsi,-0x20(%rbp)?
In this simple case, why don't you ask your compiler directly? For GCC, clang and ICC there's the -fverbose-asm option.
main:
pushq %rbp #
movq %rsp, %rbp #,
movl %edi, -20(%rbp) # argc, argc
movq %rsi, -32(%rbp) # argv, argv
movl $0, -4(%rbp) #, x
movl -4(%rbp), %eax # x, D.2607
popq %rbp #
ret
So, yes, they save argv and argv onto the stack by using the "old" frame pointer method since new architectures allow subtracting/adding from/to the stack pointer directly, thus omitting the frame pointer (-fomit-frame-pointer).
Purpose of ESI & EDI registers?
Based on this and the context, I'm not an expert, but my guess is these are capturing the main() input parameters. EDI takes a standard width, which would match the int argc, whereas RSI takes a long, which would match the char **argv pointer.
Related
This question already has answers here:
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
(1 answer)
How many bytes does the push instruction push onto the stack when I don't specify the operand size?
(2 answers)
Does each PUSH instruction push a multiple of 8 bytes on x64?
(2 answers)
What's the difference between 'push' and 'pushq' in at&t assembly
(1 answer)
How to determine the appropriate MOV instruction suffix based on the operands?
(1 answer)
Closed 2 years ago.
I've written the following basic C program:
int main() {
char a = 1;
char b = 5;
return a + b;
}
And it compiles in godbolt as:
main:
pushq %rbp
movq %rsp, %rbp
movb $1, -1(%rbp)
movb $5, -2(%rbp)
movsbl -1(%rbp), %edx
movsbl -2(%rbp), %eax
addl %edx, %eax
popq %rbp
ret
I have a few questions about the compiled asm:
Is movb used for 1byte (char), movw for 2byte (short), movl for 4byte (int), and movq for 8byte (int) integers? What then is just mov used for, without an extension?
Why is an offset used for movb $1 -1(%rbp), movb $5 -2(%rbp)? Why aren't the two numbers just moved into two different registers? For example, there's an addl %edx, %eax later on...why aren't the two numbers just moved into those two registers?
Why is movsbl used here? Why aren't the numbers just moved directly into the registers?
Is pushq / popq pushing/popping an 8byte pointer onto the stack? If so, what's the point of the movq %rsp, %rbp?
I'm experimenting disassembling clang binaries of simple C programs (compiled with -O0), and I'm confused about a certain instruction that gets generated.
Here are two empty main functions with standard arguments, one of which returns value and other does not:
// return_void.c
void main(int argc, char** argv)
{
}
// return_0.c
int main(int argc, char** argv)
{
return 0;
}
Now, when I disassemble their assemblies, they look reasonably different, but there's one line that I don't understand:
return_void.bin:
(__TEXT,__text) section
_main:
0000000000000000 pushq %rbp
0000000000000001 movq %rsp, %rbp
0000000000000004 movl %edi, -0x4(%rbp)
0000000000000007 movq %rsi, -0x10(%rbp)
000000000000000b popq %rbp
000000000000000c retq
return_0.bin:
(__TEXT,__text) section
_main:
0000000100000f80 pushq %rbp
0000000100000f81 movq %rsp, %rbp
0000000100000f84 xorl %eax, %eax # We return with EAX, so we clean it to return 0
0000000100000f86 movl $0x0, -0x4(%rbp) # What does this mean?
0000000100000f8d movl %edi, -0x8(%rbp)
0000000100000f90 movq %rsi, -0x10(%rbp)
0000000100000f94 popq %rbp
0000000100000f95 retq
It only gets generated when I use the function is not void, so I thought that it might be another way to return 0, but when I changed the returned constant, this line didn't change at all:
// return_1.c
int main(int argc, char** argv)
{
return 1;
}
empty_return_1.bin:
(__TEXT,__text) section
_main:
0000000100000f80 pushq %rbp
0000000100000f81 movq %rsp, %rbp
0000000100000f84 movl $0x1, %eax # Return value modified
0000000100000f89 movl $0x0, -0x4(%rbp) # This value is not modified
0000000100000f90 movl %edi, -0x8(%rbp)
0000000100000f93 movq %rsi, -0x10(%rbp)
0000000100000f97 popq %rbp
0000000100000f98 retq
Why is this line getting generated and what is it's purpose?
The purpose of that area is revealed by the following code
int main(int argc, char** argv)
{
if (rand() == 42)
return 1;
printf("Helo World!\n");
return 0;
}
At the start it does
movl $0, -4(%rbp)
then the early return looks as follows
callq rand
cmpl $42, %eax
jne .LBB0_2
movl $1, -4(%rbp)
jmp .LBB0_3
and then at the end it does
.LBB0_3:
movl -4(%rbp), %eax
addq $32, %rsp
popq %rbp
retq
So, this area is indeed reserved to store the function return value. It doesn't appear to be terribly necessary and it is not used in optimized code, but in -O0 mode that's the way it works.
clang is making space on the stack for the arguments (registers edi and rsi) and puts the value 0 on the stack, too, for some reason. I assume that clang compiles your code to an SSA-representation like this:
int main(int argc, char** argv)
{
int a;
a = 0;
return a;
}
This would explain why a stack slot is allocated. If clang does constant propagation, too, this would explain why eax is zeroed out instead of being loaded from -4(%rbp). In general, don't think too much about dubious constructs in unoptimized assembly. After all, you forbade the compiler from removing useless code.
movl $0x0,-0x4(%rbp)
This instruction stores 0 at %rbp - 4. It seems that clang allocates a hidden local variable for an implicit return value from main.
From the clang mailing list:
Yes. We allocate an implicit local variable to hold the return value;
return statements then just initialize the return slot and jump to the
epilogue, where the slot is loaded and returned. We don't use a phi
because the control flow for getting to the epilogue is not
necessarily as simple as a simple branch, due to cleanups in local
scopes (like C++ destructors).
Implicit return values like main's are handled with an implicit store
in the prologue.
Source: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-February/019767.html
According the the standard (for hosted environments), 5.1.2.2.1, main is required to returrn an int result. So do not expect defined behavior if violating this.
Furthermore, main is actually _not required to explicitly return 0; this is implicitly returned if it reaches the end of the function. (Note this is only for main, which also does not have a prototype.
I recently went through an Assembly language book by Richard Blum wherein there was a subject on the C program to assembly conversion.
Consider the following C program:
#include <stdio.h>
int main(){
int a=100;
int b=25;
if (a>b)
printf("The higher value is %d\n", a);
else
printf("The higher value is %d\n", b);
return 0;
}
when I compiled the above program using -S parameter as:
gcc -S abc.c
I got the following result:
.file "abc.c"
.section .rodata
.LC0:
.string "The higher value is %d\n"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
leal 4(%esp), %ecx
.cfi_def_cfa 1, 0
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
.cfi_escape 0x10,0x5,0x2,0x75,0
movl %esp, %ebp
pushl %ecx
.cfi_escape 0xf,0x3,0x75,0x7c,0x6
subl $20, %esp
movl $100, -16(%ebp)
movl $25, -12(%ebp)
movl -16(%ebp), %eax
cmpl -12(%ebp), %eax
jle .L2
subl $8, %esp
pushl -16(%ebp)
pushl $.LC0
call printf
addl $16, %esp
jmp .L3
.L2:
subl $8, %esp
pushl -12(%ebp)
pushl $.LC0
call printf
addl $16, %esp
.L3:
movl $0, %eax
movl -4(%ebp), %ecx
.cfi_def_cfa 1, 0
leave
.cfi_restore 5
leal -4(%ecx), %esp
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005"
.section .note.GNU-stack,"",#progbits
What I cant understand is this:
Snippet
.LFB0:
.cfi_startproc
leal 4(%esp), %ecx
.cfi_def_cfa 1, 0
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
.cfi_escape 0x10,0x5,0x2,0x75,0
movl %esp, %ebp
pushl %ecx
.cfi_escape 0xf,0x3,0x75,0x7c,0x6
subl $20, %esp
I am unable to predict what is happening with the ESP and EBP register. About EBP, I can understand to an extent that it is used as a local stack and so it's value is saved by pushing onto stack.
Can you please elaborate the above snippet?
This is a special form of function entry-sequence suitable for the main()
function. The compiler knows that main() really is called as main(int argc, char **argv, char **envp), and compiles this function according to that very special behavior. So what's sitting on the stack when this code is reached is four long-size values, in this order: envp, argv, argc, return_address.
So that means that the entry-sequence code is doing something like this
(rewritten to use Intel syntax, which frankly makes a lot more sense
than AT&T syntax):
; Copy esp+4 into ecx. The value at [esp] has the return address,
; so esp+4 is 'argc', or the start of the function's arguments.
lea ecx, [esp+4]
; Round esp down (align esp down) to the nearest 16-byte boundary.
; This ensures that regardless of what esp was before, esp is now
; starting at an address that can store any register this processor
; has, from the one-byte registers all the way up to the 16-byte xmm
; registers
and esp, 0xFFFFFFF0
; Since we copied esp+4 into ecx above, that means that [ecx] is 'argc',
; [ecx+4] is 'argv', and [ecx+8] is 'envp'. For whatever reason, the
; compiler decided to push a duplicate copy of 'argv' onto the function's
; new local frame.
push dword ptr [ecx+4]
; Preserve 'ebp'. The C ABI requires us not to damage 'ebp' across
; function calls, so we save its old value on the stack before we
; change it.
push ebp
; Set 'ebp' to the current stack pointer to set up the function's
; stack frame for real. The "stack frame" is the place on the stack
; where this function will store all its local variables.
mov ebp, esp
; Preserve 'ecx'. Ecx tells us what 'esp' was before we munged 'esp'
; in the 'and'-instruction above, so we'll need it later to restore
; 'esp' before we return.
push ecx
; Finally, allocate space on the stack frame for the local variables,
; 20 bytes worth. 'ebp' points to 'esp' plus 24 by this point, and
; the compiler will use 'ebp-16' and 'ebp-12' to store the values of
; 'a' and 'b', respectively. (So under 'ebp', going down the stack,
; the values will look like this: [ecx, unused, unused, a, b, unused].
; Those unused slots are probably used by the .cfi pseudo-ops for
; something related to exception handling.)
sub esp, 20
At the other end of the function, the inverse operations are used to put
the stack back the way it was before the function was called; it may be
helpful to examine what they're doing as well to understand what's happening
at the beginning:
; Return values are always passed in 'eax' in the x86 C ABI, so set
; 'eax' to the return value of 0.
mov eax, 0
; We pushed 'ecx' onto the stack a while back to save it. This
; instruction pulls 'ecx' back off the stack, but does so without
; popping (which would alter 'esp', which doesn't currently point
; to the right location).
mov ecx, [ebp+4]
; Magic instruction! The 'leave' instruction is designed to shorten
; instruction sequences by "undoing" the stack in a single op.
; So here, 'leave' means specifically to do the following two
; operations, in order: esp = ebp / pop ebp
leave
; 'esp' is now set to what it was before we pushed 'ecx', and 'ebp'
; is back to the value that was used when this function was called.
; But that's still not quite right, so we set 'esp' specifically to
; 'ecx+4', which is the exact opposite of the very first instruction
; in the function.
lea esp, [ecx+4]
; Finally, the stack is back to the way it was when we were called,
; so we can just return.
ret
This question already has answers here:
Why does GCC pad functions with NOPs?
(3 answers)
Closed 7 years ago.
On OSX 64bit, compiling a dummy C program like that:
#include <stdio.h>
void foo1() {
}
void foo2() {
}
int main() {
printf("Helloooo!\n");
foo1();
foo2();
return 0;
}
Produces the following ASM code (obtained disassembling the binary with otool):
(__TEXT,__text) section
_foo1:
0000000100000f10 55 pushq %rbp
0000000100000f11 4889e5 movq %rsp, %rbp
0000000100000f14 897dfc movl %edi, -0x4(%rbp)
0000000100000f17 5d popq %rbp
0000000100000f18 c3 retq
0000000100000f19 0f1f8000000000 nopl (%rax)
_foo2:
0000000100000f20 55 pushq %rbp
0000000100000f21 4889e5 movq %rsp, %rbp
0000000100000f24 5d popq %rbp
0000000100000f25 c3 retq
0000000100000f26 662e0f1f840000000000 nopw %cs:(%rax,%rax)
_main:
0000000100000f30 55 pushq %rbp
0000000100000f31 4889e5 movq %rsp, %rbp
0000000100000f34 4883ec10 subq $0x10, %rsp
0000000100000f38 488d3d4b000000 leaq 0x4b(%rip), %rdi ## literal pool for: "Helloooo!\n"
0000000100000f3f c745fc00000000 movl $0x0, -0x4(%rbp)
0000000100000f46 b000 movb $0x0, %al
0000000100000f48 e81b000000 callq 0x100000f68 ## symbol stub for: _printf
0000000100000f4d bf06000000 movl $0x6, %edi
0000000100000f52 8945f8 movl %eax, -0x8(%rbp)
0000000100000f55 e8b6ffffff callq _foo1
0000000100000f5a e8c1ffffff callq _foo2
0000000100000f5f 31c0 xorl %eax, %eax
0000000100000f61 4883c410 addq $0x10, %rsp
0000000100000f65 5d popq %rbp
0000000100000f66 c3 retq
What are the "nop" instructions found right after the "ret" on functions foo1() and foo2()? They are, of course, never executed since the "ret" instructions return from the function call. Is that any kind of padding or it has a different meaning?
From the Assembly language for x86 processors, Kip R. Irvine
The safest (and the most useless) instruction you can write is called NOP (no operation). It takes up 1 byte of program storage and doesn’t do any work. It is sometimes used by compilers and assemblers to align code to even-address boundaries
00000000 66 8B C3 mov ax,bx
00000003 90 nop ; align next instruction
00000004 8B D1 mov edx,ecx
What are the "nop" instructions found right after the "ret" on functions foo1() and foo2()?
The nop is a no-operation instruction (do nothing), from the linked Wikipedia page (emphasis mine)
A NOP is most commonly used for timing purposes, to force memory alignment, to prevent hazards, to occupy a branch delay slot, to render void an existing instruction such as a jump, or as a place-holder to be replaced by active instructions later on in program development (or to replace removed instructions when refactoring would be problematic or time-consuming).
nop is short for No Operation. The nop instructions in this case are providing execution code alignment. Notice that labels are on 16 byte boundaries. On OSX, the linker (ld) should have a -segalign option that will affect this behavior.
I'm experimenting disassembling clang binaries of simple C programs (compiled with -O0), and I'm confused about a certain instruction that gets generated.
Here are two empty main functions with standard arguments, one of which returns value and other does not:
// return_void.c
void main(int argc, char** argv)
{
}
// return_0.c
int main(int argc, char** argv)
{
return 0;
}
Now, when I disassemble their assemblies, they look reasonably different, but there's one line that I don't understand:
return_void.bin:
(__TEXT,__text) section
_main:
0000000000000000 pushq %rbp
0000000000000001 movq %rsp, %rbp
0000000000000004 movl %edi, -0x4(%rbp)
0000000000000007 movq %rsi, -0x10(%rbp)
000000000000000b popq %rbp
000000000000000c retq
return_0.bin:
(__TEXT,__text) section
_main:
0000000100000f80 pushq %rbp
0000000100000f81 movq %rsp, %rbp
0000000100000f84 xorl %eax, %eax # We return with EAX, so we clean it to return 0
0000000100000f86 movl $0x0, -0x4(%rbp) # What does this mean?
0000000100000f8d movl %edi, -0x8(%rbp)
0000000100000f90 movq %rsi, -0x10(%rbp)
0000000100000f94 popq %rbp
0000000100000f95 retq
It only gets generated when I use the function is not void, so I thought that it might be another way to return 0, but when I changed the returned constant, this line didn't change at all:
// return_1.c
int main(int argc, char** argv)
{
return 1;
}
empty_return_1.bin:
(__TEXT,__text) section
_main:
0000000100000f80 pushq %rbp
0000000100000f81 movq %rsp, %rbp
0000000100000f84 movl $0x1, %eax # Return value modified
0000000100000f89 movl $0x0, -0x4(%rbp) # This value is not modified
0000000100000f90 movl %edi, -0x8(%rbp)
0000000100000f93 movq %rsi, -0x10(%rbp)
0000000100000f97 popq %rbp
0000000100000f98 retq
Why is this line getting generated and what is it's purpose?
The purpose of that area is revealed by the following code
int main(int argc, char** argv)
{
if (rand() == 42)
return 1;
printf("Helo World!\n");
return 0;
}
At the start it does
movl $0, -4(%rbp)
then the early return looks as follows
callq rand
cmpl $42, %eax
jne .LBB0_2
movl $1, -4(%rbp)
jmp .LBB0_3
and then at the end it does
.LBB0_3:
movl -4(%rbp), %eax
addq $32, %rsp
popq %rbp
retq
So, this area is indeed reserved to store the function return value. It doesn't appear to be terribly necessary and it is not used in optimized code, but in -O0 mode that's the way it works.
clang is making space on the stack for the arguments (registers edi and rsi) and puts the value 0 on the stack, too, for some reason. I assume that clang compiles your code to an SSA-representation like this:
int main(int argc, char** argv)
{
int a;
a = 0;
return a;
}
This would explain why a stack slot is allocated. If clang does constant propagation, too, this would explain why eax is zeroed out instead of being loaded from -4(%rbp). In general, don't think too much about dubious constructs in unoptimized assembly. After all, you forbade the compiler from removing useless code.
movl $0x0,-0x4(%rbp)
This instruction stores 0 at %rbp - 4. It seems that clang allocates a hidden local variable for an implicit return value from main.
From the clang mailing list:
Yes. We allocate an implicit local variable to hold the return value;
return statements then just initialize the return slot and jump to the
epilogue, where the slot is loaded and returned. We don't use a phi
because the control flow for getting to the epilogue is not
necessarily as simple as a simple branch, due to cleanups in local
scopes (like C++ destructors).
Implicit return values like main's are handled with an implicit store
in the prologue.
Source: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-February/019767.html
According the the standard (for hosted environments), 5.1.2.2.1, main is required to returrn an int result. So do not expect defined behavior if violating this.
Furthermore, main is actually _not required to explicitly return 0; this is implicitly returned if it reaches the end of the function. (Note this is only for main, which also does not have a prototype.