Comparison of godbolt assembly of basic C program [duplicate] - c

This question already has answers here:
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
(1 answer)
How many bytes does the push instruction push onto the stack when I don't specify the operand size?
(2 answers)
Does each PUSH instruction push a multiple of 8 bytes on x64?
(2 answers)
What's the difference between 'push' and 'pushq' in at&t assembly
(1 answer)
How to determine the appropriate MOV instruction suffix based on the operands?
(1 answer)
Closed 2 years ago.
I've written the following basic C program:
int main() {
char a = 1;
char b = 5;
return a + b;
}
And it compiles in godbolt as:
main:
pushq %rbp
movq %rsp, %rbp
movb $1, -1(%rbp)
movb $5, -2(%rbp)
movsbl -1(%rbp), %edx
movsbl -2(%rbp), %eax
addl %edx, %eax
popq %rbp
ret
I have a few questions about the compiled asm:
Is movb used for 1byte (char), movw for 2byte (short), movl for 4byte (int), and movq for 8byte (int) integers? What then is just mov used for, without an extension?
Why is an offset used for movb $1 -1(%rbp), movb $5 -2(%rbp)? Why aren't the two numbers just moved into two different registers? For example, there's an addl %edx, %eax later on...why aren't the two numbers just moved into those two registers?
Why is movsbl used here? Why aren't the numbers just moved directly into the registers?
Is pushq / popq pushing/popping an 8byte pointer onto the stack? If so, what's the point of the movq %rsp, %rbp?

Related

Addresses in C Language

The GCC compiler on CSLab translates the following C function:
int func(int x) {
return 13 + x;
}
Into the following Assembly code:
func:
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
addl $13, %eax
popq %rbp
ret
I have completed this code and was then asked the following question:
In the Assembly code for func shown in the previous question, suppose %rsp has the value
0x7fffffffe3e0
What is the address corresponding to the parameter (local variable) x? Include the 0x prefix.
(Note that the address has 12 significant hex digits, or 6 bytes. > The value for the top two hex digits is 0. Omit the 0s to the left just as shown above.)
I answered 0xd and it was incorrect.
Taking the given value for %rsp as 0x7fffffffe3e0, we have
movq %rsp, %rbp
which copies the value of %rsp to %rbp, then we have
movl -4(%rbp), %eax
addl $13, %eax
We copy something to %eax and add 13 to it, so that something must be x. That something is -4(%rbp), which translates to "an object 4 bytes below the address value stored in %rbp".
Thus, the address of x must be 0x7fffffffe3e0 - 4, or 0x7fffffffe3dc.
Read up on your x86 assembly addressing modes.

Initializing an array in C/asm [duplicate]

This question already has answers here:
how does array[100] = {0} set the entire array to 0?
(4 answers)
Why is there no "sub rsp" instruction in this function prologue and why are function parameters stored at negative rbp offsets?
(2 answers)
Compiler using local variables without adjusting RSP
(1 answer)
Closed 2 years ago.
The following C function:
int main(void) {
char name[10] = "H";
}
Produces the following (unoptimized) assembly in Compiler Explorer:
main:
pushq %rbp
movq %rsp, %rbp
movq $72, -10(%rbp)
movw $0, -2(%rbp) <--------- ??
movl $0, %eax
popq %rbp
ret
What does the line above do? I would think we would want to null terminal the string by adding a $0 but I don't undertand why it's being added at -2. If helpful, here is a screenshot:

What are the four execution steps before entering main? [duplicate]

This question already has answers here:
Function Prologue and Epilogue in C
(4 answers)
Closed 3 years ago.
Take the following 5-line file I have:
#include <stdio.h>
int main() {
printf("Hello");
return 0;
}
It corresponds to the following assembly:
`main:
0x100000f60 <+0>: pushq %rbp
0x100000f61 <+1>: movq %rsp, %rbp
0x100000f64 <+4>: subq $0x10, %rsp
0x100000f68 <+8>: movl $0x0, -0x4(%rbp)
-> 0x100000f6f <+15>: leaq 0x34(%rip), %rdi ; "Hello"
We can notice the first line in main which prints "Hello" corresponds to the fifth instruction. What are the four preceding instructions: what do they do?
0x100000f60 <+0>: pushq %rbp
Push the caller's base pointer.
0x100000f61 <+1>: movq %rsp, %rbp
Copy the stack pointer into the base pointer (set up this function's stack frame)
0x100000f64 <+4>: subq $0x10, %rsp
Reserve stack space (presumably for the return value - you probably didn't compile this program with any optimizations enabled)
0x100000f68 <+8>: movl $0x0, -0x4(%rbp)
Put the return value (zero) on the stack.
-> 0x100000f6f <+15>: leaq 0x34(%rip), %rdi ; "Hello"
Load a pointer to the "Hello" string literal into rdi register.

Passing 128 bit register to C function from Assembly [duplicate]

This question already has an answer here:
Printing floating point numbers from x86-64 seems to require %rbp to be saved
(1 answer)
Closed 5 years ago.
I am attempting to test passing a floating point value to a C function from assembly on 64-bit Linux. The C file containing my C function looks like this:
#include <stdio.h>
extern void printer(double k){
printf("%f\n",k);
}
Its expected behavior is to simply print the floating point number passed to it. I am trying to accomplish this from an AT&T-syntax assembly file. If I am not mistaken, in 64-bit linux, the calling convention is to pass floating point arguments on the XMM registers. My .s file is the following:
.extern printer
.data
var:
.double 120.1
.global main
main:
movups (var),%xmm0
call printer
mov $60,%rax
syscall
What I'm hoping this could do is have a variable (var) with value 120.1. This is then moved to the xmm0 register, which I expect is what is used to pass the argument k. This understanding of the calling convention is also backed up by the assembly code generated from the C file, a portion of which is below:
printer:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movsd %xmm0, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, -16(%rbp)
movsd -16(%rbp), %xmm0
movl $.LC0, %edi
movl $1, %eax
call printf
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
My .s file assembles to an executable, but running it only gives a segmentation fault, and doesn't print the floating point value. I can only assume this is because I'm not properly moving the value to xmm0 and/or using the register to pass it to the function. Can somebody explain how I should pass the value to the function?
You have defined main in the data section, which makes it non-executable. Add a .text directive before main.

What do the instructions mov %edi and mov %rsi do?

I've written a basic C program that defines an integer variable x, sets it to zero and returns the value of that variable:
#include <stdio.h>
int main(int argc, char **argv) {
int x;
x = 0;
return x;
}
When I dump the object code using objdump (compiled on Linux X86-64 with gcc):
0x0000000000400474 <main+0>: push %rbp
0x0000000000400475 <main+1>: mov %rsp,%rbp
0x0000000000400478 <main+4>: mov %edi,-0x14(%rbp)
0x000000000040047b <main+7>: mov %rsi,-0x20(%rbp)
0x000000000040047f <main+11>: movl $0x0,-0x4(%rbp)
0x0000000000400486 <main+18>: mov -0x4(%rbp),%eax
0x0000000000400489 <main+21>: leaveq
0x000000000040048a <main+22>: retq
I can see the function prologue, but before we set x to 0 at address 0x000000000040047f there are two instructions that move %edi and %rsi onto the stack. What are these for?
In addition, unlike where we set x to 0, the mov instruction as shown in GAS syntax does not have a suffix.
If the suffix is not specified, and there are no memory operands for the instruction, GAS infers the operand size from the size of the destination register operand.
In this case, are -0x14(%rsbp) and -0x20(%rbp) both memory operands and what are their sizes? Since %edi is a 32 bit register, are 32 bits moved to -0x14(%rsbp) whereas since %rsi is a 64 bit register, 64 bits are moved to %rsi,-0x20(%rbp)?
In this simple case, why don't you ask your compiler directly? For GCC, clang and ICC there's the -fverbose-asm option.
main:
pushq %rbp #
movq %rsp, %rbp #,
movl %edi, -20(%rbp) # argc, argc
movq %rsi, -32(%rbp) # argv, argv
movl $0, -4(%rbp) #, x
movl -4(%rbp), %eax # x, D.2607
popq %rbp #
ret
So, yes, they save argv and argv onto the stack by using the "old" frame pointer method since new architectures allow subtracting/adding from/to the stack pointer directly, thus omitting the frame pointer (-fomit-frame-pointer).
Purpose of ESI & EDI registers?
Based on this and the context, I'm not an expert, but my guess is these are capturing the main() input parameters. EDI takes a standard width, which would match the int argc, whereas RSI takes a long, which would match the char **argv pointer.

Resources