This question already has answers here:
When should -m32 option of gcc be used?
(2 answers)
C to assembly call convention 32bit vs 64bit
(1 answer)
Closed 1 year ago.
I study on a tutorial and there is a c code like that
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
void main() {
function(1,2,3);
}
He says after compile this code using gcc with -S switch we will get an assembly result like that
pushl $3
pushl $2
pushl $1
call function
However when I compile it my main part looks like
main:
.LFB1:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $3, %edx
movl $2, %esi
movl $1, %edi
call function
nop
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
Are both meaning the same or am I doing wrong with somewhere?
Related
The GCC compiler on CSLab translates the following C function:
int func(int x) {
return 13 + x;
}
Into the following Assembly code:
func:
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
addl $13, %eax
popq %rbp
ret
I have completed this code and was then asked the following question:
In the Assembly code for func shown in the previous question, suppose %rsp has the value
0x7fffffffe3e0
What is the address corresponding to the parameter (local variable) x? Include the 0x prefix.
(Note that the address has 12 significant hex digits, or 6 bytes. > The value for the top two hex digits is 0. Omit the 0s to the left just as shown above.)
I answered 0xd and it was incorrect.
Taking the given value for %rsp as 0x7fffffffe3e0, we have
movq %rsp, %rbp
which copies the value of %rsp to %rbp, then we have
movl -4(%rbp), %eax
addl $13, %eax
We copy something to %eax and add 13 to it, so that something must be x. That something is -4(%rbp), which translates to "an object 4 bytes below the address value stored in %rbp".
Thus, the address of x must be 0x7fffffffe3e0 - 4, or 0x7fffffffe3dc.
Read up on your x86 assembly addressing modes.
This question already has answers here:
gcc argument register spilling on x86-64
(2 answers)
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
(1 answer)
x86 explanation, number of function arguments and local variables
(2 answers)
Closed 2 years ago.
I have the following C program to see how the main function is called with argc and argv as follows:
#include <stdio.h>
int main(int argc, char *argv[]) {
// use a non-zero value so we can easily tell it ran properly with echo $?
return 3;
}
And the non-optimized assembly output with $ gcc ifile.c -S -o ifile.s gives us:
.file "ifile.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp) <== here
movq %rsi, -16(%rbp) <== here
movl $3, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",#progbits
I understand this with the exception of the two lines above preceding moving the return value into %eax:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
What are these two lines doing? I am guessing the first line since it has an offset of 4 is populating the integer value of argc, and the second argument is passing an (8-byte) pointer padded to 16 for the strings that can be passed in the argv. Is this a correct understanding of these items? Where can I learn more about, not so much the full ABI, but the specific details/internals about how the main() function gets invoked and such?
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I want to test my code (I know my code is still incomplete -- yes I am planning to complete it before I compile it) to see if it gives the correct assembly code by compiling with -s switch, how do I do this?
I am not very familiar with compiling. All I did so far was save my file. Now I need to compile it to be able to run it.
typedef enum {MODE_A, MODE_B, MODE_C, MODE_D, MODE_E} mode_t;
long switch3 (long *p1, long *p2, mode_t action) {
long result = 0;
switch(action){
case MODE_A:
case MODE_B:
case MODE_C:
case MODE_D:
case MODE_E:
default:; // don't forget the colon
}
return result;
}
Open an editor, Vi or Emacs for example
Type and save your code in a file, maybe main.c
Exit the editor
Type gcc -S main.c or clang -S main.c in the terminal. You can also add a -fverbose-asm flag to tell the complier to add more information in the output, or a -masm=intel flag to inspect the assembly output much nicer.
On success, a file named main.s will be generated under the current directory, containing the assembly code; on failure, error messages will be printed on the screen.
Also note that your C code will only be compiled when it's compilable, so you have to modify your code first. At least, change default; to default:;
Here is the assembly code produced by clang -S main.c on my machine:
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 11
.globl _switch3
.align 4, 0x90
_switch3: ## #switch3
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
movq %rdi, -8(%rbp)
movq %rsi, -16(%rbp)
movl %edx, -20(%rbp)
movq $0, -32(%rbp)
movl -20(%rbp), %edx
subl $4, %edx
movl %edx, -36(%rbp) ## 4-byte Spill
ja LBB0_2
jmp LBB0_1
LBB0_1:
jmp LBB0_2
LBB0_2:
jmp LBB0_3
LBB0_3:
movq -32(%rbp), %rax
popq %rbp
retq
.cfi_endproc
.subsections_via_symbols
To compile without linking using GNU Compiler Collection (gcc) you can use the -S switch:
jan#jsn-dev:~/src/so> gcc -S main.c
main.c: In function ‘switch3’:
main.c:11:12: error: expected ‘:’ before ‘;’ token
default;
^
After correcting your code with the suggested fix, you get:
jan#jsn-dev:~/src/so> gcc -S main.c
jan#jsn-dev:~/src/so> cat main.s
.file "main.c"
.text
.globl switch3
.type switch3, #function
switch3:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -24(%rbp)
movq %rsi, -32(%rbp)
movl %edx, -36(%rbp)
movq $0, -8(%rbp)
movq -8(%rbp), %rax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size switch3, .-switch3
.ident "GCC: (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]"
.section .note.GNU-stack,"",#progbits
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
How is it possbile to disassembly a C Code? I already read a few Questions here (stackoverflow). But if you want to disassembly you need a Machine code so how do this with nasm ? So if I create for ex. an Hello World in C how can do this ?
Nasm is a bad idea. There are a few options. IDA pro has given me some success, but if you really know your assembly, you can nm for symbols, then hexdump the code from there and manually make assembly out of it. There really isn't just a way to use nasm to produce recompilable code though.
otool (or objdump) will produce assembly.
If you need some examples: here:
#include <stdio.h>
main(argc, argv)
int argc; char * * argv;
{
printf("Hello, World\n");
}
nm output:
hydrogen:tmp phyrrus9$ nm a.out
0000000100000000 T __mh_execute_header
0000000100000f40 T _main
U _printf
U dyld_stub_binder
otool output:
hydrogen:tmp phyrrus9$ otool -tv a.out
a.out:
(__TEXT,__text) section
_main:
0000000100000f40 pushq %rbp
0000000100000f41 movq %rsp, %rbp
0000000100000f44 subq $0x10, %rsp
0000000100000f48 leaq 0x37(%rip), %rdi ; this is our string
0000000100000f4f movb $0x0, %al
0000000100000f51 callq 0x100000f66 ; call printf
0000000100000f56 movl $0x0, %ecx
0000000100000f5b movl %eax, 0xfffffffffffffffc(%rbp)
0000000100000f5e movl %ecx, %eax
0000000100000f60 addq $0x10, %rsp
0000000100000f64 popq %rbp
0000000100000f65 ret
hexdump output not shown.
Actual assembly:
hydrogen:tmp phyrrus9$ cat tmp.s
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main: ## #main
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
subq $16, %rsp
leaq L_.str(%rip), %rdi
movb $0, %al
callq _printf
movl $0, %ecx
movl %eax, -4(%rbp) ## 4-byte Spill
movl %ecx, %eax
addq $16, %rsp
popq %rbp
ret
.cfi_endproc
.section __TEXT,__cstring,cstring_literals
L_.str: ## #.str
.asciz "Hello, world!\n"
.subsections_via_symbols
Hope this helps you get a grasp.
I'm studying NASM on Linux 64-bit and have been trying to implement some examples of code. However I got a problem in the following example. The function donothing is implemented in NASM and is supposed to be called in a program implemented in C:
File main.c:
#include <stdio.h>
#include <stdlib.h>
int donothing(int, int);
int main() {
printf(" == %d\n", donothing(1, 2));
return 0;
}
File first.asm
global donothing
section .text
donothing:
push rbp
mov rbp, rsp
mov eax, [rbp-0x4]
pop rbp
ret
What donothing does is nothing more than returning the value of the first parameter. But when donothing is called the value 0 is printed instead of 1. I tried rbp+0x4, but it doesn't work too.
I compile the files using the following command:
nasm -f elf64 first.asm && gcc first.o main.c
Compiling the function 'test' in C by using gcc -s the assembly code generated to get the parameters looks similar to the donothing:
int test(int a, int b) {
return a > b;
}
Assembly generated by gcc for the function 'test' above:
test:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl -4(%rbp), %eax
cmpl -8(%rbp), %eax
setg %al
movzbl %al, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
So, what's wrong with donothing?
In x86-64 calling conventions the first few parameters are passed in registers rather than on the stack. In your case you should find the 1 and 2 in RDI and RSI.
As you can see in the compiled C code, it takes a from edi and b from esi (although it goes through an unnecessary intermediate step by placing them in memory)