Understanding argc and argv in main() assembly [duplicate] - c

This question already has answers here:
gcc argument register spilling on x86-64
(2 answers)
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
(1 answer)
x86 explanation, number of function arguments and local variables
(2 answers)
Closed 2 years ago.
I have the following C program to see how the main function is called with argc and argv as follows:
#include <stdio.h>
int main(int argc, char *argv[]) {
// use a non-zero value so we can easily tell it ran properly with echo $?
return 3;
}
And the non-optimized assembly output with $ gcc ifile.c -S -o ifile.s gives us:
.file "ifile.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp) <== here
movq %rsi, -16(%rbp) <== here
movl $3, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",#progbits
I understand this with the exception of the two lines above preceding moving the return value into %eax:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
What are these two lines doing? I am guessing the first line since it has an offset of 4 is populating the integer value of argc, and the second argument is passing an (8-byte) pointer padded to 16 for the strings that can be passed in the argv. Is this a correct understanding of these items? Where can I learn more about, not so much the full ABI, but the specific details/internals about how the main() function gets invoked and such?

Related

How does x=x+1 is evaluated by the compiler and how is represented in assembly?

I'm trying to understand how does the compiler "sees" the i+1 part from expression i=i+1. I understand that i=3 means putting the value 3 in the location memory of variable i.
My guess about the i=i+1 is that the compiler expects a value from the right side of the "=" operator, so it gets the value from the location memory of variable i (which is 3, after the assignment) and add 1 to it, and the final result of the "i+1" expression(3+1=4) is stored back into the location memory of variable i, as a value. Is that correct?
And if it is, it means that any variable/combination of variables and literals present on the right side of an "=" operator will always be replaced with the value stored in them and those value can be added/substracted/etc with the values from other variables/literals (as in the x+1 expression), whilst the final result of those calculations will also be literal values (ex: 5, literal strings, etc), and will also be stored like values in a single variable on the left side of the "=" operator.
I'm also curious how this code is seen in assembly, and what are the main operations of this incrementation of i ( i = i+1);
#include <stdio.h>
int main()
{
int i = 3;
i = i + 1; // i should have the value of 4 stored back in it;
return 0;
}
This is not answerable for the general case. It depends on the target platform. If you want to inspect the assembly, you can do so with the -S parameter with gcc. When I did that to your code, it gave me this:
/tmp$ cat main.s
.file "main.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $3, -4(%rbp)
addl $1, -4(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 9.2.1-8) 9.2.1 20190909"
.section .note.GNU-stack,"",#progbits
A brief little explanation of what is happening here. First we push the value of the stackpointer. This is so that we can jump back later.
.cfi_startproc
pushq %rbp
Then we set up the stack frame with this code. It corresponds to declaring variables.
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
Then we have this. Comments are mine.
movl $3, -4(%rbp) # i = 3;
addl $1, -4(%rbp) # i = i + 1;
Lastly, we return from the main function
movl $0, %eax # Store 0 in the "return register"
popq %rbp # Restore stackpointer
.cfi_def_cfa 7, 8
ret # return
Note that there is not a 1-1 relationship between lines. Not even for very simple lines.
Please also note that C imposes requirement on the observable behavior of the program and not on the generated assembly. So for instance, a compiler might remove the whole body for the main function because the variable i is not used in an observable way. And it will if you use optimization. When I recompiled your code with -O3 I got this instead:
/tmp/$ cat main.s
.file "main.c"
.text
.section .text.startup,"ax",#progbits
.p2align 4
.globl main
.type main, #function
main:
.LFB11:
.cfi_startproc
xorl %eax, %eax
ret
.cfi_endproc
.LFE11:
.size main, .-main
.ident "GCC: (Debian 9.2.1-8) 9.2.1 20190909"
.section .note.GNU-stack,"",#progbits
Notice how much that got removed from main. It can be interesting that movl $0, %eax has changed to xorl %eax, %eax. If you think about it, it's pretty obvious that this is a "set zero" operation. One could reasonably argue why anyone would write stuff like that. Well, the optimizer does certainly not optimize for readability. There are a few reasons why it is better. You can read about them here: What is the best way to set a register to zero in x86 assembly: xor, mov or and?

It is possible to convert given C code to Assembly x86?

I'm working in AWD obstacle avoidance robot in assembly x86. I can find out some program which is already executed in C language but can't find executed in assembly x86.
How do convert these C codes to Assembly x86 code?
The whole part of codes here:
http://www.mertarduino.com/arduino-obstacle-avoiding-robot-car-4wd/2018/11/22/
void compareDistance() // find the longest distance
{
if (leftDistance>rightDistance) //if left is less obstructed
{
turnLeft();
}
else if (rightDistance>leftDistance) //if right is less obstructed
{
turnRight();
}
else //if they are equally obstructed
{
turnAround();
}
}
int readPing() { // read the ultrasonic sensor distance
delay(70);
unsigned int uS = sonar.ping();
int cm = uSenter code here/US_ROUNDTRIP_CM;
return cm;
}
How do convert these C codes to Assembly x86 code?
Converting source code to assembly is basically what a compiler does, so just compile it. Most (if not all) compilers have the option of outputting the intermediate assembly code.
If you use gcc -S main.c you will get a file called main.s containing the assembly code.
Here is an example:
$ cat hello.c
#include <stdio.h>
void print_hello() {
puts("Hello World!");
}
int main() {
print_hello();
}
$ gcc -S hello.c
$ cat hello.s
.file "hello.c"
.text
.section .rodata
.LC0:
.string "Hello World!"
.text
.globl print_hello
.type print_hello, #function
print_hello:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts#PLT
nop
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size print_hello, .-print_hello
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
call print_hello
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Debian 8.3.0-6) 8.3.0"
.section .note.GNU-stack,"",#progbits
How do convert these C codes to Assembly x86 code?
You can use the gcc -m32 -S main.c command to do that, where :
the -S flag indicates that the output must be assembly,
the -m32 flag indicates that you want to produce i386 (32-bit) output.

Understanding an unexpected result due to an unmatched prototype (C89)

I have a program goo.c
void foo(double);
#include <stdio.h>
void foo(int x){
printf ("in foo.c:: x= %d\n",x);
}
which is called by foo.c
int main(){
double x=3.0;
foo(x);
}
I compile and run
gcc foo.c goo.c
./a.out
Guess what? I get "x= 1" as result. Then I find the signature of 'foo' should have been void foo(int). Apparently, my double input value 3.0 has to be downcast to an int. But, if I try to see the value of (int) 3.0 with the test program:
int main(){
double x=3.0;
printf ("%d", ((int) x));
}
I get 3 as output, which makes the earlier ` x= 1' even more hard to understand. Any idea? For information, my gcc is run with ANSI C standard. Thanks.
[EDIT] If I use gcc -S as suggested by JS1,
I get goo.s
.file "goo.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movabsq $4613937818241073152, %rax
movq %rax, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, -24(%rbp)
movsd -24(%rbp), %xmm0
call foo
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2"
.section .note.GNU-stack,"",#progbits
and foo.s
.file "foo.c"
.section .rodata
.LC0:
.string "in foo.c:: x= %d\n"
.text
.globl foo
.type foo, #function
foo:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.ident "GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2"
.section .note.GNU-stack,"",#progbits
Anyone who knows how to read Assembly can help figure out the source problem?
Understanding why you get '1' requires a bit of ASM and x86-64 ABI
knowledge. First of all, goo.c and foo.c are two separate compilation
units. The only thing that foo.c knows about the foo function is
the bogus prototype.
The bogus prototype is as follows: void foo(double);. It's a function
that takes only a single double argument. The x86-64 ABI mandates that
the doubles are passed through the xmm registers (The exact phrasing
is 'If the class is SSE, the next available vector register is used,
the registers are taken in the order from %xmm0 to %xmm7.'.
That means that when the compiler sets up the arguments to call the
foo() function, it's going to pass the argument via %xmm0. In
simplified asm what happens is:
mov 3.0, %xmm0
call foo
Now, foo(), on it's side, believes it's going to recieve an int. The
x86-64 ABI says: 'If the class is INTEGER, the next available register
of the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9 is used.'. The first
argument is supposed to be passed via %rdi. That means that foo()
will do something like:
mov %rdi, %rsi
mov 0xabcd, %rdi // 0xabcd being the address of the "%d" string
call printf
So you're going to end up printing whatever was in %rsi, and not %xmm0.
But why 1? You'll get an idea by issuing the following commands:
./a.out a
./a.out a b
./a.out a b c
See a pattern? Let's go back to the simplified assembly:
main:
mov 3.0, %xmm0
call foo
ret
foo:
mov %rdi, %rsi
mov 0xabcd, %rdi // 0xabcd being the address of the "%d" string
call printf
ret
As you can see, nothing is setting %rdi until it reaches foo(),
where it's passed on to printf. Which means 1 was passed to main
in the first place. Now, in the question, main is given the following
prototype: int main(). But the compiler actually setup the function to
have the following prototype instead: int main (int argc, char *argv[],
char *envp[]). The first argument, thus stored in %rdi, is actually
argc. That's why the program was printing 1.

Why initialized variable before usage is faster than uninitialized variable [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
For example, variable i in the C code below:
t03a.c:
#include <stdio.h>
int main(void)
{
int a=0;
int b=1;
int i=0;//When a variable is declared,the variable is initialised
for(i=0;i<1000000000;++i);
return 0;
}
t03b.c:
#include <stdio.h>
int main(void)
{
int a=0;
int b=1;
int i; // not initialized
for(i=0;i<1000000000;++i);
return 0;
}
The result (test by linux time):
t03a:
real 0m0.527s
user 0m0.250s
sys 0m0.004s
t03b:
real 0m2.499s
user 0m2.431s
sys 0m0.003s
Of course, I ran the test many times. Why is t03a faster than t03b?
I have run the test for a couple of times. Below is a pretty average result of both programs:
[xxx#arch-desktop: ~/]$ time ./t03a
real 0m2.819s
user 0m2.817s
sys 0m0.000s
[xxx#arch-desktop: ~/]$ time ./t03b
real 0m2.815s
user 0m2.813s
sys 0m0.000s
Compiled with gcc -std=c99 -o t03a t03a.c. Maybe you can try using the optimizing parameter(i.e. -O3).
I don't think there should be any difference if it's compiled in the right way. Declaring and defining it directly shouldn't be any different from defining it later.
On my machine (an Intel x86 with gcc), both routines compile to almost identical assembly. The only difference is that 0 is put on the stack twice in the second example. I can't imagine this taking much time.
Compiled with -O1 I get identical (if not especially useful) code for both examples.
Initialize is a useful practice for the style of the code and for avoiding some null and logic errors.
So this is the main scope of initialization.
But for my tests the difference of performance for t03a.c and t03b.c on Linux architecture have not significant difference.
Be carefully to analyse results because also temperature of CPU, mainboard and ect ...influence the time of compilation (you need to the take an relative and absolute error if you want to demonstrate your assertion when the difference is so little).
In my case the assembly code from t03a.c is the same than t03b.c, but with one more instruction, so I guess you were doing something else and the scheduler gave another process a little more execution time or maybe it was for another reason, however te execution time aren't so great and the difference is not much to conclude something, the difference is only one instruction (you have to think that you are doing 1000000000+ instructions in less than 3 seconds, so 1 instruction won't make a difference that will be perceived).
.file "t03b.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, -8(%rbp)
movl $1, -12(%rbp)
movl $0, -4(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
addl $1, -4(%rbp)
.L2:
cmpl $999999999, -4(%rbp)
jle .L3
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.8.3 20140911 (Red Hat 4.8.3-7)"
.section .note.GNU-stack,"",#progbits
.file "t03b.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, -8(%rbp)
movl $1, -12(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
addl $1, -4(%rbp)
.L2:
cmpl $999999999, -4(%rbp)
jle .L3
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.8.3 20140911 (Red Hat 4.8.3-7)"
.section .note.GNU-stack,"",#progbits

long compile time when using big arrays in the extern block

Why does gcc take a long time to compile a C code if it has a big array in the extern block?
#define MAXNITEMS 100000000
int buff[MAXNITEMS];
int main (int argc, char *argv[])
{
return 0;
}
I suspect a bug somewhere. There is no reason for the compile to take longer, no matter how big the array is since the compiler will just write an integer into the .bss segment since you never assign a value to an element in it. Proof:
.file "big.c"
.comm buff,4000000000000000000,32
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
.section .note.GNU-stack,"",#progbits
As you can see, the only thing left of the array in the assembly is .comm buff,4000000000000000000,32.
I suggest you gcc with -S to see the assembler code. Maybe your version of GCC has bug. I tested with GCC 4.7.3 and the compile times here are the same, no matter which value I use.
Related: Where are static variables stored (in C/C++)?

Resources