I am new to c and gcc. I'm trying to follow along with an example in Computer Systems: A Programmer's Perspective. The author says that the following code when put into a file (code.c)
int accum = 0;
int sum(int x, int y)
{
int t = x + y;
accum += t;
return t;
}
and using the gcc as follows to output an assembly code file
gcc -O2 -S code.c
will produce assembly code as follows
sum:
pushl %ebp
movl %esp,%ebp
movl 12(%ebp),%eax
addl 8(%ebp),%eax
addl %eax,accum
movl %ebp,%esp
popl %ebp
ret
However on my machine (OS: Ubuntu 10.4 x64) I get the following
.file "code.c"
.intel_syntax noprefix
.text
.p2align 4,,15
.globl sum
.type sum, #function
sum:
.LFB0:
.cfi_startproc
lea eax, [rdi+rsi]
add DWORD PTR accum[rip], eax
ret
.cfi_endproc
.LFE0:
.size sum, .-sum
.globl accum
.bss
.align 4
.type accum, #object
.size accum, 4
accum:
.zero 4
.ident "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
.section .note.GNU-stack,"",#progbits
Why am I seeing this difference?
Because the book is 11 years old and gcc has changed a great deal since it was written.
Related
Following code compile and run on GCC compiler.
#include <stdio.h>
int arr[10];
int func()
{
printf("In func\n");
return 0;
}
int main()
{
if (&arr[func()])
printf("In main\n");
return 0;
}
Output:
In main
Why does not execute printf("In func\n"); ?
There seems to be a subtle issue, either intended, or unintended with various combinations of the latest gcc. ver 7.3 on the latest kernel 4.15.8 on Archlinux. For whatever reason the call to func() is omitted for the code generated for main(). e.g.
$ gcc -S -masm=intel -o infunc2.asm infunc2.c
The generated assembly is:
$ cat infunc2.asm
.file "infunc2.c"
.intel_syntax noprefix
.text
.comm arr,40,32
.section .rodata
.LC0:
.string "In func"
.text
.globl func
.type func, #function
func:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
lea rdi, .LC0[rip]
call puts#PLT
mov eax, 0
pop rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size func, .-func
.section .rodata
.LC1:
.string "In main"
.text
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
lea rdi, .LC1[rip]
call puts#PLT
mov eax, 0
pop rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (GNU) 7.3.0"
.section .note.GNU-stack,"",#progbits
Note the call to func() is labeled .LFB0: above. The procedure for main: does not call func or .LFB0: at all, despite it being present, and despite the "In func" string being present in .LC0:. I suspect this is not intended behavior.
For example, simple compilation without optimization -O0 the function is not called, e.g.:
$ gcc -g -O0 -o bin/if2 infunc2.c
$ ./bin/if2
In main
Changing the code to store the address of arr[func()] does force func() to be called, e.g.
#include <stdio.h>
int arr[10];
int func()
{
printf ("In func\n");
return 0;
}
int main (void)
{
int *p = &arr[func()];
if (p)
printf("In main\n");
return 0;
}
Then
$ gcc -Wall -Wextra -pedantic -std=gnu11 -Ofast -o bin/infunc infunc.c
$ ./bin/infunc
In func
In main
And the generated assembly supports the different behavior:
$ gcc -S -masm=intel -o infunc.asm infunc.c
$ cat infunc.asm
.file "infunc.c"
.intel_syntax noprefix
.text
.comm arr,40,32
.section .rodata
.LC0:
.string "In func"
.text
.globl func
.type func, #function
func:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
lea rdi, .LC0[rip]
call puts#PLT
mov eax, 0
pop rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size func, .-func
.section .rodata
.LC1:
.string "In main"
.text
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
sub rsp, 16
mov eax, 0
call func
cdqe
lea rdx, 0[0+rax*4]
lea rax, arr[rip]
add rax, rdx
mov QWORD PTR -8[rbp], rax
cmp QWORD PTR -8[rbp], 0
je .L4
lea rdi, .LC1[rip]
call puts#PLT
.L4:
mov eax, 0
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (GNU) 7.3.0"
.section .note.GNU-stack,"",#progbits
I wish I could provide some logical explanation for the handling here, but I can only document it. Seems we need to talk with the guys on the gcc list.
Side effects discarded in address computation inside 'if'
This seems to be a regression in gcc that will appear depending on whether an individual distro applies enough patching to mask it. It is a gcc bug in work. Bug 84607
This is a gcc bug (#84607) and has been fixed in gcc 7.3.1 or later.
The problem is with your compilation. I use gcc to compile. I compiled your file like this:
gcc main.c -o prog
./prog
In func
In main
Seems good to me. Check the procedure on how to compile with you compiler if you use a different compiler than gcc. Also I use gcc 7.3
I am using GCC in 32-bit mode on a Windows 7 machine under cygwin. I have the following function:
unsigned f1(unsigned x, unsigned y)
{
return x*y;
}
I want the code to do an unsigned multiply and as such I would expect it to generate the mul instruction, not the imul instruction. I compile the program
with the following command:
gcc -m32 -S t4.c
The generated assembly code is:
.file "t4.c"
.text
.globl _f1
.def _f1; .scl 2; .type 32; .endef
_f1:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
imull 12(%ebp), %eax
popl %ebp
ret
.ident "GCC: (GNU) 4.8.2"
I believe that the generated code has the wrong multiply instruction in it but I find it hard to believe that GCC has such a simple bug. Please comment.
The compiler relies on the "as-if" rule: No standard conforming program can detect a difference between what this program does and what the program should do, since the lowest 32 bits of the result are the same for both instructions.
I have a C code that declares global variable char file[MAX]. This variable is used in various functions directly to copy filename to it. I can compile this c file to assembly code but I don't know how to find the address of this variable? In x86 stack, how do I find the address of a global variable? Can you give me an example how global variable is referenced in assembly code?
EDIT: I don't see a .Data segment in the assembly code.
To store the address of file to register EAX:
AT&T syntax: movl $_file, %eax
intel syntax: mov eax, OFFSET _file
How to examine:
Firstly, write a simple code (test.c).
#define MAX 256
char file[MAX];
int main(void) {
volatile char *address = file;
return 0;
}
Then, compile it to asssembly code: gcc -S -O0 -o test.s test.c
.file "test.c"
.comm _file, 256, 5
.def ___main; .scl 2; .type 32; .endef
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
call ___main
movl $_file, 12(%esp)
movl $0, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE0:
.ident "GCC: (GNU) 4.8.1"
Or if you want intel syntax: gcc -S -O0 -masm=intel -o test_intel.s test.c
.file "test.c"
.intel_syntax noprefix
.comm _file, 256, 5
.def ___main; .scl 2; .type 32; .endef
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB0:
.cfi_startproc
push ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
mov ebp, esp
.cfi_def_cfa_register 5
and esp, -16
sub esp, 16
call ___main
mov DWORD PTR [esp+12], OFFSET FLAT:_file
mov eax, 0
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE0:
.ident "GCC: (GNU) 4.8.1"
With a little more experiments and examination, I got the result.
hello_world.c
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
Running gcc hello_world.c -S generates a hello_world.s file in assembly language.
hello_world.s
.file "hello_world.c"
.section .rodata
.LC0:
.string "Hello World"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %edi
call puts
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section .note.GNU-stack,"",#progbits
Is there some way to find out in what type of assembly language the code was generated in (besides knowing the syntax of all assembly languages.)?
Reference for myself or anyone else who didn't know this:
To get your processor architecture run the following:
uname -p
It is the AT&T syntax for the GNU assembler of the target code's CPU by default. There are options to alter that.
I wrote this simple C program:
int main() {
int i;
int count = 0;
for(i = 0; i < 2000000000; i++){
count = count + 1;
}
}
I wanted to see how the gcc compiler optimizes this loop (clearly add 1 2000000000 times should be "add 2000000000 one time"). So:
gcc test.c and then time on a.out gives:
real 0m7.717s
user 0m7.710s
sys 0m0.000s
$ gcc -O2 test.c and then time ona.out` gives:
real 0m0.003s
user 0m0.000s
sys 0m0.000s
Then I disassembled both with gcc -S. First one seems quite clear:
.file "test.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movl $0, -8(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
addl $1, -8(%rbp)
addl $1, -4(%rbp)
.L2:
cmpl $1999999999, -4(%rbp)
jle .L3
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
.section .note.GNU-stack,"",#progbits
L3 adds, L2 compare -4(%rbp) with 1999999999 and loops to L3 if i < 2000000000.
Now the optimized one:
.file "test.c"
.text
.p2align 4,,15
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
rep
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
.section .note.GNU-stack,"",#progbits
I can't understand at all what's going on there! I've got little knowledge of assembly, but I expected something like
addl $2000000000, -8(%rbp)
I even tried with gcc -c -g -Wa,-a,-ad -O2 test.c to see the C code together with the assembly it was converted to, but the result was no more clear that the previous one.
Can someone briefly explain:
The gcc -S -O2 output.
If the loop is optimized as I expected (one sum instead of many sums)?
The compiler is even smarter than that. :)
In fact, it realizes that you aren't using the result of the loop. So it took out the entire loop completely!
This is called Dead Code Elimination.
A better test is to print the result:
#include <stdio.h>
int main(void) {
int i; int count = 0;
for(i = 0; i < 2000000000; i++){
count = count + 1;
}
// Print result to prevent Dead Code Elimination
printf("%d\n", count);
}
EDIT : I've added the required #include <stdio.h>; the MSVC assembly listing corresponds to a version without the #include, but it should be the same.
I don't have GCC in front of me at the moment, since I'm booted into Windows. But here's the disassembly of the version with the printf() on MSVC:
EDIT : I had the wrong assembly output. Here's the correct one.
; 57 : int main(){
$LN8:
sub rsp, 40 ; 00000028H
; 58 :
; 59 :
; 60 : int i; int count = 0;
; 61 : for(i = 0; i < 2000000000; i++){
; 62 : count = count + 1;
; 63 : }
; 64 :
; 65 : // Print result to prevent Dead Code Elimination
; 66 : printf("%d\n",count);
lea rcx, OFFSET FLAT:??_C#_03PMGGPEJJ#?$CFd?6?$AA#
mov edx, 2000000000 ; 77359400H
call QWORD PTR __imp_printf
; 67 :
; 68 :
; 69 :
; 70 :
; 71 : return 0;
xor eax, eax
; 72 : }
add rsp, 40 ; 00000028H
ret 0
So yes, Visual Studio does this optimization. I'd assume GCC probably does too.
And yes, GCC performs a similar optimization. Here's an assembly listing for the same program with gcc -S -O2 test.c (gcc 4.5.2, Ubuntu 11.10, x86):
.file "test.c"
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "%d\n"
.text
.p2align 4,,15
.globl main
.type main, #function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl $2000000000, 8(%esp)
movl $.LC0, 4(%esp)
movl $1, (%esp)
call __printf_chk
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
.section .note.GNU-stack,"",#progbits
Compilers have a few tools at their disposal to make code more efficient or more "efficient":
If the result of a computation is never used, the code that performs the computation can be omitted (if the computation acted upon volatile values, those values must still be read but the results of the read may be ignored). If the results of the computations that fed it weren't used, the code that performs those can be omitted as well. If such omission makes the code for both paths on a conditional branch identical, the condition may be regarded as unused and omitted. This will have no effect on the behaviors (other than execution time) of any program that doesn't make out-of-bounds memory accesses or invoke what Annex L would call "Critical Undefined Behaviors".
If the compiler determines that the machine code that computes a value can only produce results in a certain range, it may omit any conditional tests whose outcome could be predicted on that basis. As above, this will not affect behaviors other than execution time unless code invokes "Critical Undefined Behaviors".
If the compiler determines that certain inputs would invoke any form of Undefined Behavior with the code as written, the Standard would allow the compiler to omit any code which would only be relevant when such inputs are received, even if the natural behavior of the execution platform given such inputs would have been benign and the compiler's rewrite would make it dangerous.
Good compilers do #1 and #2. For some reason, however, #3 has become fashionable.