Can a compiler generate useless assembly code?

Can a compiler generate useless assembly code? - c

I am trying to find the meaning of assembly code generated from a c program. Here is the program in C:
int* a = &argc;
int b = 8;
a = &b;
Here is the assembly code generated with explanations. There is one part that I do not understand:
Prologue of the main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $36, %esp
Load the address of argc in %eax:
movl %ecx, %eax
The part I do not get:
movl 4(%eax), %edx
movl %edx, -28(%ebp)
Stack-Smashing Protector code (setup):
movl %gs:20, %ecx
movl %ecx, -12(%ebp)
xorl %ecx, %ecx
Load values in a and b (see in main.c):
movl %eax, -16(%ebp)
movl $8, -20(%ebp)
Modify the value of a (a = &b):
leal -20(%ebp), %eax
movl %eax, -16(%ebp)
Stack-Smashing Protector code (verify the stack is ok):
movl $0, %eax
movl -12(%ebp), %edx
xorl %gs:20, %edx
je .L7
call __stack_chk_fail
If the stack is Ok:
.L7:
addl $36, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
So the part I do not uinderstand is modifying the value in -28(%ebp), an address never used. Does someone knows why is this part generated?

The good way to see what the compiler does. I assume you have a file called main.c:
int main(int argc, char **argv)
{
int* a = &argc;
int b = 8;
a = &b;
}
Compile with debug info to an object file:
$ gcc -c -g main.c
View the assembly:
$ objdump -S main.o
main.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
int main(int argc, char **argv)
{
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 7d ec mov %edi,-0x14(%rbp)
7: 48 89 75 e0 mov %rsi,-0x20(%rbp)
int* a = &argc;
b: 48 8d 45 ec lea -0x14(%rbp),%rax
f: 48 89 45 f8 mov %rax,-0x8(%rbp)
int b = 8;
13: c7 45 f4 08 00 00 00 movl $0x8,-0xc(%rbp)
a = &b;
1a: 48 8d 45 f4 lea -0xc(%rbp),%rax
1e: 48 89 45 f8 mov %rax,-0x8(%rbp)
22: b8 00 00 00 00 mov $0x0,%eax
}
27: 5d pop %rbp
28: c3 retq
Then do the same with full optimization:
$ gcc -c -g -O3 main.c
And view the assembly again:
$ objdump -S main.o
main.o: file format elf64-x86-64
Disassembly of section .text.startup:
0000000000000000 <main>:
int main(int argc, char **argv)
{
int* a = &argc;
int b = 8;
a = &b;
}
0: 31 c0 xor %eax,%eax
2: c3 retq
So the answer is yes. The compiler can produce instructions not needed. That's why you turn on optimizations. When they are turned off, the compiler does its job in a very generic way without thinking at all. For example, it reserves space for variables that are not used.

Related

Why is this stack variable in C not in a register?

I have a reasonable understanding of how to execute buffer overflow attacks and how register allocation works in compilers.
What confuses me is why so many things are on the stack in C programs.
Consider this vulnerable program:
#include <stdio.h>
int main() {
int a = 0;
char str[] = "ABC";
gets(str);
printf("int: %d, str: %s\n", a, str);
return a;
}
Let's run it
> gcc run.c
> ./a.out asdfasdf
int: 1717859169, str: asdfasdf
Okay so str is overwritten as is int a. But why is int a even on the stack?
Wouldn't it be easiest to just do something like (x86 asm)
.global _main
.text
_main:
// omitting the gets() stuff
movq $0, %rax
retq
Now we have less memory traffic since nothing is on the stack and much less code.
tl;dr why is int a on the stack at all?

Per the comments on my post.
It happens because I am compiling without optimization and when I compile with optimizations gcc -O3 run.c I won't see the same behavior.
Here's some of the optimized assembly
> gcc -o run -O3 run.c
> objdump -d run
...
// Set eax = 0
100003f5c: 31 c0 xorl %eax, %eax
100003f5e: 48 83 c4 08 addq $8, %rsp
100003f62: 5b popq %rbx
100003f63: 5d popq %rbp
// Return 0
100003f64: c3 retq
And the more complicated unoptimized:
...
// put 0 on the stack
100003f33: c7 45 f8 00 00 00 00 movl $0, -8(%rbp) the stack
...
// take it off the stack and into ecx
100003f61: 8b 4d f8 movl -8(%rbp), %ecx
100003f64: 89 45 e4 movl %eax, -28(%rbp)
100003f67: 89 c8 movl %ecx, %eax
100003f69: 48 83 c4 20 addq $32, %rsp
100003f6d: 5d popq %rbp
// return 0
100003f6e: c3 retq

How can gcc -O3 make the run so fast?

Let test_speed.c be the following C code :
#include <stdio.h>
int main(){
int i;
for(i=0; i < 1000000000; i++) {}
printf("%d", i);
}
I run in the terminal :
gcc -o test_speed test_speed.c
and then :
time ./test_speed
I get :
Now i run the following :
gcc -O3 -o test_speed test_speed.c
and then :
time ./test_speed
I get :
How can the second run be this fast ? Is it already computed during the compilation ?

that's because -O3 aggressive optimization assumes that
for(i=0; i < 1000000000; i++) {}
has no side effect (except for the value of i) and removes the loop completely (directly setting i to 1000000000).
Disassembly (x86):
00000000 <_main>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 e4 f0 and $0xfffffff0,%esp
6: 83 ec 10 sub $0x10,%esp
9: e8 00 00 00 00 call e <_main+0xe>
e: c7 44 24 04 00 ca 9a movl $0x3b9aca00,0x4(%esp) <== 1000000000 in hex, no loop
15: 3b
16: c7 04 24 00 00 00 00 movl $0x0,(%esp)
1d: e8 00 00 00 00 call 22 <_main+0x22>
22: 31 c0 xor %eax,%eax
24: c9 leave
25: c3 ret
that optimization level is not suitable for calibrated active-CPU loops as you can see (the result is the same with -O2, but the loop remains unoptimized with just -O)

gcc "knows" that there is no body in the loop, and no dependency on any result, temporary or real -- so it removes the loop.
A good tool for analysis like this is godbolt.org which shows you the generated assembly. The difference between no optimization at all and the -O3 optmization is stark:
No optimization
With -O3

A compiler only has to keep the observable behavior of a program. Counting a variable without any I/O, interaction, or just using its value isn't observable, so as your loop doesn't do anything, the optimizer just throws it away completely and directly assigns the final value.

The compiler recognizes that the loop does nothing, and that removing it would not change the output of the program, so the loop was optimized away entirely.
Here's the assembly with -O0:
.L3:
.loc 1 4 0 is_stmt 0 discriminator 3
addl $1, -4(%rbp)
.L2:
.loc 1 4 0 discriminator 1
cmpl $999999999, -4(%rbp) # loop
jle .L3
.loc 1 5 0 is_stmt 1
movl -4(%rbp), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
.loc 1 6 0
leave
.cfi_def_cfa 7, 8
ret
And with -O3:
main:
.LFB23:
.file 1 "x1.c"
.loc 1 2 0
.cfi_startproc
.LVL0:
subq $8, %rsp
.cfi_def_cfa_offset 16
.LBB4:
.LBB5:
.file 2 "/usr/include/x86_64-linux-gnu/bits/stdio2.h"
.loc 2 104 0
movl $1000000000, %edx # stored value, no loop
movl $.LC0, %esi
movl $1, %edi
xorl %eax, %eax
call __printf_chk
.LVL1:
.LBE5:
.LBE4:
.loc 1 6 0
xorl %eax, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
You can see that in the -O3 case the loop is removed entirely and the final value of i, 1000000000, is stored directly.

What c code would compile to something like `call *%eax`?

I'm working with c and assembly and I've seen call *%eax in a few spots. I wanted to write a small c program that would compile to something like this, but I'm stuck.
I was thinking about just writing up some assembly code like in this question: x86 assembly instruction: call *Reg only using AT&T syntax in my case to get a small example with the call in it. However, that wouldn't solve my burning question of what kind of c code compiles to that?
I understand that it is a call to the address that eax is pointing to.

Documentation: http://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html#Local-Reg-Vars
Try this
#include <stdio.h>
typedef void (*FuncPtr)(void);
void _Func(void){
printf("Hello");
}
int main(int argc, char *argv[]){
register FuncPtr func asm ("eax") = _Func;
func();
return 0;
}
And its relative assembly:
.file "functorTest.c"
.section .rdata,"dr"
LC0:
.ascii "Hello\0"
.text
.globl __Func
.def __Func; .scl 2; .type 32; .endef
__Func:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl $LC0, (%esp)
call _printf
leave
ret
.def ___main; .scl 2; .type 32; .endef
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
call ___main
movl $__Func, %eax
call *%eax ; see?
movl $0, %eax
movl %ebp, %esp
popl %ebp
ret
.def _printf; .scl 2; .type 32; .endef

Just to add another example with local variables instead.
#include <stdio.h>
// A normal function with an int parameter
// and void return type
void fun(int a)
{
printf("Value of a is %d\n", a);
}
int main()
{
// fun_ptr is a pointer to function fun()
void (*fun_ptr)(int) = &fun;
// Invoking fun() using fun_ptr
(*fun_ptr)(10);
return 0;
}
The Assembly (There are some extra lines, for this was compiled in x86 on an x64 computer. I can provide the x64 if requested by new viewers)
000011c7 <main>:
11c7: 8d 4c 24 04 lea 0x4(%esp),%ecx
11cb: 83 e4 f0 and $0xfffffff0,%esp
11ce: ff 71 fc pushl -0x4(%ecx)
11d1: 55 push %ebp
11d2: 89 e5 mov %esp,%ebp
11d4: 51 push %ecx
11d5: 83 ec 14 sub $0x14,%esp
11d8: e8 28 00 00 00 call 1205 <__x86.get_pc_thunk.ax>
11dd: 05 23 2e 00 00 add $0x2e23,%eax
11e2: 8d 80 99 d1 ff ff lea -0x2e67(%eax),%eax
11e8: 89 45 f4 mov %eax,-0xc(%ebp)
11eb: 83 ec 0c sub $0xc,%esp
11ee: 6a 0a push $0xa
11f0: 8b 45 f4 mov -0xc(%ebp),%eax
11f3: ff d0 call *%eax
11f5: 83 c4 10 add $0x10,%esp
11f8: b8 00 00 00 00 mov $0x0,%eax
11fd: 8b 4d fc mov -0x4(%ebp),%ecx
1200: c9 leave
1201: 8d 61 fc lea -0x4(%ecx),%esp
1204: c3 ret

how to know where is the register variable stored?

I knew that register variables are stored in CPU registers.
And the same variables are stored in stack if the CPU registers are busy/full.
how can i know that the variable is stored in stack or CPU register?

No, you can't.
It's decided by the compiler, and might change between compilations if, for instance, the surrounding code changes the register pressure or if compiler flags are changed.

I am agree with Mr. Unwind's answer, but upto some extend this way may be helpful to you:
file name x.c:
int main(){
register int i=0;
i++;
printf("%d",i);
}
Assemble code:
~$ gcc x.c -S
output file name is x.s.
In my case ebx register is used, which may be difference at different compilation time.
~$ cat x.s
.file "x.c"
.section .rodata
.LC0:
.string "%d"
.text
.globl main
.type main, #function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
pushl %ebx
subl $28, %esp
movl $0, %ebx
addl $1, %ebx // because i++
movl $.LC0, %eax
movl %ebx, 4(%esp)
movl %eax, (%esp)
call printf
addl $28, %esp
popl %ebx
movl %ebp, %esp
popl %ebp
ret
You can also disassemble your executable using objdunp:
$ gcc x.c -o x
$ objdump x -d
Partial assembly output using objdump command:
080483c4 <main>:
80483c4: 55 push %ebp
80483c5: 89 e5 mov %esp,%ebp
80483c7: 83 e4 f0 and $0xfffffff0,%esp
80483ca: 53 push %ebx
80483cb: 83 ec 1c sub $0x1c,%esp
80483ce: bb 00 00 00 00 mov $0x0,%ebx
80483d3: 83 c3 01 add $0x1,%ebx //due to i++
80483d6: b8 b0 84 04 08 mov $0x80484b0,%eax
80483db: 89 5c 24 04 mov %ebx,0x4(%esp)
80483df: 89 04 24 mov %eax,(%esp)
80483e2: e8 0d ff ff ff call 80482f4 <printf#plt>
80483e7: 83 c4 1c add $0x1c,%esp
80483ea: 5b pop %ebx
80483eb: 89 ec mov %ebp,%esp
80483ed: 5d pop %ebp
80483ee: c3 ret
80483ef: 90 nop
%ebx register reserved for register variable.

Am too agreeing with UnWind answer, on the other hand disassembling the code in GDB may give the storage of the variables. Disassembling a vague code which I have gives the locals of that frame as below,
(gdb) info locals
i = 0
ret = <value optimized out>
k = 0
ctx = (BN_CTX *) 0x632e1cc8
A1 = (BIGNUM *) 0x632e1cd0
A1_odd = (BIGNUM *) 0x632e1ce8
check = <value optimized out>
mont = (BN_MONT_CTX *) 0x632e2108
A = (const BIGNUM *) 0x632e2028
Now if try printing the address of the locals it does tell me the storage location as below,
(gdb) p &i
$16 = (int *) 0x143fba40
(gdb) p &k
$17 = (int *) 0x143fba38
(gdb) p &mont
Address requested for identifier "mont" which is in register $s7
(gdb)
Here objects i and k are on stack and mont is in register $s7.

According to the book "The Ansi C Programming Language - Second Edition" of Brian W. Kernighan & Dennis M.Ritchie (The founders of the C languages), you can not.
Chapter 4, Page 84,
"... And it is not possible to take the address of register variable,
regardless whether the variable is actually placed in a register."
Hope that helps!
Best of luck in the future,
Ron

why less than expression converts into less than or equal to expression in gcc

I am working on code optimization and going through gcc internals. I wrote a simple expression in my program and I checked the gimple representation of that expression and I got stuck why gcc had done this.
Say I have an expression :
if(i < 9)
then in the gimple representation it will be converted to
if(i <= 8)
I dont know why gcc do this. Is it some kind of optimization, if yes then can anyone tell me how it can optimize our program?

The canonalisation helps to detect CommonSubExpressions, such as:
#include <stdio.h>
int main(void)
{
unsigned u, pos;
char buff[40];
for (u=pos=0; u < 10; u++) {
buff[pos++] = (u <5) ? 'A' + u : 'a' + u;
buff[pos++] = (u <=4) ? '0' + u : 'A' + u;
}
buff[pos++] = 0;
printf("=%s=\n", buff);
return 0;
}
GCC -O1 will compile this into:
...
movl $1, %edx
movl $65, %ecx
.L4:
cmpl $4, %eax
ja .L2
movb %cl, (%rsi)
leal 48(%rax), %r8d
jmp .L3
.L2:
leal 97(%rax), %edi
movb %dil, (%rsi)
movl %ecx, %r8d
.L3:
mov %edx, %edi
movb %r8b, (%rsp,%rdi)
addl $1, %eax
addl $1, %ecx
addl $2, %edx
addq $2, %rsi
cmpl $10, %eax
jne .L4
movb $0, 20(%rsp)
movq %rsp, %rdx
movl $.LC0, %esi
movl $1, %edi
movl $0, %eax
call __printf_chk
...
GCC -O2 will actually remove the entire loop and replace it by a stream of assignments.

Consider the following C code:
int i = 10;
if(i < 9) {
puts("1234");
}
And also the equivalent C code:
int i = 10;
if(i <= 8) {
puts("asdf");
}
Under no optimisation, both generate the exact same assembly sequence:
40052c: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp)
400533: 83 7d fc 08 cmpl $0x8,-0x4(%rbp)
400537: 7f 0a jg 400543 <main+0x1f>
400539: bf 3c 06 40 00 mov $0x40063c,%edi
40053e: e8 d5 fe ff ff callq 400418 <puts#plt>
400543: .. .. .. .. .. .. ..
Since I am not familiar with the GCC implementation, I can only speculate as to why the conversion is done at all. Perhaps it makes the job of the code generator easier because it only has to handle a single case. I expect someone can come up with a more definitive answer.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Can a compiler generate useless assembly code? - c

Related

Why is this stack variable in C not in a register?

How can gcc -O3 make the run so fast?

What c code would compile to something like `call *%eax`?

how to know where is the register variable stored?

why less than expression converts into less than or equal to expression in gcc

Categories

Resources