Is there a difference between declaring a variable first and then assigning a value or directly declaring and assigning a value in the compiled function? Does the compiled function do the same work? e.g, does it still read the parameters, declare variables and then assign value or is there a difference between the two examples in the compiled versions?
example:
void foo(u32 value) {
u32 extvalue = NULL;
extvalue = value;
}
compared with
void foo(u32 value) {
u32 extvalue = value;
}
I am under the impression that there is no difference between those two functions if you look at the compiled code, e.g they will look the same and i will not be able to tell which is which.
it depends on the compiler & the optimization level of course.
A dumb compiler/low optimization level when it sees:
u32 extvalue = NULL;
extvalue = value;
could set to NULL then to value in the next line.
Since extvalue isn't used in-between, the NULL initialization is useless and most compilers directly set to value as an easy optimization
Note that declaring a variable isn't really an instruction per se. The compiler just allocates auto memory to store this variable.
I've tested a simple code with and without assignment and the result is diff
erent when using gcc compiler 6.2.1 with -O0 (don't optimize anything) flag:
#include <stdio.h>
void foo(int value) {
int extvalue = 0;
extvalue = value;
printf("%d",extvalue);
}
disassembled:
Disassembly of section .text:
00000000 <_foo>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 28 sub $0x28,%esp
6: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%ebp) <=== here we see the init
d: 8b 45 08 mov 0x8(%ebp),%eax
10: 89 45 f4 mov %eax,-0xc(%ebp)
13: 8b 45 f4 mov -0xc(%ebp),%eax
16: 89 44 24 04 mov %eax,0x4(%esp)
1a: c7 04 24 00 00 00 00 movl $0x0,(%esp)
21: e8 00 00 00 00 call 26 <_foo+0x26>
26: c9 leave
27: c3 ret
now:
void foo(int value) {
int extvalue;
extvalue = value;
printf("%d",extvalue);
}
disassembled:
Disassembly of section .text:
00000000 <_foo>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 28 sub $0x28,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: 89 45 f4 mov %eax,-0xc(%ebp)
c: 8b 45 f4 mov -0xc(%ebp),%eax
f: 89 44 24 04 mov %eax,0x4(%esp)
13: c7 04 24 00 00 00 00 movl $0x0,(%esp)
1a: e8 00 00 00 00 call 1f <_foo+0x1f>
1f: c9 leave
20: c3 ret
21: 90 nop
22: 90 nop
23: 90 nop
the 0 init has disappeared. The compiler didn't optimize the initialization in that case.
If I switch to -O2 (good optimization level) the code is then identical in both cases, compiler found that the initialization wasn't necessary (still, silent, no warnings):
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 18 sub $0x18,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: c7 04 24 00 00 00 00 movl $0x0,(%esp)
10: 89 44 24 04 mov %eax,0x4(%esp)
14: e8 00 00 00 00 call 19 <_foo+0x19>
19: c9 leave
1a: c3 ret
I tried these functions in godbolt:
void foo(uint32_t value)
{
uint32_t extvalue = NULL;
extvalue = value;
}
void bar(uint32_t value)
{
uint32_t extvalue = value;
}
I ported to the actual type uint32_t rather than u32 which is not standard. The resulting non-optimized assembly generated by x86-64 GCC 6.3 is:
foo(unsigned int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi
mov DWORD PTR [rbp-4], 0
mov eax, DWORD PTR [rbp-20]
mov DWORD PTR [rbp-4], eax
nop
pop rbp
ret
bar(unsigned int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi
mov eax, DWORD PTR [rbp-20]
mov DWORD PTR [rbp-4], eax
nop
pop rbp
ret
So clearly the non-optimized code retains the (weird, as pointed out by others since it's not written to a pointer) NULL assignment, which is of course pointless.
I'd vote for the second one since it's shorter (less to hold in one's head when reading the code), and never allow/recommend the pointless setting to NULL before overwriting with the proper value. I would consider that a bug, since you're saying/doing something you don't mean.
Related
I have a main.c file
int boyut(const char* string);
char greeting[6] = {"Helle"};
int main(){
greeting[5] = 0x00;
int a = boyut(greeting);
return 0;
}
int boyut(const char* string){
int len=0;
while(string[len]){
len++;
}
return len;
}
I compile it with GCC command gcc -Wall -m32 -nostdlib main.c -o main.o
When I check disassembly, I see the variable greeting is placed in .data segment. And before calling boyut it's not pushed into stack. Inside the boyut function, it acts like variable greeting is in stack segment. So that variable actually not being accessed inside the function. Why is it generating a code like this? How can I correct this?
Disassembly of section .text:
080480f8 <main>:
80480f8: 55 push ebp
80480f9: 89 e5 mov ebp,esp
80480fb: 83 ec 18 sub esp,0x18
80480fe: c6 05 05 a0 04 08 00 mov BYTE PTR ds:0x804a005,0x0
8048105: 83 ec 0c sub esp,0xc
8048108: 68 00 a0 04 08 push 0x804a000
804810d: e8 0d 00 00 00 call 804811f <boyut>
8048112: 83 c4 10 add esp,0x10
8048115: 89 45 f4 mov DWORD PTR [ebp-0xc],eax
8048118: b8 00 00 00 00 mov eax,0x0
804811d: c9 leave
804811e: c3 ret
0804811f <boyut>:
804811f: 55 push ebp
8048120: 89 e5 mov ebp,esp
8048122: 83 ec 10 sub esp,0x10
8048125: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-0x4],0x0
804812c: eb 04 jmp 8048132 <boyut+0x13>
804812e: 83 45 fc 01 add DWORD PTR [ebp-0x4],0x1
8048132: 8b 55 fc mov edx,DWORD PTR [ebp-0x4]
8048135: 8b 45 08 mov eax,DWORD PTR [ebp+0x8]
8048138: 01 d0 add eax,edx
804813a: 0f b6 00 movzx eax,BYTE PTR [eax]
804813d: 84 c0 test al,al
804813f: 75 ed jne 804812e <boyut+0xf>
8048141: 8b 45 fc mov eax,DWORD PTR [ebp-0x4]
8048144: c9 leave
8048145: c3 ret
main.o: file format elf32-i386
Contents of section .data:
804a000 48656c6c 6500 Helle.
The function boyut is declared like this:
int boyut(const char* string);
That means: boyut takes a pointer to char and returns an int. And indeed, the compiler pushes a point to char on the stack. This pointer points to the beginning of greeting. This happens, because in C, an array is implicitly converted to a pointer to its first element under most circumstances.
If you want to pass an array to a function so it is copied to the function, you have to wrap the array into a structure and pass that.
Consider the following piece of code.
#include <stdio.h>
void f(int *x, int *y)
{
(*x)++;
(*y)++;
}
int main()
{
int x=5, y=5;
f(&x, &y);
return 0;
}
I know that the function f is not reentrant. One of the stupid things I am thinking is to do (*x)++ + (*y)++ in one line and discard the sum. I wonder that multiple assembly instructions will be generated for evaluation of this expression. Will the interrupt be served in between evaluation of expression?
You won't get anything atomic with that...
c.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 89 7d f8 mov %rdi,-0x8(%rbp)
8: 48 89 75 f0 mov %rsi,-0x10(%rbp)
c: 48 8b 45 f8 mov -0x8(%rbp),%rax
10: 8b 00 mov (%rax),%eax
12: 8d 50 01 lea 0x1(%rax),%edx
15: 48 8b 45 f8 mov -0x8(%rbp),%rax
19: 89 10 mov %edx,(%rax)
1b: 48 8b 45 f0 mov -0x10(%rbp),%rax
1f: 8b 00 mov (%rax),%eax
21: 8d 50 01 lea 0x1(%rax),%edx
24: 48 8b 45 f0 mov -0x10(%rbp),%rax
28: 89 10 mov %edx,(%rax)
2a: 5d pop %rbp
2b: c3 retq
And it gets a lot better with -O2, but still it's not atomic.
c.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f>:
0: 83 07 01 addl $0x1,(%rdi)
3: 83 06 01 addl $0x1,(%rsi)
6: c3 retq
And, at least for GCC, the exact same code is generated for (*x)++ + (*y++). Anyway, may you elaborate a little bit on your question? You're being too broad and this code is reentrant as long as x and y are not the same on different entries. Otherwise, you should give us more details about you're intending.
Edit: It's (apparently, unless there's some hidden black magic...) impossible to do such a thing atomically on a x86(-64) architecture. Anyway, it's non-portable to consider an operation "atomic" if it is done in a single instruction. That's specific to x86(-64) CPUs.
I have this test.c on my Ubuntu14.04 x86_64 system.
void foo(int a, long b, int c) {
}
int main() {
foo(0x1, 0x2, 0x3);
}
I compiled this with gcc --no-stack-protector -g test.c -o test and got the assembly code with objdump -dS test -j .text
00000000004004ed <_Z3fooili>:
void foo(int a, long b, int c) {
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: 89 7d fc mov %edi,-0x4(%rbp)
4004f4: 48 89 75 f0 mov %rsi,-0x10(%rbp)
4004f8: 89 55 f8 mov %edx,-0x8(%rbp) // !!Attention here!!
}
4004fb: 5d pop %rbp
4004fc: c3 retq
00000000004004fd <main>:
int main() {
4004fd: 55 push %rbp
4004fe: 48 89 e5 mov %rsp,%rbp
foo(0x1, 0x2, 0x3);
400501: ba 03 00 00 00 mov $0x3,%edx
400506: be 02 00 00 00 mov $0x2,%esi
40050b: bf 01 00 00 00 mov $0x1,%edi
400510: e8 d8 ff ff ff callq 4004ed <_Z3fooili>
}
400515: b8 00 00 00 00 mov $0x0,%eax
40051a: 5d pop %rbp
40051b: c3 retq
40051c: 0f 1f 40 00 nopl 0x0(%rax)
I know that the function parameters should be pushed to stack from right to left in sequence. So I was expecting this
void foo(int a, long b, int c) {
push %rbp
mov %rsp,%rbp
mov %edi,-0x4(%rbp)
mov %rsi,-0x10(%rbp)
mov %edx,-0x14(%rbp) // c should be push on stack after b, not after a
But gcc seemed clever enough to push parameter c(0x3) right after a(0x1) to save the four bytes which should be reserved for data alignment of b(0x2). Can someone please explain this and show me some documentation on why gcc did this?
The parameters are passed in registers - edi, esi, edx (then rcx, r8, r9 and only then pushed on stack) - just what the Linux amd64 calling convention mandates.
What you see in your function is just how the compiler saves them upon entry when compiling with -O0, so they're in memory where a debugger can modify them. It is free to do it in any way it wants, and it cleverly does this space optimization.
The only reason it does this is that gcc -O0 always spills/reloads all C variables between C statements to support modifying variables and jumping between lines in a function with a debugger.
All this would be optimized out in release build in the end.
I wrote the following program:
#include <stdio.h>
int main()
{
int i = 0;
for (; i < 4; i++)
{
printf("%i",i);
}
return 0;
}
I compiled it using gcc test.c -o test.o, then disassembled it using objdump -d -Mintel test.o. The assembly code I got (at least the relevant part) is the following:
0804840c <main>:
804840c: 55 push ebp
804840d: 89 e5 mov ebp,esp
804840f: 83 e4 f0 and esp,0xfffffff0
8048412: 83 ec 20 sub esp,0x20
8048415: c7 44 24 1c 00 00 00 mov DWORD PTR [esp+0x1c],0x0
804841c: 00
804841d: eb 19 jmp 8048438 <main+0x2c>
804841f: 8b 44 24 1c mov eax,DWORD PTR [esp+0x1c]
8048423: 89 44 24 04 mov DWORD PTR [esp+0x4],eax
8048427: c7 04 24 e8 84 04 08 mov DWORD PTR [esp],0x80484e8
804842e: e8 bd fe ff ff call 80482f0 <printf#plt>
8048433: 83 44 24 1c 01 add DWORD PTR [esp+0x1c],0x1
8048438: 83 7c 24 1c 03 cmp DWORD PTR [esp+0x1c],0x3
804843d: 7e e0 jle 804841f <main+0x13>
804843f: b8 00 00 00 00 mov eax,0x0
8048444: c9 leave
8048445: c3 ret
I noticed that, although my compare operation was i < 4, the assembly code is (after disassembly) i <= 3. Why does that happen? Why would it use JLE instead of JL?
Loops that count upwards, and have constant limit, are very common. The compiler has two options to implement the check for loop termination - JLE and JL. While the two ways seem absolutely equivalent, consider the following.
As you can see in the disassembly listing, the constant (3 in your case) is encoded in 1 byte. If your loop counted to 256 instead of 4, it would be impossible to use such an efficient encoding for the CMP instruction, and the compiler would have to use a "larger" encoding. So JLE provides a marginal improvement in code density (which is ultimately good for performance because of caching).
It would JLE because it shifted the value by one.
if (x < 4) {
// ran when x is 3, 2, 1, 0, -1, ... MIN_INT.
}
is logically equivalent to
if (x <= 3) {
// ran when x is 3, 2, 1, 0, -1, ... MIN_INT.
}
Why the compiler chose one internal representation over another is often a matter of optimization, but really it is hard to know if optimization was the true driver. In any case, functional equivalents like this is the reason why back-mapping isn't 100% accurate. There are many ways to write a condition that has the same effect over the same inputs.
hi i'm such newb in assemble and OS world. and yes this is my homework which i'm in stuck in deep dark of i386 manual. please help me or give me some hint.. here's code i have to analyze ine by line. this function is part of EOS(educational OS), doing about interrupt request in hal(hardware abstraction layer). i did "objdump -d interrupt.o" and got this assemble code. of course in i386.
00000000 <eos_ack_irq>:
0: 55 push %ebp ; push %ebp to stack to save stack before
1: b8 fe ff ff ff mov $0xfffffffe,%eax ; what is this??
6: 89 e5 mov %esp,%ebp ; couple with "push %ebp". known as prolog assembly function.
8: 8b 4d 08 mov 0x8(%ebp),%ecx ; set %ecx as value of (%ebp+8)...and what is this do??
b: 5d pop %ebp ; pop the top of stack to %ebp. i know this is for getting back to callee..
c: d3 c0 rol %cl,%eax ; ????? what is this for???
e: 21 05 00 00 00 00 and %eax,0x0 ; make %eax as 0. for what??
14: c3 ret ; return what register??
00000015 <eos_get_irq>:
15: 8b 15 00 00 00 00 mov 0x0,%edx
1b: b8 1f 00 00 00 mov $0x1f,%eax
20: 55 push %ebp
21: 89 e5 mov %esp,%ebp
23: 56 push %esi
24: 53 push %ebx
25: bb 01 00 00 00 mov $0x1,%ebx
2a: 89 de mov %ebx,%esi
2c: 88 c1 mov %al,%cl
2e: d3 e6 shl %cl,%esi
30: 85 d6 test %edx,%esi
32: 75 06 jne 3a <eos_get_irq+0x25>
34: 48 dec %eax
35: 83 f8 ff cmp $0xffffffff,%eax
38: 75 f0 jne 2a <eos_get_irq+0x15>
3a: 5b pop %ebx
3b: 5e pop %esi
3c: 5d pop %ebp
3d: c3 ret
0000003e <eos_disable_irq_line>:
3e: 55 push %ebp
3f: b8 01 00 00 00 mov $0x1,%eax
44: 89 e5 mov %esp,%ebp
46: 8b 4d 08 mov 0x8(%ebp),%ecx
49: 5d pop %ebp
4a: d3 e0 shl %cl,%eax
4c: 09 05 00 00 00 00 or %eax,0x0
52: c3 ret
00000053 <eos_enable_irq_line>:
53: 55 push %ebp
54: b8 fe ff ff ff mov $0xfffffffe,%eax
59: 89 e5 mov %esp,%ebp
5b: 8b 4d 08 mov 0x8(%ebp),%ecx
5e: 5d pop %ebp
5f: d3 c0 rol %cl,%eax
61: 21 05 00 00 00 00 and %eax,0x0
67: c3 ret
and here's pre-assembled C code
/* ack the specified irq */
void eos_ack_irq(int32u_t irq) {
/* clear the corresponding bit in _irq_pending register */
_irq_pending &= ~(0x1<<irq);
}
/* get the irq number */
int32s_t eos_get_irq() {
/* get the highest bit position in the _irq_pending register */
int i = 31;
for(; i>=0; i--) {
if (_irq_pending & (0x1<<i)) {
return i;
}
}
return -1;
}
/* mask an irq */
void eos_disable_irq_line(int32u_t irq) {
/* turn on the corresponding bit */
_irq_mask |= (0x1<<irq);
}
/* unmask an irq */
void eos_enable_irq_line(int32u_t irq) {
/* turn off the corresponding bit */
_irq_mask &= ~(0x1<<irq);
}
so these functions do ack and get and mask and unmask an interrupt request. and i'm stuck at the first one. so if you are mercy enough, would you please get me some hint or answer to analyze the first function? i'll try to get the others... and i'm very sorry for another homework.. (my TA doesn't look email)
21 05 00 00 00 00 (that and) is actually an and with a memory operand (namely and [0], eax) which the AT&T syntax obscures (but technically it does say that, note the absence of a $ sign). It makes more sense that way (the offset of 0 suggests you didn't link the code before disassembling).
mov $0xfffffffe, %eax is doing exactly what it looks like it's doing (note that 0xfffffffe is all ones except the lowest bit), and that means the function has been implemented like this:
_irq_pending &= rotate_left(0xFFFFFFFE, irq);
Saving a not operation. It has to be a rotate there instead of a shift in order to make the low bits 1 if necessary.