This question already has an answer here:
Regarding post increment
(1 answer)
Closed 9 years ago.
int demo()
{
static int i = 1;
return i++;
}
int main()
{
printf("%d %d %d\n", demo(), demo(), demo());
return 0;
}
output:-
3 2 1
During the first demo call, 1 is returned.
I have heard that when return statement is executed, then control passes to the calling function without any further execution of code in called function.
So my question is that in my code when 1 is returned in first call, Why its value is incremented?
In other words I want to know that after returning 1 , why ++ is executed?
int demo()
{
static int i = 1;
return i++;
}
int main()
{
printf("%d %d %d\n", demo1(), demo2(), demo3());
return 0;
}
The order of execution of demo_i() is language dependent .
Now, comes use of static keyword .
Static variables persist on the stack through the entire duration of the program , even after function ends and returns the value .
Due to this , 1st time : i=1
return 1 , increment to 2 .
2nd time : i=2
return 2 , increment to 3 .
3rd time : i=3
return 3 , increment to 4 .
Hope this helps !
Three points to keep in mind here:
static variables into functions persist through the entire duration of the program as soon as they're created for the first time
That variable returned also has the postfix ++ operator which means: "use the value (i.e. return it) and increment it AFTERWARDS": the incremented value is not returned.
That's why that variable has "memory" of what happened and gets incremented.
-> Why you're seeing "3 2 1" instead of "1 2 3"?
The order in which the parameters are evaluated is not known 'a priori' and it's up to the compiler decide it, see https://stackoverflow.com/a/12960263/1938163
If you really want to know how is it possible that the value gets first returned and then incremented, take a look at the generated asm code:
demo(): # #demo()
movl demo()::i, %eax # move i and put it into eax
movl %eax, %ecx # Move eax into ecx -> eax will be used/returned!
addl $1, %ecx # Increment ecx
movl %ecx, demo()::i # save ecx into i -> this is for the next round!
ret # returns!
main: # #main
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl $0, -4(%rbp)
callq demo() # Call demo()
movl %eax, -8(%rbp) # save eax in rbp-8 (contains 1, demo::i is 2 for the next round)
callq demo() # Call demo()
movl %eax, -12(%rbp) # save eax in rbp-12 (contains 2, demo::i is 3 for the next round)
callq demo() # Call demo()
leaq .L.str, %rdi # load str address
movl -8(%rbp), %esi # esi points to 1
movl -12(%rbp), %edx # edx points to 2
movl %eax, %ecx # move eax (3) into ecx (demo::i is 4 but doesn't get used)
movb $0, %al # needed by the ABI to call printf
callq printf # call printf() and display 3 2 1
movl $0, %ecx
movl %eax, -16(%rbp)
movl %ecx, %eax
addq $16, %rsp
popq %rbp
ret
demo()::i:
.L.str:
.asciz "%d %d %d\n"
The 64-bit ABI uses registers (RDI, RSI, RDX, RCX, R8 and R9) instead of the stack for argument passing.
[...] I want to know that after returning 1, why ++ is executed?
The postfix operator is defined by the C Standard to work like this:
6.5.2.4 Postfix increment and decrement operators
[...]
2 The result of the postfix ++ operator is the value of the operand. After the result is
obtained, the value of the operand is incremented. (That is, the value 1 of the appropriate
type is added to it.)
So before the return is executed i is increment, but as the result of the postfix operation is the "original" value, return returns this "original" value.
Your function returns old value of i and increments it . Since you used static keyword the value of i is stored and available for next call (it is not vanished after the call).
I have heard that when return statement is executed, then control passes to the calling function without any further execution of code in called function.
You heard right. But it doesn't mean that the statement which is returned by return statement does not executed. See an example:
return a + b;
Now this statement executed then a+b get evaluated first and then its value is returned to the caller. In a similar way when
return i++;
get executed then i++ is executed. It returns the previous value of i and increment it by 1.
Related
Consider the following code for producing a list of numbers from 0 to 9 along with values of 2 and -3 raised to the power of the corresponding number from the list:
#include <stdio.h>
int power(int m, int n);
main()
{
int i;
for (i = 0; i <= 10; ++i)
printf("%d %d %d\n", i, power(2, i), power(-3, i));
return 0;
}
int power(int base, int n)
{
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
// return statement purposefully omitted. //
}
Of course the program does not work properly without the return statement for the power function, however by running the written code I get the following output:
0 1 1
1 2 2
2 3 3
3 4 4
4 5 5
5 6 6
6 7 7
7 8 8
8 9 9
9 10 10
And I'm wondering where are the numbers in the second and third column of the output coming from? In lack of a valid return value of power, the control transfers back to the calling function, but why does it output these numbers?
As pointed out by #dyukha and #Daniel H, the value returned is "whatever is in your EAX register". Your function finishes on a for-loop, hence the last instruction realized before returning (ending the function) probably was a branching test, to check if i <= n (your loop condition). You can actually check on what variable is set to your EAX register by using your compiler to generate the assembly version of your code (option -S). You may try to follow what values are set into your register before the call to
popq %rbp
retq
at the end of your function.
On my computer, I tried with Apple LLVM version 9.0.0 (clang-900.0.39.2), which generate the following for my function:
movl %edi, -8(%rbp)
movl %esi, -12(%rbp)
movl $1, -20(%rbp)
movl $1, -16(%rbp)
LBB2_1: ## =>This Inner Loop Header: Depth=1
movl -12(%rbp), %eax
cmpl -16(%rbp), %eax
jl LBB2_4
## BB#2: ## in Loop: Header=BB2_1 Depth=1
movl -20(%rbp), %eax
imull -8(%rbp), %eax
movl %eax, -20(%rbp)
## BB#3: ## in Loop: Header=BB2_1 Depth=1
movl -16(%rbp), %eax
addl $1, %eax
movl %eax, -16(%rbp)
jmp LBB2_1
LBB2_4:
movl -4(%rbp), %eax
popq %rbp
retq
As you can see, I have 4 remarkable addresses: -8(%rbp), -12(%rbp), -16(%rbp) and -20(%rbp). Given the order of declaration in the C code, and the order of initialisation, -8(%rbp) is base, -12(%rbp) is n, -16(%rbp) is i and -20(%rbp) is p.
LBB2_1 is your loop condition. The instructions for the check are
* move value of n in %eax
* is the value of %eax lower then the value of i, store the result in %eax
* if %eax says that it was lower, go to label LBB2_4 else continue to next instruction
The three instructions after BB#2 are you actual multiplication. The three instructions after BB#3 are your increment of i, which is followed by an unconditional jump to your loop condition at label LBB2_1.
The ending of the power function is to take whatever is in memory address -4(%rbp), tu put it in %eax, and then leave the function (reset stack pointer, put value in %eax to the proper variable in the previous stack frame).
In the code produced by my compiler, I don't see the same result as you do, as I get every time the last two columns equal to 0 (-4(%rbp) is never set to anything). Except when adding a call to another function foo, taking two integers as parameters, having two local integer variables (to ensure that my new stack frame will be the same size as the power function one). This function actually set the address -4(%rbp). When calling my function right before entering the loop, I effectively find the value from -4(%rbp) as set in my function foo returned in my function power.
As a colleague just told me, playing with undefined behaviour is dangerous as your compiler is allowed to treat it any way it likes. It could be summoning a demon for what it's worth.
In definitive, or TL;DR, this undefined behaviour is handle in some way by the compiler. Whether some value is moved from one local variable, defined or not, or nothing special is moved to the register %eax is up to the compiler. Anyway, whatever was hanging in there, got returned when retq is called.
Suppose I have the following C code:
#include
int main()
{
int x = 11;
int y = x + 3;
printf("%d\n", x);
return 0;
}
Then I compile it into asm using gcc, I get this(with some flag removed):
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl $11, -4(%rbp)
movl -4(%rbp), %eax
addl $3, %eax
movl %eax, -8(%rbp)
movl -4(%rbp), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
leave
ret
My problem is why it is movl -4(%rbp), %eax followed by movl %eax, %esi, rather than a simple movl -4(%rbp), %esi(which works well according to my experiment)?
You probably did not enable optimizations.
Without optimization the compiler will produce code like this. For one it does not allocate data to registers, but on the stack. This means that when you operate on variables they will first be transferred to a register and then operated on.
So given that x lives is allocated in -4(%rbp) and this is what the code appears as if you translate it directly without optimization. First you move 11 to the storage of x. This means:
movl $11, -4(%rbp)
done with the first statement. The next statement is to evaluate x+3 and place in the storage of y (which is -8(%rbp), this is done without regard of the previous generated code:
movl -4(%rbp), %eax
addl $3, %eax
movl %eax, -8(%rbp)
done with the second statement. By the way that is divided into two parts: evaluation of x+3 and the storage of the result. Then the compiler continues to generate code for the printf statement, again without taking earlier statements into account.
If you on the other hand would enable optimization the compiler does a number of smart and to humans obvious things. One thing is that it allows variables to be allocated to registers, or at least keep track on where one can find the value of the variable. In this case the compiler would for example know in the second statement that x is not only stored at -4(%ebp) it will also know that it is stored in $11 (yes it nows it's actual value). It can then use this to add 3 to it which means it knows the result to be 14 (but it's smarter that that - it has also seen that you didn't use that variable so it skips that statement entirely). Next statement is the printf statement and here it can use the fact that it knows x to be 11 and pass that directly to printf. By the way it also realizes that it doesn't get to use the storage of x at -4(%ebp). Finally it may know what printf does (since you included stdio.h) so can analyze the format string and do the conversion at compile time to replace the printf statement to a call that directly writes 14 to standard out.
This is a homework question.
I am attempting to obtain information from the following assembly code (x86 linux machine, compiled with gcc -O2 optimization). I have commented each section to show what I know. A big chunk of my assumptions could be wrong, but I have done enough searching to the point where I know I should ask these questions here.
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "result %lx\n" //Printed string at end of program
.text
main:
.LFB13:
xorl %esi, %esi // value of esi = 0; x
movl $1, %ecx // value of ecx = 1; result
xorl %edx, %edx // value of edx = 0; Loop increment variable (possibly mask?)
.L2:
movq %rcx, %rax // value of rax = 1; ?
addl $1, %edx // value of edx = 1; Increment loop by one;
salq $3, %rcx // value of rcx = 8; Shift left rcx;
andl $3735928559, %eax // value of eax = 1; Value AND 1 = 1;
orq %rax, %rsi // value of rsi = 1; 1 OR 0 = 1;
cmpl $22, %edx // edx != 22
jne .L2 // if true, go back to .L2 (loop again)
movl $.LC0, %edi // Point to string
xorl %eax, %eax // value of eax = 0;
jmp printf // print
.LFE13: ret // return
And I am supposed to turn it into the following C code with the blanks filled in
#include <stdio.h>
int main()
{
long x = 0x________;
long result = ______;
long mask;
for (mask = _________; mask _______; mask = ________) {
result |= ________;
}
printf("result %lx\n",result);
}
I have a couple of questions and sanity checks that I want to make sure I am getting right since none of the similar examples I have found are for optimized code. Upon compiling some trials myself I get something close but the middle part of L2 is always off.
MY UNDERSTANDING
At the beginning, esi is xor'd with itself, resulting in 0 which is represented by x. 1 is then added to ecx, which would be represented by the variable result.
x = 0; result = 1;
Then, I believe a loop increment variable is stored in edx and set to 0. This will be used in the third part of the for loop (update expression). I also think that this variable must be mask, because later on 1 is added to edx, signifying a loop increment (mask = mask++), along with edx being compared in the middle part of the for loop (test expression aka mask != 22).
mask = 0; (in a way)
The loop is then entered, with rax being set to 1. I don't understand where this is used at all since there is no fourth variable I have declared, although it shows up later to be anded and zeroed out .
movq %rcx, %rax;
The loop variable is then incremented by one
addl $1, %edx;
THE NEXT PART MAKES THE LEAST AMOUNT OF SENSE TO ME
The next three operations I feel make up the body expression of the loop, however I have no idea what to do with them. It would result in something similar to result |= x ... but I don't know what else
salq $3, %rcx
andl $3735928559, %eax
orq %rax, %rsi
The rest I feel I have a good grasp on. A comparison is made ( if mask != 22, loop again), and the results are printed.
PROBLEMS I AM HAVING
I don't understand a couple of things.
1) I don't understand how to figure out my variables. There seem to be 3 hardcoded ones along with one increment or temporary storage variable that is found in the assembly (rax, rcx, rdx, rsi). I think rsi would be the x , and rcx would be result, yet I am unsure of if mask would be rdx or rax, and either way, what would the last variable be?
2) What do the 3 expressions of which I am unsure of do? I feel that I have them mixed up with the incrementation somehow, but without knowing the variables I don't know how to go about solving this.
Any and all help will be great, thank you!
The answer is :
#include <stdio.h>
int main()
{
long x = 0xDEADBEEF;
long result = 0;
long mask;
for (mask = 1; mask != 0; mask = mask << 3) {
result |= mask & x;
}
printf("result %lx\n",result);
}
In the assembly :
rsi is result. We deduce that because it is the only value that get ORed, and it is the second argument of the printf (In x64 linux, arguments are stored in rdi, rsi, rdx, and some others, in order).
x is a constant that is set to 0xDEADBEEF. This is not deductible for sure, but it makes sense because it seems to be set as a constant in the C code, and doesn't seem to be set after that.
Now for the rest, it is obfuscated by an anti-optimization by GCC. You see, GCC detected that the loop would be executed exactly 21 times, and thought is was clever to mangle the condition and replace it by a useless counter. Knowing that, we see that edx is the useless counter, and rcx is mask. We can then deduce the real condition and the real "increment" operation. We can see the <<= 3 in the assembly, and notice that if you shift left a 64-bit int 22 times, it becomes 0 ( shift 3, 22 times means shift 66 bits, so it is all shifted out).
This anti-optimization is sadly really common for GCC. The assembly can be replaced with :
.LFB13:
xorl %esi, %esi
movl $1, %ecx
.L2:
movq %rcx, %rax
andl $3735928559, %eax
orq %rax, %rsi
salq $3, %rcx // implicit test for 0
jne .L2
movl $.LC0, %edi
xorl %eax, %eax
jmp printf
It does exactly the same thing, but we removed the useless counter and saved 3 assembly instructions. It also matches the C code better.
Let's work backwards a bit. We know that result must be the second argument to printf(). In the x86_64 calling convention, that's %rsi. The loop is everything between the .L2 label and the jne .L2 instruction. We see in the template that there's a result |= line at the end of the loop, and indeed, there's an orl instruction there with %rsi as its target, so that checks out. We can now see what it's initialized to at the top of .main.
ElderBug is correct that the compiler spuriously optimized by adding a counter. But we can still figure out: which instruction runs immediately after the |= when the loop repeats? That must be the third part of the loop. What runs immediately before the body of the loop? That must be the loop initialization. Unfortunately, you'll have to figure out what would have happened on the 22nd iteration of the original loop to reverse-engineer the loop condition. (But sal is a left-shift, and that line is a vestige of the original loop condition, which would have been followed by a conditional branch before the %rdx test was inserted.)
Note that the code keeps a copy of the value of mask around in %rcx before modifying it in %rax, and x is folded into a constant (take a close look at the andl line).
Also note that you can feed the .S file to gas to get a .o and see what it does.
I got this short C Code.
#include <stdint.h>
uint64_t multiply(uint32_t x, uint32_t y) {
uint64_t res;
res = x*y;
return res;
}
int main() {
uint32_t a = 3, b = 5, z;
z = multiply(a,b);
return 0;
}
There is also an Assembler Code for the given C code above.
I don't understand everything of that assembler code. I commented each line and you will find my question in the comments for each line.
The Assembler Code is:
.text
multiply:
pushl %ebp // stores the stack frame of the calling function on the stack
movl %esp, %ebp // takes the current stack pointer and uses it as the frame for the called function
subl $16, %esp // it leaves room on the stack, but why 16Bytes. sizeof(res) = 8Bytes
movl 8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because
imull 12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
movl %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
movl $0, -4(%ebp) // here as well
movl -8(%ebp), %eax // and here again.
movl -4(%ebp), %edx // also here
leave
ret
main:
pushl %ebp // stores the stack frame of the calling function on the stack
movl %esp, %ebp // // takes the current stack pointer and uses it as the frame for the called function
andl $-8, %esp // what happens here and why?
subl $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
movl $3, 20(%esp) // 3 gets pushed on the stack
movl $5, 16(%esp) // 5 also get pushed on the stack
movl 16(%esp), %eax // what does 16(%esp) mean and what happened with z?
movl %eax, 4(%esp) // we got the here as well
movl 20(%esp), %eax // and also here
movl %eax, (%esp) // what does happen in this line?
call multiply // thats clear, the function multiply gets called
movl %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
movl $0, %eax // I suppose, this line is because of "return 0;"
leave
ret
Negative references relative to %ebp are for local variables on the stack.
movl 8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because`
%eax = x
imull 12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
%eax = %eax * y
movl %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
(u_int32_t)res = %eax // sets low 32 bits of res
movl $0, -4(%ebp) // here as well
clears upper 32 bits of res to extend 32-bit multiplication result to uint64_t
movl -8(%ebp), %eax // and here again.
movl -4(%ebp), %edx // also here
return ret; //64-bit results are returned as a pair of 32-bit registers %edx:%eax
As for the main, see x86 calling convention which may help making sense of what happens.
andl $-8, %esp // what happens here and why?
stack boundary is aligned by 8. I believe it's ABI requirement
subl $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
Multiples of 8 (probably due to alignment requirements)
movl $3, 20(%esp) // 3 gets pushed on the stack
a = 3
movl $5, 16(%esp) // 5 also get pushed on the stack
b = 5
movl 16(%esp), %eax // what does 16(%esp) mean and what happened with z?
%eax = b
z is at 12(%esp) and is not used yet.
movl %eax, 4(%esp) // we got the here as well
put b on the stack (second argument to multiply())
movl 20(%esp), %eax // and also here
%eax = a
movl %eax, (%esp) // what does happen in this line?
put a on the stack (first argument to multiply())
call multiply // thats clear, the function multiply gets called
multiply returns 64-bit result in %edx:%eax
movl %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
z = (uint32_t) multiply()
movl $0, %eax // I suppose, this line is because of "return 0;"
yup. return 0;
Arguments are pushed onto the stack when the function is called. Inside the function, the stack pointer at that time is saved as the base pointer. (You got that much already.) The base pointer is used as a fixed location from which to reference arguments (which are above it, hence the positive offsets) and local variables (which are below it, hence the negative offsets).
The advantage of using a base pointer is that it is stable throughout the entire function, even when the stack pointer changes (due to function calls and new scopes).
So 8(%ebp) is one argument, and 12(%ebp) is the other.
The code is likely using more space on the stack than it needs to, because it is using temporary variables that could be optimized out of you had optimization turned on.
You might find this helpful: http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames
I started typing this as a comment but it was getting too long to fit.
You can compile your example with -masm=intel so the assembly is more readable. Also, don't confuse the push and pop instructions with mov. push and pop always increments and decrements esp respectively before derefing the address whereas mov does not.
There are two ways to store values onto the stack. You can either push each item onto it one item at a time or you can allocate up-front the space required and then load each value onto the stackslot using mov + relative offset from either esp or ebp.
In your example, gcc chose the second method since that's usually faster because, unlike the first method, you're not constantly incrementing esp before saving the value onto the stack.
To address your other question in comment, x86 instruction set does not have a mov instruction for copying values from memory location a to another memory location b directly. It is not uncommon to see code like:
mov eax, [esp+16]
mov [esp+4], eax
mov eax, [esp+20]
mov [esp], eax
call multiply(unsigned int, unsigned int)
mov [esp+12], eax
Register eax is being used as an intermediate temporary variable to help copy data between the two stack locations. You can mentally translate the above as:
esp[4] = esp[16]; // argument 2
esp[0] = esp[20]; // argument 1
call multiply
esp[12] = eax; // eax has return value
Here's what the stack approximately looks like right before the call to multiply:
lower addr esp => uint32_t:a_copy = 3 <--. arg1 to 'multiply'
esp + 4 uint32_t:b_copy = 5 <--. arg2 to 'multiply'
^ esp + 8 ????
^ esp + 12 uint32_t:z = ? <--.
| esp + 16 uint32_t:b = 5 | local variables in 'main'
| esp + 20 uint32_t:a = 3 <--.
| ...
| ...
higher addr ebp previous frame
I am learning assembly and I have this function that contains some lines I just don't understand:
. globl
. text
factR:
cmpl $0 ,4(% esp )
jne cont
movl $1 ,%eax
ret
cont :
movl 4(%esp),%eax
decl %eax
pushl %eax // (1)
call factR // (2)
addl $4,%esp // (3)
imull 4(%esp),%eax
ret
and the C code corresponding to it is:
int factR ( int n ) {
if ( n != 0 )
return n;
else
return n ∗ factR ( n − 1 );
}
I am not sure about the lines marked with numbers.
pushl %eax: does it mean we put the contents of %eax in
%esp?
So we call factR(). Will the result of that be in %esp when we come back here to the next instructions?
addl $4,%esp not sure about this one, are we adding 4 to the number stored in %esp or do we add 4 to the pointer to get the next number or something similar?
It appears that the factR() function follows the C calling convention (cdecl). It is where the caller pushes the arguments to the function call onto the stack and the caller cleans up the stack (undoes the changes to the stack that was made to do the function call) when the function returns.
The first push (1) is putting the contents of the %eax register as the argument to the following call. Then the actual call to the function is made (2). Then the stack is cleaned (3) by resetting the stack pointer %esp back to the state when it didn't have the argument pushed back in step 1. It pushed one 32-bit value so it must adjust the pointer by 4-bytes.