#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",*r);
printf("%d\n",*r);
}
Output after running the code:
3
-1073855580
I understand that a[2] is local to fun() but why value is getting changed of same pointer?
The variable a is indeed local to fun. When you return from that function, the stack is popped. The memory itself remains unchanged (for the moment). When you dereference r the first time, the memory is what you'd expect it to be. And since the dereference happens before the call to printf, nothing bad happens. When printf executes, it modifies the stack and the value is wiped out. The second time through you're seeing whatever value happened to be put there by printf the first time through.
The order of events for a "normal" calling convention (I know, I know -- no such thing):
Dereference r (the first time through, this is what it should be)
Push value onto stack (notice this is making a copy of the value) (may wipe out a)
Push other parameters on to stack (order is usually right to left, IIRC) (may wipe out a)
Allocate room for return value on stack (may wipe out a)
Call printf
Push local printf variables onto stack (may wipe out a)
Do your thang
Return from function
If you change int a[2]; to static int a[2]; this will alleviate the problem.
Because r points to a location on the stack that is likely to be overwritten by a function call.
In this case, it's the first call to printf itself which is changing that location.
In detail, the return from fun has that particular location being preserved simply because nothing has overwritten it yet.
The *r is then evaluated (as 3) and passed to printf to be printed. The actual call to printf changes the contents of that location (since it uses the memory for its own stack frame), but the value has already been extracted at that point so it's safe.
On the subsequent call, *r has the different value, changed by the first call. That's why it's different in this case.
Of course, this is just the likely explanation. In reality, anything could be happening since what you've coded up there is undefined behaviour. Once you do that, all bets are off.
As you've mentioned, a[2] is local to fun(); meaning it is created on the stack right before the code within fun() starts executing. When fun exits the stack is popped, meaning it is unwound so that the stack pointer is pointing to where it was before fun started executing.
The compiler is now free to stick whatever it wants into those locations that were unwound. So, it is possible that the first location of a was skipped for a variety of reasons. Maybe it now represents an uninitialized variable. Maybe it was for memory alignment of another variable. Simple answer is, by returning a pointer to a local variable from a function, and then de-referencing that pointer, you're invoking undefined behavior and anything can happen, including demons flying out of your nose.
When you compile you code with the following command:
$ gcc -Wall yourProgram.c
It will yield a warning, which says.
In function ‘fun’:
warning: function returns address of local variable
When r is dereferenced in first printf statement, it's okay as the memory is preserved. However, the second printf statement overwrites the stack and so we get an undesired result.
Because printf is using the stack location and changes it after printing the first value.
Related
I came across this page that illustrates common ways in which dangling pointes are created.
The code below is used to illustrate dangling pointers by returning address of a local variable:
// The pointer pointing to local variable becomes
// dangling when local variable is static.
#include<stdio.h>
int *fun()
{
// x is local variable and goes out of scope
// after an execution of fun() is over.
int x = 5;
return &x;
}
// Driver Code
int main()
{
int *p = fun();
fflush(stdout);
// p points to something which is not valid anymore
printf("%d", *p);
return 0;
}
On running this, this is the compiler warning I get (as expected):
In function 'fun':
12:2: warning: function returns address of local variable [-Wreturn-local-addr]
return &x;
^
And this is the output I get (good so far):
32743
However, when I comment out the fflush(stdout) line, this is the output I get (with the same compiler warning):
5
What is the reason for this behaviour? How exactly is the presence/absence of the fflush command causing this behaviour change?
Returning a pointer to an object on the stack is bad, as you've mentioned. The reason you only see a problem with your fflush() call in place is that the stack is unmodified if it's not there. That is, the 5 is still in place, so the pointer dereference still gives that 5 to you. If you call a function (almost any function, probably) in between fun and printf, it will almost certainly overwrite that stack location, making the later dereference return whatever junk that function happened to leave there.
This is because calling fflush(stdout) writes onto the stack where x was.
Let me explain. The stack in assembly language (which is what all programming languages eventually run as in one way or another) is commonly used to store local variables, return addresses, and function parameters. When a function is called, it pushes these things onto the stack:
the address of where to continue executing code once the function completes.
the parameters to the function, in an order determined by the calling convention used.
the local variables that the function uses.
These things are then popped off of the stack, one by one, simply by changing where the CPU thinks the top of the stack is. This means the data still exists, but it's not guaranteed to continue to exist.
Calling another function after fun() overwrites the previous values above the top of the stack, in this case with the value of stdout, and so the pointer's referenced value changes.
Without calling another function, the data stays there and is still valid when the pointer is dereferenced.
this is the source code
#include <stdio.h>
#include <stdlib.h>
int *fun();
int main()
{
int *j;
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
return 0;
}
int *fun()
{
int k=35;
return &k;
}
output-
35
1637778
the first printf() prints 35 which is the value of k but
In the main() the second printf prints a garbage value rather than printing 35.why?
The problem here is the return from fun is returning the address of a local variable. That address becomes invalid the moment the function returns. You are simply getting lucky on the first call to printf.
Even though the local is technically destroyed when fun returns the C runtime does nothing to actively destroy it. Hence your first use of *j is working because the memory for the local hasn't been written over yet. The implementation of printf though is likely over writing this simply by using its own locals in the method. Hence in the second use of *j you're referring to whatever local printf used and not k.
In order to make this work you need to return an address that points to a value that lives longer than fun. Typically in C this is achieved with malloc
int *fun() {
int* pValue = malloc(sizeof(int));
*pValue = 23;
return pValue;
}
Because the return of malloc lives until you call free this will be valid across multiple uses of printf. The one catch is the calling function now has to tell the program when it is done with the retun of fun. To do this call free after the second call to printf
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
free(j);
Program invokes undefined behavior. You can't return a pointer to an automatic local variable. The variable no longer exist once fun returns. In this case the result you get, may be expected or unexpected.
Never return a pointer to an automatic local variable
You are returning local value it is stored in stack. When you move out of function it gets erased. You getting undefined behaviour.
In your case stack not changed after function returning, so first time you getting correct value. This is not same in all time.
Both are wrong, since you print a value that no longer exists: the memory to store int k in the function is ok only while the function is executing; you can't return a reference (pointer) to it, since it will no longer reference anything meaningful.
The following, instead, would work:
int *fun()
{
static int k=35;
return &k;
}
The static keyword "says" that the memory must "survive" even if the function is not running, thus the pointer you return will be valid.
As others already told, your program invokes undefined behavior.
That means, anything can happen where the behaviour is not defined.
In your case, the following happens: The address of the variable, sitting on the stack, is returned. After returning from the function, the next function call can - and will - reuse that space.
Between the function call erroneously returning this address and the call using the value, nothing happens - in your case. Be aware that even this might be different on systems where interrupts may occur, and as well on systems with signals being able to interrupt the normal program run.
The first printf() call now uses the stack for its own purpose - maybe it is even the call itself which overwrites the old value. So the second printf() call receives the value now written into that memory.
On undefined behaviour, anything may happen.
#include<stdio.h>
int *fun();
int main()
{
int *ptr;
ptr=fun();
printf("%d",*ptr);
printf("%d",*ptr);
}
int * fun()
{
int k=4;//If auto then cannot print it two times.....stack will be changed
return(&k);
}
O/P: 4
-2
Calling printf() for the first time prints the correct value.
Calling any function (even printf( ) ) immediately after the call to fun( ). This time printf( ) prints a garbage value. Why does this happen?Why do we not get a garbage value during the first print statement itself????
This is not behavior you can rely on; it may and likely will differ on different systems, even different versions of the compiler or different compiler switches.
Given that, what is likely happening is this: fun returns a pointer to where it stored k. That part of the stack is no longer reliable, because the function that allocated it has exited. Nonetheless, nobody has written over it yet, so the 4 is still where it was written. Then main prepares to call printf. To do so, it gets the first argument, *ptr. To do this, it loads from the place ptr points, which is the (former) address of k, so the load gets the 4 that is there. This 4 is stored in a register or stack location to be passed to printf. Then the address of the format string, "%d", is stored to be passed to printf. Then printf is called. At this point, printf uses a great deal of stack and writes new data where k used to be. However, the 4 that was passed as an argument is in a safe place, where arguments to printf should be, so printf prints it. Then printf returns. Then the main routine prepares to call printf again. This time, when it loads from where ptr points, the 4 is no longer there; it is some value that was written during the first call to printf. So that value is what is passed to printf and is what is printed.
Never write code that uses this behavior. It is not reliable, and it is not proper code.
Why does it surprise you? The behavior is undefined, but there's nothing unusual in observing what you observed.
All variables live somewhere in memory. When a variable gets formally destroyed (like local variables when function exits) the memory it used to occupy still exists and, most likely, still holds the last value that was written to it. That memory in now officially free, but it will continue to hold that last value until some other code reuses that memory for other purposes and overwrites it.
This is what you observe in your experiment. Even though the variable kno longer exists, pointer ptr still points to its former location. And that former location still happens to hold the last value of k, which is 4.
The very first printf "successfully" receives a copy of that value for printing. And that very first printf is actually the one that reuses the old memory location and overwrites the former value of k. All further attempts to dereference ptr will show that 4 is no longer there, which is why your second printf prints something else.
Variable k is local to fun(), means it will be destroyed when the function returns. This is a very bad coding technique and will always lead to problem.
And the reason why the first printf returns the correct value:
First of all it might or might not return the value. The thing is suppose k is written somewhere on stack memory. First time when the function returns printf might get the correct value because that part of memory might exist for a while. But this is not guaranteed.
Here is my code
#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
//int c=5;
printf("%x\n",&a[0]);
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",r[0]);
printf("%d\n",r[0]);
}
I am running codeblocks on Windows 7
Every time I run the loop I get the outputs as
22fee8
3
2293700
Here is the part I do not understand :
r expects a pointer to a part of memory which is interpreted as a sequence of boxes (each box of 4 byte width - >Integers ) on invoking fun function
What should happen is printf of function will print the address of a or address of a[0]:
Seconded
NOW THE QUESTION IS :
each time I run the program I get the same address?
And the array a should be destroyed at the end of Function fun only pointer must remain after function call
Then why on earth does the line r[0] must print 3?
r is pointing to something that doesn't exist anymore. You are returning a pointer to something on the stack. That stack will rewind when fun() ends. It can point to anything after that but nothing has overwritten it because another function is never called.
Nothing forces r[0] to be 3 - it's just a result of going for the simplest acceptable behaviour.
Basically, you're right that a must be destroyed at the end of fun. All this means is that the returned pointer (r in your case) is completely unreliable. That is, even though r[0] == 3 for you on your particular machine with your particular compiler, there's no guarantee that this will always hold on every machine.
To understand why it is so consistent for you, think about this: what does is mean for a to be destroyed? Only that you can't use it in any reliable way. The simplest way of satisfying this simple requirement is for the stack pointer to move back to the point where fun was called. So when you use r[0], the values of a are still present, but they are junk data - you can't count on them existing.
This is what happens:
int a[2]; is allocated on the stack (or similar). Suppose it gets allocated at the stack at address 0x12345678.
Various data gets pushed on the stack at this address, as the array is filled. Everything works as expected.
The address 0x12345678 pointing at the stack gets returned. (Ironically, the address itself likely gets returned on the stack.)
The memory allocated on the stack for a ceases to be valid. For now the two int values still sit at the given address in RAM, containing the values assigned to them. But the stack pointer isn't reserving those cells, nor is anything else in the program keeping track of that data. Computers don't delete data by erasing the value etc, they delete cells by forgetting that anything of use is stored at that memory location.
When the function ends, those memory cells are free to be used for the rest of the program. For example, a value returned by the function might end up there instead.
The function returned a pointer to a segment on the stack where there used to be valid data. The pointer is still 0x12345678 but at that address, anything might be stored by now. Furthermore, the contents at that address may change as different items are pushed/popped from the stack.
Printing the contents of that address will therefore give random results. Or it could print the same garbage value each time the program is executed. In fact it isn't guaranteed to print anything at all: printing the contents of an invalid memory cell is undefined behavior in C. The program could even crash when you attempt it.
r is undefined after the stack of the function int * fun(int a1,int b) is released, right after it ends, so it can be 3 or 42 or whatever value. The fact that it still contains your expected value is because it haven't been used for anything else, as a chunk of your memory is reserved for your program and your program does not use the stack further. Then after the first 3 is printed you get another value, that means that stack was used for something else, you could blame printf() since it's the only thing runing and it does a LOT of things to get that numbers into the console.
Why does it always print the same results? Because you always do the same process, there's no magic in it. But there's no guarantee that it'll be 3 since that 'memory space' is not yours and you are only 'peeking' into it.
Also, check the optimization level of your compiler fun() and main(), being as simple as they are, could be inline'd or replaced if the binary is to be optimized reducing the 'randomness' expected in your results. Although I wouldn't expect it to change much either.
You can find pretty good answers here:
can-a-local-variables-memory-be-accessed-outside-its-scope
returning-the-address-of-local-or-temporary-variable
return-reference-to-local-variable
Though the examples are for C++, underlying idea is same.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Returning the address of local or temporary variable
The add function is implemented wrongly. It should return a value instead of a pointer.
Why aren't any errors when ans and *ans_ptr are printed and the program even gives correct result? I guess the variable of z is already out of scope and there should be segmentation fault.
#include <stdio.h>
int * add(int x, int y) {
int z = x + y;
int *ans_ptr = &z;
return ans_ptr;
}
int main() {
int ans = *(add(1, 2));
int *ans_ptr = add(1, 2);
printf("%d\n", *ans_ptr);
printf("%d\n", ans);
return 0;
}
The reason it 'works' is because you got lucky. Returning a pointer to a local variable is Undefined Behaviour!! You should NOT do it.
int * add(int x, int y) {
int z = x + y; //z is a local variable in this stack frame
int *ans_ptr = &z; // ans_ptr points to z
return ans_ptr;
}
// at return of function, z is destroyed, so what does ans_ptr point to? No one knows. UB results
Because C has no garbage collection, when the "z" variable goes out of scope, nothing happens to the actual memory. It is simply freed for another variable to overwrite if the compiler pleases.
Since no memory is allocated between calling "add" and printing, the value is still sitting in memory, and you can access it because you have its address. You "got lucky."
However, as Tony points out, you should NEVER do this. It will work some of the time, but as soon as your program gets more complex, you will start ending up with spurious values.
No. Your question displays a fundamental lack of understanding of how the C memory model works.
The value z is allocated at an address on the stack, in the frame which is created when control enters add(). ans_ptr is then set to this memory address and returned.
The space on the stack will be overwritten by the next function that is called, but remember that C never performs memory clean up unless explicitly told to (eg via a function like calloc()).
This means that the value in the memory location &z (from the just-vacated stack frame) is still intact in the immediately following statement, ie. the printf() statement in main().
You should never ever rely on this behaviour - as soon as you add additional code into the above it will likely break.
The answer is: this program works because you are fortunate, but it will take no time to betray, as the address you return is not reserved to you anymore and any one can use it again. Its like renting the room, making a duplicate key, releasing the room, and after you have released the room at some later time you try to enter it with a duplicate key. In this case if the room is empty and not rented to someone else then you are fortunate, otherwise it can land you in police custody (something bad), and if the lock of the room was changed you get a segfault, so you can't just trust on the duplicate key which you made without acquisition of the room.
The z is a local variable allocated in stack and its scope is as long as the particular call to the function block. You return the address of such a local variable. Once you return from the function, all the addresses local to the block (allocated in the function call stack frame) might be used for another call and be overwritten, therefore you might or might not get what you expect. Which is undefined behavior, and thus such operation is incorrect.
If you are getting correct output, then you are fortunate that the old value held by that memory location is not overwritten, but your program has access to the page in which the address lies, therefore you do not get a segmentation fault error.
A quick test shows, as the OP points out, that neither GCC 4.3 nor MSVC 10 provide any warnings. But the Clang Static Analyzer does:
ccc-analyzer -c foo.c
...
ANALYZE: foo.c add
foo.c:6:5: warning: Address of stack memory associated with local
variable 'z' returned to caller
return ans_ptr;
^ ~~~~~~~