I came across this page that illustrates common ways in which dangling pointes are created.
The code below is used to illustrate dangling pointers by returning address of a local variable:
// The pointer pointing to local variable becomes
// dangling when local variable is static.
#include<stdio.h>
int *fun()
{
// x is local variable and goes out of scope
// after an execution of fun() is over.
int x = 5;
return &x;
}
// Driver Code
int main()
{
int *p = fun();
fflush(stdout);
// p points to something which is not valid anymore
printf("%d", *p);
return 0;
}
On running this, this is the compiler warning I get (as expected):
In function 'fun':
12:2: warning: function returns address of local variable [-Wreturn-local-addr]
return &x;
^
And this is the output I get (good so far):
32743
However, when I comment out the fflush(stdout) line, this is the output I get (with the same compiler warning):
5
What is the reason for this behaviour? How exactly is the presence/absence of the fflush command causing this behaviour change?
Returning a pointer to an object on the stack is bad, as you've mentioned. The reason you only see a problem with your fflush() call in place is that the stack is unmodified if it's not there. That is, the 5 is still in place, so the pointer dereference still gives that 5 to you. If you call a function (almost any function, probably) in between fun and printf, it will almost certainly overwrite that stack location, making the later dereference return whatever junk that function happened to leave there.
This is because calling fflush(stdout) writes onto the stack where x was.
Let me explain. The stack in assembly language (which is what all programming languages eventually run as in one way or another) is commonly used to store local variables, return addresses, and function parameters. When a function is called, it pushes these things onto the stack:
the address of where to continue executing code once the function completes.
the parameters to the function, in an order determined by the calling convention used.
the local variables that the function uses.
These things are then popped off of the stack, one by one, simply by changing where the CPU thinks the top of the stack is. This means the data still exists, but it's not guaranteed to continue to exist.
Calling another function after fun() overwrites the previous values above the top of the stack, in this case with the value of stdout, and so the pointer's referenced value changes.
Without calling another function, the data stays there and is still valid when the pointer is dereferenced.
Related
this is the source code
#include <stdio.h>
#include <stdlib.h>
int *fun();
int main()
{
int *j;
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
return 0;
}
int *fun()
{
int k=35;
return &k;
}
output-
35
1637778
the first printf() prints 35 which is the value of k but
In the main() the second printf prints a garbage value rather than printing 35.why?
The problem here is the return from fun is returning the address of a local variable. That address becomes invalid the moment the function returns. You are simply getting lucky on the first call to printf.
Even though the local is technically destroyed when fun returns the C runtime does nothing to actively destroy it. Hence your first use of *j is working because the memory for the local hasn't been written over yet. The implementation of printf though is likely over writing this simply by using its own locals in the method. Hence in the second use of *j you're referring to whatever local printf used and not k.
In order to make this work you need to return an address that points to a value that lives longer than fun. Typically in C this is achieved with malloc
int *fun() {
int* pValue = malloc(sizeof(int));
*pValue = 23;
return pValue;
}
Because the return of malloc lives until you call free this will be valid across multiple uses of printf. The one catch is the calling function now has to tell the program when it is done with the retun of fun. To do this call free after the second call to printf
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
free(j);
Program invokes undefined behavior. You can't return a pointer to an automatic local variable. The variable no longer exist once fun returns. In this case the result you get, may be expected or unexpected.
Never return a pointer to an automatic local variable
You are returning local value it is stored in stack. When you move out of function it gets erased. You getting undefined behaviour.
In your case stack not changed after function returning, so first time you getting correct value. This is not same in all time.
Both are wrong, since you print a value that no longer exists: the memory to store int k in the function is ok only while the function is executing; you can't return a reference (pointer) to it, since it will no longer reference anything meaningful.
The following, instead, would work:
int *fun()
{
static int k=35;
return &k;
}
The static keyword "says" that the memory must "survive" even if the function is not running, thus the pointer you return will be valid.
As others already told, your program invokes undefined behavior.
That means, anything can happen where the behaviour is not defined.
In your case, the following happens: The address of the variable, sitting on the stack, is returned. After returning from the function, the next function call can - and will - reuse that space.
Between the function call erroneously returning this address and the call using the value, nothing happens - in your case. Be aware that even this might be different on systems where interrupts may occur, and as well on systems with signals being able to interrupt the normal program run.
The first printf() call now uses the stack for its own purpose - maybe it is even the call itself which overwrites the old value. So the second printf() call receives the value now written into that memory.
On undefined behaviour, anything may happen.
#include<stdio.h>
int *fun();
int main()
{
int *ptr;
ptr=fun();
printf("%d",*ptr);
printf("%d",*ptr);
}
int * fun()
{
int k=4;//If auto then cannot print it two times.....stack will be changed
return(&k);
}
O/P: 4
-2
Calling printf() for the first time prints the correct value.
Calling any function (even printf( ) ) immediately after the call to fun( ). This time printf( ) prints a garbage value. Why does this happen?Why do we not get a garbage value during the first print statement itself????
This is not behavior you can rely on; it may and likely will differ on different systems, even different versions of the compiler or different compiler switches.
Given that, what is likely happening is this: fun returns a pointer to where it stored k. That part of the stack is no longer reliable, because the function that allocated it has exited. Nonetheless, nobody has written over it yet, so the 4 is still where it was written. Then main prepares to call printf. To do so, it gets the first argument, *ptr. To do this, it loads from the place ptr points, which is the (former) address of k, so the load gets the 4 that is there. This 4 is stored in a register or stack location to be passed to printf. Then the address of the format string, "%d", is stored to be passed to printf. Then printf is called. At this point, printf uses a great deal of stack and writes new data where k used to be. However, the 4 that was passed as an argument is in a safe place, where arguments to printf should be, so printf prints it. Then printf returns. Then the main routine prepares to call printf again. This time, when it loads from where ptr points, the 4 is no longer there; it is some value that was written during the first call to printf. So that value is what is passed to printf and is what is printed.
Never write code that uses this behavior. It is not reliable, and it is not proper code.
Why does it surprise you? The behavior is undefined, but there's nothing unusual in observing what you observed.
All variables live somewhere in memory. When a variable gets formally destroyed (like local variables when function exits) the memory it used to occupy still exists and, most likely, still holds the last value that was written to it. That memory in now officially free, but it will continue to hold that last value until some other code reuses that memory for other purposes and overwrites it.
This is what you observe in your experiment. Even though the variable kno longer exists, pointer ptr still points to its former location. And that former location still happens to hold the last value of k, which is 4.
The very first printf "successfully" receives a copy of that value for printing. And that very first printf is actually the one that reuses the old memory location and overwrites the former value of k. All further attempts to dereference ptr will show that 4 is no longer there, which is why your second printf prints something else.
Variable k is local to fun(), means it will be destroyed when the function returns. This is a very bad coding technique and will always lead to problem.
And the reason why the first printf returns the correct value:
First of all it might or might not return the value. The thing is suppose k is written somewhere on stack memory. First time when the function returns printf might get the correct value because that part of memory might exist for a while. But this is not guaranteed.
why does printf prints 7 although the variable a was local to the function fun() and should no longer exist once the control returns from function fun().
Here is the c code
#include<stdio.h>
main()
{
int *fun();
int *c=fun();
printf("%d",*c);
getch();
}
int *fun()
{
int a=7;
return(&a);
}
output : 7
This is because even if the variable does not exist anymore, the memory location where it was has not yet been used for something else. Hence, the pointer still points to a memory location where the bits contains an int with value 7.
But this is definitely undefined behavior. You should not rely on it.
There is a difference between the language idioms and the physical operation of the hardware. In "C" words, yes, your variable should not be accessed anymore, but physically the variable a has been allocated on the stack of your program, which is not erased each time a function returns (it would take too much time), thus you can still read it.
Anyway, this is not recommended because other function calls may erase this data.
once the fun() return, the frame pointer has been set back to point the main() frame again. the pointer c pointed to some address in the memory, since the fun() has already returned, we don't know what's in the adress,but if nothing else has been written to the adress, it can still be the previous variable a. the C standard simply move the frame pointer when a function returns.
I think its because you are printing *c , which displays the value which is stored on the location i.e &a , just try to print c then you will get the address of 7 .
It is because the address is already passed to the variable c and the memory address can be read outside the function as well
I recently read about scope rules in C. It says that a local or auto variable is available only inside the block of the function in which it is declared. Once outside the function it no longer is visible. Also that its lifetime is only till the end of the final closing braces of the function body.
Now here is the problem. What happens when the address of a local variable is returned from the function to the calling function ?
For example :-
main()
{
int *p=fun();
}
int * fun()
{
int localvar=0;
return (&localvar);
}
once the control returns back from the function fun, the variable localvar is no longer alive. So how will main be able to access the contents at this address ?
The address can be returned, but the value stored at the address cannot reliably be read. Indeed, it is not even clear that you can safely assign it, though the chances are that on most machines there wouldn't be a problem with that.
You can often read the address, but the behaviour is undefined (read 'bad: to be avoided at all costs!'). In particular, the address may be used for other variables in other functions, so if you access it after calling other functions, you are definitely unlikely to see the last value stored in the variable by the function that returned the pointer to it.
Why then is a function returning a pointer ever required?
One reason is often 'dynamic memory'. The malloc() family of functions return a pointer to new (non-stack) memory.
Another reason is 'found something at this location in a value passed to me'. Consider strchr() or strstr().
Another reason is 'returning pointer to a static object, either hidden in the function or in the file containing the source for the function'. Consider asctime() et al (and worry about thread-safety).
There are probably a few others, but those are probably the most common.
Note that none of these return a pointer to a local (stack-based) variable.
The variable is gone, but the memory location still exists and might even still contain the value you set. It will however probably get overwritten pretty fast as more functions are called and the memory address gets reused for another function's local variables. You can learn more by reading about the Call Stack, which is where local variables of functions are stored.
Referencing that location in memory after the function has returned is dangerous. Of course the location still exists (and it may still contain your value), but you no longer have any claim to that memory region and it will likely be overwritten with new data as the program continues and new local variables are allocated on the stack.
gcc gives me the following warning:
t.c: In function ‘test’:
t.c:3:2: warning: function returns address of local variable [enabled by default]
Consider this test program:
int * test(int p) {
int loc = p;
return &loc;
}
int main(void) {
int *c = test(4);
test(5);
printf("%d\n", *c);
return 0;
}
What do you think this prints?
#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",*r);
printf("%d\n",*r);
}
Output after running the code:
3
-1073855580
I understand that a[2] is local to fun() but why value is getting changed of same pointer?
The variable a is indeed local to fun. When you return from that function, the stack is popped. The memory itself remains unchanged (for the moment). When you dereference r the first time, the memory is what you'd expect it to be. And since the dereference happens before the call to printf, nothing bad happens. When printf executes, it modifies the stack and the value is wiped out. The second time through you're seeing whatever value happened to be put there by printf the first time through.
The order of events for a "normal" calling convention (I know, I know -- no such thing):
Dereference r (the first time through, this is what it should be)
Push value onto stack (notice this is making a copy of the value) (may wipe out a)
Push other parameters on to stack (order is usually right to left, IIRC) (may wipe out a)
Allocate room for return value on stack (may wipe out a)
Call printf
Push local printf variables onto stack (may wipe out a)
Do your thang
Return from function
If you change int a[2]; to static int a[2]; this will alleviate the problem.
Because r points to a location on the stack that is likely to be overwritten by a function call.
In this case, it's the first call to printf itself which is changing that location.
In detail, the return from fun has that particular location being preserved simply because nothing has overwritten it yet.
The *r is then evaluated (as 3) and passed to printf to be printed. The actual call to printf changes the contents of that location (since it uses the memory for its own stack frame), but the value has already been extracted at that point so it's safe.
On the subsequent call, *r has the different value, changed by the first call. That's why it's different in this case.
Of course, this is just the likely explanation. In reality, anything could be happening since what you've coded up there is undefined behaviour. Once you do that, all bets are off.
As you've mentioned, a[2] is local to fun(); meaning it is created on the stack right before the code within fun() starts executing. When fun exits the stack is popped, meaning it is unwound so that the stack pointer is pointing to where it was before fun started executing.
The compiler is now free to stick whatever it wants into those locations that were unwound. So, it is possible that the first location of a was skipped for a variety of reasons. Maybe it now represents an uninitialized variable. Maybe it was for memory alignment of another variable. Simple answer is, by returning a pointer to a local variable from a function, and then de-referencing that pointer, you're invoking undefined behavior and anything can happen, including demons flying out of your nose.
When you compile you code with the following command:
$ gcc -Wall yourProgram.c
It will yield a warning, which says.
In function ‘fun’:
warning: function returns address of local variable
When r is dereferenced in first printf statement, it's okay as the memory is preserved. However, the second printf statement overwrites the stack and so we get an undesired result.
Because printf is using the stack location and changes it after printing the first value.