why does printf prints 7 although the variable a was local to the function fun() and should no longer exist once the control returns from function fun().
Here is the c code
#include<stdio.h>
main()
{
int *fun();
int *c=fun();
printf("%d",*c);
getch();
}
int *fun()
{
int a=7;
return(&a);
}
output : 7
This is because even if the variable does not exist anymore, the memory location where it was has not yet been used for something else. Hence, the pointer still points to a memory location where the bits contains an int with value 7.
But this is definitely undefined behavior. You should not rely on it.
There is a difference between the language idioms and the physical operation of the hardware. In "C" words, yes, your variable should not be accessed anymore, but physically the variable a has been allocated on the stack of your program, which is not erased each time a function returns (it would take too much time), thus you can still read it.
Anyway, this is not recommended because other function calls may erase this data.
once the fun() return, the frame pointer has been set back to point the main() frame again. the pointer c pointed to some address in the memory, since the fun() has already returned, we don't know what's in the adress,but if nothing else has been written to the adress, it can still be the previous variable a. the C standard simply move the frame pointer when a function returns.
I think its because you are printing *c , which displays the value which is stored on the location i.e &a , just try to print c then you will get the address of 7 .
It is because the address is already passed to the variable c and the memory address can be read outside the function as well
Related
I came across this page that illustrates common ways in which dangling pointes are created.
The code below is used to illustrate dangling pointers by returning address of a local variable:
// The pointer pointing to local variable becomes
// dangling when local variable is static.
#include<stdio.h>
int *fun()
{
// x is local variable and goes out of scope
// after an execution of fun() is over.
int x = 5;
return &x;
}
// Driver Code
int main()
{
int *p = fun();
fflush(stdout);
// p points to something which is not valid anymore
printf("%d", *p);
return 0;
}
On running this, this is the compiler warning I get (as expected):
In function 'fun':
12:2: warning: function returns address of local variable [-Wreturn-local-addr]
return &x;
^
And this is the output I get (good so far):
32743
However, when I comment out the fflush(stdout) line, this is the output I get (with the same compiler warning):
5
What is the reason for this behaviour? How exactly is the presence/absence of the fflush command causing this behaviour change?
Returning a pointer to an object on the stack is bad, as you've mentioned. The reason you only see a problem with your fflush() call in place is that the stack is unmodified if it's not there. That is, the 5 is still in place, so the pointer dereference still gives that 5 to you. If you call a function (almost any function, probably) in between fun and printf, it will almost certainly overwrite that stack location, making the later dereference return whatever junk that function happened to leave there.
This is because calling fflush(stdout) writes onto the stack where x was.
Let me explain. The stack in assembly language (which is what all programming languages eventually run as in one way or another) is commonly used to store local variables, return addresses, and function parameters. When a function is called, it pushes these things onto the stack:
the address of where to continue executing code once the function completes.
the parameters to the function, in an order determined by the calling convention used.
the local variables that the function uses.
These things are then popped off of the stack, one by one, simply by changing where the CPU thinks the top of the stack is. This means the data still exists, but it's not guaranteed to continue to exist.
Calling another function after fun() overwrites the previous values above the top of the stack, in this case with the value of stdout, and so the pointer's referenced value changes.
Without calling another function, the data stays there and is still valid when the pointer is dereferenced.
Like every function is put on a stack frame for its execution and it is flushed after its completion. So, any local variable wont be available to other functions. But then how are we able to return a local variable to the caller?
int pickMin( int x, int y, int z ) {
int min = x ;
if ( y < min )
min = y ;
if ( z < min )
min = z ;
return min ; }
The above code works fine. However the in the below code, compiler does give a warning message- "warning: function returns address of local variable [-Wreturn-local-addr] return a; " but it prints a garbage value at the end, which I think is fine because the variable has already been flushed. But why didn't that happen in the ABOVE program?! I mean, it should also have returned me a garbage value.Moreover, I know that the problem in the below code can be solved using malloc, and then returning that value. :)
int *returnarray(){
int a[10]; int i;
for(i=0;i<10;++i){
a[i] = i;
}return a;}
C passes everything around by value. In your first snippet, return min returns an int variable. Its value is returned. The second snippet consists of return and an array name, which decays into a pointer.
The value that is returned, is the memory address of a local variable. The function where this variable existed has returned, though, and accessing the memory that this function used then invokes undefined behaviour.
The way to handle this kind of situation (ie: needing to return an array) is either by passing the target array to the function as an argument, or by allocating the memory using malloc and returning that pointer.
Heap memory is a tad slower, more error prone, and requires you to look after it though. Still, here's an example of both approaches
create_fill allocates, assigns and returns a pointer to the heap memory, fill_array doesn't return anything, but expects you to pass an array (which decays into a pointer), and a max length to fill. The advantage being: stack memory doesn't require as much care, and will outperform the heap.
The return statement des exactly that: it copies the value of a variable, and leaves it on top of the stack so the calling function can use it. Now, in C this works for simple values, not for arrays because your "array variable" a is actually the address of its first value only.
First, read carefully call stack wikipage. There are nice pictures on it. See also the x86 calling conventions wikipage.
Then, the result (of some C function) often goes thru a register when returned (or, for large struct-s, in a stack space allocated by the caller).
Details are ABI specific. For Linux on x86-64 (in 64 bits), the x86-64 ABI mentions the %rax register to return a result (in common cases, but when the result is a large struct the caller passes an address for it).
BTW, in principle, I believe that nothing in the C99 standard requires a stack, but I know no C implementations without a call stack (generally the processor stack, i.e. thru the stack register).
In first case you return value of a variable.
While, in second case you return address of a local variable which as you correctly said is not available for other functions. In C, the name of an array is the base address of that array. Hence in second case, "base address of array" is copied to any variable/pointer, which is assigned the return value of that function
this is the source code
#include <stdio.h>
#include <stdlib.h>
int *fun();
int main()
{
int *j;
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
return 0;
}
int *fun()
{
int k=35;
return &k;
}
output-
35
1637778
the first printf() prints 35 which is the value of k but
In the main() the second printf prints a garbage value rather than printing 35.why?
The problem here is the return from fun is returning the address of a local variable. That address becomes invalid the moment the function returns. You are simply getting lucky on the first call to printf.
Even though the local is technically destroyed when fun returns the C runtime does nothing to actively destroy it. Hence your first use of *j is working because the memory for the local hasn't been written over yet. The implementation of printf though is likely over writing this simply by using its own locals in the method. Hence in the second use of *j you're referring to whatever local printf used and not k.
In order to make this work you need to return an address that points to a value that lives longer than fun. Typically in C this is achieved with malloc
int *fun() {
int* pValue = malloc(sizeof(int));
*pValue = 23;
return pValue;
}
Because the return of malloc lives until you call free this will be valid across multiple uses of printf. The one catch is the calling function now has to tell the program when it is done with the retun of fun. To do this call free after the second call to printf
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
free(j);
Program invokes undefined behavior. You can't return a pointer to an automatic local variable. The variable no longer exist once fun returns. In this case the result you get, may be expected or unexpected.
Never return a pointer to an automatic local variable
You are returning local value it is stored in stack. When you move out of function it gets erased. You getting undefined behaviour.
In your case stack not changed after function returning, so first time you getting correct value. This is not same in all time.
Both are wrong, since you print a value that no longer exists: the memory to store int k in the function is ok only while the function is executing; you can't return a reference (pointer) to it, since it will no longer reference anything meaningful.
The following, instead, would work:
int *fun()
{
static int k=35;
return &k;
}
The static keyword "says" that the memory must "survive" even if the function is not running, thus the pointer you return will be valid.
As others already told, your program invokes undefined behavior.
That means, anything can happen where the behaviour is not defined.
In your case, the following happens: The address of the variable, sitting on the stack, is returned. After returning from the function, the next function call can - and will - reuse that space.
Between the function call erroneously returning this address and the call using the value, nothing happens - in your case. Be aware that even this might be different on systems where interrupts may occur, and as well on systems with signals being able to interrupt the normal program run.
The first printf() call now uses the stack for its own purpose - maybe it is even the call itself which overwrites the old value. So the second printf() call receives the value now written into that memory.
On undefined behaviour, anything may happen.
Here is my code
#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
//int c=5;
printf("%x\n",&a[0]);
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",r[0]);
printf("%d\n",r[0]);
}
I am running codeblocks on Windows 7
Every time I run the loop I get the outputs as
22fee8
3
2293700
Here is the part I do not understand :
r expects a pointer to a part of memory which is interpreted as a sequence of boxes (each box of 4 byte width - >Integers ) on invoking fun function
What should happen is printf of function will print the address of a or address of a[0]:
Seconded
NOW THE QUESTION IS :
each time I run the program I get the same address?
And the array a should be destroyed at the end of Function fun only pointer must remain after function call
Then why on earth does the line r[0] must print 3?
r is pointing to something that doesn't exist anymore. You are returning a pointer to something on the stack. That stack will rewind when fun() ends. It can point to anything after that but nothing has overwritten it because another function is never called.
Nothing forces r[0] to be 3 - it's just a result of going for the simplest acceptable behaviour.
Basically, you're right that a must be destroyed at the end of fun. All this means is that the returned pointer (r in your case) is completely unreliable. That is, even though r[0] == 3 for you on your particular machine with your particular compiler, there's no guarantee that this will always hold on every machine.
To understand why it is so consistent for you, think about this: what does is mean for a to be destroyed? Only that you can't use it in any reliable way. The simplest way of satisfying this simple requirement is for the stack pointer to move back to the point where fun was called. So when you use r[0], the values of a are still present, but they are junk data - you can't count on them existing.
This is what happens:
int a[2]; is allocated on the stack (or similar). Suppose it gets allocated at the stack at address 0x12345678.
Various data gets pushed on the stack at this address, as the array is filled. Everything works as expected.
The address 0x12345678 pointing at the stack gets returned. (Ironically, the address itself likely gets returned on the stack.)
The memory allocated on the stack for a ceases to be valid. For now the two int values still sit at the given address in RAM, containing the values assigned to them. But the stack pointer isn't reserving those cells, nor is anything else in the program keeping track of that data. Computers don't delete data by erasing the value etc, they delete cells by forgetting that anything of use is stored at that memory location.
When the function ends, those memory cells are free to be used for the rest of the program. For example, a value returned by the function might end up there instead.
The function returned a pointer to a segment on the stack where there used to be valid data. The pointer is still 0x12345678 but at that address, anything might be stored by now. Furthermore, the contents at that address may change as different items are pushed/popped from the stack.
Printing the contents of that address will therefore give random results. Or it could print the same garbage value each time the program is executed. In fact it isn't guaranteed to print anything at all: printing the contents of an invalid memory cell is undefined behavior in C. The program could even crash when you attempt it.
r is undefined after the stack of the function int * fun(int a1,int b) is released, right after it ends, so it can be 3 or 42 or whatever value. The fact that it still contains your expected value is because it haven't been used for anything else, as a chunk of your memory is reserved for your program and your program does not use the stack further. Then after the first 3 is printed you get another value, that means that stack was used for something else, you could blame printf() since it's the only thing runing and it does a LOT of things to get that numbers into the console.
Why does it always print the same results? Because you always do the same process, there's no magic in it. But there's no guarantee that it'll be 3 since that 'memory space' is not yours and you are only 'peeking' into it.
Also, check the optimization level of your compiler fun() and main(), being as simple as they are, could be inline'd or replaced if the binary is to be optimized reducing the 'randomness' expected in your results. Although I wouldn't expect it to change much either.
You can find pretty good answers here:
can-a-local-variables-memory-be-accessed-outside-its-scope
returning-the-address-of-local-or-temporary-variable
return-reference-to-local-variable
Though the examples are for C++, underlying idea is same.
#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",*r);
printf("%d\n",*r);
}
Output after running the code:
3
-1073855580
I understand that a[2] is local to fun() but why value is getting changed of same pointer?
The variable a is indeed local to fun. When you return from that function, the stack is popped. The memory itself remains unchanged (for the moment). When you dereference r the first time, the memory is what you'd expect it to be. And since the dereference happens before the call to printf, nothing bad happens. When printf executes, it modifies the stack and the value is wiped out. The second time through you're seeing whatever value happened to be put there by printf the first time through.
The order of events for a "normal" calling convention (I know, I know -- no such thing):
Dereference r (the first time through, this is what it should be)
Push value onto stack (notice this is making a copy of the value) (may wipe out a)
Push other parameters on to stack (order is usually right to left, IIRC) (may wipe out a)
Allocate room for return value on stack (may wipe out a)
Call printf
Push local printf variables onto stack (may wipe out a)
Do your thang
Return from function
If you change int a[2]; to static int a[2]; this will alleviate the problem.
Because r points to a location on the stack that is likely to be overwritten by a function call.
In this case, it's the first call to printf itself which is changing that location.
In detail, the return from fun has that particular location being preserved simply because nothing has overwritten it yet.
The *r is then evaluated (as 3) and passed to printf to be printed. The actual call to printf changes the contents of that location (since it uses the memory for its own stack frame), but the value has already been extracted at that point so it's safe.
On the subsequent call, *r has the different value, changed by the first call. That's why it's different in this case.
Of course, this is just the likely explanation. In reality, anything could be happening since what you've coded up there is undefined behaviour. Once you do that, all bets are off.
As you've mentioned, a[2] is local to fun(); meaning it is created on the stack right before the code within fun() starts executing. When fun exits the stack is popped, meaning it is unwound so that the stack pointer is pointing to where it was before fun started executing.
The compiler is now free to stick whatever it wants into those locations that were unwound. So, it is possible that the first location of a was skipped for a variety of reasons. Maybe it now represents an uninitialized variable. Maybe it was for memory alignment of another variable. Simple answer is, by returning a pointer to a local variable from a function, and then de-referencing that pointer, you're invoking undefined behavior and anything can happen, including demons flying out of your nose.
When you compile you code with the following command:
$ gcc -Wall yourProgram.c
It will yield a warning, which says.
In function ‘fun’:
warning: function returns address of local variable
When r is dereferenced in first printf statement, it's okay as the memory is preserved. However, the second printf statement overwrites the stack and so we get an undesired result.
Because printf is using the stack location and changes it after printing the first value.