this is the source code
#include <stdio.h>
#include <stdlib.h>
int *fun();
int main()
{
int *j;
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
return 0;
}
int *fun()
{
int k=35;
return &k;
}
output-
35
1637778
the first printf() prints 35 which is the value of k but
In the main() the second printf prints a garbage value rather than printing 35.why?
The problem here is the return from fun is returning the address of a local variable. That address becomes invalid the moment the function returns. You are simply getting lucky on the first call to printf.
Even though the local is technically destroyed when fun returns the C runtime does nothing to actively destroy it. Hence your first use of *j is working because the memory for the local hasn't been written over yet. The implementation of printf though is likely over writing this simply by using its own locals in the method. Hence in the second use of *j you're referring to whatever local printf used and not k.
In order to make this work you need to return an address that points to a value that lives longer than fun. Typically in C this is achieved with malloc
int *fun() {
int* pValue = malloc(sizeof(int));
*pValue = 23;
return pValue;
}
Because the return of malloc lives until you call free this will be valid across multiple uses of printf. The one catch is the calling function now has to tell the program when it is done with the retun of fun. To do this call free after the second call to printf
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
free(j);
Program invokes undefined behavior. You can't return a pointer to an automatic local variable. The variable no longer exist once fun returns. In this case the result you get, may be expected or unexpected.
Never return a pointer to an automatic local variable
You are returning local value it is stored in stack. When you move out of function it gets erased. You getting undefined behaviour.
In your case stack not changed after function returning, so first time you getting correct value. This is not same in all time.
Both are wrong, since you print a value that no longer exists: the memory to store int k in the function is ok only while the function is executing; you can't return a reference (pointer) to it, since it will no longer reference anything meaningful.
The following, instead, would work:
int *fun()
{
static int k=35;
return &k;
}
The static keyword "says" that the memory must "survive" even if the function is not running, thus the pointer you return will be valid.
As others already told, your program invokes undefined behavior.
That means, anything can happen where the behaviour is not defined.
In your case, the following happens: The address of the variable, sitting on the stack, is returned. After returning from the function, the next function call can - and will - reuse that space.
Between the function call erroneously returning this address and the call using the value, nothing happens - in your case. Be aware that even this might be different on systems where interrupts may occur, and as well on systems with signals being able to interrupt the normal program run.
The first printf() call now uses the stack for its own purpose - maybe it is even the call itself which overwrites the old value. So the second printf() call receives the value now written into that memory.
On undefined behaviour, anything may happen.
Related
#include <stdio.h>
#include <stdlib.h>
int* test(int); // the prototype
int main()
{
int var = -20;
int *ptr = NULL;
ptr = test(var);
printf("This is what we got back in main(): %d \n", *ptr);
return 0;
}
int* test(int k)
{
int y = abs(k);
int *ptr1 = &y;
printf("The value of y in test() directly is %d \n", y);
printf("The value of y in test() indirectly is %d \n", *ptr1);
return ptr1;
}
The output is:
The value of y in test() directly is 20
The value of y in test() indirectly is 20
This is what we got back in main(): 20
Short Summary:
The above mentioned code has a user defined function which returns a pointer that holds the address of a variable(i.e.,y). This variable is assigned an absolute value of the integer passed to the function (i.e.,x)
My book " Computer Programming for Beginners" by A.J.Gonzalez states the following:
The variable y will cease to exist after the function exits, so the address returned to the calling function will turn out too be meaningless. However the value held in its address will persist and can be retrieved through the pointer.
My question is :
How did we come to this conclusion that a value held in the address will persist and can be retrieved through the pointer from the following printf statements:
1st statement: giving value directly using y.
2nd statement: using a pointer to give the value.
3rd statement: getting the value from main.
All that is all right but then from there how does one make the conclusion that the direct use variable loses its value but the indirectly used variable (i.e., the pointer ) retains its value ?
I tried looking at past questions but could not find anything relevant.
Will be grateful for your help. Thank You.
This program is illustrating an effect of undefined behavior.
After test returns, the pointer value it returns points to a variable that no longer exists. Formally, this is undefined behavior which means that the C standard makes no guarantees what will happen when you attempt to use this pointer. In practice, the memory that was used by y was not yet overwritten by some other value so dereferencing the pointer will often yield the value that was stored there.
But again, there's no guarantee that will actually happen. As an example, if we change the main function as follows:
int main()
{
int var = -20;
int *ptr = NULL;
ptr = test(var);
printf("This is what we got back in main(): %d \n", *ptr);
printf("%d %d %d %f\n", 1, 2, 3, 4.0);
printf("This is what we got back in main(): %d \n", *ptr);
return 0;
}
My machine outputs:
The value of y in test() directly is 20
The value of y in test() indirectly is 20
This is what we got back in main(): 20
1 2 3 4.000000
This is what we got back in main(): 0
Which demonstrates that the memory previously used by y has some other value.
The moral of the story: don't attempt to use a pointer to a variable which no longer exists.
My question is : How did we come to this conclusion that a value held in the address will persist and can be retrieved through the pointer
The function returns an address where a variable used to be stored, very likely somewhere on the stack. But since that address is no longer valid, what the pointer now contains is indeterminate. Meaning there's no guarantees of anything any longer. And you can't de-reference that pointer any longer, this code here is bugged:
printf("This is what we got back in main(): %d \n", *ptr);
It could print anything or cause a program crash, it is so-called "undefined behavior" (as per C standard C17 6.2.4/2), see What is undefined behavior and how does it work?.
What conclusions can we draw from this? Only one, returning a pointer to a local scope variable from a function is bad and a bug.
Other than that, there's nothing else to reason about, no deterministic behavior, no interesting phenomenon to study or learn anything meaningful from. See Can a local variable's memory be accessed outside its scope?
The book's reasoning is somewhat correct, however it assumes an architecture that pushes parameters onto the stack, such as Intel.
What happens is that the location of y is not re-used until after its value has been retrieved through ptr of main to be pushed onto the stack for the call of printf.
However, an interrupt could re-use the location and so in general this is undefined behavior.
Note: this is used by compilers when returning a struct as function return. The compiler copies the struct variable of the called function to the struct of the caller, before anything else takes place, so the memory has not yet been re-used.
The program in your book is erroneous, as the function test() returns a pointer that has been initialized to point to an automatic local variable (the variable y in test() itself) that has ceased to exist as soon as the program returned from test(). So trying to use the pointer to access the value pointed to by it is undefined behaviour and this makes your program probably crash, show weird behaviour (as you show in your post). You cannot predict the value returned by it, as the pointer is pointing to memory that has ceased being used for its original purpose, and now is used for a different thing. The output you get is that, but can be completely different, with just changing the compiler version, optimization options or the machine architecture.
A book trying to illustrate this, not warning you about the undefined behaviour stated by the language, is showing, at least, very bad pedagogical manners, and using bad code to teach unusable things. I cannot guess why the output of the program can be of interest for anything. So, please, correct me.
I came across this page that illustrates common ways in which dangling pointes are created.
The code below is used to illustrate dangling pointers by returning address of a local variable:
// The pointer pointing to local variable becomes
// dangling when local variable is static.
#include<stdio.h>
int *fun()
{
// x is local variable and goes out of scope
// after an execution of fun() is over.
int x = 5;
return &x;
}
// Driver Code
int main()
{
int *p = fun();
fflush(stdout);
// p points to something which is not valid anymore
printf("%d", *p);
return 0;
}
On running this, this is the compiler warning I get (as expected):
In function 'fun':
12:2: warning: function returns address of local variable [-Wreturn-local-addr]
return &x;
^
And this is the output I get (good so far):
32743
However, when I comment out the fflush(stdout) line, this is the output I get (with the same compiler warning):
5
What is the reason for this behaviour? How exactly is the presence/absence of the fflush command causing this behaviour change?
Returning a pointer to an object on the stack is bad, as you've mentioned. The reason you only see a problem with your fflush() call in place is that the stack is unmodified if it's not there. That is, the 5 is still in place, so the pointer dereference still gives that 5 to you. If you call a function (almost any function, probably) in between fun and printf, it will almost certainly overwrite that stack location, making the later dereference return whatever junk that function happened to leave there.
This is because calling fflush(stdout) writes onto the stack where x was.
Let me explain. The stack in assembly language (which is what all programming languages eventually run as in one way or another) is commonly used to store local variables, return addresses, and function parameters. When a function is called, it pushes these things onto the stack:
the address of where to continue executing code once the function completes.
the parameters to the function, in an order determined by the calling convention used.
the local variables that the function uses.
These things are then popped off of the stack, one by one, simply by changing where the CPU thinks the top of the stack is. This means the data still exists, but it's not guaranteed to continue to exist.
Calling another function after fun() overwrites the previous values above the top of the stack, in this case with the value of stdout, and so the pointer's referenced value changes.
Without calling another function, the data stays there and is still valid when the pointer is dereferenced.
Consider the following code.
#include<stdio.h>
int *abc(); // this function returns a pointer of type int
int main()
{
int *ptr;
ptr = abc();
printf("%d", *ptr);
return 0;
}
int *abc()
{
int i = 45500, *p;
p = &i;
return p;
}
Output:
45500
I know according to link this type of behavior is undefined. But why i am getting correct value everytime i run the program.
Every time you call abc it "marks" a region at the top of the stack as the place where it will write all of its local variables. It does that by moving the pointer that indicates where the top of stack is. That region is called the stack frame. When the function returns, it indicates that it does not want to use that region anymore by moving the stack pointer to where it was originally. As a result, if you call other functions afterwards, they will reuse that region of the stack for their own purposes. But in your case, you haven't called any other functions yet. So that region of the stack is left in the same state.
All the above explain the behavior of your code. It is not necessary that all C compilers implement functions that way and therefore you should not rely on that behavior.
Well, undefined behavior is, undefined. You can never rely on UB (or on an output of a program invoking UB).
Maybe, just maybe in your environment and for your code, the memory location allocated for the local variable is not reclaimed by the OS and still accessible, but there's no guarantee that it will have the same behavior for any other platform.
I've done the following:
char * copyact(char * from)
{
return ++from;
}
int main()
{
char *string = "school";
char *copy;
copy = copyact(string);
printf("%s", copy);
}
This is printing chool, however my idea is the application must crash when we try to print it in main(). By scope rules, parameter from is a variable local to copyact function. I'm doing from = from + 1; and returning address to that place. So when we get back to main, shouldn't the memory given to that location now be invalid because all local variables must be destroyed? Why is this thing still working?
Clarification: Don't we assign a memory location for the pointer &from in which it stores the address for the string? When the function exits, don't we also destroy the address of pointer that holds the valid address? or is it because by the time return is executed, the address it points to was already sent to copy= ?
1. Undefined behavior is not a crash
First of all please remember that when you do bad things with memory (like handling a variable after it has been destroyed) the result is undefined behavior and this means something completely different from a "crash".
Undefined behavior means that anything can happen (including a crash) but anything may also mean "nothing". Actually the worst kinds of bug are those in which undefined behavior doesn't do anything apparent immediately, but only to provoke crazy behavior in some other and unrelated and innocent part of the code one million of instructions executed later. Or only when showing your program in front of a vast audience.
So please remember that undefined behavior is not crash. It's a crash only when you're lucky.
The sooner you understand the difference between a bug and a crash and the better it is. Bugs are your enemies, crashes are your friends (because they reveal a bug).
2. This code is not doing anything bad
The function returns a char *, and this value (a pointer) is computed by pre-incrementing a local variable. When the function returns the local variable is destroyed, but because the function was returning its value (a pointer) then the code is perfectly safe.
It would have been unsafe instead if the function was defined as
char *& copyact(char * from)
{
return ++from;
}
because in this case the return value is a reference to a pointer to char and it would have returned a reference to from that was however going to be already destroyed by the time the caller could access the returned reference.
By the way for exampe g++ compiler emits a warning when you compile the modified version:
vref.cpp: In function ‘char*& copyact(char*)’:
vref.cpp:3:9: warning: reference to local variable ‘from’ returned
Note however that even in this case you cannot expect that running the code would generate a crash for sure. For example on my computer running the buggy code with the modified version just prints "school" instead of "chool".
It doesn't make much sense, but this is quite normal once you enter Undefined Behavior realm.
It works, because your function gets the reference to the object that already exists outside of it. The result it returns is just a value. Though judging by the code, it returns the pointer to the string shifted by one from start. I am not sure that was the idea, also it will probably crash if the original was an empty string.
char * copyact(char * from)
{
return ++from;
}
char *string = "school";
char *copy;
copy = copyact(string);
You are making farm points to "school" , which is already there in memory
and you are returned from+1 that points to "chool"
For example in which case you should not return.
char * copyact(char * from)
{
char a[10]; //declared array, has automatic scope.
return a; // you should not return a and can't be accessed outside of function.
}
This question already has answers here:
Why does gcc throw a warning when returning a pointer to a local variable and not when returning a local variable?
(4 answers)
Closed 9 years ago.
Here is my code:
#include <stdio.h>
//returning a pointer
int *fun()
{
int i = 10;
//printf ("%u\n",i);
//printf ("%u\n",&i);
return &i;
}
int main()
{
int *p;
p = fun();
printf ("p = %u\n", p);
printf ("i = %u \n",*p);
return 0;
}
If I remove the comments in the function fun, then the second printf in main shows 10 as the output. otherwise it shows a garbage value. any idea?
Without the commented lines i is never used. So depending on your optimizer, i may never even be allocated. When you add printf within the function, the variable is now used so the compiler allocates memory for i on the stack frame (which happens to have not been reclaimed at the point your second set of printfs occurs). Of course, you cannot depend on when that memory will be reclaimed- but the next function call that occurs is very likely to overwrite the fun() stack frame.
If you set your compiler to disable code optimization you may have a different result. Or you can try setting the variable to volatile which tells the compiler that it doesn't know about all uses of the variable and so allocate it even if the optimizer says it's not needed (which won't stop your variable's memory from being deallocated after you leave the function, it'll just force the allocation in the first place).
As a side note this issue can come up in embedded systems where you have a pointer to a hardware register that triggers hardware actions when set (for instance you might have hardware registers that control a robots arm motion). If you don't declare the pointer to that register volatile then the compiler may optimize away your assignment thinking it's never used.
When fun returns, i goes out of scope so that the address you've returned now points to something else. Try to malloc() some memory and return that instead. And don't forget to call free() when you're done with it :)
And also the second printf in main shows 10 is a pure luck because you have not yet used that space/address for something else.
As #clcto mentions in the first comment the variable i is local to function and it get de-allocated when function returns.
Now why uncommenting the two print statements in function fun() make the value of p to be 10?
It can be because of many reasons which may be dependent on internal behavior of C and your system. But my guess is that it is happening because of how print works.
It maintains a buffer that I know. It fills it and then print it to the console when it get filled completely. So the first two print calls in fun() push i to the buffer, which is not yet filled completely. So when you are returning from fun() it may be possible that i doesn't get de-allocated because buffer is using it (or may be any other reason which I am not sure of but due to buffer i is preserved.)
To support my guess I tried to flush the buffer before printing again and now it doesn't prints 10. You can see the output here and the modified code is below:
#include <stdio.h>
//returning a pointer
int *fun()
{
int i = 10;
printf ("%u\n",i);
printf ("%u\n",&i);
return &i;
}
int main()
{
int *p;
p = fun();
fflush(stdout);
printf ("p = %u\n", p);
printf ("i = %u \n",*p);
return 0;
}
I think my guess is wrong, as #KayakDave pinted out. It just get fit to the situation completely. Kindly refere to his answer for correct explanation.