Why y isn't dereferenced? - c

For the following code, this is how i understand:
Reference to pointer x is passed to function f,
val get the address of y which is a local variable.
So why, after exiting function f, x is ok? y should have been dereferenced.
x is equal to 5, and both printf print the same adress.
void f (int ** val)
{
int y = 5 ;
*val = & y;
printf("%d\n", &y);
}
int _tmain(int argc, _TCHAR* argv[])
{
int * x ;
f(&x );
if ( *x == 5 )
printf("%d", x);
}

It is Undefined Behaviour to access memory your program does not own.
The memory space occupied by y inside the function does not belong to your program once the function finishes, and yet you access it.
Anything could happen.
The worst thing to happen is for the program to behave as you expect.
When this happens, you believe it is ok to do what you did. IT IS NOT OK. Undefined Behaviour is bad.
Also it's not guaranteed that the same undefined behaviour happens on different runs of the program either. It can work as you expect for a while and crash when you demo it to the client (or your boss).
(Some good manifestations of UB are a crash, or lemon juice starting to ooze out of the USB port)
Anything can happen.

x is pointing to a local variable inside f which is no longer valid by the time f returns.
EDIT: Your post doesn't make it clear what you expect should happen, but as described much clearer in other answers, *x is pointing to memory which you do not own, and reading from *x is undefined behavior, so all bets are off. If you try to read from *x and it happens to be 5, it is probably because the value of 5 is still on the stack. Try to insert some calls to printf immediately after the call to f, and you will probably get another result.

The y variable sits on the stack. so you pass an address on the stack to x, and it's a valid address, but with undefined content. if you add another function (like printf) between the call to f and the check of *x == 5 you'll probably get a different result (since the stack was changed).

This is a classic...
The variable y is only alive as long as the function f is executed. Once it returns, the memory space occupied by y on the stack can be used for anything else.

y lives only within f(int**val), since it is declared in that scope.
Refering to its address outside of f() has no clear definition (or as we love to to say: Undefined Behaviour.

Because y might not be valid, but it's value is still in memory.
It'll get nasty if you call some other function or do something else which will write on it.

As nearly everyone has already said, it's undefined behavior. The reason you are printing the correct value (5) is because your program hasn't reused that memory, yet. Wait until your program puts something else at that address, then you will see incorrect results.

Related

Cant understand how the program given in my book leads to the conclusion that a pointer can retrieve the value from the address

#include <stdio.h>
#include <stdlib.h>
int* test(int); // the prototype
int main()
{
int var = -20;
int *ptr = NULL;
ptr = test(var);
printf("This is what we got back in main(): %d \n", *ptr);
return 0;
}
int* test(int k)
{
int y = abs(k);
int *ptr1 = &y;
printf("The value of y in test() directly is %d \n", y);
printf("The value of y in test() indirectly is %d \n", *ptr1);
return ptr1;
}
The output is:
The value of y in test() directly is 20
The value of y in test() indirectly is 20
This is what we got back in main(): 20
Short Summary:
The above mentioned code has a user defined function which returns a pointer that holds the address of a variable(i.e.,y). This variable is assigned an absolute value of the integer passed to the function (i.e.,x)
My book " Computer Programming for Beginners" by A.J.Gonzalez states the following:
The variable y will cease to exist after the function exits, so the address returned to the calling function will turn out too be meaningless. However the value held in its address will persist and can be retrieved through the pointer.
My question is :
How did we come to this conclusion that a value held in the address will persist and can be retrieved through the pointer from the following printf statements:
1st statement: giving value directly using y.
2nd statement: using a pointer to give the value.
3rd statement: getting the value from main.
All that is all right but then from there how does one make the conclusion that the direct use variable loses its value but the indirectly used variable (i.e., the pointer ) retains its value ?
I tried looking at past questions but could not find anything relevant.
Will be grateful for your help. Thank You.
This program is illustrating an effect of undefined behavior.
After test returns, the pointer value it returns points to a variable that no longer exists. Formally, this is undefined behavior which means that the C standard makes no guarantees what will happen when you attempt to use this pointer. In practice, the memory that was used by y was not yet overwritten by some other value so dereferencing the pointer will often yield the value that was stored there.
But again, there's no guarantee that will actually happen. As an example, if we change the main function as follows:
int main()
{
int var = -20;
int *ptr = NULL;
ptr = test(var);
printf("This is what we got back in main(): %d \n", *ptr);
printf("%d %d %d %f\n", 1, 2, 3, 4.0);
printf("This is what we got back in main(): %d \n", *ptr);
return 0;
}
My machine outputs:
The value of y in test() directly is 20
The value of y in test() indirectly is 20
This is what we got back in main(): 20
1 2 3 4.000000
This is what we got back in main(): 0
Which demonstrates that the memory previously used by y has some other value.
The moral of the story: don't attempt to use a pointer to a variable which no longer exists.
My question is : How did we come to this conclusion that a value held in the address will persist and can be retrieved through the pointer
The function returns an address where a variable used to be stored, very likely somewhere on the stack. But since that address is no longer valid, what the pointer now contains is indeterminate. Meaning there's no guarantees of anything any longer. And you can't de-reference that pointer any longer, this code here is bugged:
printf("This is what we got back in main(): %d \n", *ptr);
It could print anything or cause a program crash, it is so-called "undefined behavior" (as per C standard C17 6.2.4/2), see What is undefined behavior and how does it work?.
What conclusions can we draw from this? Only one, returning a pointer to a local scope variable from a function is bad and a bug.
Other than that, there's nothing else to reason about, no deterministic behavior, no interesting phenomenon to study or learn anything meaningful from. See Can a local variable's memory be accessed outside its scope?
The book's reasoning is somewhat correct, however it assumes an architecture that pushes parameters onto the stack, such as Intel.
What happens is that the location of y is not re-used until after its value has been retrieved through ptr of main to be pushed onto the stack for the call of printf.
However, an interrupt could re-use the location and so in general this is undefined behavior.
Note: this is used by compilers when returning a struct as function return. The compiler copies the struct variable of the called function to the struct of the caller, before anything else takes place, so the memory has not yet been re-used.
The program in your book is erroneous, as the function test() returns a pointer that has been initialized to point to an automatic local variable (the variable y in test() itself) that has ceased to exist as soon as the program returned from test(). So trying to use the pointer to access the value pointed to by it is undefined behaviour and this makes your program probably crash, show weird behaviour (as you show in your post). You cannot predict the value returned by it, as the pointer is pointing to memory that has ceased being used for its original purpose, and now is used for a different thing. The output you get is that, but can be completely different, with just changing the compiler version, optimization options or the machine architecture.
A book trying to illustrate this, not warning you about the undefined behaviour stated by the language, is showing, at least, very bad pedagogical manners, and using bad code to teach unusable things. I cannot guess why the output of the program can be of interest for anything. So, please, correct me.

Why does gcc give me this result?

When I run this code gcc gives me the output 10.
Can someone explain to me why it gives me 10? :)
#include <stdio.h>
int f(int x) {
int y;
y = 2*x;
}
int g() {
int z;
return z;
}
int main() {
int x=5;
f(x);
printf("%d\n",g());
}
this is undefined behavior - you are referencing a variable which has no value set to it. likely, it gives 10 because the compiler has used the same memory location for the variable in f(), but there is no guarantee of that, it should not be depended on, and is nothing more than a curiosity.
There's nothing to explain. Your code exhibits undefined behaviour on two separate, unrelated occasions: First f isn't returning anything despite being declared as returning int, and second because g returns an uninitialized value.
Practically, the way the functions will be put on the call stack will have caused the local y (which eventually has the value 10) to be in the same place as the return value of g() in the printf call, so you happen to see the value 10. But that's more or less a matter of luck.
Here:
int g() {
int z;
return z;
}
This reads:
int g():
reserve memory for an integer, call it z.
return whatever is in that reserved memory.
You never used that reserved memory for your integer. Its value is whatever was at that address before you chose to use it (or not use it, rather). That value could be anything.
You do the same in your other function. What you are doing is reading uninitialized memory. You can google that up for further information. See also the "stack" and the "heap", dynamic memory, and other related topics.
g returns an unitialized varable from the stack, in your example that location was last set by the F function giving you your answer of x*2 = 10
Because you're not initializing z, and it's using the same location on the stack as y. Since you're not initializing it the old value is still there.
This is a perfect example of why people fear optimizations and when they brag about finding compiler bugs to their bosses. This code as others have alluded to will throw warnings about using uninitialized variables in g(). With your compiler settings, it is using the old value on the stack from the call to f(5). With different compiler optimization settings, it will likely have effects on how variables end up on the stack and you'll end up getting a different results when you make changes which appear unrelated. This is undefined behavior and there is no guarantees on what value will result however it is usually easy to explain by understanding the call order and how the compiler sets up the stack. If there are warnings when you're troubleshooting weird behavior like this, fix the warnings first then start asking questions about why.

Memory Allocation: Why this C program works? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Returning the address of local or temporary variable
The add function is implemented wrongly. It should return a value instead of a pointer.
Why aren't any errors when ans and *ans_ptr are printed and the program even gives correct result? I guess the variable of z is already out of scope and there should be segmentation fault.
#include <stdio.h>
int * add(int x, int y) {
int z = x + y;
int *ans_ptr = &z;
return ans_ptr;
}
int main() {
int ans = *(add(1, 2));
int *ans_ptr = add(1, 2);
printf("%d\n", *ans_ptr);
printf("%d\n", ans);
return 0;
}
The reason it 'works' is because you got lucky. Returning a pointer to a local variable is Undefined Behaviour!! You should NOT do it.
int * add(int x, int y) {
int z = x + y; //z is a local variable in this stack frame
int *ans_ptr = &z; // ans_ptr points to z
return ans_ptr;
}
// at return of function, z is destroyed, so what does ans_ptr point to? No one knows. UB results
Because C has no garbage collection, when the "z" variable goes out of scope, nothing happens to the actual memory. It is simply freed for another variable to overwrite if the compiler pleases.
Since no memory is allocated between calling "add" and printing, the value is still sitting in memory, and you can access it because you have its address. You "got lucky."
However, as Tony points out, you should NEVER do this. It will work some of the time, but as soon as your program gets more complex, you will start ending up with spurious values.
No. Your question displays a fundamental lack of understanding of how the C memory model works.
The value z is allocated at an address on the stack, in the frame which is created when control enters add(). ans_ptr is then set to this memory address and returned.
The space on the stack will be overwritten by the next function that is called, but remember that C never performs memory clean up unless explicitly told to (eg via a function like calloc()).
This means that the value in the memory location &z (from the just-vacated stack frame) is still intact in the immediately following statement, ie. the printf() statement in main().
You should never ever rely on this behaviour - as soon as you add additional code into the above it will likely break.
The answer is: this program works because you are fortunate, but it will take no time to betray, as the address you return is not reserved to you anymore and any one can use it again. Its like renting the room, making a duplicate key, releasing the room, and after you have released the room at some later time you try to enter it with a duplicate key. In this case if the room is empty and not rented to someone else then you are fortunate, otherwise it can land you in police custody (something bad), and if the lock of the room was changed you get a segfault, so you can't just trust on the duplicate key which you made without acquisition of the room.
The z is a local variable allocated in stack and its scope is as long as the particular call to the function block. You return the address of such a local variable. Once you return from the function, all the addresses local to the block (allocated in the function call stack frame) might be used for another call and be overwritten, therefore you might or might not get what you expect. Which is undefined behavior, and thus such operation is incorrect.
If you are getting correct output, then you are fortunate that the old value held by that memory location is not overwritten, but your program has access to the page in which the address lies, therefore you do not get a segmentation fault error.
A quick test shows, as the OP points out, that neither GCC 4.3 nor MSVC 10 provide any warnings. But the Clang Static Analyzer does:
ccc-analyzer -c foo.c
...
ANALYZE: foo.c add
foo.c:6:5: warning: Address of stack memory associated with local
variable 'z' returned to caller
return ans_ptr;
^ ~~~~~~~

Explain the output

#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",*r);
printf("%d\n",*r);
}
Output after running the code:
3
-1073855580
I understand that a[2] is local to fun() but why value is getting changed of same pointer?
The variable a is indeed local to fun. When you return from that function, the stack is popped. The memory itself remains unchanged (for the moment). When you dereference r the first time, the memory is what you'd expect it to be. And since the dereference happens before the call to printf, nothing bad happens. When printf executes, it modifies the stack and the value is wiped out. The second time through you're seeing whatever value happened to be put there by printf the first time through.
The order of events for a "normal" calling convention (I know, I know -- no such thing):
Dereference r (the first time through, this is what it should be)
Push value onto stack (notice this is making a copy of the value) (may wipe out a)
Push other parameters on to stack (order is usually right to left, IIRC) (may wipe out a)
Allocate room for return value on stack (may wipe out a)
Call printf
Push local printf variables onto stack (may wipe out a)
Do your thang
Return from function
If you change int a[2]; to static int a[2]; this will alleviate the problem.
Because r points to a location on the stack that is likely to be overwritten by a function call.
In this case, it's the first call to printf itself which is changing that location.
In detail, the return from fun has that particular location being preserved simply because nothing has overwritten it yet.
The *r is then evaluated (as 3) and passed to printf to be printed. The actual call to printf changes the contents of that location (since it uses the memory for its own stack frame), but the value has already been extracted at that point so it's safe.
On the subsequent call, *r has the different value, changed by the first call. That's why it's different in this case.
Of course, this is just the likely explanation. In reality, anything could be happening since what you've coded up there is undefined behaviour. Once you do that, all bets are off.
As you've mentioned, a[2] is local to fun(); meaning it is created on the stack right before the code within fun() starts executing. When fun exits the stack is popped, meaning it is unwound so that the stack pointer is pointing to where it was before fun started executing.
The compiler is now free to stick whatever it wants into those locations that were unwound. So, it is possible that the first location of a was skipped for a variety of reasons. Maybe it now represents an uninitialized variable. Maybe it was for memory alignment of another variable. Simple answer is, by returning a pointer to a local variable from a function, and then de-referencing that pointer, you're invoking undefined behavior and anything can happen, including demons flying out of your nose.
When you compile you code with the following command:
$ gcc -Wall yourProgram.c
It will yield a warning, which says.
In function ‘fun’:
warning: function returns address of local variable
When r is dereferenced in first printf statement, it's okay as the memory is preserved. However, the second printf statement overwrites the stack and so we get an undesired result.
Because printf is using the stack location and changes it after printing the first value.

Pointers in C weird behavior inside a function

can someone explain this to me
main()
{
int *x,y;
*x = 1;
y = *x;
printf("%d",y);
}
when I compile it in gcc how come running this in main function is ok, while running it in different function wont work like the function below?
test()
{
int *x,y;
*x = 1;
y = *x;
printf("%d",y);
}
int *x,y;
*x = 1;
Undefined Behavior. x doesn't point to anything meaningful.
This will be correct:
int *x, y, z;
x = &z;
*x = 1;
y = *x;
or
int *x, y;
x = malloc(sizeof(int));
*x = 1;
y = *x;
//print
free(x);
Undefined behavior is, well, undefined. You can't know how it will behave. It can seem to work, crash, print unpredictable results and anything else. Or it can behave differently on different runs. Don't rely on undefined behavior
Technically, in standardese, you invoke what is called undefined behavior due to using an uninitialized value (the value of the pointer x).
What's going on under the hood is very likely this: your compiler allocates local variables on the stack. Calling functions likely changes the stack pointer, so different function's local variables are at different places on the stack. This in turn makes the value of the uninitialized x be whatever happens to be at that place in the current stack frame. This value can be different, depending on the depth of the chain of functions you called. The actual value can depend on a lot of things, e.g. back to the whole history of processes called before your program started. There's no point in speculating what the actual value might be and what kind of erroneous behavior might possibly ensue. In the C community we refer to undefined behaviour as even having the possibility to make demons fly out of your nose. It might even start WW3 (assuming appropriate hardware is installed).
Seriously, a C programmer worth her money will take extreme care not to invoke undefined behavior.
since x is a pointer, its not containing the int itself, it points to another memory location which holds that value.
I think you assume that declaring a pointer to a value also reserves memory for it... not in C.
If you made the above error in your code, maybe it would be good if I gave you a little bit more graphic representation of what is actually going on in the code... this is a common novice error. The explanation below might seem a bit verbose and basic, but it might help your brain "see" what is actually going on.
Let's begin... if [xxxx] is a value being stored in a few bits in the RAM, and [????] is an unknown value (in physical ram) you can say that that for X to be properly used it should be:
x == [xxxx] -> [xxxx]
x == address of a value (int)
when you write: *x=1 above, you are changing the value of an unknown area of RAM, so you are in fact doing:
x == [????] -> [0001] // address [????] is completely undefined !
In fact, we don't even know IF address [????] is allocated or accessible by your application (this is the undefined part), its possible the address points to anything. Function code, dll address, file handle structure... it all depends on the compiler/OS/application state, and can never be relied on.
so to be able to use a pointer to an int, we must first allocate memory for it, and assign the address of that memory to x, ex:
int y; // allocate on the stack
x = &y; // & operator means, *address* of"
or
x = malloc(sizeof(int)); // in 'heap' memory (often called dynamic memory allocation)
// here malloc() returns the *address* of a memory space which is at least large enough
// to store an int, and is known to be reserved for your application.
at this point, we know that x holds a proper memory address so we'll just say it's currently set to [3948] (and contains an unknown value).
x == [3948] -> [????]
Using the * operator, you dereference the pointer address (i.e. look it up), to store a value AT that address.
*x = 1;
means:
x == [3948] -> [0001]
I hope this helps

Resources