Ampersand bug and lifetime in c - c

As we know, local variables have local scope and lifetime. Consider the following code:
int* abc()
{
int m;
return(&m);
}
void main()
{
int* p=abc();
*p=32;
}
This gives me a warning that a function returns the address of a local variable.
I see this as justification:
Local veriable m is deallocated once abc() completes. So we are dereferencing an invalid memory location in the main function.
However, consider the following code:
int* abc()
{
int m;
return(&m);
int p=9;
}
void main()
{
int* p=abc();
*p=32;
}
Here I am getting the same warning. But I guess that m will still retain its lifetime when returning. What is happening? Please explain the error. Is my justification wrong?

First, notice that int p=9; will never be reached, so your two versions are functionally identical. The program will allocate memory for m and return the address of that memory; any code below the return statement is unreacheable.
Second, the local variable m is not actually de-allocated after the function returns. Rather, the program considers the memory free space. That space might be used for another purpose, or it might stay unused and forever hold its old value. Because you have no guarantee about what happens to the memory once the abc() function exits, you should not attempt to access or modify it in any way.

As soon as return keyword is encountered, control passes back to the caller and the called function goes out of scope. Hence, all local variables are popped off the stack. So the last statement in your second example is inconsequential and the warning is justified

Logically, m no longer exists when you return from the function, and any reference to it is invalid once the function exits.
Physically, the picture is a bit more complicated. The memory cells that m occupied are certainly still there, and if you access those cells before anything else has a chance to write to them, they'll contain the value that was written to them in the function, so under the right circumstances it's possible for you to read what was stored in m through p after abc has returned. Do not rely on this behavior being repeatable; it is a coding error.
From the language standard (C99):
6.2.4 Storage durations of objects
...
2 The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address,25) and retains
its last-stored value throughout its lifetime.26) If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to reaches the end of its lifetime.
25) The term ‘‘constant address’’ means that two pointers to the object constructed at possibly different
times will compare equal. The address may be different during two different executions of the same
program.
26) In the case of a volatile object, the last store need not be explicit in the program.
Emphasis mine. Basically, you're doing something that the language definition explicitly calls out as undefined behavior, meaning the compiler is free to handle that situation any way it wants to. It can issue a diagnostic (which your compiler is doing), it can translate the code without issuing a diagnostic, it can halt translation at that point, etc.

The only way you can make m still valid memory (keeping the maximum resemblance with your code) when you exit the function, is to prepend it with the static keyword
int* abc()
{
static int m;
m = 42;
return &m;
}
Anything after a return is a "dead branch" that won't be ever executed.

int m should be locally visible. You should create it as int* m and return it directly.

Related

Pointer allocation and Memory Location [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 12 months ago.
In an Microprocessor it is said that the local variables are stored in stack. In my case if func1() is called by main function then the local variable (int a = 12;)will be created in stack. Once the Called Function is executed the and return back to main function the stack memory will be deleted. So the pointer address still holds (*b) the value 12. At stack if this 'a = 12' is deleted then 'b' should be a dangling pointer no?? Can anyone explain this ? If you have detailed explanation on what happens in memory when this code is being executed it would be helpful.
#include <stdio.h>
int* func1(void);
int main()
{
int* b = func1();
printf("%d\n",*b);
}
int* func1(void)
{
int a = 12;
int* b = &a;
return b;
}
The pointer is dangling. The memory may still hold the previous value, but dereferencing the pointer invokes undefined behaviour.
GCC will give you a warning about this, if you pass -Wall option.
From the C standard (6.2.4):
The lifetime of an object is the portion of program execution during
which storage is guaranteed to be reserved for it. An object exists,
has a constant address,25) and retains its last-stored value
throughout its lifetime.26) If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes
indeterminate when the object it points to reaches the end of its
lifetime.
There are multiple layers here.
First, is the C programming language. It is a language. You say stuff in it and it has meaning. There are sentences that have meaning, but you can also construct grammatically valid sentences that are gibberish.
The code you posted, grammatically valid, is gibberish. The object a inside func1 stops existing when the function returns. *b tries to access an object that does not exists anymore. It is not defined what should happen when you access an object after its lifetime ended. You can read about undefined behavior.
Memory exists. It's not like it is disintegrated when a function returns. It's not like RAM chips are falling out of your PC. They are still there.
So your compiler will produce some machine instructions. These machine instructions will execute. Depending solely on your compiler decisions (the code is undefined behavior, compiler can do what it wants) the actual code can behavior in any way the compiler decides to. But, most probably, *b will generate machine instructions that will read the memory region where object a resided. That memory region may still hold the value 12 or it may have been overwritten somewhere between func1 returning and calling printf, resulting in reading some other value.
At stack if this 'a = 12' is deleted then 'b' should be a dangling pointer no?
Yes.
what happens in memory when this code
This depends on the actual machine instructions generated by the compiler. Compile your code and inspect the assembly.
Strictly speaking, the behaviour is undefined, but repeat the printf or otherwise inspect (in your debugger for example) *b after the printf. It is likely that it will no longer be 12.
It is not accurate to say that the stack memory is "deleted". Rather the stack pointer is restored to the address it had before the call was invoked. The memory is otherwise untouched but becomes available for use by subsequent calls (or for passing arguments to such calls), so after calling printf it is likely to have been reused and modified.
So yes, the pointer is "dangling" since it points to memory that is no longer "valid" in the sense that it does not belong to the context in which the pointer exists.

Scope and lifetime of local variables in C

I would like to understand the difference between the following two C programs.
First program:
void main()
{
int *a;
{
int b = 10;
a=&b;
}
printf("%d\n", *a);
}
Second program:
void main()
{
int *a;
a = foo();
printf("%d\n", *a);
}
int* foo()
{
int b = 10;
return &b;
}
In both cases, the address of a local variable (b) is returned to and assigned to a. I know that the memory a is pointing should not be accessed when b goes out of scope. However, when compiling the above two programs, I receive the following warning for the second program only:
warning C4172: returning address of local variable or temporary
Why do I not get a similar warning for the first program?
As you already know that b goes out of scope in each instance, and accessing that memory is illegal, I am only dumping my thoughts on why only one case throws the warning and other doesn't.
In the second case, you're returning the address of a variable stored on Stack memory. Thus, the compiler detects the issue and warns you about it.
The first case, however skips the compiler checking because the compiler sees that a valid initialized address is assigned to a. The compilers depends in many cases on the intellect of the coder.
Similar examples for depicting your first case could be,
char temp[3] ;
strcpy( temp, "abc" ) ;
The compiler sees that the temp have a memory space but it depends on the coder intellect on how many chars, they are going to copy in that memory region.
your foo() function has undefined behavior since it returns a pointer to a part of stack memory that is not used anymore and that will be overwritten soon on next function call or something
it is called "b is gone out of scope".
Sure the memory still exists and probably have not changed so far but this is not guaranteed.
The same applies to your first code since also the scope of b ends with the closing bracket of the block there b is declared.
Edit:
you did not get the warning in first code because you did not return anything. The warning explicitly refers to return. And since the compiler may allocate the stack space of the complete function at once and including all sub-blocks it may guarantee that the value will not be overwritten. but nevertheless it is undefined behavior.
may be you get additional warnings if you use a higher warning level.
In the first code snippet even though you explicitly add brackets the stack space you are using is in the same region; there are no jumps or returns in the code so the code still uses consecutive memory addresses from the stack. Several things happen:
The compiler will not push additional variables on the stack even if you take out the code block.
You are only restricting the visibility of variable b to that code-block; which is more or less the same as if you would declare it at the beginning and only use it once in the exact same place, but without the { ... }
The value for b is most likely saved in a register which so there would be no problem to print it later - but this is speculative.
For the second code snippet, the function call means a jump and a return which means:
pushing the current stack pointer and the context on the stack
push the relevant values for the function call on the stack
jump to the function code
execute the function code
restore the stack pointer to it's value before the function call
Because the stack pointer has been restored, anything that is on the stack is not lost (yet) but any operations on the stack will be likely to override those values.
I think it is easy to see why you get the warning in only one case and what the expected behavior can be...
Maybe it is related with the implementation of a compiler. In the second program,the compiler can identify that return call is a warning because the program return a variable out of scope. I think it is easy to identify using information about ebp register. But in the first program our compiler needs to do more work for achieving it.
Your both programs invoke undefined behaviour. Statements grouped together within curly braces is called a block or a compound statement. Any variable defined in a block has scope in that block only. Once you go out of the block scope, that variable ceases to exist and it is illegal to access it.
int main(void) {
int *a;
{ // block scope starts
int b = 10; // b exists in this block only
a = &b;
} // block scope ends
// *a dereferences memory which is no longer in scope
// this invokes undefined behaviour
printf("%d\n", *a);
}
Likewise, the automatic variables defined in a function have function scope. Once the function returns, the variables which are allocated on the stack are no longer accessible. That explains the warning you get for your second program. If you want to return a variable from a function, then you should allocate it dynamically.
int main(void) {
int *a;
a = foo();
printf("%d\n", *a);
}
int *foo(void) {
int b = 10; // local variable
// returning the address of b which no longer exists
// after the function foo returns
return &b;
}
Also, the signature of main should be one of the following -
int main(void);
int main(int argc, char *argv[]);
In your first program-
The variable b is a block level variable and the visibility is inside that block
only.
But the lifetime of b is lifetime of the function so it lives upto the exit of main function.
Since the b is still allocated space, *a prints the value stored in b ,since a points b.

Scope of a variable. Internal working basic

This is a very basic question about the scope of a variable suppose. I have the fiollowing code:
int main()
{
int *p;
p=func();
printf("%d",*p);
return 0;
}
int *func()
{
int i;
i=5;
return &i;
}
My question
The scope of i is finished in func() but, since I am returning the address of i will I be able to access and print5 in main()?
If not, why? does the compiler puts a garbage value in that address space (I don't think this is done).
What actually it means by the scope of a variable is ended ? Also does the memory allocated to i is freed when its scope ends?
Scope of the variable is the region where it can be accessed.
Lifetime of the variable is the time till when the variable is guaranteed to exist.
In your case lifetime of i is within the function not beyond it. It means i is not guaranteed to exist beyond the function. It is not required to and it is Undefined Behavior to access a local variable beyond the function.
The scope of i is finished in func() but, since I am returning the address of i will I be able to access and print 5 in main()?
You might, but it is Undefined Behavior. So don't do it.
If not, why? does the compiler puts a garbage value in that address space (I don't think this is done)
The compiler may put whatever it chooses to in that location, once the function returns the address location is holds an Indeterminate value.
What actually it means by the scope of a variable is ended ? Also does the memory allocated to i is freed when its scope ends?
i is a automatic/local variable and all automatic variables are freed once the scope {,} in which they are declared ends. Hence the name automatic.
It is undefined behaviour to access a variable after it has gone out of scope. This means it is not possible to say what will definitely happen. In the posted code, 5 might be printed, some other value may be printed or some other behaviour may occur (access violation for example).
Behaviour in your example is undefined. Your printf probably will output 5 but that'd be down to luck rather than good design.
In this case, when the scope of the variable is ended, further function calls may reuse the stack address &i changing the value your p variable points to.
No, accessing a variable that has gone out of scope leads to undefined behavior. The storage where thev variable used to be has been reclaimed, so you're likely to overwrite something else which can lead to crashes or just unpredictable behavior.
Your function will probably print 5, but you should never do this. It's undefined behavior since your program no longer owns the location pointed to by the pointer your return (in other words, your program no longer owns i).
Basically each time a function is called, the stack pointer is pushed down to accommodate the new stack frame. When the function call ends, the stack pointer is raised back up. This means that if a different function were to be called, it would have overlapping the same stack space as the previous function call.
To illustrate this a little better, consider this:
int main()
{
int *p;
p=func();
printf("%d\n",*p);
func2();
printf("%d\n",*p);
return 0;
}
int *func()
{
int i;
i=5;
return &i;
}
void func2()
{
int i = 1;
}
There's a pretty good chance that the output would be 5 1. This is because the second call will reuse the same stack space.
(Note that above code snippet is horrible -- you should never do something like that -- it's undefined behavior and highly implementation dependent.)
To answer your questions directly:
The scope of i is finished in func() but, since I am returning the address of i will I be able to access and print5 in main()?
No. You can, but you shouldn't. Such is the beauty of C. Depending on the compiler/OS/etc it might output 5, or it might output random garage.
If not, why? does the compiler puts a garbage value in that address space (I don't think this is done).
The space used for local variables is reused. The first half of the answer hopefully illustrated how this works. (Well, how it typically works.)
What actually it means by the scope of a variable is ended ? Also does the memory allocated to i is freed when its scope ends?
Stack based memory allocation is what's going on behind the scenes.

scope rules in C

I recently read about scope rules in C. It says that a local or auto variable is available only inside the block of the function in which it is declared. Once outside the function it no longer is visible. Also that its lifetime is only till the end of the final closing braces of the function body.
Now here is the problem. What happens when the address of a local variable is returned from the function to the calling function ?
For example :-
main()
{
int *p=fun();
}
int * fun()
{
int localvar=0;
return (&localvar);
}
once the control returns back from the function fun, the variable localvar is no longer alive. So how will main be able to access the contents at this address ?
The address can be returned, but the value stored at the address cannot reliably be read. Indeed, it is not even clear that you can safely assign it, though the chances are that on most machines there wouldn't be a problem with that.
You can often read the address, but the behaviour is undefined (read 'bad: to be avoided at all costs!'). In particular, the address may be used for other variables in other functions, so if you access it after calling other functions, you are definitely unlikely to see the last value stored in the variable by the function that returned the pointer to it.
Why then is a function returning a pointer ever required?
One reason is often 'dynamic memory'. The malloc() family of functions return a pointer to new (non-stack) memory.
Another reason is 'found something at this location in a value passed to me'. Consider strchr() or strstr().
Another reason is 'returning pointer to a static object, either hidden in the function or in the file containing the source for the function'. Consider asctime() et al (and worry about thread-safety).
There are probably a few others, but those are probably the most common.
Note that none of these return a pointer to a local (stack-based) variable.
The variable is gone, but the memory location still exists and might even still contain the value you set. It will however probably get overwritten pretty fast as more functions are called and the memory address gets reused for another function's local variables. You can learn more by reading about the Call Stack, which is where local variables of functions are stored.
Referencing that location in memory after the function has returned is dangerous. Of course the location still exists (and it may still contain your value), but you no longer have any claim to that memory region and it will likely be overwritten with new data as the program continues and new local variables are allocated on the stack.
gcc gives me the following warning:
t.c: In function ‘test’:
t.c:3:2: warning: function returns address of local variable [enabled by default]
Consider this test program:
int * test(int p) {
int loc = p;
return &loc;
}
int main(void) {
int *c = test(4);
test(5);
printf("%d\n", *c);
return 0;
}
What do you think this prints?

How does temporary storage work in C when a function returns?

I know C pretty well, however I'm confused of how temporary storage works.
Like when a function returns, all the allocation happened inside that function is freed (from the stack or however the implementation decides to do this).
For example:
void f() {
int a = 5;
} // a's value doesn't exist anymore
However we can use the return keyword to transfer some data to the outside world:
int f() {
int a = 5;
return a;
} // a's value exists because it's transfered to the outside world
Please stop me if any of this is wrong.
Now here's the weird thing, when you do this with arrays, it doesn't work.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
I know arrays are only accessible by pointers, and you can't pass arrays around like another data structure without using pointers. Is this the reason you can't return arrays and use them in the outside world? Because they're only accessible by pointers?
I know I could be using dynamic allocation to keep the data to the outside world, but my question is about temporary allocation.
Thanks!
When you return something, its value is copied. a does not exist outside the function in your second example; it's value does. (It exists as an rvalue.)
In your last example, you implicitly convert the array a to an int*, and that copy is returned. a's lifetime ends, and you're pointing at garbage.
No variable lives outside its scope, ever.
In the first example the data is copied and returned to the calling function, however the second returns a pointer so the pointer is copied and returned, however the data that is pointed to is cleaned up.
In implementations of C I use (primarily for embedded 8/16-bit microcontrollers), space is allocated for the return value in the stack when the function is called.
Before calling the function, assume the stack is this (the lines could represent various lengths, but are all fixed):
[whatever]
...
When the routine is called (e.g. sometype myFunc(arg1,arg2)), C throws the parameters for the function (arguments and space for the return value, which are all of fixed length) on to the stack, followed by the return address to continue code execution from, and possibly backs up some processor registers.
[myFunc local variables...]
[return address after myFunc is done]
[myFunc argument 1]
[myFunc argument 2]
[myFunc return value]
[whatever]
...
By the time the function fully completes and returns to the code it was called from, all of it's variables have been deallocated off the stack (they might still be there in theory, but there is no guarantee)
In any case, in order to return the array, you would need to allocate space for it somewhere else, then return the address to the 0th element.
Some compilers will store return values in temporary registers of the processor rather than using the stack, but it's rare (only seen it on some AVR compilers).
When you attempt to return a locally allocated array like that, the calling function gets a pointer to where the array used to live on the stack. This can make for some spectacularly gruesome crashes, when later on, something else writes to the array, and clobbers a stack frame .. which may not manifest itself until much later, if the corrupted frame is deep in the calling sequence. The maddening this with debugging this type of error is that real error (returning a local array) can make some other, absolutely perfect function blow up.
You still return a memory address, you can try to check its value, but the contents its pointing are not valid beyond the scope of function,so dont confuse value with reference.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
First, the compiler wouldn't know what size of array to return. I just got syntax errors when I used your code, but with a typedef I was able to get an error that said that functions can't return arrays, which I knew.
typedef int ia[1];
ia h(void) {
ia a = 5;
return a;
}
Secondly, you can't do that anyway. You also can't do
int a[1] = {4};
int b[1];
b = a; // Here both a and b are interpreted as pointer literals or pointer arithmatic
While you don't write it out like that, and the compiler really wouldn't even have to generate any code for it this operation would have to happen semantically for this to be possible so that a new variable name could be used to refer the value that was returned by the function. If you enclosed it in a struct then the compiler would be just fine with copying the data.
Also, outside of the declaration and sizeof statements (and possibly typeof operations if the compiler has that extension) whenever an array name appears in code it is thought of by the compiler as either a pointer literal or as a chunk of pointer arithmetic that results in a pointer. This means that the return statement would end looking like you were returning the wrong type -- a pointer rather than an array.
If you want to know why this can't be done -- it just can't. A compiler could implicitly think about the array as though it were in a struct and make it happen, but that's just not how the C standard says it is to be done.

Resources