I would like to understand the difference between the following two C programs.
First program:
void main()
{
int *a;
{
int b = 10;
a=&b;
}
printf("%d\n", *a);
}
Second program:
void main()
{
int *a;
a = foo();
printf("%d\n", *a);
}
int* foo()
{
int b = 10;
return &b;
}
In both cases, the address of a local variable (b) is returned to and assigned to a. I know that the memory a is pointing should not be accessed when b goes out of scope. However, when compiling the above two programs, I receive the following warning for the second program only:
warning C4172: returning address of local variable or temporary
Why do I not get a similar warning for the first program?
As you already know that b goes out of scope in each instance, and accessing that memory is illegal, I am only dumping my thoughts on why only one case throws the warning and other doesn't.
In the second case, you're returning the address of a variable stored on Stack memory. Thus, the compiler detects the issue and warns you about it.
The first case, however skips the compiler checking because the compiler sees that a valid initialized address is assigned to a. The compilers depends in many cases on the intellect of the coder.
Similar examples for depicting your first case could be,
char temp[3] ;
strcpy( temp, "abc" ) ;
The compiler sees that the temp have a memory space but it depends on the coder intellect on how many chars, they are going to copy in that memory region.
your foo() function has undefined behavior since it returns a pointer to a part of stack memory that is not used anymore and that will be overwritten soon on next function call or something
it is called "b is gone out of scope".
Sure the memory still exists and probably have not changed so far but this is not guaranteed.
The same applies to your first code since also the scope of b ends with the closing bracket of the block there b is declared.
Edit:
you did not get the warning in first code because you did not return anything. The warning explicitly refers to return. And since the compiler may allocate the stack space of the complete function at once and including all sub-blocks it may guarantee that the value will not be overwritten. but nevertheless it is undefined behavior.
may be you get additional warnings if you use a higher warning level.
In the first code snippet even though you explicitly add brackets the stack space you are using is in the same region; there are no jumps or returns in the code so the code still uses consecutive memory addresses from the stack. Several things happen:
The compiler will not push additional variables on the stack even if you take out the code block.
You are only restricting the visibility of variable b to that code-block; which is more or less the same as if you would declare it at the beginning and only use it once in the exact same place, but without the { ... }
The value for b is most likely saved in a register which so there would be no problem to print it later - but this is speculative.
For the second code snippet, the function call means a jump and a return which means:
pushing the current stack pointer and the context on the stack
push the relevant values for the function call on the stack
jump to the function code
execute the function code
restore the stack pointer to it's value before the function call
Because the stack pointer has been restored, anything that is on the stack is not lost (yet) but any operations on the stack will be likely to override those values.
I think it is easy to see why you get the warning in only one case and what the expected behavior can be...
Maybe it is related with the implementation of a compiler. In the second program,the compiler can identify that return call is a warning because the program return a variable out of scope. I think it is easy to identify using information about ebp register. But in the first program our compiler needs to do more work for achieving it.
Your both programs invoke undefined behaviour. Statements grouped together within curly braces is called a block or a compound statement. Any variable defined in a block has scope in that block only. Once you go out of the block scope, that variable ceases to exist and it is illegal to access it.
int main(void) {
int *a;
{ // block scope starts
int b = 10; // b exists in this block only
a = &b;
} // block scope ends
// *a dereferences memory which is no longer in scope
// this invokes undefined behaviour
printf("%d\n", *a);
}
Likewise, the automatic variables defined in a function have function scope. Once the function returns, the variables which are allocated on the stack are no longer accessible. That explains the warning you get for your second program. If you want to return a variable from a function, then you should allocate it dynamically.
int main(void) {
int *a;
a = foo();
printf("%d\n", *a);
}
int *foo(void) {
int b = 10; // local variable
// returning the address of b which no longer exists
// after the function foo returns
return &b;
}
Also, the signature of main should be one of the following -
int main(void);
int main(int argc, char *argv[]);
In your first program-
The variable b is a block level variable and the visibility is inside that block
only.
But the lifetime of b is lifetime of the function so it lives upto the exit of main function.
Since the b is still allocated space, *a prints the value stored in b ,since a points b.
Related
I've written a function that returns an array whilst I know that I should return a dynamically allocated pointer instead, but still I wanted to know what happens when I am returning an array declared locally inside a function (without declaring it as static), and I got surprised when I noticed that the memory of the internal array in my function wasn't deallocated, and I got my array back to main.
The main:
int main()
{
int* arr_p;
arr_p = demo(10);
return 0;
}
And the function:
int* demo(int i)
{
int arr[10] = { 0 };
for (int i = 0; i < 10; i++)
{
arr[i] = i;
}
return arr;
}
When I dereference arr_p I can see the 0-9 integers set in the demo function.
Two questions:
How come when I examined arr_p I saw that its address is the same as arr which is in the demo function?
How come demo_p is pointing to data which is not deallocated (the 0-9 numbers) already in demo? I expected that arr inside demo will be deallocated as we got out of demo scope.
One of the things you have to be careful of when programming is to pay good attention to what the rules say, and not just to what seems to work. The rules say you're not supposed to return a pointer to a locally-allocated array, and that's a real, true rule.
If you don't get an error when you write a program that returns a pointer to a locally-allocated array, that doesn't mean it was okay. (Although, it means you really ought to get a newer compiler, because any decent, modern compiler will warn about this.)
If you write a program that returns a pointer to a locally-allocated array and it seems to work, that doesn't mean it was okay, either. Be really careful about this: In general, in programming, but especially in C, seeming to work is not proof that your program is okay. What you really want is for your program to work for the right reasons.
Suppose you rent an apartment. Suppose, when your lease is up, and you move out, your landlord does not collect your key from you, but does not change the lock, either. Suppose, a few days later, you realize you forgot something in the back of one closet. Suppose, without asking, you sneak back to try to collect it. What happens next?
As it happens, your key still works in the lock. Is this a total surprise, or mildly unexpected, or guaranteed to work?
As it happens, your forgotten item still is in the closet. It has not yet been cleared out. Is this a total surprise, or mildly unexpected, or guaranteed to happen?
In the end, neither your old landlord, nor the police, accost you for this act of trespass. Once more, is this a total surprise, or mildly unexpected, or just about completely expected?
What you need to know is that, in C, reusing memory you're no longer allowed to use is just about exactly analogous to sneaking back in to an apartment you're no longer renting. It might work, or it might not. Your stuff might still be there, or it might not. You might get in trouble, or you might not. There's no way to predict what will happen, and there's no (valid) conclusion you can draw from whatever does or doesn't happen.
Returning to your program: local variables like arr are usually stored on the call stack, meaning they're still there even after the function returns, and probably won't be overwritten until the next function gets called and uses that zone on the stack for its own purposes (and maybe not even then). So if you return a pointer to locally-allocated memory, and dereference that pointer right away (before calling any other function), it's at least somewhat likely to "work". This is, again, analogous to the apartment situation: if no one else has moved in yet, it's likely that your forgotten item will still be there. But it's obviously not something you can ever depend on.
arr is a local variable in demo that will get destroyed when you return from the function. Since you return a pointer to that variable, the pointer is said to be dangling. Dereferencing the pointer makes your program have undefined behavior.
One way to fix it is to malloc (memory allocate) the memory you need.
Example:
#include <stdio.h>
#include <stdlib.h>
int* demo(int n) {
int* arr = malloc(sizeof(*arr) * n); // allocate
for (int i = 0; i < n; i++) {
arr[i] = i;
}
return arr;
}
int main() {
int* arr_p;
arr_p = demo(10);
printf("%d\n", arr_p[9]);
free(arr_p) // free the allocated memory
}
Output:
9
How come demo_p is pointing to data which is not deallocated (the 0-9 numbers) already in demo? I expected that arr inside demo will be deallocated as we got out of demo scope.
The life of the arr object has ended and reading the memory addresses previously occupied by arr makes your program have undefined behavior. You may be able to see the old data or the program may crash - or do something completely different. Anything can happen.
… I noticed that the memory of the internal array in my function wasn't deallocated…
Deallocation of memory is not something you can notice or observe, except by looking at the data that records memory reservations (in this case, the stack pointer). When memory is reserved or released, that is just a bookkeeping process about what memory is available or not available. Releasing memory does not necessarily erase memory or immediately reuse it for another purpose. Looking at the memory does not necessarily tell you whether it is in use or not.
When int arr[10] = { 0 }; appears inside a function, it defines an array that is allocated automatically when the function starts executing (or at certain times within the function execution if the definition is in some nested scope). This is commonly done by adjusting the stack pointer. In common systems, programs have a region of memory called the stack, and a stack pointer contains an address that marks the end of the portion of the stack that is currently reserved for use. When a function starts executing, the stack pointer is changed to reserve more memory for that function’s data. When execution of the function ends, the stack pointer is changed to release that memory.
If you keep a pointer to that memory (how you can do that is another matter, discussed below), you will not “notice” or “observe” any change to that memory immediately after the function returns. That is why you see the value of arr_p is the address that arr had, and it is why you see the old data in that memory.
If you call some other function, the stack pointer will be adjusted for the new function, that function will generally use the memory for its own purposes, and then the contents of that memory will have changed. The data you had in arr will be gone. A common example of this that beginners happen across is:
int main(void)
{
int *p = demo(10);
// p points to where arr started, and arr’s data is still there.
printf("arr[3] = %d.\n", p[3]);
// To execute this call, the program loads data from p[3]. Since it has
// not changed, 3 is loaded. This is passed to printf.
// Then printf prints “arr[3] = 3.\n”. In doing this, it uses memory
// on the stack. This changes the data in the memory that p points to.
printf("arr[3] = %d.\n", p[3]);
// When we try the same call again, the program loads data from p[3],
// but it has been changed, so something different is printed. Two
// different things are printed by the same printf statement even
// though there is no visible code changing p[3].
}
Going back to how you can have a copy of a pointer to memory, compilers follow rules that are specified abstractly in the C standard. The C standard defines an abstract lifetime of the array arr in demo and says that lifetime ends when the function returns. It further says the value of a pointer becomes indeterminate when the lifetime of the object it points to ends.
If your compiler is simplistically generating code, as it does when you compile using GCC with -O0 to turn off optimization, it typically keeps the address in p and you will see the behaviors described above. But, if you turn optimization on and compile more complicated programs, the compiler seeks to optimize the code it generates. Instead of mechanically generating assembly code, it tries to find the “best” code that performs the defined behavior of your program. If you use a pointer with indeterminate value or try to access an object whose lifetime has ended, there is no defined behavior of your program, so optimization by the compiler can produce results that are unexpected by new programmers.
As you know dear, the existence of a variable declared in the local function is within that local scope only. Once the required task is done the function terminates and the local variable is destroyed afterwards. As you are trying to return a pointer from demo() function ,but the thing is the array to which the pointer points to will get destroyed once we come out of demo(). So indeed you are trying to return a dangling pointer which is pointing to de-allocated memory. But our rule suggests us to avoid dangling pointer at any cost.
So you can avoid it by re-initializing it after freeing memory using free(). Either you can also allocate some contiguous block of memory using malloc() or you can declare your array in demo() as static array. This will store the allocated memory constant also when the local function exits successfully.
Thank You Dear..
#include<stdio.h>
#define N 10
int demo();
int main()
{
int* arr_p;
arr_p = demo();
printf("%d\n", *(arr_p+3));
}
int* demo()
{
static int arr[N];
for(i=0;i<N;i++)
{
arr[i] = i;
}
return arr;
}
OUTPUT : 3
Or you can also write as......
#include <stdio.h>
#include <stdlib.h>
#define N 10
int* demo() {
int* arr = (int*)malloc(sizeof(arr) * N);
for(int i = 0; i < N; i++)
{
arr[i]=i;
}
return arr;
}
int main()
{
int* arr_p;
arr_p = demo();
printf("%d\n", *(arr_p+3));
free(arr_p);
return 0;
}
OUTPUT : 3
Had the similar situation when i have been trying to return char array from the function. But i always needed an array of a fixed size.
Solved this by declaring a struct with a fixed size char array in it and returning that struct from the function:
#include <time.h>
typedef struct TimeStamp
{
char Char[9];
} TimeStamp;
TimeStamp GetTimeStamp()
{
time_t CurrentCalendarTime;
time(&CurrentCalendarTime);
struct tm* LocalTime = localtime(&CurrentCalendarTime);
TimeStamp Time = { 0 };
strftime(Time.Char, 9, "%H:%M:%S", LocalTime);
return Time;
}
Happy new year everyone.
I am studying C language. I had a question when some code run about pointer.
#include <stdio.h>
int * b() {
int a = 8;
int *p = &a;
printf("the addr of a in b: %p\n", p); the addr of a in b: 0x7ffccfcba984
return p;
}
int main () {
int *c = b();
printf("the addr of a in main: %p\n", c); // the addr of a in main: 0x7ffccfcba984
printf("The value of ptr is : %d\n", *c ); // 8
return 0;
}
Can you feel something odd in this code?
I learned that a variables declared inside a function is deallocated at the end of the function.
However, I can still access variables outside the function
like above code when trying to access the address of "a" variable. If the deallocation is true, int a should be deallocated at the end of the b function. It is like a free is not used after variables is declared.
Is there some knowledge I am missing about deallocation?
Could you tell me why I can still access it?
Once you leave a function variables "fall out of scope" meaning they are no longer valid.
Using the address of an out of scope variable breaks that boundary and leads to undefined behaviour, as in, it's not valid to do. The &a pointer is effectively invalidated when you exit that function. If you use it then the program may behave erratically, might crash, or might work fine. It's not defined what happens.
In this trivial example you're not going to get the same behaviour as in a real program. Make another function call to a function that exercises the stack and you'll likely see some problems since the stack is being re-used.
Local variables aren't "allocated" per-se, they are simply scoped, and when that scope is exited they are invalidated.
In something like C++ there may be a deallocation process when things fall out of scope, as that language can define destructors and such, but that's not the same as C. In C they just cease to exist.
I learned that a variables declared inside a function is deallocated at the end of the function.
If the deallocation is true, int a should be deallocated at the end of the b function.
Yes. You are not wrong.
A variable will be destructured when it goes out of its scope. Although you use a pointer variable to save a pointer to that variable, accessing it through that pointer is actually an undefined behavior.
Yes, you can access it and see the results you expect because of luck or environment, but it may still cause a crash, unexpected results, etc. Because this behavior is undefined and wrong.
I have created a local variable b in the function foo(). This function returns the address of the variable b. As b is a local variable to the functionfoo(), the address of the variable is not supposed to be there as the execution of the function foo() finishes. I get the following warning message:
C:\Users\User\Documents\C codes\fun.c||In function 'foo':
C:\Users\User\Documents\C codes\fun.c|6|warning: function returns address of local variable [-Wreturn-local-addr]|
But the pointer ptr of the main() function successfully receives the address of b and prints it. As far as I know about stack, the address of the local variable b should cease to exist after the control of the program returns to the main() function. But the output proves otherwise.
#include<stdio.h>
int* foo()
{
int b = 8;
return &b;
}
int main()
{
int *ptr;
ptr = foo();
printf("b = %d", *ptr);
return 0;
}
Output:
b = 8
The lifetime of an object is the portion of program execution during which storage (memory) for it is reserved.
When we say an object ceases to exist in the C model, we mean that reservation ends. The memory is no longer reserved for the object. This is merely a bookkeeping operation; the memory might or might not be used for another purpose. The fact that the reservation ends does not mean anything enforces a rule that you must not use the memory.
What happens when you use an object after its memory reservation ends is similar to what happens when you trespass in the real world. Maybe nobody notices and you get away with it. Maybe somebody notices and stops you. Maybe somebody changes what was there, and maybe they do not.
In the case you tested, nothing changed the memory where b was, and you got away with using memory that was no longer reserved for b. But you cannot rely on this; there is no rule that says it will always work.
Consider the following code.
#include<stdio.h>
int *abc(); // this function returns a pointer of type int
int main()
{
int *ptr;
ptr = abc();
printf("%d", *ptr);
return 0;
}
int *abc()
{
int i = 45500, *p;
p = &i;
return p;
}
Output:
45500
I know according to link this type of behavior is undefined. But why i am getting correct value everytime i run the program.
Every time you call abc it "marks" a region at the top of the stack as the place where it will write all of its local variables. It does that by moving the pointer that indicates where the top of stack is. That region is called the stack frame. When the function returns, it indicates that it does not want to use that region anymore by moving the stack pointer to where it was originally. As a result, if you call other functions afterwards, they will reuse that region of the stack for their own purposes. But in your case, you haven't called any other functions yet. So that region of the stack is left in the same state.
All the above explain the behavior of your code. It is not necessary that all C compilers implement functions that way and therefore you should not rely on that behavior.
Well, undefined behavior is, undefined. You can never rely on UB (or on an output of a program invoking UB).
Maybe, just maybe in your environment and for your code, the memory location allocated for the local variable is not reclaimed by the OS and still accessible, but there's no guarantee that it will have the same behavior for any other platform.
As we know, local variables have local scope and lifetime. Consider the following code:
int* abc()
{
int m;
return(&m);
}
void main()
{
int* p=abc();
*p=32;
}
This gives me a warning that a function returns the address of a local variable.
I see this as justification:
Local veriable m is deallocated once abc() completes. So we are dereferencing an invalid memory location in the main function.
However, consider the following code:
int* abc()
{
int m;
return(&m);
int p=9;
}
void main()
{
int* p=abc();
*p=32;
}
Here I am getting the same warning. But I guess that m will still retain its lifetime when returning. What is happening? Please explain the error. Is my justification wrong?
First, notice that int p=9; will never be reached, so your two versions are functionally identical. The program will allocate memory for m and return the address of that memory; any code below the return statement is unreacheable.
Second, the local variable m is not actually de-allocated after the function returns. Rather, the program considers the memory free space. That space might be used for another purpose, or it might stay unused and forever hold its old value. Because you have no guarantee about what happens to the memory once the abc() function exits, you should not attempt to access or modify it in any way.
As soon as return keyword is encountered, control passes back to the caller and the called function goes out of scope. Hence, all local variables are popped off the stack. So the last statement in your second example is inconsequential and the warning is justified
Logically, m no longer exists when you return from the function, and any reference to it is invalid once the function exits.
Physically, the picture is a bit more complicated. The memory cells that m occupied are certainly still there, and if you access those cells before anything else has a chance to write to them, they'll contain the value that was written to them in the function, so under the right circumstances it's possible for you to read what was stored in m through p after abc has returned. Do not rely on this behavior being repeatable; it is a coding error.
From the language standard (C99):
6.2.4 Storage durations of objects
...
2 The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address,25) and retains
its last-stored value throughout its lifetime.26) If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to reaches the end of its lifetime.
25) The term ‘‘constant address’’ means that two pointers to the object constructed at possibly different
times will compare equal. The address may be different during two different executions of the same
program.
26) In the case of a volatile object, the last store need not be explicit in the program.
Emphasis mine. Basically, you're doing something that the language definition explicitly calls out as undefined behavior, meaning the compiler is free to handle that situation any way it wants to. It can issue a diagnostic (which your compiler is doing), it can translate the code without issuing a diagnostic, it can halt translation at that point, etc.
The only way you can make m still valid memory (keeping the maximum resemblance with your code) when you exit the function, is to prepend it with the static keyword
int* abc()
{
static int m;
m = 42;
return &m;
}
Anything after a return is a "dead branch" that won't be ever executed.
int m should be locally visible. You should create it as int* m and return it directly.