Stack frame memory allocation - c

Like every function is put on a stack frame for its execution and it is flushed after its completion. So, any local variable wont be available to other functions. But then how are we able to return a local variable to the caller?
int pickMin( int x, int y, int z ) {
int min = x ;
if ( y < min )
min = y ;
if ( z < min )
min = z ;
return min ; }
The above code works fine. However the in the below code, compiler does give a warning message- "warning: function returns address of local variable [-Wreturn-local-addr] return a; " but it prints a garbage value at the end, which I think is fine because the variable has already been flushed. But why didn't that happen in the ABOVE program?! I mean, it should also have returned me a garbage value.Moreover, I know that the problem in the below code can be solved using malloc, and then returning that value. :)
int *returnarray(){
int a[10]; int i;
for(i=0;i<10;++i){
a[i] = i;
}return a;}

C passes everything around by value. In your first snippet, return min returns an int variable. Its value is returned. The second snippet consists of return and an array name, which decays into a pointer.
The value that is returned, is the memory address of a local variable. The function where this variable existed has returned, though, and accessing the memory that this function used then invokes undefined behaviour.
The way to handle this kind of situation (ie: needing to return an array) is either by passing the target array to the function as an argument, or by allocating the memory using malloc and returning that pointer.
Heap memory is a tad slower, more error prone, and requires you to look after it though. Still, here's an example of both approaches
create_fill allocates, assigns and returns a pointer to the heap memory, fill_array doesn't return anything, but expects you to pass an array (which decays into a pointer), and a max length to fill. The advantage being: stack memory doesn't require as much care, and will outperform the heap.

The return statement des exactly that: it copies the value of a variable, and leaves it on top of the stack so the calling function can use it. Now, in C this works for simple values, not for arrays because your "array variable" a is actually the address of its first value only.

First, read carefully call stack wikipage. There are nice pictures on it. See also the x86 calling conventions wikipage.
Then, the result (of some C function) often goes thru a register when returned (or, for large struct-s, in a stack space allocated by the caller).
Details are ABI specific. For Linux on x86-64 (in 64 bits), the x86-64 ABI mentions the %rax register to return a result (in common cases, but when the result is a large struct the caller passes an address for it).
BTW, in principle, I believe that nothing in the C99 standard requires a stack, but I know no C implementations without a call stack (generally the processor stack, i.e. thru the stack register).

In first case you return value of a variable.
While, in second case you return address of a local variable which as you correctly said is not available for other functions. In C, the name of an array is the base address of that array. Hence in second case, "base address of array" is copied to any variable/pointer, which is assigned the return value of that function

Related

Returning an array from function in C

I've written a function that returns an array whilst I know that I should return a dynamically allocated pointer instead, but still I wanted to know what happens when I am returning an array declared locally inside a function (without declaring it as static), and I got surprised when I noticed that the memory of the internal array in my function wasn't deallocated, and I got my array back to main.
The main:
int main()
{
int* arr_p;
arr_p = demo(10);
return 0;
}
And the function:
int* demo(int i)
{
int arr[10] = { 0 };
for (int i = 0; i < 10; i++)
{
arr[i] = i;
}
return arr;
}
When I dereference arr_p I can see the 0-9 integers set in the demo function.
Two questions:
How come when I examined arr_p I saw that its address is the same as arr which is in the demo function?
How come demo_p is pointing to data which is not deallocated (the 0-9 numbers) already in demo? I expected that arr inside demo will be deallocated as we got out of demo scope.
One of the things you have to be careful of when programming is to pay good attention to what the rules say, and not just to what seems to work. The rules say you're not supposed to return a pointer to a locally-allocated array, and that's a real, true rule.
If you don't get an error when you write a program that returns a pointer to a locally-allocated array, that doesn't mean it was okay. (Although, it means you really ought to get a newer compiler, because any decent, modern compiler will warn about this.)
If you write a program that returns a pointer to a locally-allocated array and it seems to work, that doesn't mean it was okay, either. Be really careful about this: In general, in programming, but especially in C, seeming to work is not proof that your program is okay. What you really want is for your program to work for the right reasons.
Suppose you rent an apartment. Suppose, when your lease is up, and you move out, your landlord does not collect your key from you, but does not change the lock, either. Suppose, a few days later, you realize you forgot something in the back of one closet. Suppose, without asking, you sneak back to try to collect it. What happens next?
As it happens, your key still works in the lock. Is this a total surprise, or mildly unexpected, or guaranteed to work?
As it happens, your forgotten item still is in the closet. It has not yet been cleared out. Is this a total surprise, or mildly unexpected, or guaranteed to happen?
In the end, neither your old landlord, nor the police, accost you for this act of trespass. Once more, is this a total surprise, or mildly unexpected, or just about completely expected?
What you need to know is that, in C, reusing memory you're no longer allowed to use is just about exactly analogous to sneaking back in to an apartment you're no longer renting. It might work, or it might not. Your stuff might still be there, or it might not. You might get in trouble, or you might not. There's no way to predict what will happen, and there's no (valid) conclusion you can draw from whatever does or doesn't happen.
Returning to your program: local variables like arr are usually stored on the call stack, meaning they're still there even after the function returns, and probably won't be overwritten until the next function gets called and uses that zone on the stack for its own purposes (and maybe not even then). So if you return a pointer to locally-allocated memory, and dereference that pointer right away (before calling any other function), it's at least somewhat likely to "work". This is, again, analogous to the apartment situation: if no one else has moved in yet, it's likely that your forgotten item will still be there. But it's obviously not something you can ever depend on.
arr is a local variable in demo that will get destroyed when you return from the function. Since you return a pointer to that variable, the pointer is said to be dangling. Dereferencing the pointer makes your program have undefined behavior.
One way to fix it is to malloc (memory allocate) the memory you need.
Example:
#include <stdio.h>
#include <stdlib.h>
int* demo(int n) {
int* arr = malloc(sizeof(*arr) * n); // allocate
for (int i = 0; i < n; i++) {
arr[i] = i;
}
return arr;
}
int main() {
int* arr_p;
arr_p = demo(10);
printf("%d\n", arr_p[9]);
free(arr_p) // free the allocated memory
}
Output:
9
How come demo_p is pointing to data which is not deallocated (the 0-9 numbers) already in demo? I expected that arr inside demo will be deallocated as we got out of demo scope.
The life of the arr object has ended and reading the memory addresses previously occupied by arr makes your program have undefined behavior. You may be able to see the old data or the program may crash - or do something completely different. Anything can happen.
… I noticed that the memory of the internal array in my function wasn't deallocated…
Deallocation of memory is not something you can notice or observe, except by looking at the data that records memory reservations (in this case, the stack pointer). When memory is reserved or released, that is just a bookkeeping process about what memory is available or not available. Releasing memory does not necessarily erase memory or immediately reuse it for another purpose. Looking at the memory does not necessarily tell you whether it is in use or not.
When int arr[10] = { 0 }; appears inside a function, it defines an array that is allocated automatically when the function starts executing (or at certain times within the function execution if the definition is in some nested scope). This is commonly done by adjusting the stack pointer. In common systems, programs have a region of memory called the stack, and a stack pointer contains an address that marks the end of the portion of the stack that is currently reserved for use. When a function starts executing, the stack pointer is changed to reserve more memory for that function’s data. When execution of the function ends, the stack pointer is changed to release that memory.
If you keep a pointer to that memory (how you can do that is another matter, discussed below), you will not “notice” or “observe” any change to that memory immediately after the function returns. That is why you see the value of arr_p is the address that arr had, and it is why you see the old data in that memory.
If you call some other function, the stack pointer will be adjusted for the new function, that function will generally use the memory for its own purposes, and then the contents of that memory will have changed. The data you had in arr will be gone. A common example of this that beginners happen across is:
int main(void)
{
int *p = demo(10);
// p points to where arr started, and arr’s data is still there.
printf("arr[3] = %d.\n", p[3]);
// To execute this call, the program loads data from p[3]. Since it has
// not changed, 3 is loaded. This is passed to printf.
// Then printf prints “arr[3] = 3.\n”. In doing this, it uses memory
// on the stack. This changes the data in the memory that p points to.
printf("arr[3] = %d.\n", p[3]);
// When we try the same call again, the program loads data from p[3],
// but it has been changed, so something different is printed. Two
// different things are printed by the same printf statement even
// though there is no visible code changing p[3].
}
Going back to how you can have a copy of a pointer to memory, compilers follow rules that are specified abstractly in the C standard. The C standard defines an abstract lifetime of the array arr in demo and says that lifetime ends when the function returns. It further says the value of a pointer becomes indeterminate when the lifetime of the object it points to ends.
If your compiler is simplistically generating code, as it does when you compile using GCC with -O0 to turn off optimization, it typically keeps the address in p and you will see the behaviors described above. But, if you turn optimization on and compile more complicated programs, the compiler seeks to optimize the code it generates. Instead of mechanically generating assembly code, it tries to find the “best” code that performs the defined behavior of your program. If you use a pointer with indeterminate value or try to access an object whose lifetime has ended, there is no defined behavior of your program, so optimization by the compiler can produce results that are unexpected by new programmers.
As you know dear, the existence of a variable declared in the local function is within that local scope only. Once the required task is done the function terminates and the local variable is destroyed afterwards. As you are trying to return a pointer from demo() function ,but the thing is the array to which the pointer points to will get destroyed once we come out of demo(). So indeed you are trying to return a dangling pointer which is pointing to de-allocated memory. But our rule suggests us to avoid dangling pointer at any cost.
So you can avoid it by re-initializing it after freeing memory using free(). Either you can also allocate some contiguous block of memory using malloc() or you can declare your array in demo() as static array. This will store the allocated memory constant also when the local function exits successfully.
Thank You Dear..
#include<stdio.h>
#define N 10
int demo();
int main()
{
int* arr_p;
arr_p = demo();
printf("%d\n", *(arr_p+3));
}
int* demo()
{
static int arr[N];
for(i=0;i<N;i++)
{
arr[i] = i;
}
return arr;
}
OUTPUT : 3
Or you can also write as......
#include <stdio.h>
#include <stdlib.h>
#define N 10
int* demo() {
int* arr = (int*)malloc(sizeof(arr) * N);
for(int i = 0; i < N; i++)
{
arr[i]=i;
}
return arr;
}
int main()
{
int* arr_p;
arr_p = demo();
printf("%d\n", *(arr_p+3));
free(arr_p);
return 0;
}
OUTPUT : 3
Had the similar situation when i have been trying to return char array from the function. But i always needed an array of a fixed size.
Solved this by declaring a struct with a fixed size char array in it and returning that struct from the function:
#include <time.h>
typedef struct TimeStamp
{
char Char[9];
} TimeStamp;
TimeStamp GetTimeStamp()
{
time_t CurrentCalendarTime;
time(&CurrentCalendarTime);
struct tm* LocalTime = localtime(&CurrentCalendarTime);
TimeStamp Time = { 0 };
strftime(Time.Char, 9, "%H:%M:%S", LocalTime);
return Time;
}

puzzled with the output

why does printf prints 7 although the variable a was local to the function fun() and should no longer exist once the control returns from function fun().
Here is the c code
‎#include<stdio.h>
main()
{
int *fun();
int *c=fun();
printf("%d",*c);
getch();
}
int *fun()
{
int a=7;
return(&a);
}
output : 7
This is because even if the variable does not exist anymore, the memory location where it was has not yet been used for something else. Hence, the pointer still points to a memory location where the bits contains an int with value 7.
But this is definitely undefined behavior. You should not rely on it.
There is a difference between the language idioms and the physical operation of the hardware. In "C" words, yes, your variable should not be accessed anymore, but physically the variable a has been allocated on the stack of your program, which is not erased each time a function returns (it would take too much time), thus you can still read it.
Anyway, this is not recommended because other function calls may erase this data.
once the fun() return, the frame pointer has been set back to point the main() frame again. the pointer c pointed to some address in the memory, since the fun() has already returned, we don't know what's in the adress,but if nothing else has been written to the adress, it can still be the previous variable a. the C standard simply move the frame pointer when a function returns.
I think its because you are printing *c , which displays the value which is stored on the location i.e &a , just try to print c then you will get the address of 7 .
It is because the address is already passed to the variable c and the memory address can be read outside the function as well

Array Destruction At The End Of Function Call

Here is my code
#include<stdio.h>
int * fun(int a1,int b)
{
int a[2];
a[0]=a1;
a[1]=b;
//int c=5;
printf("%x\n",&a[0]);
return a;
}
int main()
{
int *r=fun(3,5);
printf("%d\n",r[0]);
printf("%d\n",r[0]);
}
I am running codeblocks on Windows 7
Every time I run the loop I get the outputs as
22fee8
3
2293700
Here is the part I do not understand :
r expects a pointer to a part of memory which is interpreted as a sequence of boxes (each box of 4 byte width - >Integers ) on invoking fun function
What should happen is printf of function will print the address of a or address of a[0]:
Seconded
NOW THE QUESTION IS :
each time I run the program I get the same address?
And the array a should be destroyed at the end of Function fun only pointer must remain after function call
Then why on earth does the line r[0] must print 3?
r is pointing to something that doesn't exist anymore. You are returning a pointer to something on the stack. That stack will rewind when fun() ends. It can point to anything after that but nothing has overwritten it because another function is never called.
Nothing forces r[0] to be 3 - it's just a result of going for the simplest acceptable behaviour.
Basically, you're right that a must be destroyed at the end of fun. All this means is that the returned pointer (r in your case) is completely unreliable. That is, even though r[0] == 3 for you on your particular machine with your particular compiler, there's no guarantee that this will always hold on every machine.
To understand why it is so consistent for you, think about this: what does is mean for a to be destroyed? Only that you can't use it in any reliable way. The simplest way of satisfying this simple requirement is for the stack pointer to move back to the point where fun was called. So when you use r[0], the values of a are still present, but they are junk data - you can't count on them existing.
This is what happens:
int a[2]; is allocated on the stack (or similar). Suppose it gets allocated at the stack at address 0x12345678.
Various data gets pushed on the stack at this address, as the array is filled. Everything works as expected.
The address 0x12345678 pointing at the stack gets returned. (Ironically, the address itself likely gets returned on the stack.)
The memory allocated on the stack for a ceases to be valid. For now the two int values still sit at the given address in RAM, containing the values assigned to them. But the stack pointer isn't reserving those cells, nor is anything else in the program keeping track of that data. Computers don't delete data by erasing the value etc, they delete cells by forgetting that anything of use is stored at that memory location.
When the function ends, those memory cells are free to be used for the rest of the program. For example, a value returned by the function might end up there instead.
The function returned a pointer to a segment on the stack where there used to be valid data. The pointer is still 0x12345678 but at that address, anything might be stored by now. Furthermore, the contents at that address may change as different items are pushed/popped from the stack.
Printing the contents of that address will therefore give random results. Or it could print the same garbage value each time the program is executed. In fact it isn't guaranteed to print anything at all: printing the contents of an invalid memory cell is undefined behavior in C. The program could even crash when you attempt it.
r is undefined after the stack of the function int * fun(int a1,int b) is released, right after it ends, so it can be 3 or 42 or whatever value. The fact that it still contains your expected value is because it haven't been used for anything else, as a chunk of your memory is reserved for your program and your program does not use the stack further. Then after the first 3 is printed you get another value, that means that stack was used for something else, you could blame printf() since it's the only thing runing and it does a LOT of things to get that numbers into the console.
Why does it always print the same results? Because you always do the same process, there's no magic in it. But there's no guarantee that it'll be 3 since that 'memory space' is not yours and you are only 'peeking' into it.
Also, check the optimization level of your compiler fun() and main(), being as simple as they are, could be inline'd or replaced if the binary is to be optimized reducing the 'randomness' expected in your results. Although I wouldn't expect it to change much either.
You can find pretty good answers here:
can-a-local-variables-memory-be-accessed-outside-its-scope
returning-the-address-of-local-or-temporary-variable
return-reference-to-local-variable
Though the examples are for C++, underlying idea is same.

Where does the return statement save its data?

#include<stdio.h>
int add(int,int);
int main()
{
int a;
a=add(5,7);
printf("%d",a);
}
int add(int x,int y)
{
x=x+y;
return(x);
}
Last night I got a doubt on return statement. See x is an auto variable defined inside the add function and as ANSI said an auto variable exists and has its lifetime only inside the function but here the return statement can make the variable a to exist even outside the function. Where does it store the value ? Stack or heap ?
Neither on the stack nor on the heap.
At least in the x86 architecture, the return value of functions is usually copied to the eax register, so it persists outside of the function. This is one of the reasons why you can't return an array but merely a pointer to it.
You can, however, return structs and other larger variables by value, in which case the compiler does some tricks by using the stack and additional registers such as edx.
What return actually does depends on the machine architecture to which the code is compiled. On a machine with registers the value of x is copied to a register. The caller will pick up the value from there and do something with it. Here, that value is copied to a variable named a.
If this code is compiled on a stack machine, the return value might be just left on the stack itself. The object code for calling the add function will look something like this:
push 7
push 5
call add
seti 0x28ccf4
add will pop the values from the stack, add them together and push the result back to the stack. (If the function deals with data structures that will not find into a slot on the stack, their addresses are pushed instead.) seti is an operation that will pop a value from the stack and assign it to the integer variable at a particular address. (Here, 0x28ccf4 is the address of a.). As the top of the stack will now contain the result of adding 5 and 7, the value of a will become 12.
It varies. In a typical case, you can count on anything up to at least the size of an int being returned in some register(s) (e.g., typically eax on x86, $v0-$v1 on MIPS, r0 on ARM).
On many machines, larger return values (e.g., a struct with several members) will result in the caller allocating space on the stack to hold the return value, and either passing a pointer to that space as a hidden argument to the function, or the function having implicit knowledge of the location of that space relative to (for example) the stack pointer when the function was entered.
Though the question is tagged C, I'll also mention that in C++ there are a few special rules about return values. Specifically, there's a return value optimization and a named return value optimization. These allow a compiler to elide code for copying a return value, even if/when (for example) the copy would normally have visible side effects. These are intended to allow optimization where the function constructs a return value where the calling code will need it, rather than constructing it locally in the function, and then making a copy back to where the calling code can use it.
It is stored in a register in most platforms I know about, but this could be changed by compiler implementation.
It is worth noticing that while return x where x is a value and not pointer, such as in you case is legitimate, if x is a pointer, you should make sure that the buffer / data it points to is valid when the function returns, otherwise it will be a valid pointer that points to invalid location.
This is most likely covered in programming language course if you are a CS student. Anyway, you can check out the Wikipedia.
And in fact, it is stored in the call stack, aka stack.
The return value of add function is copied back to 'a' in the main function. The auto variables declared within a function are present in the activation record for that function on the call stack and the activation record is removed at the end of function execution, after copying any return value back to the caller's variable..

How does temporary storage work in C when a function returns?

I know C pretty well, however I'm confused of how temporary storage works.
Like when a function returns, all the allocation happened inside that function is freed (from the stack or however the implementation decides to do this).
For example:
void f() {
int a = 5;
} // a's value doesn't exist anymore
However we can use the return keyword to transfer some data to the outside world:
int f() {
int a = 5;
return a;
} // a's value exists because it's transfered to the outside world
Please stop me if any of this is wrong.
Now here's the weird thing, when you do this with arrays, it doesn't work.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
I know arrays are only accessible by pointers, and you can't pass arrays around like another data structure without using pointers. Is this the reason you can't return arrays and use them in the outside world? Because they're only accessible by pointers?
I know I could be using dynamic allocation to keep the data to the outside world, but my question is about temporary allocation.
Thanks!
When you return something, its value is copied. a does not exist outside the function in your second example; it's value does. (It exists as an rvalue.)
In your last example, you implicitly convert the array a to an int*, and that copy is returned. a's lifetime ends, and you're pointing at garbage.
No variable lives outside its scope, ever.
In the first example the data is copied and returned to the calling function, however the second returns a pointer so the pointer is copied and returned, however the data that is pointed to is cleaned up.
In implementations of C I use (primarily for embedded 8/16-bit microcontrollers), space is allocated for the return value in the stack when the function is called.
Before calling the function, assume the stack is this (the lines could represent various lengths, but are all fixed):
[whatever]
...
When the routine is called (e.g. sometype myFunc(arg1,arg2)), C throws the parameters for the function (arguments and space for the return value, which are all of fixed length) on to the stack, followed by the return address to continue code execution from, and possibly backs up some processor registers.
[myFunc local variables...]
[return address after myFunc is done]
[myFunc argument 1]
[myFunc argument 2]
[myFunc return value]
[whatever]
...
By the time the function fully completes and returns to the code it was called from, all of it's variables have been deallocated off the stack (they might still be there in theory, but there is no guarantee)
In any case, in order to return the array, you would need to allocate space for it somewhere else, then return the address to the 0th element.
Some compilers will store return values in temporary registers of the processor rather than using the stack, but it's rare (only seen it on some AVR compilers).
When you attempt to return a locally allocated array like that, the calling function gets a pointer to where the array used to live on the stack. This can make for some spectacularly gruesome crashes, when later on, something else writes to the array, and clobbers a stack frame .. which may not manifest itself until much later, if the corrupted frame is deep in the calling sequence. The maddening this with debugging this type of error is that real error (returning a local array) can make some other, absolutely perfect function blow up.
You still return a memory address, you can try to check its value, but the contents its pointing are not valid beyond the scope of function,so dont confuse value with reference.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
First, the compiler wouldn't know what size of array to return. I just got syntax errors when I used your code, but with a typedef I was able to get an error that said that functions can't return arrays, which I knew.
typedef int ia[1];
ia h(void) {
ia a = 5;
return a;
}
Secondly, you can't do that anyway. You also can't do
int a[1] = {4};
int b[1];
b = a; // Here both a and b are interpreted as pointer literals or pointer arithmatic
While you don't write it out like that, and the compiler really wouldn't even have to generate any code for it this operation would have to happen semantically for this to be possible so that a new variable name could be used to refer the value that was returned by the function. If you enclosed it in a struct then the compiler would be just fine with copying the data.
Also, outside of the declaration and sizeof statements (and possibly typeof operations if the compiler has that extension) whenever an array name appears in code it is thought of by the compiler as either a pointer literal or as a chunk of pointer arithmetic that results in a pointer. This means that the return statement would end looking like you were returning the wrong type -- a pointer rather than an array.
If you want to know why this can't be done -- it just can't. A compiler could implicitly think about the array as though it were in a struct and make it happen, but that's just not how the C standard says it is to be done.

Resources