I was just looking into the memory allocation of a program in C. I know that all the global and static variables are stored in a heap. Also, the stack stores all the function calls. I do have one doubt here though. Say I am calling the following function:
int ret;
int num = 10;
int arr[3] = {1,2,3};
int *ptr = &arr[0];
ret = giveNumber(num, ptr);
Here, I read that the parameters of the function call giveNumer() would also be stored in the same stack. But in what order will they be stored? If I popped the top of stack, which parameter will be popped first, num or ptr?
I know that all the global and static variables are stored in a heap.
No, thats not true.
As per the standard they are stored in implementation defined memory regions, typically the Data segment and the BSS.
If I popped the top of stack, which parameter will be popped first, num or ptr
The order of evaluation of arguments to a function is Unspecified.
So it depends on your compiler implementation. An compiler might evaluate the arguments from:
Left to Right or
Right to Left or
Any other random order
So the behavior & order you see would depend on this.
Adding to what #Als has already mentioned, most of the compilers on x86 follow _cdecl calling convention where arguments are evaluated from right to left. Learn more here
http://en.wikibooks.org/wiki/X86_Disassembly/Calling_Conventions#Standard_C_Calling_Conventions
Related
Consider the following code.
#include<stdio.h>
int *abc(); // this function returns a pointer of type int
int main()
{
int *ptr;
ptr = abc();
printf("%d", *ptr);
return 0;
}
int *abc()
{
int i = 45500, *p;
p = &i;
return p;
}
Output:
45500
I know according to link this type of behavior is undefined. But why i am getting correct value everytime i run the program.
Every time you call abc it "marks" a region at the top of the stack as the place where it will write all of its local variables. It does that by moving the pointer that indicates where the top of stack is. That region is called the stack frame. When the function returns, it indicates that it does not want to use that region anymore by moving the stack pointer to where it was originally. As a result, if you call other functions afterwards, they will reuse that region of the stack for their own purposes. But in your case, you haven't called any other functions yet. So that region of the stack is left in the same state.
All the above explain the behavior of your code. It is not necessary that all C compilers implement functions that way and therefore you should not rely on that behavior.
Well, undefined behavior is, undefined. You can never rely on UB (or on an output of a program invoking UB).
Maybe, just maybe in your environment and for your code, the memory location allocated for the local variable is not reclaimed by the OS and still accessible, but there's no guarantee that it will have the same behavior for any other platform.
Consider the following code.
#include<stdio.h>
int *abc(); // this function returns a pointer of type int
int main()
{
int *ptr;
ptr = abc();
printf("%d", *ptr);
return 0;
}
int *abc()
{
int i = 45500, *p;
p = &i;
return p;
}
Output:
45500
I know according to link this type of behavior is undefined. But why i am getting correct value everytime i run the program.
Every time you call abc it "marks" a region at the top of the stack as the place where it will write all of its local variables. It does that by moving the pointer that indicates where the top of stack is. That region is called the stack frame. When the function returns, it indicates that it does not want to use that region anymore by moving the stack pointer to where it was originally. As a result, if you call other functions afterwards, they will reuse that region of the stack for their own purposes. But in your case, you haven't called any other functions yet. So that region of the stack is left in the same state.
All the above explain the behavior of your code. It is not necessary that all C compilers implement functions that way and therefore you should not rely on that behavior.
Well, undefined behavior is, undefined. You can never rely on UB (or on an output of a program invoking UB).
Maybe, just maybe in your environment and for your code, the memory location allocated for the local variable is not reclaimed by the OS and still accessible, but there's no guarantee that it will have the same behavior for any other platform.
I would like to understand the difference between the following two C programs.
First program:
void main()
{
int *a;
{
int b = 10;
a=&b;
}
printf("%d\n", *a);
}
Second program:
void main()
{
int *a;
a = foo();
printf("%d\n", *a);
}
int* foo()
{
int b = 10;
return &b;
}
In both cases, the address of a local variable (b) is returned to and assigned to a. I know that the memory a is pointing should not be accessed when b goes out of scope. However, when compiling the above two programs, I receive the following warning for the second program only:
warning C4172: returning address of local variable or temporary
Why do I not get a similar warning for the first program?
As you already know that b goes out of scope in each instance, and accessing that memory is illegal, I am only dumping my thoughts on why only one case throws the warning and other doesn't.
In the second case, you're returning the address of a variable stored on Stack memory. Thus, the compiler detects the issue and warns you about it.
The first case, however skips the compiler checking because the compiler sees that a valid initialized address is assigned to a. The compilers depends in many cases on the intellect of the coder.
Similar examples for depicting your first case could be,
char temp[3] ;
strcpy( temp, "abc" ) ;
The compiler sees that the temp have a memory space but it depends on the coder intellect on how many chars, they are going to copy in that memory region.
your foo() function has undefined behavior since it returns a pointer to a part of stack memory that is not used anymore and that will be overwritten soon on next function call or something
it is called "b is gone out of scope".
Sure the memory still exists and probably have not changed so far but this is not guaranteed.
The same applies to your first code since also the scope of b ends with the closing bracket of the block there b is declared.
Edit:
you did not get the warning in first code because you did not return anything. The warning explicitly refers to return. And since the compiler may allocate the stack space of the complete function at once and including all sub-blocks it may guarantee that the value will not be overwritten. but nevertheless it is undefined behavior.
may be you get additional warnings if you use a higher warning level.
In the first code snippet even though you explicitly add brackets the stack space you are using is in the same region; there are no jumps or returns in the code so the code still uses consecutive memory addresses from the stack. Several things happen:
The compiler will not push additional variables on the stack even if you take out the code block.
You are only restricting the visibility of variable b to that code-block; which is more or less the same as if you would declare it at the beginning and only use it once in the exact same place, but without the { ... }
The value for b is most likely saved in a register which so there would be no problem to print it later - but this is speculative.
For the second code snippet, the function call means a jump and a return which means:
pushing the current stack pointer and the context on the stack
push the relevant values for the function call on the stack
jump to the function code
execute the function code
restore the stack pointer to it's value before the function call
Because the stack pointer has been restored, anything that is on the stack is not lost (yet) but any operations on the stack will be likely to override those values.
I think it is easy to see why you get the warning in only one case and what the expected behavior can be...
Maybe it is related with the implementation of a compiler. In the second program,the compiler can identify that return call is a warning because the program return a variable out of scope. I think it is easy to identify using information about ebp register. But in the first program our compiler needs to do more work for achieving it.
Your both programs invoke undefined behaviour. Statements grouped together within curly braces is called a block or a compound statement. Any variable defined in a block has scope in that block only. Once you go out of the block scope, that variable ceases to exist and it is illegal to access it.
int main(void) {
int *a;
{ // block scope starts
int b = 10; // b exists in this block only
a = &b;
} // block scope ends
// *a dereferences memory which is no longer in scope
// this invokes undefined behaviour
printf("%d\n", *a);
}
Likewise, the automatic variables defined in a function have function scope. Once the function returns, the variables which are allocated on the stack are no longer accessible. That explains the warning you get for your second program. If you want to return a variable from a function, then you should allocate it dynamically.
int main(void) {
int *a;
a = foo();
printf("%d\n", *a);
}
int *foo(void) {
int b = 10; // local variable
// returning the address of b which no longer exists
// after the function foo returns
return &b;
}
Also, the signature of main should be one of the following -
int main(void);
int main(int argc, char *argv[]);
In your first program-
The variable b is a block level variable and the visibility is inside that block
only.
But the lifetime of b is lifetime of the function so it lives upto the exit of main function.
Since the b is still allocated space, *a prints the value stored in b ,since a points b.
I wanted to know in which section of the program are function pointers stored? As in, is it on the program stack or is there a separate section for the same?
void f(void){}
int main(void){
int x[10];
void (*fp)(void) = NULL;
fp = f;
return 0;
}
Now, will the address of x and fp be in the same segment of the program's stack memory?
A function pointer is no different from any other pointer in terms of storage, which is again no different from any other variable. So yes, they'll all be stored together in the same place, which is the stack for local variables.
With a good compiler, they won't exist anywhere because their values are never used and contribute nothing to the output of the program.
The answer to this precise question is that your two examples (an array of ints and a pointer-to-a-function) are both local variables and both are kept on "the stack" (the stack is a bit conceptual but at the level of your question, it's the right way to think about it), so the addresses of x and fp are both there.
What you might possibly be getting at however (with "which section of the program are function pointers stored") maybe something a bit different: if you assign a value to the pointer-to-function--as in you assign it the address of an actual function-- the address of the function is contains will almost certainly be somewhere else, because executable code is located in a different part of system memory than the execution stack.
(The array of ints is allocated entirely on the stack and if you treat x as a pointer, it will point into the stack area.)
If I create a variable within a new set of curly braces, is that variable popped off the stack on the closing brace, or does it hang out until the end of the function? For example:
void foo() {
int c[100];
{
int d[200];
}
//code that takes a while
return;
}
Will d be taking up memory during the code that takes a while section?
No, braces do not act as a stack frame. In C, braces only denote a naming scope, but nothing gets destroyed nor is anything popped off the stack when control passes out of it.
As a programmer writing code, you can often think of it as if it is a stack frame. The identifiers declared within the braces are only accessible within the braces, so from a programmer's point of view, it is like they are pushed onto the stack as they are declared and then popped when the scope is exited. However, compilers don't have to generate code that pushes/pops anything on entry/exit (and generally, they don't).
Also note that local variables may not use any stack space at all: they could be held in CPU registers or in some other auxiliary storage location, or be optimized away entirely.
So, the d array, in theory, could consume memory for the entire function. However, the compiler may optimize it away, or share its memory with other local variables whose usage lifetimes do not overlap.
The time during which the variable is actually taking up memory is obviously compiler-dependent (and many compilers don't adjust the stack pointer when inner blocks are entered and exited within functions).
However, a closely related but possibly more interesting question is whether the program is allowed to access that inner object outside the inner scope (but within the containing function), ie:
void foo() {
int c[100];
int *p;
{
int d[200];
p = d;
}
/* Can I access p[0] here? */
return;
}
(In other words: is the compiler allowed to deallocate d, even if in practice most don't?).
The answer is that the compiler is allowed to deallocate d, and accessing p[0] where the comment indicates is undefined behaviour (the program is not allowed to access the inner object outside of the inner scope). The relevant part of the C standard is 6.2.4p5:
For such an object [one that has
automatic storage duration] that does
not have a variable length array type,
its lifetime extends from entry into the block with which it is associated
until execution of that block ends in
any way. (Entering an enclosed block
or calling a function suspends, but
does not end, execution of the current
block.) If the block is entered
recursively, a new instance of the
object is created each time. The
initial value of the object is
indeterminate. If an initialization is
specified for the object, it is
performed each time the declaration is
reached in the execution of the block;
otherwise, the value becomes
indeterminate each time the
declaration is reached.
Your question is not clear enough to be answered unambiguously.
On the one hand, compilers don't normally do any local memory allocation-deallocation for nested block scopes. The local memory is normally allocated only once at function entry and released at function exit.
On the other hand, when the lifetime of a local object ends, the memory occupied by that object can be reused for another local object later. For example, in this code
void foo()
{
{
int d[100];
}
{
double e[20];
}
}
both arrays will usually occupy the same memory area, meaning that the total amount of the local storage needed by function foo is whatever is necessary for the largest of two arrays, not for both of them at the same time.
Whether the latter qualifies as d continuing to occupy memory till the end of function in the context of your question is for you to decide.
It's implementation dependent. I wrote a short program to test what gcc 4.3.4 does, and it allocates all of the stack space at once at the start of the function. You can examine the assembly that gcc produces using the -S flag.
No, d[] will not be on the stack for the remainder of routine. But alloca() is different.
Edit: Kristopher Johnson (and simon and Daniel) are right, and my initial response was wrong. With gcc 4.3.4.on CYGWIN, the code:
void foo(int[]);
void bar(void);
void foobar(int);
void foobar(int flag) {
if (flag) {
int big[100000000];
foo(big);
}
bar();
}
gives:
_foobar:
pushl %ebp
movl %esp, %ebp
movl $400000008, %eax
call __alloca
cmpl $0, 8(%ebp)
je L2
leal -400000000(%ebp), %eax
movl %eax, (%esp)
call _foo
L2:
call _bar
leave
ret
Live and learn! And a quick test seems to show that AndreyT is also correct about multiple allocations.
Added much later: The above test shows the gcc documentation is not quite right. For years it has said (emphasis added):
"The space for a variable-length array is deallocated as soon as the array name's scope ends."
They might. They might not. The answer I think you really need is: Don't ever assume anything. Modern compilers do all kinds of architecture and implementation-specific magic. Write your code simply and legibly to humans and let the compiler do the good stuff. If you try to code around the compiler you're asking for trouble - and the trouble you usually get in these situations is usually horribly subtle and difficult to diagnose.
Your variable d is typically not popped off the stack. Curly braces do not denote a stack frame. Otherwise, you would not be able to do something like this:
char var = getch();
{
char next_var = var + 1;
use_variable(next_char);
}
If curly braces caused a true stack push/pop (like a function call would), then the above code would not compile because the code inside the braces would not be able to access the variable var that lives outside the braces (just like a sub-function cannot directly access variables in the calling function). We know that this is not the case.
Curly braces are simply used for scoping. The compiler will treat any access to the "inner" variable from outside the enclosing braces as invalid, and it may re-use that memory for something else (this is implementation-dependent). However, it may not be popped off of the stack until the enclosing function returns.
Update: Here's what the C spec has to say. Regarding objects with automatic storage duration (section 6.4.2):
For an object that does not have a variable length array type, its
lifetime extends from entry into the block with which it is associated
until execution of that block ends in anyway.
The same section defines the term "lifetime" as (emphasis mine):
The lifetime of an object is the portion of program execution during
which storage is guaranteed to be reserved for it. An object exists,
has a constant address, and retains its last-stored value throughout
its lifetime. If an object is referred to outside of its lifetime, the
behavior is undefined.
The key word here is, of course, 'guaranteed'. Once you leave the scope of the inner set of braces, the array's lifetime is over. Storage may or may not still be allocated for it (your compiler might re-use the space for something else), but any attempts to access the array invoke undefined behavior and bring about unpredictable results.
The C spec has no notion of stack frames. It speaks only to how the resulting program will behave, and leaves the implementation details to the compiler (after all, the implementation would look quite different on a stackless CPU than it would on a CPU with a hardware stack). There is nothing in the C spec that mandates where a stack frame will or will not end. The only real way to know is to compile the code on your particular compiler/platform and examine the resulting assembly. Your compiler's current set of optimization options will likely play a role in this as well.
If you want to ensure that the array d is no longer eating up memory while your code is running, you can either convert the code in curly braces into a separate function or explicitly malloc and free the memory instead of using automatic storage.
I believe that it does go out of scope, but is not pop-ed off the stack until the function returns. So it will still be taking up memory on the stack until the function is completed, but not accessible downstream of the first closing curly brace.
There has already been given much information on the standard indicating that it is indeed implementation specific.
So, one experiment might be of interest. If we try the following code:
#include <stdio.h>
int main() {
int* x;
int* y;
{
int a;
x = &a;
printf("%p\n", (void*) x);
}
{
int b;
y = &b;
printf("%p\n", (void*) y);
}
}
Using gcc we obtain here two times the same address: Coliro
But if we try the following code:
#include <stdio.h>
int main() {
int* x;
int* y;
{
int a;
x = &a;
}
{
int b;
y = &b;
}
printf("%p\n", (void*) x);
printf("%p\n", (void*) y);
}
Using gcc we obtain here two different addresses: Coliro
So, you can't be really sure what is going on.