Does the C compiler allocate memory for a variable? [duplicate] - c

This question already has answers here:
When is memory allocated during compilation?
(8 answers)
Closed 5 years ago.
I was wondering that in which stage, memory gets allocated to the variable.
Is it in the compilation stage or is it at the execution time?

Yes, and yes, both.
For a global variable (declared at file scope), the compiler reserves memory in the executable image. So this is compile-time.
For an automatic variable (declared in a function), the compiler adds instructions to allocate the variable on the stack. So this is run-time
int a; // file scope
int f(void)
{
int b; // function scope
...
Notes:
The compiler has one (one set of) instructions to allocate all local variables of a function in one time. Generally, there is not overhead per variable (there can be exceptions I don't discuss now). These instructions are executed every time the function is called.
The compiler does not allocate storage for your strings. This is an error beginners often make. Consider:
char *s; // a pointer to a strings
scanf("%s", s); // no, the compiler will not allocate storage for the string to read.

It depends on the kind of variable/object.
Globally and statically allocated variables are known at compile time and their offsets in the data segment are baked into the program. So in a way, they got allocated at compile time.
Variables local to function scope are allocated on the stack. You could say that the compiler knew about them and the kind of storage their needed but obviously they got allocated (in the sense of came into existence) at run time, during a function call.
Another interesting object is the heap allocated object, which can be created with malloc/calloc in C and new or related mechanisms in C++. They are allocated at run time in the heap section.

There's a third kind of memory which is allocated dynamically by using malloc() and friends.
This memory is taken from the so called heap. While automatic variables (you have in functions) is taken from the so called stack.
Then, if you're having a variable with an initializer (e.g. int i = 5;) that never changes value, a compiler may figure that out and not allocate memory at all. Instead it would just use 5 wherever you use that variable in your code.

Related

When is memory allocated and de-allocated static and dynamic memory in C?

I'm learning C now and trying to figure out how the memory management of C works. Please correct me if I am wrong, but as I know for:
Static memory allocation - this happens during compile time. The compiler allocates the necessary memory needed for static memory.
Static memory deallocation - the memory is deallocated automatically when the block/function is finished running (for local variables) or when the entire program has finished executing (for global variables).
Dynamic memory allocation - the memory is allocated during run-time because the size of the input is unknown at this time.
Dynamics memory deallocation - the memory is deallocated when the free() is executed.
Is this about right? Am I missing anything?
There are 3 different kinds of storage duration in C language:
static: the lifetime of the variable is the lifetime of the program. It is allocated at load time (only defined at compile time) and only freed when the operation system unloads the program. Static variables are variables declared outside any functions, and local variables (declared in a function or block) having the static modifier
automatic: automatic variables are declared inside a block (or function), with no storage modifier. Their lifetime starts at the beginning of the bloc and ends at the end of the bloc. They are generally allocated at the beginning of the bloc and deallocated at its end, but because of the as if rule, optimizing compilers could allocate them sooner and free them later, for example if the bloc in located inside a loop.
dynamic: they are allocated manually through malloc, and will only be deallocated by free
Common implementations use a system stack for automatic variables and a memory pool (asking the Operating System when more memory is needed) for dynamic ones, but this is an implementation details
When multithreading is used, a fourth kind of storage duration is available: thread storage duration. Those variable are declared with the _Thread_local storage class modifier. Their lifetime is the duration of the thread, and each thread has its own copy of them.
For common implementation, they are managed the same as static variables: they are allocated by the operating system when the thread is created, and reclaimed (still by OS) when the thread ends.
Some remarks regarding your wordings:
Static memory allocation - this happens during compile time.
Beware, compile time and load time are different. At build time only a file is created, and memory is only allocated by the system at run time
Static memory deallocation - the memory is deallocated automatically when the block/function is finished running (for local variables)...
There is a confusion between scope (local vs. global) and storage duration. A function can contain static variables, that is one of the reasons for the static keyword
Dynamic memory allocation - the memory is allocated during run-time because the size of the input is unknown at this time
This is one possible reason for the programmer to use dynamic memory, but there might be others, for example because the code would be cleaner that way. In particular, dynamic memory is a nice tool when you want to mimic Object Oriented Programming in C language.
I think most of the words you say is correct. Just a few points I wanted to add.
For global and static variables, if they are initialized, their values are present in the resulting binary so yes, static memory allocation (actually it is not memory but anyways) happens in compile time but consider uninitialized global variables (bss section). only their length is written in the resulting binary image because writing thousands of zeros to the compiled image would be silly. in this case memory allocation is handled by loader at load time. it allocates required space, maps them to virtual addresses of your variables and zero out the memory.
And free is not necessarily meaning that you give the unused memory to the operating system. Usually the c standard library keeps track of free'd chunks, and concatenates them if it can in order to not execute a sbrk or equivalent system call next time you want to malloc because they are relatively costly. It is I believe highly dependent to the library implementation

When/where are local arrays allocated?

https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-and-C.html describes automatic allocation of local variables. I understand that local variables are commonly allocated on the stack. I can imagine how an int might be allocated on the stack; just push its value. But how might an array be allocated?
For example, if you declare an array char str[10];, does that 10 bytes of space go on the stack, or is it allocated somewhere else, and only the str pointer is pushed to the stack? If the latter, where is the 10 bytes of space allocated?
Furthermore, when exactly are local variables, including arrays, allocated? I commonly see heap allocation referred to as "dynamic allocation", implying that automatic variables are not dynamically allocated. But automatic variables may be declared within flow-of-control constructs and function bodies, so the compiler can't possibly know before runtime exactly how much space will be occupied by automatic variables. So automatic variables must also be dynamically allocated, right?
Edit: I would like to emphasize the first half of this question. I am most interested in understanding when and where the space for local arrays is allocated. On the stack? Somewhere else?
Edit 2: I made a mistake when I originally included the C++ tag for this question. I meant to ask only about the C language and its implementations. I apologize for any confusion.
In the C 2018 standard, clause 6.2.4, paragraphs 6 and 7 tell us about the lifetimes of objects with automatic storage duration. Paragraph 6 covers such objects that are not variable length arrays:
… its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time.
Thus, if we have this code:
{
PointA;
int x = 3;
PointB;
}
then x exists in the C model as soon as execution reaches PointA—its block was entered, and that is when the lifetime of x begins. However, although x already exists at PointA, its value is indeterminate. The initialization only occurs when the definition is reached.
Paragraph 7 tells us about variable length arrays:
… its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration.
So, if we have this code:
{
PointA;
int x[n]; // n is some variable.
PointB;
}
then x does not exist at PointA. Its lifetime begins when int x[n]; is reached.
Keep in mind this existence is only in terms of C’s abstract model of computing. Compilers are allowed to optimize code as long as the observable results (such as output of the program) are the same. So the actual code generated by a compiler might not create x when the block is entered. (It might not create x at all; it could be optimized away completely.)
For example, if you declare an array char str[10];, does that 10 bytes of space go on the stack, or is it allocated somewhere else, and only the str pointer is pushed to the stack? If the latter, where is the 10 bytes of space allocated?
In general, the array's storage is allocated on the stack, just like any other local variable. This is compiler and target-specific. Even on a x86_64 machine, a 4 billion byte array is probably not allocated on the stack. I'd expect one of: a compile error, a link error, a runtime error, or it works somehow. In the last alternative, it might call new[] or malloc() and leave the pointer to the array on the stack in place of the array.
Notice that the array's allocation and its pointer are the same thing, so your addition of allocated somewhere else, and only the str pointer wording might indicate confusion. The allocation occurs and the name for it are not independent data.
What you ask for is depends on the language implementation (the compiler). To answer your question, this is (a simplified overview of) what compilers usually do for compiled languages (like C/C++):
When the compiler finishes parsing a function, it keeps a symbol table of all local variables declared in this function, even those declared "syntactically" during the instruction flow of the function (like local loops variables). Later, when it needs to generate the final (assembly) code, it generates the necessary instructions to push (or just moves the stack pointer) a sufficient space for all local variables. So, local loop variables, for instance, are not allocated when the loop starts execution. Rather, they are allocated at the beginning of the execution of the function containing the loop. The compiler also adds instructions to remove this allocated stack space before returning from the function.
So, automatic variables, like your char array, is totally allocated on the stack in this (common) scenario.
[EDIT] Variable length arrays (before C99)
The discussion above was for arrays having lengths known at compile time like this:
void f () {
char n[10];
....
}
If we stay in C language terms (before C99), variable-length arrays (arrays whose lengths are not known at compile-time, but rather at runtime) are declared as a pointer like this:
void f() {
char *n;
... //array is later allocated using some kind of memory allocation construct
}
This, in fact, just declares a pointer to the array. Pointers size is known to the compiler. So, as I said above, the compiler will be able to reserve the necessary storage for the pointer on the stack (just the pointer, not the real array) regardless of what will be the size of the array at runtime. When the execution reaches the line that allocates the array (using malloc, for instance), the array storage is dynamically allocated on the heap, and its address is stored in the local automatic variable n. In languages without garbage collection, this requires freeing (deallocating) the reserved storage from the heap manually (i.e. the programmer should add an instruction to do it in the program when the array is no longer needed). This is not necessary for constant-sized array (that are allocated on the stack) because the compiler removes the stack frame before returning from the function, as I said earlier.
[EDIT2]
C99 variable length arrays cannot be declared on the stack. The compiler must add some code to the resulting machine code that handles its dynamic creation and destruction at runtime.

All about C memory management [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to move on from writing small one/two/three-filed C programs to slightly larger C projects. For that I really want to get the memory management right. Now I know that similar questions have been asked, but they won't quite answer mine. I know some of the theory already and put it to use. Therefore I'd like to present what I know or think to know and you correct me or add information I miss.
There is the stack, static memory and the heap
static int n; // goes to static memory, thread lifetime
char *array = malloc(256 * sizeof(*array)); // goes to heap, needs free(array)
func(int n) {
float f; // goes to stack, dies with func return
static double d; // thread lifetime again
}
Static memory can't overflow since it's set for all static variables,
however heap and stack can overflow, heap overflows when non allocated memory is accessed in most cases, stack is set to ~1MB Windows or ~8MB Linux, if that is exhausted (I got a message "core dumped" on my Ubuntu for setting up an array of structs for every pixel of an image on the stack)
Does static, stack and heap memory behave like this in every scope?
I know heap memory does. But does a global non static array of structs go on the stack? What if I have a static array in file b where there is no main?
Goes on to static memory right? And what is if a function has a local static variable with initialized value? Is it initialized everytime i call the function? and do functions take from the stack?
How to avoid exhausting the stack in large programs with long lifed stack variables and plenty of them? And most of all, where do string literals go?? They are supposed to be pointers not arrays, but if i change the pointer what happens to the original string? One more thing: It always bothers me how it looks straight up bad practise to write code like
if(!strcmp(a, "comparethis")) do ...
or
fprintf(stderr, "There was a problem .... %d", something);
EDIT
Where is the difference in terms of memory management?
char arr[3] = {'A', 'B', 'C'};
char arr[3] = "ABC";
char *arr = "ABC";
EDIT END
Is it good to include string literals anyway or rather read them from a file or whatnot?
Finally, sorry if my grammar failed me here and there but this was written in a fast manner. Thanks for reading and please no hate. Outline my mistakes, don't judge them, if I can ask that much. I wanna improve and fast.
Have a good day anyway.
There is the stack, static memory and the heap
Not exactly. C standard defines automatic storage, static storage and dynamic storage. The stack is a possible (in fact the common) implementation for automatic storage, as is the heap for dynamic storage. This definitions acts on the lifetime of variables:
automatic objects have their life bound by their containing block
static objects come to life at the beginning of the program, and their life extends to the end of the program
dynamic variables are created by malloc (and associated) function(s) from the standard library and shall be destroyed by free.
Provided these rules are obeyed, an implementation if free to physically store objects where it wants to. In particular, some implementations were known to store automatic arrays on the heap and automatically destroy them at block exit
Static memory can't overflow since it's set for all static variables
In the time of MS/DOS, the small memory model required all static variable to fit in a single segment of 64 kbytes. If you wanted more, you could not compile in that mode. The error could be a compile error in one single compilation unit caused the error, or a link error if only the total size exceeded 64k
stack is set to ~1MB Windows or ~8MB Linux
Compiler options allow to change the size of the stack on common compilers
Now for your questions:
does a global non static array of structs go on the stack?
it can depend on implementation, provided it is automatically destroyed when leaving block, it can be stored on the heap
What if I have a static array in file b where there is no main? Goes on to static memory right?
Yes it has to be static, whether the translation unit contains a main or not
And what is if a function has a local static variable with initialized value? Is it initialized everytime i call the function?
as it is static, it is created and initialized at the beginning of the program, and keeps its value through other calls. It is not reinitialized on following calls
and do functions take from the stack?
What do you mean here? Common implementation do use the stack for the function return address and all its automatic variables
How to avoid exhausting the stack in large programs with long lifed stack variables and plenty of them?
You can increase the stack size at compile time, or use a different design. For example iterative algorithms are less stack consuming than recursive ones
And most of all, where do string literals go?
A string litteral has static duration. Some implementation store them in a read only segment, but there is no requirement for it. Simply it is undefined behaviour to try to modify a string litteral.
Where is the difference in terms of memory management?
char arr[3] = {'A', 'B', 'C'};
char arr[3] = "ABC";
char *arr = "ABC";
First and second ones both declare a non const array of 3 characters initialized with A, B and C, and no terminating null.
Third one is quite different: it declares a pointer to a (null terminated) string litteral. That means that arr[0] = 'X'; is undefined behaviour because it modifies a litteral. And sizeof(arr) the length of the string litteral but sizeof(char *).
Is it good to include string literals anyway or rather read them from a file or whatnot?
I cannot understand the question. A string litteral is available inside the program. A string stored in a file requires access to the file. So at least the file name has to be a string litteral.

Variable-length arrays are created on heap, but we cannot free them?

I've verified that variable-length arrays are created on heap (see code below), but we cannot use a free operation to free them (cause a fault trap 6).
I was taught that the heap is managed by user and thus we have to explicitly free anything on heap if we don't need them. So who will be responsible to free these memories? Is this a defect of C?
Code that shows variable-length arrays are created on heap. (Platform: Mac OS X, gcc, 64bits)
#include<stdio.h>
#include<stdlib.h>
int main(int argc, char **argv)
{
int n = atoi(argv[1]);
int a1[10];
int a2[n];
int a3[10];
printf("address for a1:%p,address for a2:%p, address for a3:%p\n",a1,a2,a3);
printf("a1-a2: %lx, a1-a3: %lx\n",(a1-a2),(a1-a3));
//free(t); // will cause fault trap 6
return 0;
}
The result is:
$ ./run 10
address for a1:0x7fff5d095aa0,address for a2:0x7fff5d0959e0, address for a3:0x7fff5d095a70
a1-a2: 30, a1-a3: c
It's obvious that a1 and a3 is consecutive and thus are on stack, but a2 has a lower address, thus on heap.
Variable length arrays have automatic storage duration and have either block scope or function prototype scope.
So this array
int a2[n];
has automatic storage duration and the block scope of the function main. It was not created in the heap. It is the compiler that generates the corresponding code to free the allocated memory for the array when the control will exit the block scope.
According to the C Standard (6.2.4 Storage durations of objects)
7 For such an object that does have a variable length array type, its
lifetime extends from the declaration of the object until execution of
the program leaves the scope of the declaration.35) If the scope is
entered recursively, a new instance of the object is created each
time. The initial value of the object is indeterminate.
You may apply function free only to objects that were allocated using one of the memory allocation functions like malloc, calloc or realloc.
How an implementation allocates storage for VLAs is implementation defined. VLAs have automatic storage duration and you should not try to free() it.
You should treat it just like any other local variable for all practical purposes.
You only free() whatever the memory you allocated using malloc() family functions.
VLAs are not supported by all implementation and it is a conditional feature.
The macro
_ _STDC_NO_VLA_ _
is used to test if VLAs are supported or not by implementation (if it's 1 then VLAs are not supported).
In my opinion, VLAs should not used mainly because:
They are optional in C11
The allocation failure is not portable detectable
Is this a defect of C?
Absolutely not. You made several wrong assumptions, which lead you to a wrong conclusion.
First, let's get terminology straight: "stack" is called automatic storage area; "heap" is called dynamic storage area. C standard does not make any claims about any of the things listed below:
Relative order of addresses in automatic and dynamic areas
Relative order of addresses of items within the same storage area
Presence or absence of gaps between allocations within the same area
This makes it impossible to determine if a variable is in an automatic or in a dynamic area simply by looking at numeric addresses, without making a guess. In particular, what appears "obvious" to you has nothing to do with what is actually happening.
So who will be responsible to free these memories?
You are responsible for calling free on everything that you allocated in the dynamic storage area. You do not allocate your variable-length array in the dynamic storage area*, hence you are not responsible for calling free on it.
* If a compiler implementation were to allocate a VLA in the dynamic storage area, the compiler would be responsible for calling free on that pointer.

Difference between static memory allocation and dynamic memory allocation

I would like to know what is the difference between static memory allocation and dynamic memory allocation?
Could you explain this with any example?
This is a standard interview question:
Dynamic memory allocation
Is memory allocated at runtime using calloc(), malloc() and friends. It is sometimes also referred to as 'heap' memory, although it has nothing to do with the heap data-structure ref.
int * a = malloc(sizeof(int));
Heap memory is persistent until free() is called. In other words, you control the lifetime of the variable.
Automatic memory allocation
This is what is commonly known as 'stack' memory, and is allocated when you enter a new scope (usually when a new function is pushed on the call stack). Once you move out of the scope, the values of automatic memory addresses are undefined, and it is an error to access them.
int a = 43;
Note that scope does not necessarily mean function. Scopes can nest within a function, and the variable will be in-scope only within the block in which it was declared. Note also that where this memory is allocated is not specified. (On a sane system it will be on the stack, or registers for optimisation)
Static memory allocation
Is allocated at compile time*, and the lifetime of a variable in static memory is the lifetime of the program.
In C, static memory can be allocated using the static keyword. The scope is the compilation unit only.
Things get more interesting when the extern keyword is considered. When an extern variable is defined the compiler allocates memory for it. When an extern variable is declared, the compiler requires that the variable be defined elsewhere. Failure to declare/define extern variables will cause linking problems, while failure to declare/define static variables will cause compilation problems.
in file scope, the static keyword is optional (outside of a function):
int a = 32;
But not in function scope (inside of a function):
static int a = 32;
Technically, extern and static are two separate classes of variables in C.
extern int a; /* Declaration */
int a; /* Definition */
*Notes on static memory allocation
It's somewhat confusing to say that static memory is allocated at compile time, especially if we start considering that the compilation machine and the host machine might not be the same or might not even be on the same architecture.
It may be better to think that the allocation of static memory is handled by the compiler rather than allocated at compile time.
For example the compiler may create a large data section in the compiled binary and when the program is loaded in memory, the address within the data segment of the program will be used as the location of the allocated memory. This has the marked disadvantage of making the compiled binary very large if uses a lot of static memory. It's possible to write a multi-gigabytes binary generated from less than half a dozen lines of code. Another option is for the compiler to inject initialisation code that will allocate memory in some other way before the program is executed. This code will vary according to the target platform and OS. In practice, modern compilers use heuristics to decide which of these options to use. You can try this out yourself by writing a small C program that allocates a large static array of either 10k, 1m, 10m, 100m, 1G or 10G items. For many compilers, the binary size will keep growing linearly with the size of the array, and past a certain point, it will shrink again as the compiler uses another allocation strategy.
Register Memory
The last memory class are 'register' variables. As expected, register variables should be allocated on a CPU's register, but the decision is actually left to the compiler. You may not turn a register variable into a reference by using address-of.
register int meaning = 42;
printf("%p\n",&meaning); /* this is wrong and will fail at compile time. */
Most modern compilers are smarter than you at picking which variables should be put in registers :)
References:
The libc manual
K&R's The C programming language, Appendix A, Section 4.1, "Storage Class". (PDF)
C11 standard, section 5.1.2, 6.2.2.3
Wikipedia also has good pages on Static Memory allocation, Dynamic Memory Allocation and Automatic memory allocation
The C Dynamic Memory Allocation page on Wikipedia
This Memory Management Reference has more details on the underlying implementations for dynamic allocators.
There are three types of allocation — static, automatic, and dynamic.
Static Allocation means, that the memory for your variables is allocated when the program starts. The size is fixed when the program is created. It applies to global variables, file scope variables, and variables qualified with static defined inside functions.
Automatic memory allocation occurs for (non-static) variables defined inside functions, and is usually stored on the stack (though the C standard doesn't mandate that a stack is used). You do not have to reserve extra memory using them, but on the other hand, have also limited control over the lifetime of this memory. E.g: automatic variables in a function are only there until the function finishes.
void func() {
int i; /* `i` only exists during `func` */
}
Dynamic memory allocation is a bit different. You now control the exact size and the lifetime of these memory locations. If you don't free it, you'll run into memory leaks, which may cause your application to crash, since at some point of time, system cannot allocate more memory.
int* func() {
int* mem = malloc(1024);
return mem;
}
int* mem = func(); /* still accessible */
In the upper example, the allocated memory is still valid and accessible, even though the function terminated. When you are done with the memory, you have to free it:
free(mem);
Static memory allocation: The compiler allocates the required memory space for a declared variable.By using the address of operator,the reserved address is obtained and this address may be assigned to a pointer variable.Since most of the declared variable have static memory,this way of assigning pointer value to a pointer variable is known as static memory allocation. memory is assigned during compilation time.
Dynamic memory allocation: It uses functions such as malloc( ) or calloc( ) to get memory dynamically.If these functions are used to get memory dynamically and the values returned by these functions are assingned to pointer variables, such assignments are known as dynamic memory allocation.memory is assined during run time.
Static Memory Allocation:
Variables get allocated permanently
Allocation is done before program execution
It uses the data structure called stack for implementing static allocation
Less efficient
There is no memory reusability
Dynamic Memory Allocation:
Variables get allocated only if the program unit gets active
Allocation is done during program execution
It uses the data structure called heap for implementing dynamic allocation
More efficient
There is memory reusability . Memory can be freed when not required
Difference between STATIC MEMORY ALLOCATION & DYNAMIC MEMORY ALLOCATION
Memory is allocated before the execution of the program begins
(During Compilation).
Memory is allocated during the execution of the program.
No memory allocation or deallocation actions are performed during Execution.
Memory Bindings are established and destroyed during the Execution.
Variables remain permanently allocated.
Allocated only when program unit is active.
Implemented using stacks and heaps.
Implemented using data segments.
Pointer is needed to accessing variables.
No need of Dynamically allocated pointers.
Faster execution than Dynamic.
Slower execution than static.
More memory Space required.
Less Memory space required.
Static memory allocation is allocated memory before execution pf program during compile time.
Dynamic memory alocation is alocated memory during execution of program at run time.
Static memory allocation. Memory allocated will be in stack.
int a[10];
Dynamic memory allocation. Memory allocated will be in heap.
int *a = malloc(sizeof(int) * 10);
and the latter should be freed since there is no Garbage Collector(GC) in C.
free(a);

Resources