Variable-length arrays are created on heap, but we cannot free them? - c

I've verified that variable-length arrays are created on heap (see code below), but we cannot use a free operation to free them (cause a fault trap 6).
I was taught that the heap is managed by user and thus we have to explicitly free anything on heap if we don't need them. So who will be responsible to free these memories? Is this a defect of C?
Code that shows variable-length arrays are created on heap. (Platform: Mac OS X, gcc, 64bits)
#include<stdio.h>
#include<stdlib.h>
int main(int argc, char **argv)
{
int n = atoi(argv[1]);
int a1[10];
int a2[n];
int a3[10];
printf("address for a1:%p,address for a2:%p, address for a3:%p\n",a1,a2,a3);
printf("a1-a2: %lx, a1-a3: %lx\n",(a1-a2),(a1-a3));
//free(t); // will cause fault trap 6
return 0;
}
The result is:
$ ./run 10
address for a1:0x7fff5d095aa0,address for a2:0x7fff5d0959e0, address for a3:0x7fff5d095a70
a1-a2: 30, a1-a3: c
It's obvious that a1 and a3 is consecutive and thus are on stack, but a2 has a lower address, thus on heap.

Variable length arrays have automatic storage duration and have either block scope or function prototype scope.
So this array
int a2[n];
has automatic storage duration and the block scope of the function main. It was not created in the heap. It is the compiler that generates the corresponding code to free the allocated memory for the array when the control will exit the block scope.
According to the C Standard (6.2.4 Storage durations of objects)
7 For such an object that does have a variable length array type, its
lifetime extends from the declaration of the object until execution of
the program leaves the scope of the declaration.35) If the scope is
entered recursively, a new instance of the object is created each
time. The initial value of the object is indeterminate.
You may apply function free only to objects that were allocated using one of the memory allocation functions like malloc, calloc or realloc.

How an implementation allocates storage for VLAs is implementation defined. VLAs have automatic storage duration and you should not try to free() it.
You should treat it just like any other local variable for all practical purposes.
You only free() whatever the memory you allocated using malloc() family functions.
VLAs are not supported by all implementation and it is a conditional feature.
The macro
_ _STDC_NO_VLA_ _
is used to test if VLAs are supported or not by implementation (if it's 1 then VLAs are not supported).
In my opinion, VLAs should not used mainly because:
They are optional in C11
The allocation failure is not portable detectable

Is this a defect of C?
Absolutely not. You made several wrong assumptions, which lead you to a wrong conclusion.
First, let's get terminology straight: "stack" is called automatic storage area; "heap" is called dynamic storage area. C standard does not make any claims about any of the things listed below:
Relative order of addresses in automatic and dynamic areas
Relative order of addresses of items within the same storage area
Presence or absence of gaps between allocations within the same area
This makes it impossible to determine if a variable is in an automatic or in a dynamic area simply by looking at numeric addresses, without making a guess. In particular, what appears "obvious" to you has nothing to do with what is actually happening.
So who will be responsible to free these memories?
You are responsible for calling free on everything that you allocated in the dynamic storage area. You do not allocate your variable-length array in the dynamic storage area*, hence you are not responsible for calling free on it.
* If a compiler implementation were to allocate a VLA in the dynamic storage area, the compiler would be responsible for calling free on that pointer.

Related

When/where are local arrays allocated?

https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-and-C.html describes automatic allocation of local variables. I understand that local variables are commonly allocated on the stack. I can imagine how an int might be allocated on the stack; just push its value. But how might an array be allocated?
For example, if you declare an array char str[10];, does that 10 bytes of space go on the stack, or is it allocated somewhere else, and only the str pointer is pushed to the stack? If the latter, where is the 10 bytes of space allocated?
Furthermore, when exactly are local variables, including arrays, allocated? I commonly see heap allocation referred to as "dynamic allocation", implying that automatic variables are not dynamically allocated. But automatic variables may be declared within flow-of-control constructs and function bodies, so the compiler can't possibly know before runtime exactly how much space will be occupied by automatic variables. So automatic variables must also be dynamically allocated, right?
Edit: I would like to emphasize the first half of this question. I am most interested in understanding when and where the space for local arrays is allocated. On the stack? Somewhere else?
Edit 2: I made a mistake when I originally included the C++ tag for this question. I meant to ask only about the C language and its implementations. I apologize for any confusion.
In the C 2018 standard, clause 6.2.4, paragraphs 6 and 7 tell us about the lifetimes of objects with automatic storage duration. Paragraph 6 covers such objects that are not variable length arrays:
… its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time.
Thus, if we have this code:
{
PointA;
int x = 3;
PointB;
}
then x exists in the C model as soon as execution reaches PointA—its block was entered, and that is when the lifetime of x begins. However, although x already exists at PointA, its value is indeterminate. The initialization only occurs when the definition is reached.
Paragraph 7 tells us about variable length arrays:
… its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration.
So, if we have this code:
{
PointA;
int x[n]; // n is some variable.
PointB;
}
then x does not exist at PointA. Its lifetime begins when int x[n]; is reached.
Keep in mind this existence is only in terms of C’s abstract model of computing. Compilers are allowed to optimize code as long as the observable results (such as output of the program) are the same. So the actual code generated by a compiler might not create x when the block is entered. (It might not create x at all; it could be optimized away completely.)
For example, if you declare an array char str[10];, does that 10 bytes of space go on the stack, or is it allocated somewhere else, and only the str pointer is pushed to the stack? If the latter, where is the 10 bytes of space allocated?
In general, the array's storage is allocated on the stack, just like any other local variable. This is compiler and target-specific. Even on a x86_64 machine, a 4 billion byte array is probably not allocated on the stack. I'd expect one of: a compile error, a link error, a runtime error, or it works somehow. In the last alternative, it might call new[] or malloc() and leave the pointer to the array on the stack in place of the array.
Notice that the array's allocation and its pointer are the same thing, so your addition of allocated somewhere else, and only the str pointer wording might indicate confusion. The allocation occurs and the name for it are not independent data.
What you ask for is depends on the language implementation (the compiler). To answer your question, this is (a simplified overview of) what compilers usually do for compiled languages (like C/C++):
When the compiler finishes parsing a function, it keeps a symbol table of all local variables declared in this function, even those declared "syntactically" during the instruction flow of the function (like local loops variables). Later, when it needs to generate the final (assembly) code, it generates the necessary instructions to push (or just moves the stack pointer) a sufficient space for all local variables. So, local loop variables, for instance, are not allocated when the loop starts execution. Rather, they are allocated at the beginning of the execution of the function containing the loop. The compiler also adds instructions to remove this allocated stack space before returning from the function.
So, automatic variables, like your char array, is totally allocated on the stack in this (common) scenario.
[EDIT] Variable length arrays (before C99)
The discussion above was for arrays having lengths known at compile time like this:
void f () {
char n[10];
....
}
If we stay in C language terms (before C99), variable-length arrays (arrays whose lengths are not known at compile-time, but rather at runtime) are declared as a pointer like this:
void f() {
char *n;
... //array is later allocated using some kind of memory allocation construct
}
This, in fact, just declares a pointer to the array. Pointers size is known to the compiler. So, as I said above, the compiler will be able to reserve the necessary storage for the pointer on the stack (just the pointer, not the real array) regardless of what will be the size of the array at runtime. When the execution reaches the line that allocates the array (using malloc, for instance), the array storage is dynamically allocated on the heap, and its address is stored in the local automatic variable n. In languages without garbage collection, this requires freeing (deallocating) the reserved storage from the heap manually (i.e. the programmer should add an instruction to do it in the program when the array is no longer needed). This is not necessary for constant-sized array (that are allocated on the stack) because the compiler removes the stack frame before returning from the function, as I said earlier.
[EDIT2]
C99 variable length arrays cannot be declared on the stack. The compiler must add some code to the resulting machine code that handles its dynamic creation and destruction at runtime.

Does the C compiler allocate memory for a variable? [duplicate]

This question already has answers here:
When is memory allocated during compilation?
(8 answers)
Closed 5 years ago.
I was wondering that in which stage, memory gets allocated to the variable.
Is it in the compilation stage or is it at the execution time?
Yes, and yes, both.
For a global variable (declared at file scope), the compiler reserves memory in the executable image. So this is compile-time.
For an automatic variable (declared in a function), the compiler adds instructions to allocate the variable on the stack. So this is run-time
int a; // file scope
int f(void)
{
int b; // function scope
...
Notes:
The compiler has one (one set of) instructions to allocate all local variables of a function in one time. Generally, there is not overhead per variable (there can be exceptions I don't discuss now). These instructions are executed every time the function is called.
The compiler does not allocate storage for your strings. This is an error beginners often make. Consider:
char *s; // a pointer to a strings
scanf("%s", s); // no, the compiler will not allocate storage for the string to read.
It depends on the kind of variable/object.
Globally and statically allocated variables are known at compile time and their offsets in the data segment are baked into the program. So in a way, they got allocated at compile time.
Variables local to function scope are allocated on the stack. You could say that the compiler knew about them and the kind of storage their needed but obviously they got allocated (in the sense of came into existence) at run time, during a function call.
Another interesting object is the heap allocated object, which can be created with malloc/calloc in C and new or related mechanisms in C++. They are allocated at run time in the heap section.
There's a third kind of memory which is allocated dynamically by using malloc() and friends.
This memory is taken from the so called heap. While automatic variables (you have in functions) is taken from the so called stack.
Then, if you're having a variable with an initializer (e.g. int i = 5;) that never changes value, a compiler may figure that out and not allocate memory at all. Instead it would just use 5 wherever you use that variable in your code.

How does memory allocation in malloc differs from that of an array?

I have written a code like this:
int * ptr;
ptr = (int*) malloc(5 * sizeof(int));
Please tell me how memory is allocated to "ptr" using this malloc function?
How is it different from memory allocation in an integer array?
1. Answer in terms of the language
It differs in the storage duration. If you use an integer array, it could be at the file scope (outside any function):
int arr[5];
This has static storage duration, which means the object is alive for the whole execution time of the program.
The other possibility is inside a function:
void foo(void)
{
int arr[5];
}
This variant has automatic storage duration, so the object is only alive as long as the program execution is inside this function (more generally: inside the scope of the variable which is the enclosing pair of braces: { ... })
If you use malloc() instead, the object has dynamic storage duration. This means you decide for how long the object is alive. It stays alive until you call free() on it.
2. Answer in terms of the OS
malloc() is typically implemented on the heap. This is an area of your address space where your program can dynamically request more memory from the OS.
In contrast, with an object with automatic storage duration, typical implementations place it on the stack (where a new frame is created for each function call) and with static storage duration, it will be in a data segment of your program already from the start.
This part of the answer is intentionally a bit vague; C implementations exist for a vast variety of systems, and while stack, heap and data segments are used in many modern operating systems with virtual memory management, they are by no means required -- take for example embedded platforms and microcontrollers where your program might run without any operating system. So, writing portable C code, you should (most of the time) only be interested in what the language specifies.
The C standard is very clear on this.
The memory pointed to by ptr has dynamic storage duration.
The memory associated with an int foo[5];, say has automatic storage duration.
The key difference between the two is that in the dynamic case you need to call free on ptr to release the memory once you're done with it, whereas foo will be released automatically once it goes out of scope.
The mechanism for acquiring the memory in either case is intentionally left to the compiler.
For further research, Google the terms that I've italicised.

what is the main reason of using malloc() in C

I'm currently using C/C++. But what is the main reason of using malloc/new instead of just declare some var on the stack. Like: int a;
Or another example:
int *a; // then use this to track an array
int a = (int)malloc(sizeof(int));
Automatic variables have a lifetime that ends when the program leaves the block of code that declares them. Sometimes you want them to live longer than that; dynamic allocation gives you full control over their lifetime. Sometimes they are too big for the stack; dynamic memory is (typically) less restricted.
With that flexibility comes responsibility: you need to delete them when you've finished with them, but not before. This is difficult to get right if you try to hold onto a raw pointer and do it yourself; so (in C++) learn about RAII, and use ready-made management types like smart pointers and containers to do the work for you.
TL;DR: stack and heap are two different memory areas, they serve different purposes, they have their own access pattern (with everything that implies) and policies (yes, stack memory is way more controlled than heap memory on hw-enforced systems).
Stack memory is a precious, limited resource that needs to be allocated contiguously (you can't chunk it as you would for a heap allocation).
As a consequence of this, its default dimension on many x86 platforms is usually around 1 MB (although this can be increased). Definitely small with regard to how much memory you might allocate with a heap allocation. Your question isn't an exact duplicate of this question but I believe you should take a look at it if you're interested in other reasons for the stack being a limited resource.
Other reasons include visibility scopes (stack stuff is collected/destroyed at the end of the scope while heap allocated memory can be used in other parts of your application until you free it).
The C standard function malloc() with its friends (free(), calloc() and realloc()) allows to dynamically allocate arbitrary large (allowed by OS though) chunks of memory in runtime. This has informal name of dynamic storage duration (formal one from N1570 is allocated storage duration) with following characteristics:
lifetime is cotrolled by malloc() and free() calls. This is unlikely to automatic storage duration variables, where their lifetime is restricted by scope (e.g. block, more precisely by { and }, with exception to VLAs) and static storage duration, which have lifetime of whole execution of program
scope is determined by availability of pointer to allocated data
In other words you use malloc() when you need maximum flexibility of data allocation in runtime. Automatic variables are practically limited by stack size and static variables require compile-time reservation for exact (i.e. static) amount of memory.

Difference between static memory allocation and dynamic memory allocation

I would like to know what is the difference between static memory allocation and dynamic memory allocation?
Could you explain this with any example?
This is a standard interview question:
Dynamic memory allocation
Is memory allocated at runtime using calloc(), malloc() and friends. It is sometimes also referred to as 'heap' memory, although it has nothing to do with the heap data-structure ref.
int * a = malloc(sizeof(int));
Heap memory is persistent until free() is called. In other words, you control the lifetime of the variable.
Automatic memory allocation
This is what is commonly known as 'stack' memory, and is allocated when you enter a new scope (usually when a new function is pushed on the call stack). Once you move out of the scope, the values of automatic memory addresses are undefined, and it is an error to access them.
int a = 43;
Note that scope does not necessarily mean function. Scopes can nest within a function, and the variable will be in-scope only within the block in which it was declared. Note also that where this memory is allocated is not specified. (On a sane system it will be on the stack, or registers for optimisation)
Static memory allocation
Is allocated at compile time*, and the lifetime of a variable in static memory is the lifetime of the program.
In C, static memory can be allocated using the static keyword. The scope is the compilation unit only.
Things get more interesting when the extern keyword is considered. When an extern variable is defined the compiler allocates memory for it. When an extern variable is declared, the compiler requires that the variable be defined elsewhere. Failure to declare/define extern variables will cause linking problems, while failure to declare/define static variables will cause compilation problems.
in file scope, the static keyword is optional (outside of a function):
int a = 32;
But not in function scope (inside of a function):
static int a = 32;
Technically, extern and static are two separate classes of variables in C.
extern int a; /* Declaration */
int a; /* Definition */
*Notes on static memory allocation
It's somewhat confusing to say that static memory is allocated at compile time, especially if we start considering that the compilation machine and the host machine might not be the same or might not even be on the same architecture.
It may be better to think that the allocation of static memory is handled by the compiler rather than allocated at compile time.
For example the compiler may create a large data section in the compiled binary and when the program is loaded in memory, the address within the data segment of the program will be used as the location of the allocated memory. This has the marked disadvantage of making the compiled binary very large if uses a lot of static memory. It's possible to write a multi-gigabytes binary generated from less than half a dozen lines of code. Another option is for the compiler to inject initialisation code that will allocate memory in some other way before the program is executed. This code will vary according to the target platform and OS. In practice, modern compilers use heuristics to decide which of these options to use. You can try this out yourself by writing a small C program that allocates a large static array of either 10k, 1m, 10m, 100m, 1G or 10G items. For many compilers, the binary size will keep growing linearly with the size of the array, and past a certain point, it will shrink again as the compiler uses another allocation strategy.
Register Memory
The last memory class are 'register' variables. As expected, register variables should be allocated on a CPU's register, but the decision is actually left to the compiler. You may not turn a register variable into a reference by using address-of.
register int meaning = 42;
printf("%p\n",&meaning); /* this is wrong and will fail at compile time. */
Most modern compilers are smarter than you at picking which variables should be put in registers :)
References:
The libc manual
K&R's The C programming language, Appendix A, Section 4.1, "Storage Class". (PDF)
C11 standard, section 5.1.2, 6.2.2.3
Wikipedia also has good pages on Static Memory allocation, Dynamic Memory Allocation and Automatic memory allocation
The C Dynamic Memory Allocation page on Wikipedia
This Memory Management Reference has more details on the underlying implementations for dynamic allocators.
There are three types of allocation — static, automatic, and dynamic.
Static Allocation means, that the memory for your variables is allocated when the program starts. The size is fixed when the program is created. It applies to global variables, file scope variables, and variables qualified with static defined inside functions.
Automatic memory allocation occurs for (non-static) variables defined inside functions, and is usually stored on the stack (though the C standard doesn't mandate that a stack is used). You do not have to reserve extra memory using them, but on the other hand, have also limited control over the lifetime of this memory. E.g: automatic variables in a function are only there until the function finishes.
void func() {
int i; /* `i` only exists during `func` */
}
Dynamic memory allocation is a bit different. You now control the exact size and the lifetime of these memory locations. If you don't free it, you'll run into memory leaks, which may cause your application to crash, since at some point of time, system cannot allocate more memory.
int* func() {
int* mem = malloc(1024);
return mem;
}
int* mem = func(); /* still accessible */
In the upper example, the allocated memory is still valid and accessible, even though the function terminated. When you are done with the memory, you have to free it:
free(mem);
Static memory allocation: The compiler allocates the required memory space for a declared variable.By using the address of operator,the reserved address is obtained and this address may be assigned to a pointer variable.Since most of the declared variable have static memory,this way of assigning pointer value to a pointer variable is known as static memory allocation. memory is assigned during compilation time.
Dynamic memory allocation: It uses functions such as malloc( ) or calloc( ) to get memory dynamically.If these functions are used to get memory dynamically and the values returned by these functions are assingned to pointer variables, such assignments are known as dynamic memory allocation.memory is assined during run time.
Static Memory Allocation:
Variables get allocated permanently
Allocation is done before program execution
It uses the data structure called stack for implementing static allocation
Less efficient
There is no memory reusability
Dynamic Memory Allocation:
Variables get allocated only if the program unit gets active
Allocation is done during program execution
It uses the data structure called heap for implementing dynamic allocation
More efficient
There is memory reusability . Memory can be freed when not required
Difference between STATIC MEMORY ALLOCATION & DYNAMIC MEMORY ALLOCATION
Memory is allocated before the execution of the program begins
(During Compilation).
Memory is allocated during the execution of the program.
No memory allocation or deallocation actions are performed during Execution.
Memory Bindings are established and destroyed during the Execution.
Variables remain permanently allocated.
Allocated only when program unit is active.
Implemented using stacks and heaps.
Implemented using data segments.
Pointer is needed to accessing variables.
No need of Dynamically allocated pointers.
Faster execution than Dynamic.
Slower execution than static.
More memory Space required.
Less Memory space required.
Static memory allocation is allocated memory before execution pf program during compile time.
Dynamic memory alocation is alocated memory during execution of program at run time.
Static memory allocation. Memory allocated will be in stack.
int a[10];
Dynamic memory allocation. Memory allocated will be in heap.
int *a = malloc(sizeof(int) * 10);
and the latter should be freed since there is no Garbage Collector(GC) in C.
free(a);

Resources