Runtime Memory allocation on stack - c

I want to know about runtime memory allocation on stack area and how its different from runtime memory allocation on Heap area.
I know how memory get allocated by using library function.
#include<alloca.h> void *alloca(size_t size); //(for runtime memory on stack)
#include<stdlib.h> void *malloc(size_t size); //(for run time memory on heap)
I also know that if we are using alloca function we don't need to free that memory explicitly because it is associated with stack, its get freed automatically.
I want to know which system calls are associated with alloc and malloc and how they works in both.

In short they usually don't use system calls, unless running out of available memory.
The bahavior is different for either, so I explain differently.
malloc
Let's say initially your program has 1MB (for example) available memory for allocation. malloc is a (standard) library function that takes this 1MB, looks at the memory you want to allocate, cut a part of the 1MB out and give it to you. For book-keeping, it keeps a linked-list of unallocated memories. The free function then adds the block being freed back to the free list, effectively freeing the memory (even though the OS still doesn't get any of it back, unless free decides that you have way too much memory and actually give it back to the OS).
Only when you run out of your 1MB does malloc actually ask the operating system for more memory. The system call itself is platform dependent. You can take a look at this answer for example.
alloca
This is not a standard function, and it could be implemented in various ways, none of which probably ever call any system functions (unless they are nice enough to increase your stack size, but you never know).
What alloca does (or equivalently the (C99) standard variable length arrays (VLA) do) is to increase the stack frame of the current function by adjusting proper registers (for example esp in x86). Any variable that happens to be on the same stack frame but located after the variable length array (or allocaed memory) would then be addressed by ebp + size_of_vla + constant instead of the good old simple ebp + constant.
Since the stack pointer is recovered to the frame of the previous function upon function return (or generally on exit of any {} block), any stack memory allocated would be automatically released.

The alloca() function is typically implemented by the compiler vendor, and doesn't have to be a "system call" at all.
Since all it needs to do is allocate space on the local stack frame, it can be implemented very simply and thus be incredibly fast when compared to malloc().
The Linux manual page for it says:
The inlined code often consists of a single instruction adjusting the stack pointer, and does not check for stack overflow.
Also, I'm not sure you realize that the memory gets deallocated "automatically" when the function that called alloca() exits. This is very important, you can't use alloca() to do long-lived allocations.

The alloca function is, according to its manpage a function that is inlined and will be specially treated by the compiler and expanded (at least for gcc).
The behavior is implementation-defined and as such, should not be used, for you cannot gurantee it to work the same way always.

Related

Where does malloc() allocate memory? Is it the data section or the heap section of the virtual address space of the process?

Ever since I was introduced to C, I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Various OS textbooks say that malloc involves system call (though not always but at times) to allocate structures on heap to the process. Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call. The activation records of a function are placed in the stack section of the process and since the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, just start from the highest possible address of the virtual address space. It does not even require a system call.
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section. The routine like malloc could self handle the "free" list and "allocated" list on its own. All it needs to know is the end of the "data section". Certain texts say that system calls are necessary to "attach memory to the process for dynamic memory allocation", but if malloc allocates memory on "heap section" why is it at all required to attach memory to the process during malloc? Could be simply taken from portion already part of the process.
While going through the text "The C Programming Language" [2e] by Kernighan and Ritchie, I came across their implementation of the malloc function [section 8.7 pages 185-189]. The authors say :
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Since asking the system for memory is a comparatively expensive operation, the authors do not do that on every call to malloc, so they create a function morecore which requests at least NALLOC units; this larger block is chopped up as needed. And the basic free list management is done by free.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Where
a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables.
Which I guess is not the "heap section". [Data section is the second section from bottom in the picture above, while heap is the third section from bottom.]
I am totally confused. I want to know what really happens and how both the concepts are correct? Please help me understand the concept by joining the scattered pieces together...
In your diagram, the section labeled "data" is more precisely called "static data"; the compiler pre-allocates this memory for all the global variables when the process starts.
The heap that malloc() uses is the rest of the process's data segment. This initially has very little memory assigned to it in the process. If malloc() needs more memory, it can use sbrk() to extend the size of the data segment, or it can use mmap() to create additional memory segments elsewhere in the address space.
Why does malloc() need to do this? Why not simply make the entire address space available for it to use? There are historical and practical reasons for this.
The historical reason is that early computers didn't have virtual memory. All the memory assigned to a process was swapped in bulk to disk when switching between processes. So it was important to only assign memory pages that were actually needed.
The practical reason is that this is useful for detecting various kinds of errors. If you've ever gotten a segmentation violation error because you dereferenced an uninitialized pointer, you've benefited from this. Much of the process's virtual address space is not allocated to the process, which makes it likely that unitialized pointers point to unavailable memory, and you get an error trying to use it.
There's also an unallocated gap between the heap (growing upwards) and the stack (growing downward). This is used to detect stack overflow -- when the stack tries to use memory in that gap, it gets a fault that's translated to the stack overflow signal.
This is the Standard C Library specification for malloc(), in its entirety:
7.22.3.4 The malloc function
Synopsis
#include <stdlib.h>
void *malloc(size_t size);
Description
The malloc function allocates space for an object whose size is
specified by size and whose value is indeterminate. Note that this need
not be the same as the representation of floating-point zero or a null
pointer constant.
Returns
The malloc function returns either a null pointer or a pointer to the
allocated space.
That's it. There's no mention of the Heap, the Stack or any other memory location, which means that the underlying mechanisms for obtaining the requested memory are implementation details.
In other words, you don't care where the memory comes from, from a C perspective. A conforming implementation is free to implement malloc() in any way it sees fit, so long as it conforms to the above specification.
I was told that in C dynamic memory allocation is done using the functions in the malloc family. I also learned that memory dynamically allocated using malloc is allocated on the heap section of the process.
Correct on both points.
Now supposing that malloc returns pointer to chunk of bytes allocated on the heap, why should it need a system call.
It needs to request an adjustment to the size of the heap, to make it bigger.
...the "stack section" is already a part of the virtual address space of the process, pushing and popping of activation records, manipulation of stack pointers, [...] does not even require a system call.
The stack segment is grown implicitly, yes, but that's a special feature of the stack segment. There's typically no such implicit growing of the data segment. (Note, too, that the implicit growing of the stack segment isn't perfect, as witness the number of people who post questions to SO asking why their programs crash when they allocate huge arrays as local variables.)
Now on the same grounds since the "heap section" is also a part of the virtual address space of the process, why should a system call be necessary for allocating a chunk of bytes in this section.
Answer 1: because it's always been that way.
Answer 2: because you want accidental stray pointer references to crash, not to implicitly allocate memory.
malloc calls upon the operating system to obtain more memory as necessary.
Which is what the OS texts say, but counter intuitive to my thought above (if malloc allocates space on heap).
Again, malloc does request space on the heap, but it must use an explicit system call to do so.
But the thing is that the authors use sbrk() to ask the operating system for memory in morecore. Now Wikipedia says:
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process.
Different people use different nomenclatures for the different segments. There's not much of a distinction between the "data" and "heap" segments. You can think of the heap as a separate segment, or you can think of those system calls -- the ones that "allocate space on the heap" -- as simply making the data segment bigger. That's the nomenclature the Wikipedia article is using.
Some updates:
I said that "There's not much of a distinction between the 'data' and 'heap' segments." I suggested that you could think of them as subparts of a single, more generic data segment. And actually there are three subparts: initialized data, uninitialized data or "bss", and the heap. Initialized data has initial values that are explicitly copied out of the program file. Uninitialized data starts out as all bits zero, and so does not need to be stored in the program file; all the program file says is how many bytes of uninitialized data it needs. And then there's the heap, which can be thought of as a dynamic extension of the data segment, which starts out with a size of 0 but may be dynamically adjusted at runtime via calls to brk and sbrk.
I said, "you want accidental stray pointer references to crash, not to implicitly allocate memory", and you asked about this. This was in response to your supposition that explicit calls to brk or sbrk ought not to be required to adjust the size of the heap, and your suggestion that the heap could grow automatically, implicitly, just like the stack does. But how would that work, really?
The way automatic stack allocation works is that as the stack pointer grows (typically "downward"), it eventually reaches a point that it points to unallocated memory -- that blue section in the middle of the picture you posted. At that point, your program literally gets the equivalent of a "segmentation violation". But the operating system notices that the violation involves an address just below the existing stack, so instead of killing your program on an actual segmentation violation, it quick-quick makes the stack segment a little bigger, and lets your program proceed as if nothing had happened.
So I think your question was, why not have the upward-growing heap segment work the same way? And I suppose an operating system could be written that worked that way, but most people would say it was a bad idea.
I said that in the stack-growing case, the operating system notices that the violation involves an address "just below" the existing stack, and decides to grow the stack at that point. There's a definition of "just below", and I'm not sure what it is, but these days I think it's typically a few tens or hundreds of kilobytes. You can find out by writing a program that allocates a local variable
char big_stack_array[100000];
and seeing if your program crashes.
Now, sometimes a stray pointer reference -- that would otherwise cause a segmentation violation style crash -- is just the result of the stack normally growing. But sometimes it's a result of a program doing something stupid, like the common error of writing
char *retbuf;
printf("type something:\n");
fgets(retbuf, 100, stdin);
And the conventional wisdom is that you do not want to (that is, the operating system does not want to) coddle a broken program like this by automatically allocating memory for it (at whatever random spot in the address space the uninitialized retbuf pointer seems to point) to make it seem to work.
If the heap were set up to grow automatically, the OS would presumably define an analogous threshold of "close enough" to the existing heap segment. Apparently stray pointer references within that region would cause the heap to automatically grow, while references beyond that (farther into the blue region) would crash as before. That threshold would probably have to be bigger than the threshold governing automatic stack growth. malloc would have to be written to make sure not to try to grow the heap by more than that amount. And true, stray pointer references -- that is, program bugs -- that happened to reference unallocated memory in that zone would not be caught. (Which is, it's true, what can happen for buggy, stray pointer references just off the end of the stack today.)
But, really, it's not hard for malloc to keep track of things, and explicitly call sbrk when it needs to. The cost of requiring explicit allocation is small, and the cost of allowing automatic allocation -- that is, the cost of the stray pointer bugs not caught -- would be larger. This is a different set of tradeoffs than for the stack growth case, where an explicit test to see if the stack needed growing -- a test which would have to occur on every function call -- would be significantly expensive.
Finally, one more complication. The picture of the virtual memory layout that you posted -- with its nice little stack, heap, data, and text segments -- is a simple and perhaps outdated one. These days I believe things can be a lot more complicated. As #chux wrote in a comment, "your malloc() understanding is only one of many ways allocation is handled. A clear understanding of one model may hinder (or help) understanding of the many possibilities." Among those complicating possibilities are:
A program may have multiple stack segments maintaining multiple stacks, if it supports coroutines or multithreading.
The mmap and shm_open system calls may cause additional memory segments to be allocated, scattered anywhere within that blue region between the heap and the stack.
For large allocations, malloc may use mmap rather than sbrk to get memory from the OS, since it turns out this can be advantageous.
See also Why does malloc() call mmap() and brk() interchangeably?
As the bard said, "There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy." :-)
Not all virtual addresses are available at the beginning of a process.
OS does maintain a virtual-to-physics map, but (at any given time) only some of the virtual addresses are in the map. Reading or Writing to an virtual address that isn't in the map cause a instruction level exception. sbrk puts more addresses in the map.
Stack just like data section but has a fixed size, and there is no sbrk-like system call to extend it. We can say there is no heap section, but only a fixed-size stack section and a data section which can be grown upward by sbrk.
The heap section you say is actually a managed (by malloc and free) part of the data section. It's clear that the code relating to heap management is not in OS kernel but in C library executing in CPU user mode.

If I want a global VLA, could I use alloca() in the main function?

I have a main function for my app, and I allocate, for example, paths to configuration files, etc. Currently I use malloc for them, but they are never freed and always available for use throughout the lifetime of the app. I never even free them because the OS already automatically reclaims allocated memory when an application terminates. At this point, is there any reason not to use alloca instead of malloc, because the program ends when main returns and alloca memory is only deleted once the function it was allocated in is freed. So based on this logic, memory allocated in the main function with alloca is only deallocated once the program ends which is desired. Are these statements correct, and is there any reason not to use alloca (alloca is bad practice so when I said alloca meant alloca or making a VLA in main) in main for a 'global VLA' like object that lasts until the program terminates?
You can use alloca/VLA in main, but why?
The typical reason to use them is if you have some performance sensitive part that is called a lot, and you don't want the overhead of malloc/free. For main, your data is allocated once at the beginning of the program, so the overhead of a few malloc calls is negligible.
Another reason to not use alloca/VLA's in main is that they consume stack space, which is a very limited resource compared to heap space.
Depends on how much memory you need. If it is small enough (say a few hundred bytes or so), you can safely do alloca in main() or use VLAs.
But then, if the sizes of these arrays have a known upper-limit which is not very large, it would be even better and safer to declare them globally with that upper-limit as the size. That way you don't consume stack space and you don't have to malloc and then ensure the allocation succeeded. It is also then clear to whoever is reading that this piece of memory lives as long as the program does.
If the sizes can be arbitrarily large then the best thing to do is to continue using malloc() like you are already. Btw even if you are calling malloc() in main() and use it for the lifetime of the program, it is still considered good practice to free it before exit.
Technically no, because any variable declared in a function will not be global. But you can do something like this:
char *buffer;
int main(void) {
char buf[size];
buffer = buf;
That would give you an interface to access the buffer globally.
At this point, is there any reason not to use alloca instead of malloc
This is one question that typically should be asked the other way around. Is there any reason to use alloca instead of malloc? Consider changing if you have performance issues, but if you just want to avoid using free, I'd say that's a bad reason.
But I don't really see the point here. If you have an allocated buffer that you want to live from when the program starts to when it ends, then just free it in the end of the main function.
int main(void) {
char *buf = malloc(size);
// Do work
free(buf);
}
I wrote a long answer about alloca and VLA:s that you might find useful. Do I really need malloc?
VLA (as defined by the standard) and non-standard alloca are both meant to be used for allocating temporary, small arrays at local scope. Nothing else.
Allocating large objects on the stack is a well-known source for subtle & severe stack overflow bugs. This is the reason you should avoid large VLA and alloca objects. Whenever you need large objects at file scope, they should either be static arrays or dynamically allocated with malloc.
It should be noted that stack allocation is usually faster than heap allocation, because stack allocation doesn't need to concern itself with look-ups, fragmentation and other heap implementation-specific concerns. Stack allocation just says "these 100 bytes are mine" and then you are ready to go.
Regarding general confusion about "stack vs heap" please see What gets allocated on the stack and the heap?
You can't even place a standard VLA at file scope, because the array size needs to be an integer constant expression there. Plus the standard (C17 6.7.6) explicitly says that you aren't allowed to:
If an identifier is declared to be an object with static or thread storage
duration, it shall not have a variable length array type.
As for alloca it isn't standard C and bad for that reason. But it's also bad because it doesn't have any type safety, so VLA is preferred over alloca - it is safer and more portable.
It should be noted that the main purpose of VLA in modern programming is however to enable pointers to VLA, rather than allocating array objects of VLA type, which is a feature of limited use.
I never even free them because the OS already automatically reclaims allocated memory when an application terminates.
While that is correct, it is still considered good practice to call free() manually. Because if you have any heap corruption or pointer-related bugs somewhere in the program, you'll get a crash upon calling free(). Which is a good thing, since it allows you to catch such (common) bugs early on during development.
(If you are concerned about the performance of free(), you can exclude the free() calls from the release build and only use them in debug build. Though performance is rarely an issue when closing down the program - usually you can just shut down the GUI if any then let the program chew away on clean-up code in the background.)

What is the point of __builtin_alloca [duplicate]

alloca() allocates memory on the stack rather than on the heap, as in the case of malloc(). So, when I return from the routine the memory is freed. So, actually this solves my problem of freeing up dynamically allocated memory. Freeing of memory allocated through malloc() is a major headache and if somehow missed leads to all sorts of memory problems.
Why is the use of alloca() discouraged in spite of the above features?
The answer is right there in the man page (at least on Linux):
RETURN VALUE
The alloca() function returns a pointer to the beginning of the
allocated space. If the
allocation causes
stack overflow, program behaviour is undefined.
Which isn't to say it should never be used. One of the OSS projects I work on uses it extensively, and as long as you're not abusing it (alloca'ing huge values), it's fine. Once you go past the "few hundred bytes" mark, it's time to use malloc and friends, instead. You may still get allocation failures, but at least you'll have some indication of the failure instead of just blowing out the stack.
One of the most memorable bugs I had was to do with an inline function that used alloca. It manifested itself as a stack overflow (because it allocates on the stack) at random points of the program's execution.
In the header file:
void DoSomething() {
wchar_t* pStr = alloca(100);
//......
}
In the implementation file:
void Process() {
for (i = 0; i < 1000000; i++) {
DoSomething();
}
}
So what happened was the compiler inlined DoSomething function and all the stack allocations were happening inside Process() function and thus blowing the stack up. In my defence (and I wasn't the one who found the issue; I had to go and cry to one of the senior developers when I couldn't fix it), it wasn't straight alloca, it was one of ATL string conversion macros.
So the lesson is - do not use alloca in functions that you think might be inlined.
Old question but nobody mentioned that it should be replaced by variable length arrays.
char arr[size];
instead of
char *arr=alloca(size);
It's in the standard C99 and existed as compiler extension in many compilers.
alloca() is very useful if you can't use a standard local variable because its size would need to be determined at runtime and you can
absolutely guarantee that the pointer you get from alloca() will NEVER be used after this function returns.
You can be fairly safe if you
do not return the pointer, or anything that contains it.
do not store the pointer in any structure allocated on the heap
do not let any other thread use the pointer
The real danger comes from the chance that someone else will violate these conditions sometime later. With that in mind it's great for passing buffers to functions that format text into them :)
As noted in this newsgroup posting, there are a few reasons why using alloca can be considered difficult and dangerous:
Not all compilers support alloca.
Some compilers interpret the intended behaviour of alloca differently, so portability is not guaranteed even between compilers that support it.
Some implementations are buggy.
One issue is that it isn't standard, although it's widely supported. Other things being equal, I'd always use a standard function rather than a common compiler extension.
still alloca use is discouraged, why?
I don't perceive such a consensus. Lots of strong pros; a few cons:
C99 provides variable length arrays, which would often be used preferentially as the notation's more consistent with fixed-length arrays and intuitive overall
many systems have less overall memory/address-space available for the stack than they do for the heap, which makes the program slightly more susceptible to memory exhaustion (through stack overflow): this may be seen as a good or a bad thing - one of the reasons the stack doesn't automatically grow the way heap does is to prevent out-of-control programs from having as much adverse impact on the entire machine
when used in a more local scope (such as a while or for loop) or in several scopes, the memory accumulates per iteration/scope and is not released until the function exits: this contrasts with normal variables defined in the scope of a control structure (e.g. for {int i = 0; i < 2; ++i) { X } would accumulate alloca-ed memory requested at X, but memory for a fixed-sized array would be recycled per iteration).
modern compilers typically do not inline functions that call alloca, but if you force them then the alloca will happen in the callers' context (i.e. the stack won't be released until the caller returns)
a long time ago alloca transitioned from a non-portable feature/hack to a Standardised extension, but some negative perception may persist
the lifetime is bound to the function scope, which may or may not suit the programmer better than malloc's explicit control
having to use malloc encourages thinking about the deallocation - if that's managed through a wrapper function (e.g. WonderfulObject_DestructorFree(ptr)), then the function provides a point for implementation clean up operations (like closing file descriptors, freeing internal pointers or doing some logging) without explicit changes to client code: sometimes it's a nice model to adopt consistently
in this pseudo-OO style of programming, it's natural to want something like WonderfulObject* p = WonderfulObject_AllocConstructor(); - that's possible when the "constructor" is a function returning malloc-ed memory (as the memory remains allocated after the function returns the value to be stored in p), but not if the "constructor" uses alloca
a macro version of WonderfulObject_AllocConstructor could achieve this, but "macros are evil" in that they can conflict with each other and non-macro code and create unintended substitutions and consequent difficult-to-diagnose problems
missing free operations can be detected by ValGrind, Purify etc. but missing "destructor" calls can't always be detected at all - one very tenuous benefit in terms of enforcement of intended usage; some alloca() implementations (such as GCC's) use an inlined macro for alloca(), so runtime substitution of a memory-usage diagnostic library isn't possible the way it is for malloc/realloc/free (e.g. electric fence)
some implementations have subtle issues: for example, from the Linux manpage:
On many systems alloca() cannot be used inside the list of arguments of a function call, because the stack space reserved by alloca() would appear on the stack in the middle of the space for the function arguments.
I know this question is tagged C, but as a C++ programmer I thought I'd use C++ to illustrate the potential utility of alloca: the code below (and here at ideone) creates a vector tracking differently sized polymorphic types that are stack allocated (with lifetime tied to function return) rather than heap allocated.
#include <alloca.h>
#include <iostream>
#include <vector>
struct Base
{
virtual ~Base() { }
virtual int to_int() const = 0;
};
struct Integer : Base
{
Integer(int n) : n_(n) { }
int to_int() const { return n_; }
int n_;
};
struct Double : Base
{
Double(double n) : n_(n) { }
int to_int() const { return -n_; }
double n_;
};
inline Base* factory(double d) __attribute__((always_inline));
inline Base* factory(double d)
{
if ((double)(int)d != d)
return new (alloca(sizeof(Double))) Double(d);
else
return new (alloca(sizeof(Integer))) Integer(d);
}
int main()
{
std::vector<Base*> numbers;
numbers.push_back(factory(29.3));
numbers.push_back(factory(29));
numbers.push_back(factory(7.1));
numbers.push_back(factory(2));
numbers.push_back(factory(231.0));
for (std::vector<Base*>::const_iterator i = numbers.begin();
i != numbers.end(); ++i)
{
std::cout << *i << ' ' << (*i)->to_int() << '\n';
(*i)->~Base(); // optionally / else Undefined Behaviour iff the
// program depends on side effects of destructor
}
}
Lots of interesting answers to this "old" question, even some relatively new answers, but I didn't find any that mention this....
When used properly and with care, consistent use of alloca()
(perhaps application-wide) to handle small variable-length allocations
(or C99 VLAs, where available) can lead to lower overall stack
growth than an otherwise equivalent implementation using oversized
local arrays of fixed length. So alloca() may be good for your stack if you use it carefully.
I found that quote in.... OK, I made that quote up. But really, think about it....
#j_random_hacker is very right in his comments under other answers: Avoiding the use of alloca() in favor of oversized local arrays does not make your program safer from stack overflows (unless your compiler is old enough to allow inlining of functions that use alloca() in which case you should upgrade, or unless you use alloca() inside loops, in which case you should... not use alloca() inside loops).
I've worked on desktop/server environments and embedded systems. A lot of embedded systems don't use a heap at all (they don't even link in support for it), for reasons that include the perception that dynamically allocated memory is evil due to the risks of memory leaks on an application that never ever reboots for years at a time, or the more reasonable justification that dynamic memory is dangerous because it can't be known for certain that an application will never fragment its heap to the point of false memory exhaustion. So embedded programmers are left with few alternatives.
alloca() (or VLAs) may be just the right tool for the job.
I've seen time & time again where a programmer makes a stack-allocated buffer "big enough to handle any possible case". In a deeply nested call tree, repeated use of that (anti-?)pattern leads to exaggerated stack use. (Imagine a call tree 20 levels deep, where at each level for different reasons, the function blindly over-allocates a buffer of 1024 bytes "just to be safe" when generally it will only use 16 or less of them, and only in very rare cases may use more.) An alternative is to use alloca() or VLAs and allocate only as much stack space as your function needs, to avoid unnecessarily burdening the stack. Hopefully when one function in the call tree needs a larger-than-normal allocation, others in the call tree are still using their normal small allocations, and the overall application stack usage is significantly less than if every function blindly over-allocated a local buffer.
But if you choose to use alloca()...
Based on other answers on this page, it seems that VLAs should be safe (they don't compound stack allocations if called from within a loop), but if you're using alloca(), be careful not to use it inside a loop, and make sure your function can't be inlined if there's any chance it might be called within another function's loop.
All of the other answers are correct. However, if the thing you want to alloc using alloca() is reasonably small, I think that it's a good technique that's faster and more convenient than using malloc() or otherwise.
In other words, alloca( 0x00ffffff ) is dangerous and likely to cause overflow, exactly as much as char hugeArray[ 0x00ffffff ]; is. Be cautious and reasonable and you'll be fine.
I don't think anyone has mentioned this: Use of alloca in a function will hinder or disable some optimizations that could otherwise be applied in the function, since the compiler cannot know the size of the function's stack frame.
For instance, a common optimization by C compilers is to eliminate use of the frame pointer within a function, frame accesses are made relative to the stack pointer instead; so there's one more register for general use. But if alloca is called within the function, the difference between sp and fp will be unknown for part of the function, so this optimization cannot be done.
Given the rarity of its use, and its shady status as a standard function, compiler designers quite possibly disable any optimization that might cause trouble with alloca, if would take more than a little effort to make it work with alloca.
UPDATE:
Since variable-length local arrays have been added to C, and since these present very similar code-generation issues to the compiler as alloca, I see that 'rarity of use and shady status' does not apply to the underlying mechanism; but I would still suspect that use of either alloca or VLA tends to compromise code generation within a function that uses them. I would welcome any feedback from compiler designers.
Everyone has already pointed out the big thing which is potential undefined behavior from a stack overflow but I should mention that the Windows environment has a great mechanism to catch this using structured exceptions (SEH) and guard pages. Since the stack only grows as needed, these guard pages reside in areas that are unallocated. If you allocate into them (by overflowing the stack) an exception is thrown.
You can catch this SEH exception and call _resetstkoflw to reset the stack and continue on your merry way. Its not ideal but it's another mechanism to at least know something has gone wrong when the stuff hits the fan. *nix might have something similar that I'm not aware of.
I recommend capping your max allocation size by wrapping alloca and tracking it internally. If you were really hardcore about it you could throw some scope sentries at the top of your function to track any alloca allocations in the function scope and sanity check this against the max amount allowed for your project.
Also, in addition to not allowing for memory leaks alloca does not cause memory fragmentation which is pretty important. I don't think alloca is bad practice if you use it intelligently, which is basically true for everything. :-)
One pitfall with alloca is that longjmp rewinds it.
That is to say, if you save a context with setjmp, then alloca some memory, then longjmp to the context, you may lose the alloca memory. The stack pointer is back where it was and so the memory is no longer reserved; if you call a function or do another alloca, you will clobber the original alloca.
To clarify, what I'm specifically referring to here is a situation whereby longjmp does not return out of the function where the alloca took place! Rather, a function saves context with setjmp; then allocates memory with alloca and finally a longjmp takes place to that context. That function's alloca memory is not all freed; just all the memory that it allocated since the setjmp. Of course, I'm speaking about an observed behavior; no such requirement is documented of any alloca that I know.
The focus in the documentation is usually on the concept that alloca memory is associated with a function activation, not with any block; that multiple invocations of alloca just grab more stack memory which is all released when the function terminates. Not so; the memory is actually associated with the procedure context. When the context is restored with longjmp, so is the prior alloca state. It's a consequence of the stack pointer register itself being used for allocation, and also (necessarily) saved and restored in the jmp_buf.
Incidentally, this, if it works that way, provides a plausible mechanism for deliberately freeing memory that was allocated with alloca.
I have run into this as the root cause of a bug.
Here's why:
char x;
char *y=malloc(1);
char *z=alloca(&x-y);
*z = 1;
Not that anyone would write this code, but the size argument you're passing to alloca almost certainly comes from some sort of input, which could maliciously aim to get your program to alloca something huge like that. After all, if the size isn't based on input or doesn't have the possibility to be large, why didn't you just declare a small, fixed-size local buffer?
Virtually all code using alloca and/or C99 vlas has serious bugs which will lead to crashes (if you're lucky) or privilege compromise (if you're not so lucky).
alloca () is nice and efficient... but it is also deeply broken.
broken scope behavior (function scope instead of block scope)
use inconsistant with malloc (alloca()-ted pointer shouldn't be freed, henceforth you have to track where you pointers are coming from to free() only those you got with malloc())
bad behavior when you also use inlining (scope sometimes goes to the caller function depending if callee is inlined or not).
no stack boundary check
undefined behavior in case of failure (does not return NULL like malloc... and what does failure means as it does not check stack boundaries anyway...)
not ansi standard
In most cases you can replace it using local variables and majorant size. If it's used for large objects, putting them on the heap is usually a safer idea.
If you really need it C you can use VLA (no vla in C++, too bad). They are much better than alloca() regarding scope behavior and consistency. As I see it VLA are a kind of alloca() made right.
Of course a local structure or array using a majorant of the needed space is still better, and if you don't have such majorant heap allocation using plain malloc() is probably sane.
I see no sane use case where you really really need either alloca() or VLA.
Processes only have a limited amount of stack space available - far less than the amount of memory available to malloc().
By using alloca() you dramatically increase your chances of getting a Stack Overflow error (if you're lucky, or an inexplicable crash if you're not).
A place where alloca() is especially dangerous than malloc() is the kernel - kernel of a typical operating system has a fixed sized stack space hard-coded into one of its header; it is not as flexible as the stack of an application. Making a call to alloca() with an unwarranted size may cause the kernel to crash.
Certain compilers warn usage of alloca() (and even VLAs for that matter) under certain options that ought to be turned on while compiling a kernel code - here, it is better to allocate memory in the heap that is not fixed by a hard-coded limit.
alloca is not worse than a variable-length array (VLA), but it's riskier than allocating on the heap.
On x86 (and most often on ARM), the stack grows downwards, and that brings with it a certain amount of risk: if you accidentally write beyond the block allocated with alloca (due to a buffer overflow for example), then you will overwrite the return address of your function, because that one is located "above" on the stack, i.e. after your allocated block.
The consequence of this is two-fold:
The program will crash spectacularly and it will be impossible to tell why or where it crashed (stack will most likely unwind to a random address due to the overwritten frame pointer).
It makes buffer overflow many times more dangerous, since a malicious user can craft a special payload which would be put on the stack and can therefore end up executed.
In contrast, if you write beyond a block on the heap you "just" get heap corruption. The program will probably terminate unexpectedly but will unwind the stack properly, thereby reducing the chance of malicious code execution.
Sadly the truly awesome alloca() is missing from the almost awesome tcc. Gcc does have alloca().
It sows the seed of its own destruction. With return as the destructor.
Like malloc() it returns an invalid pointer on fail which will segfault on modern systems with a MMU (and hopefully restart those without).
Unlike auto variables you can specify the size at run time.
It works well with recursion. You can use static variables to achieve something similar to tail recursion and use just a few others pass info to each iteration.
If you push too deep you are assured of a segfault (if you have an MMU).
Note that malloc() offers no more as it returns NULL (which will also segfault if assigned) when the system is out of memory. I.e. all you can do is bail or just try to assign it any way.
To use malloc() I use globals and assign them NULL. If the pointer is not NULL I free it before I use malloc().
You can also use realloc() as general case if want copy any existing data. You need to check pointer before to work out if you are going to copy or concatenate after the realloc().
3.2.5.2 Advantages of alloca
Actually, alloca is not guaranteed to use the stack.
Indeed, the gcc-2.95 implementation of alloca allocates memory from the heap using malloc itself. Also that implementation is buggy, it may lead to a memory leak and to some unexpected behavior if you call it inside a block with a further use of goto. Not, to say that you should never use it, but some times alloca leads to more overhead than it releaves frome.
In my opinion, alloca(), where available, should be used only in a constrained manner. Very much like the use of "goto", quite a large number of otherwise reasonable people have strong aversion not just to the use of, but also the existence of, alloca().
For embedded use, where the stack size is known and limits can be imposed via convention and analysis on the size of the allocation, and where the compiler cannot be upgraded to support C99+, use of alloca() is fine, and I've been known to use it.
When available, VLAs may have some advantages over alloca(): The compiler can generate stack limit checks that will catch out-of-bounds access when array style access is used (I don't know if any compilers do this, but it can be done), and analysis of the code can determine whether the array access expressions are properly bounded. Note that, in some programming environments, such as automotive, medical equipment, and avionics, this analysis has to be done even for fixed size arrays, both automatic (on the stack) and static allocation (global or local).
On architectures that store both data and return addresses/frame pointers on the stack (from what I know, that's all of them), any stack allocated variable can be dangerous because the address of the variable can be taken, and unchecked input values might permit all sorts of mischief.
Portability is less of a concern in the embedded space, however it is a good argument against use of alloca() outside of carefully controlled circumstances.
Outside of the embedded space, I've used alloca() mostly inside logging and formatting functions for efficiency, and in a non-recursive lexical scanner, where temporary structures (allocated using alloca() are created during tokenization and classification, then a persistent object (allocated via malloc()) is populated before the function returns. The use of alloca() for the smaller temporary structures greatly reduces fragmentation when the persistent object is allocated.
Why no one mentions this example introduced by GNU documention?
https://www.gnu.org/software/libc/manual/html_node/Advantages-of-Alloca.html
Nonlocal exits done with longjmp (see Non-Local Exits) automatically
free the space allocated with alloca when they exit through the
function that called alloca. This is the most important reason to use
alloca
Suggest reading order 1->2->3->1:
https://www.gnu.org/software/libc/manual/html_node/Advantages-of-Alloca.html
Intro and Details from Non-Local Exits
Alloca Example
I don't think that anybody has mentioned this, but alloca also has some serious security issues not necessarily present with malloc (though these issues also arise with any stack based arrays, dynamic or not). Since the memory is allocated on the stack, buffer overflows/underflows have much more serious consequences than with just malloc.
In particular, the return address for a function is stored on the stack. If this value gets corrupted, your code could be made to go to any executable region of memory. Compilers go to great lengths to make this difficult (in particular by randomizing address layout). However, this is clearly worse than just a stack overflow since the best case is a SEGFAULT if the return value is corrupted, but it could also start executing a random piece of memory or in the worst case some region of memory which compromises your program's security.
IMO the biggest risk with alloca and variable length arrays is it can fail in a very dangerous manner if the allocation size is unexpectedly large.
Allocations on the stack typically have no checking in user code.
Modern operating systems will generally put a guard page in place below* to detect stack overflow. When the stack overflows the kernel may either expand the stack or kill the process. Linux expanded this guard region in 2017 to be significantly large than a page, but it's still finite in size.
So as a rule it's best to avoid allocating more than a page on the stack before making use of the previous allocations. With alloca or variable length arrays it's easy to end up allowing an attacker to make arbitrary size allocations on the stack and hence skip over any guard page and access arbitrary memory.
* on most widespread systems today the stack grows downwards.
Most answers here largely miss the point: there's a reason why using _alloca() is potentially worse than merely storing large objects in the stack.
The main difference between automatic storage and _alloca() is that the latter suffers from an additional (serious) problem: the allocated block is not controlled by the compiler, so there's no way for the compiler to optimize or recycle it.
Compare:
while (condition) {
char buffer[0x100]; // Chill.
/* ... */
}
with:
while (condition) {
char* buffer = _alloca(0x100); // Bad!
/* ... */
}
The problem with the latter should be obvious.

Are activation records created on stack or heap in C?

I am reading about memory allocation and activation records. I am having some doubts. Can anyone make the following crystal clear ?
A). My first doubt is that "Are activation records created on stack or heap in C" ?
B). These are few lines from an abstract which i am referring :-->
Even though memory on stack area is created during run time- the
amount of memory (activation record size) is determined at compile
time. Static and global memory area is compile time determined and
this is part of the binary. At run time, we cannot change this. Only
memory area freely available for the process to change during runtime
is heap.At compile time compiler only reserves the stack space for
activation record. This gets used (allocated on actual memory) only
during program run. Only DATA segment part of the program like static
variables, string literals etc. are allocated during compile time. For
heap area, how much memory to be allocated is also determined at run
time.
Can anyone please elaborate these lines as i am unable to understand anything ?
I am sure the explaination would be of great need to me.
As a quick answer, I don't even really know what an activation record is. The rest of the quote has very poor English and is quite misleading.
Honestly, the abstract is talking about absolutes when in reality, there really are not at all absolute. You do define a main stack at compile time, yes (though you can create many stacks at runtime as well).
Yes, when you want to allocate memory, one usually creates a pointer to store that information, but where you place that is completely up to you. It can be stack, it can be global memory, it can be in the heap from another allocation, or you can just leak memory and not store it anywhere it all if you wish. Perhaps this is what is meant by an activation record?
Or perhaps, it means that when dynamic memory is created, somewhere in memory, there has to be some sort of information that keeps track of used and unused memory. For many allocators, this is a list of pointers stored somewhere in the allocated memory, though others store it in a different piece of memory and some could even place that on the stack. It all depends on the needs of the memory system.
Finally, where dynamic memory is allocated from can vary as well. It can come from a call to the OS, though in some cases, it can also just be overlayed onto existing global (or even stack) memory - which is not uncommon in embedded programming.
As you can see, this abstract is not even close to what dynamic memory represents.
Additional info:
Many are jumping all over me stating that 'C' has no stack in the standard. Correct. That said, how many people have truly coded in C without one? I'll leave that alone for now.
Defined memory, as you call it, is anything declared with the 'static' keyword within a function or any variable declared outside of a function without the 'extern' keyword in front of it. This is memory that the compiler knows about and can reserve space for without any additional help.
Allocated memory - is not a good term as defined memory can also be considered allocated. Instead, use the term dynamic memory. This is memory that you allocate from a heap at run-time. An example:
char *foo;
int my_value;
int main(void)
{
foo = malloc(10 * sizeof(char));
// Do stuff with foo
free(foo);
return 0;
}
foo is "defined" as you say as a pointer. If nothing else were done, it would only reserve that much memory, but when the malloc is reached in main(), it now points to at least 10 bytes of dynamic memory as well. Once the free is reached, that memory is now made available to the program for other uses. It's allocated size is 'dynamic'. Compare that to my_value which will always be the size of an int and nothing else.
In C (given how it is almost universally implemented*) An activation record is exactly the same thing as a stack frame which is the same thing as a call frame. They are always created on the stack.
The stack segment is a memory area the process gets "for free" from the OS when it created. It does not need to malloc or free it. On x86, a machine register (e.g RSP) points to the end of the segment and stack frames/activation records/call frames are "allocated" by decrementing the pointer in that register by how many byte to allocate. E.g:
int my_func() {
int x = 123;
int y = 234;
int z = 345;
...
return 1;
}
An unoptimizing C compiler could generate assembly code for keeping those three variables in the stack frame like this:
my_func:
; "allocate" 24 bytes of stack space
sub rsp, 24
; Initialize the allocated stack memory
mov [rsp], 345 ; z = 345
mov [rsp+8], 234 ; y = 234
mov [rsp+16], 134 ; x = 123
...
; "free" the allocated stack space
add rsp, 24
; return 1
mov rax, 1
ret
In other contexts and languages activation records can be implemented differently. For example using linked lists. But as the language is C and the context is low-level programming I don't think it is useful to discuss that.
In theory, a C99 (or C11) compatible implementation (e.g. a C compiler & C standard library implementation) do not even need (in all cases) a call stack. For example, one could imagine a whole program compiler (notably for freestanding C implementation) which would analyze the entire program and decide that stack frames are unneeded (e.g. each local variable could be allocated statically, or fit in a register). Or one could imagine an implementation allocating the call frames as continuation frames (perhaps after CPS transformation by the compiler) elsewhere (e.g. in some "heap"), using techniques similar to those described in Appel old book Compiling with Continuations (describing an SML/NJ compiler).
(remember that a programming language is a specification -not some software-, often written in English, perhaps with additional formalization, in some technical report or standard document. AFAIK, the C99 or C11 standards do not even mention any stack or activation record. But in practice, most C implementations are made of a compiler and a standard library implementation.)
In practice, allocation records are call frames (for C, they are synonyms; things are more complex with nested functions) and are allocated on a hardware assisted call stack on all reasonable C implementations I know. on Z/Architecture there is no hardware stack pointer register, so it is a convention (dedicating some register to play the role of the stack pointer).
So look first at call stack wikipage. It has a nice picture worth many words.
Are activation records created on stack or heap
In practice, they (activation records) are call frames on the call stack (allocated following calling conventions and ABIs). Of course the layout, slot usage, and size of a call frame is computed at compile-time by the compiler.
In practice, a local variable may correspond to some slot inside the call frame. But sometimes, the compiler would keep it only in a register, or reuse the same slot (which has a fixed offset in the call frame) for various usages, e.g. for several local variables in different blocks, etc.
But most C compilers are optimizing compilers. They are able to inline a function, or sometimes make a tail call to it (then the caller's call frame is reused as or overwritten by the callee call frame), so details are more complex.
See also this How was C ported to architectures that had no hardware stack? question on retro.

Heap Memory in C Programming

What exactly is heap memory?
Whenever a call to malloc is made, memory is assigned from something called as heap. Where exactly is heap. I know that a program in main memory is divided into instruction segment where program statements are presents, Data segment where global data resides and stack segment where local variables and corresponding function parameters are stored. Now, what about heap?
The heap is part of your process's address space. The heap can be grown or shrunk; you manipulate it by calling brk(2) or sbrk(2). This is in fact what malloc(3) does.
Allocating from the heap is more convenient than allocating memory on the stack because it persists after the calling routine returns; thus, you can call a routine, say funcA(), to allocate a bunch of memory and fill it with something; that memory will still be valid after funcA() returns. If funcA() allocates a local array (on the stack) then when funcA() returns, the on-stack array is gone.
A drawback of using the heap is that if you forget to release heap-allocated memory, you may exhaust it. The failure to release heap-allocated memory (e.g., failing to free() memory gotten from malloc()) is sometimes called a memory leak.
Another nice feature of the heap, vs. just allocating a local array/struct/whatever on the stack, is that you get a return value saying whether your allocation succeeded; if you try to allocate a local array on the stack and you run out, you don't get an error code; typically your thread will simply be aborted.
The heap is the diametrical opposite of the stack. The heap is a large pool of memory that can be used dynamically – it is also known as the “free store”. This is memory that is not automatically managed – you have to explicitly allocate (using functions such as malloc), and deallocate (e.g. free) the memory. Failure to free the memory when you are finished with it will result in what is known as a memory leak – memory that is still “being used”, and not available to other processes. Unlike the stack, there are generally no restrictions on the size of the heap (or the variables it creates), other than the physical size of memory in the machine. Variables created on the heap are accessible anywhere in the program.
Oh, and heap memory requires you to use pointers.
A summary of the heap:
the heap is managed by the programmer, the ability to modify it is
somewhat boundless
in C, variables are allocated and freed using functions like malloc() and free()
the heap is large, and is usually limited by the physical memory available
the heap requires pointers to access it
credit to craftofcoding
Basically, after memory is consumed by the needs of programs, what is left is the heap. In C that will be the memory available for the computer, for virtual machines it will be less than that.
But, this is the memory that can be used at run-time as your program needs memory dynamically.
You may want to look at this for more info:
http://computer.howstuffworks.com/c28.htm
Reading through this, this is actually beyond the realms of C. C doesn't specify that there's a heap behind malloc; it could just as easily be called a linked list; you're just calling it a heap by convention.
What the standard guarantees is that malloc will either return a pointer to an object that has dynamic storage duration, and your heap is just one type of data structure which facilitates the provision of such a storage duration. It's the common choice. Nonetheless, the very developers who wrote your heap have recognised that it might not be a heap, and so you'll see no reference of the term heap in the POSIX malloc manual for example.
Other things that are beyond the realms of standard C include such details of the machine code binary which is no longer C source code following compilation. The layout details, though typical, are all implementation-specific as opposed to C-specific.
The heap, or whichever book-keeping data structure is used to account for allocations, is generated during runtime; as malloc is called, new entries are (presumably) added to it and as free is called, new entries are (again, presumably) removed from it.
As a result, there's generally no need to have a section in the machine code binary for objects allocated using malloc, however there are cases where applications are shipped standalone baked into microprocessors, and in some of these cases you might find that flash or otherwise non-volatile memory might be reserved for that use.

Resources