Ways to avoid using malloc? [closed] - c

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
With all recent noise C gets, I read that there are some ways to minimize usage of malloc in C, and that it is a very good practice. I how ever have no idea when or how or what such practices are good. So my question would be, maybe some experienced C programmers could give some examples where one could (or should) write something without malloc, but what would be really non-obvious way for newbie C programmer (and thus said newbie would simply use malloc)? Maybe you have some experiences from factoring out malloc into something else.
P.S. some posts I read were referencing Quake 3 source code and how it avoids use of malloc, so if someone has knowledge of this it would be interesting to know what is done there, since at least for know I would like to avoid digging into quakes code aimlessly. (since well if they avoid using malloc searching for malloc will not give much results I suppose, also code base is most likely not as simple as individual examples could be)

I don't know about totally avoiding malloc, but you can certainly reduce it.
The basic concept is a memory pool. That is a large buffer which you have allocated that you can use for many objects instead of requesting lots of small allocations.
You might use this in a real-world situation where you are sending events into a queue to be processed by another thread. The event objects might be smallish structures and you really need to avoid making thousands of calls to malloc every second.
The answer of course is to draw these event objects from a pool. If you need to, you can even use parts of your pool buffer to form a list so that you can quickly index memory that has been returned to the pool. These are generally known as free-lists.
You do have to be careful about memory alignment, as you can severely impact performance by having misaligned data. But you can handle all that with a little maths.
Don't freak out about these concepts. A pool doesn't actually have to be that sophisticated. Consider this:
int ** matrix = malloc( rows * sizeof(int*) );
for( int i = 0; i < rows; i++ ) {
matrix[i] = malloc( cols * sizeof(int) );
}
I see this all the time, and it's a pet peeve of mine. Why would you do that, when you can do this:
int ** matrix = malloc( rows * sizeof(int*) );
matrix[0] = malloc( rows * cols * sizeof(int) );
for( int i = 1; i < rows; i++ ) {
matrix[i] = matrix[i-1] + cols;
}
And of course, that reduces to this (beware of potential alignment issues in your first row though - I've ignored it here for the sake of clarity)
int ** matrix = malloc( rows * sizeof(int*) + rows * cols * sizeof(int) );
matrix[0] = (int*)matrix + rows;
for( int i = 1; i < rows; i++ ) {
matrix[i] = matrix[i-1] + cols;
}
The cool thing about that last example is how easy it is to delete your matrix =)
free( matrix );
Oh, and zeroing the matrix is just as easy...
memset( matrix[0], 0, rows * cols * sizeof(int) );

In the scenario where you need small, dynamic sized arrays in local scope, there is alloca() which allocates from the stack and doesn't need you to explicitly free the memory (it gets freed when the function returns), and there are variable length arrays (VLA):
void meh(int s) {
float *foo = alloca(s * sizeof(float));
float frob[s];
} // note: foo and frob are freed upon returning

If you know all the sizes of arrays, lists, stacks, trees, whatever data structures your program needs beforehand, you can allocate the required memory statically by defining arrays of constant number of elements. Pros: no memory management, no memory fragmentation, fast. Cons: limited use, wasted memory.
You can implement a custom memory allocator on top of malloc() or whatever your OS provides, allocate a big chunk of memory once and then carve it up without calling standard malloc() functions. Pros: fast. Cons: not quite trivial to implement right.
Another (and a rather perverse) way of avoiding malloc() would be to store most of your data in files instead of memory. Pros: virtually none.
You may also use local variables and deep function calls (or explicit recursion) to allocate space for data on the go if you're certain that the program's stack is going to be big enough. Pros: no memory management, easy, fast. Cons: limited use.
As an example of a working midsize project that avoids malloc() I can offer my pet project, Smaller C compiler. It statically allocates a number of arrays and it also allocates small local variables inside recursive functions. Beware, the code hasn't been beautified yet and it's not something small or easy to understand if you're fairly new to programming, C or compilers.

The primary reason for not using malloc in some particular cases is probably the fact that it employs a generic, one-size-fits-all approach to memory allocation.
Other approaches, such as memory pools and slab allocation may offer benefits in the case of having well-known allocation needs.
For example, it is much more advantageous for an allocator to assume that the allocated objects will be of a fixed size, or assume that their lifetime will be relatively short. A generic allocator cannot make such assumptions and cannot therefore perform optimally in such scenarios.
The potential benefits can include a decreased memory footprint due to the specialized allocator having a more condensed bookkeeping. A generic allocator most probably holds a larger amount of metadata for each allocated object, whereas an allocator that "knows" in advance what the object's size will be can probably omit it from the metadata.
It can also make a difference in allocation speed - a custom allocator will probably be able to find an empty slot faster.
This is all talking in relatives here, but the questions you should ask before choosing a custom allocation scheme are:
Do you need to allocate and deallocate a large number of objects having the same size? (Slab allocation)
Can these objects be disposed at once without the overhead of individual calls? (Memory pools)
Is there a logical grouping of the individually allocated objects? (Cache aware allocation)
The bottom line is, you have to inspect the allocation needs and patterns of your program carefully, and then decide whether a custom allocation scheme can be beneficial.

There are several reasons to avoid malloc - the biggest one in my mind is that "no malloc, no free" to paraphrase Bob Marley... So, no memory leaks from "forgetting" to call free.
And of course, you should always check for NULL when allocating memory dynamically. Avoiding this will reduce the amount of code and the complexity of the code.
Unfortunately, the alternative, running out of stack or global variable size is often worse, as it either crashes immediately with no meaningful error message to the user (stackoverflow) or buffer overflow in global variables - checking the boundaries in global variables will avoid this, but what do you do if you detect it? There aren't many choices.
The other part is of course that a call to malloc can be substantially expensive, compared to local variables. This is particularly the case when you hit malloc/free calls in "hot-paths" - parts of the code that is called very often. There is also memory overhead in using malloc on small memory sections - the overhead from past experience in Visual studio is around 32 bytes of "header" and rounded to 16 or 32 bytes boundaries - so an allocation of 1 byte actually takes up 64 bytes. An allocation of 17 bytes would also take up 64 bytes...
Of course, like ALL engineering/software design, it is not "you MUST NOT USE malloc", but "avoid malloc if there is a simple/suitable alternative". It's wrong to use all global variables that are several times larger than they need to be, just to avoid malloc - but it's equally wrong to call malloc/free for every frame or every object of a graphics drawing loop.
I haven't looked at the code of Quake, but i worked on some code in 3DMark 2000 [I think my name is still in the credits of the product]. That's written in C++, but it avoids using new/delete in the rendering code. It's all done in the setup/teardown of a frame, with very few exceptions.

Allocating a bigger block of memory is usually faster, so my advice would be allocate a big block and then create a memory pool from it. Implement your own functions to "free" memory back to the pool and to allocate memory from it.

Related

Why is there no "recalloc" in the C standard?

Everyone knows that:
realloc resizes an existing block of memory or copies it to a larger block.
calloc ensures the memory is zeroed out and guards against arithmetic overflows and is generally geared toward large arrays.
Why doesn't the C standard provide a function like the following that combines both of the above?
void *recalloc(void *ptr, size_t num, size_t size);
Wouldn't it be useful for resizing huge hash tables or custom memory pools?
Generally in C, the point of the standard library is not to provide a rich set of cool functions. It is to provide an essential set of building blocks, from which you can build your own cool functions.
Your proposal for recalloc would be trivial to write, and therefore is not something the standard lib should provide.
Other languages take a different approach: C# and Java have super-rich libraries that make even complicated tasks trivial. But they come with enormous overhead. C has minimal overhead, and that aids in making it portable to all kinds of embedded devices.
I assume you're interested in only zeroing out the new part of the array:
Not every memory allocator knows how much memory you're using in an array. for example, if I do:
char* foo = malloc(1);
foo now points to at least a chunk of memory 1 byte large. But most allocators will allocate much more than 1 byte (for example, 8, to keep alignment).
This can happen with other allocations, too. The memory allocator will allocate at least as much memory as you request, though often just a little bit more.
And it's this "just a little bit more" part that screws things up (in addition to other factors that make this hard). Because we don't know if it's useful memory or not. If it's just padding, and you recalloc it, and the allocator doesn't zero it, then you now have "new" memory that has some nonzeros in it.
For example, what if I recalloc foo to get it to point to a new buffer that's at least 2 bytes large. Will that extra byte be zeroed? Or not? It should be, but note that the original allocation gave us 8 bytes, so are reallocation doesn't allocate any new memory. As far as the allocator can see, it doesn't need to zero any memory (because there's no "new" memory to zero). Which could lead to a serious bug in our code.

Growing an array on the stack

This is my problem in essence. In the life of a function, I generate some integers, then use the array of integers in an algorithm that is also part of the same function. The array of integers will only be used within the function, so naturally it makes sense to store the array on the stack.
The problem is I don't know the size of the array until I'm finished generating all the integers.
I know how to allocate a fixed size and variable sized array on the stack. However, I do not know how to grow an array on the stack, and that seems like the best way to solve my problem. I'm fairly certain this is possible to do in assembly, you just increment stack pointer and store an int for each int generated, so the array of ints would be at the end of the stack frame. Is this possible to do in C though?
I would disagree with your assertion that "so naturally it makes sense to store the array on the stack". Stack memory is really designed for when you know the size at compile time. I would argue that dynamic memory is the way to go here
C doesn't define what the "stack" is. It only has static, automatic and dynamic allocations. Static and automatic allocations are handled by the compiler, and only dynamic allocation puts the controls in your hands. Thus, if you want to manually deallocate an object and allocate a bigger one, you must use dynamic allocation.
Don't use dynamic arrays on the stack (compare Why is the use of alloca() not considered good practice?), better allocate memory from the heap using malloc and resize it using realloc.
Never Use alloca()
IMHO this point hasn't been made well enough in the standard references.
One rule of thumb is:
If you're not prepared to statically allocate the maximum possible size as a
fixed length C array then you shouldn't do it dynamically with alloca() either.
Why? The reason you're trying to avoid malloc() is performance.
alloca() will be slower and won't work in any circumstance static allocation will fail. It's generally less likely to succeed than malloc() too.
One thing is sure. Statically allocating the maximum will outdo both malloc() and alloca().
Static allocation is typically damn near a no-op. Most systems will advance the stack pointer for the function call anyway. There's no appreciable difference for how far.
So what you're telling me is you care about performance but want to hold back on a no-op solution? Think about why you feel like that.
The overwhelming likelihood is you're concerned about the size allocated.
But as explained it's free and it gets taken back. What's the worry?
If the worry is "I don't have a maximum or don't know if it will overflow the stack" then you shouldn't be using alloca() because you don't have a maximum and know it if it will overflow the stack.
If you do have a maximum and know it isn't going to blow the stack then statically allocate the maximum and go home. It's a free lunch - remember?
That makes alloca() either wrong or sub-optimal.
Every time you use alloca() you're either wasting your time or coding in one of the difficult-to-test-for arbitrary scaling ceilings that sleep quietly until things really matter then f**k up someone's day.
Don't.
PS: If you need a big 'workspace' but the malloc()/free() overhead is a bottle-neck for example called repeatedly in a big loop, then consider allocating the workspace outside the loop and carrying it from iteration to iteration. You may need to reallocate the workspace if you find a 'big' case but it's often possible to divide the number of allocations by 100 or even 1000.
Footnote:
There must be some theoretical algorithm where a() calls b() and if a() requires a massive environment b() doesn't and vice versa.
In that event there could be some kind of freaky play-off where the stack overflow is prevented by alloca(). I have never heard of or seen such an algorithm. Plausible specimens will be gratefully received!
The innards of the C compiler requires stack sizes to be fixed or calculable at compile time. It's been a while since I used C (now a C++ convert) and I don't know exactly why this is. http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html provides a useful comparison of the pros and cons of the two approaches.
I appreciate your assembly code analogy but C is largely managed, if that makes any sense, by the Operating System, which imposes/provides the task, process and stack notations.
In order to address your issue dynamic memory allocation looks ideal.
int *a = malloc(sizeof(int));
and dereference it to store the value .
Each time a new integer needs to be added to the existing list of integers
int *temp = realloc(a,sizeof(int) * (n+1)); /* n = number of new elements */
if(temp != NULL)
a = temp;
Once done using this memory free() it.
Is there an upper limit on the size? If you can impose one, so the size is at most a few tens of KiB, then yes alloca is appropriate (especially if this is a leaf function, not one calling other functions that might also allocate non-tiny arrays this way).
Or since this is C, not C++, use a variable-length array like int foo[n];.
But always sanity-check your size, otherwise it's a stack-clash vulnerability waiting to happen. (Where a huge allocation moves the stack pointer so far that it ends up in the middle of another memory region, where other things get overwritten by local variables and return addresses.) Some distros enable hardening options that make GCC generate code to touch every page in between when moving the stack pointer by more than a page.
It's usually not worth it to check the size and use alloc for small, malloc for large, since you also need another check at the end of your function to call free if the size was large. It might give a speedup, but this makes your code more complicated and more likely to get broken during maintenance if future editors don't notice that the memory is only sometimes malloced. So only consider a dual strategy if profiling shows this is actually important, and you care about performance more than simplicity / human-readability / maintainability for this particular project.
A size check for an upper limit (else log an error and exit) is more reasonable, but then you have to choose an upper limit beyond which your program will intentionally bail out, even though there's plenty of RAM you're choosing not to use. If there is a reasonable limit where you can be pretty sure something's gone wrong, like the input being intentionally malicious from an exploit, then great, if(size>limit) error(); int arr[size];.
If neither of those conditions can be satisfied, your use case is not appropriate for C automatic storage (stack memory) because it might need to be large. Just use dynamic allocation autom don't want malloc.
Windows x86/x64 the default user-space stack size is 1MiB, I think. On x86-64 Linux it's 8MiB. (ulimit -s). Thread stacks are allocated with the same size. But remember, your function will be part of a chain of function calls (so if every function used a large fraction of the total size, you'd have a problem if they called each other). And any stack memory you dirty won't get handed back to the OS even after the function returns, unlike malloc/free where a large allocation can give back the memory instead of leaving it on the free list.
Kernel thread stack are much smaller, like 16 KiB total for x86-64 Linux, so you never want VLAs or alloca in kernel code, except maybe for a tiny max size, like up to 16 or maybe 32 bytes, not large compared to the size of a pointer that would be needed to store a kmalloc return value.

C allocation and memory overhead

Apologies if this is a stupid question, but it's been kinda bothering me for a long time.
I'd like to know some details on how the memory manager knows what memory is in use.
Imagine a one-chip microcomputer with 1024B of RAM - not much to spare.
Now you allocate 100 ints - each int is 4 bytes, each pointer 4 bytes too (yea, 8bit one-chip will most likely have smaller pointers, but w/e).
So you've just used 800B of ram for 100 ints? But it's worse - the allocation system must somehow take note on where the memory is malloc'd and where it's free - 200 extra bytes or something? Or some bit marks?
If this is true, why is C favoured over assembler so often?
Is this really how it works? So super inefficient?
(Or am I having a totally incorrect idea about it?)
It may surprise younger developers to learn that greying old ones like myself used to write in C on systems with 1 or 2k of RAM.
In systems this size, dynamic memory allocation would have been a luxury that we could not afford. It's not just the pointer overhead of managing the free store, but also the effects of free store fragmentation making memory allocation inefficient and very probably leading to a fatal out-of-memory condition (virtual memory was not an option).
So we used to use static memory allocation (i.e. global variables), kept a very tight control on the depth of function all nesting, and an even tighter control over nested interrupt handling.
When writing on these systems, I didn't even link the standard library. I wrote my own C startup routines and provided custom minimal I/O routines.
One program I wrote in a 2k ram system used the lower part of RAM as the data area and the upper part as the stack. In the final cut, I proved that the maximal use of stack reached so far down in memory that it was 1 byte away from the last variable in the data area.
Ah, the good old days...
EDIT:
To answer your question specifically, the original K&R free store manager used to add a header block to the beginning of each block of memory allocated via malloc.
The header block looked something like this:
union header {
struct {
union header *ptr;
unsigned size;
} s;
};
Where ptr is the address of the next header block and size is the size of the memory allocated (in blocks). The malloc function would actually return the address computed by &header + sizeof(header). The free function would subtract the size of the header from the pointer you provided in order to re-link the block back into the free list.
There are several approaches how you can do that:
as you write, malloc() one memory block for every int you have. Completely inefficient, thus I strike it out.
malloc() an array of 100 ints. That needs in total 100*sizeof(int) + 1* sizeof(int*) + whatever malloc() internally needs. Much better.
statically allocate the array. Here you just need 100*sizeof(int).
allocate the array on the stack. That needs the same, but only for the current function call.
Which of these you need depends on how long you need the memory and other criteria.
If you have that few RAM, it might even be questionable if it is useful to use malloc() at all. It can be an option if several code blocks need a lot of RAM, but not at the same time.
On how the memory addresses are tracked, that as well depends:
for malloc(), you have to put the pointer in a place where you don't lose it.
for an array on the stack, it is expressed relatively to the current frame pointer. The code sets up the frame and thus knows about the offset, so it is normally not needed to store it anywhere.
for an array in the data segment, the compiler and linker know about the address and statically put the address where it is needed.
If this is true, why is C favoured over assembler so often?
You're simplifying the problem too much. C or assembler - doesn't matter, you still need to manage the memory chunks. The main issue is fragmentation, not the actual management overhead. In a system like the one you described, you would probably just allocate the memory and never ever release it, thus no need to check what's free - whatever is below the watermark is free, and that's it.
Is this really how it works? So super inefficient?
There are many algorithms around this problem, but if you're simplifying - yes, it basically is. In reality its a much more complicated problem - and there are much more complicated systems revolving around servicing memory, dealing with fragmentation, garbage collection (on a OS level), etc etc.

How to find how much memory is actually used up by a malloc call?

If I call:
char *myChar = (char *)malloc(sizeof(char));
I am likely to be using more than 1 byte of memory, because malloc is likely to be using some memory on its own to keep track of free blocks in the heap, and it may effectively cost me some memory by always aligning allocations along certain boundaries.
My question is: Is there a way to find out how much memory is really used up by a particular malloc call, including the effective cost of alignment, and the overhead used by malloc/free?
Just to be clear, I am not asking to find out how much memory a pointer points to after a call to malloc. Rather, I am debugging a program that uses a great deal of memory, and I want to be aware of which parts of the code are allocating how much memory. I'd like to be able to have internal memory accounting that very closely matches the numbers reported by top. Ideally, I'd like to be able to do this programmatically on a per-malloc-call basis, as opposed to getting a summary at a checkpoint.
There isn't a portable solution to this, however there may be operating-system specific solutions for the environments you're interested in.
For example, with glibc on Linux, you can use the mallinfo() function from <malloc.h> which returns a struct mallinfo. The uordblks and hblkhd members of this structure contains the dynamically allocated address space used by the program including book-keeping overhead - if you take the difference of this before and after each malloc() call, you will know the amount of space used by that call. (The overhead is not necessarily constant for every call to malloc()).
Using your example:
char *myChar;
size_t s = sizeof(char);
struct mallinfo before, after;
int mused;
before = mallinfo();
myChar = malloc(s);
after = mallinfo();
mused = (after.uordblks - before.uordblks) + (after.hblkhd - before.hblkhd);
printf("Requested size %zu, used space %d, overhead %zu\n", s, mused, mused - s);
Really though, the overhead is likely to be pretty minor unless you are making a very very high number of very small allocations, which is a bad idea anyway.
It really depends on the implementation. You should really use some memory debugger. On Linux Valgrind's Massif tool can be useful. There are memory debugging libraries like dmalloc, ...
That said, typical overhead:
1 int for storing size + flags of this block.
possibly 1 int for storing size of previous/next block, to assist in coallescing blocks.
2 pointers, but these may only be used in free()'d blocks, being reused for application storage in allocated blocks.
Alignment to an approppriate type, e.g: double.
-1 int (yes, that's a minus) of the next/previous chunk's field containing our size if we are an allocated block, since we cannot be coallesced until we're freed.
So, a minimum size can be 16 to 24 bytes. and minimum overhead can be 4 bytes.
But you could also satisfy every allocation via mapping memory pages (typically 4Kb), which would mean overhead for smaller allocations would be huge. I think OpenBSD does this.
There is nothing defined in the C library to query the total amount of physical memory used by a malloc() call. The amount of memory allocated is controlled by whatever memory manager is hooked up behind the scenes that malloc() calls into. That memory manager can allocate as much extra memory as it deemes necessary for its internal tracking purposes, on top of whatever extra memory the OS itself requires. When you call free(), it accesses the memory manager, which knows how to access that extra memory so it all gets released properly, but there is no way for you to know how much memory that involves. If you need that much fine detail, then you need to write your own memory manager.
If you do use valgrind/Massif, there's an option to show either the malloc value or the top value, which differ a LOT in my experience. Here's an excerpt from the Valgrind manual http://valgrind.org/docs/manual/ms-manual.html :
...However, if you wish to measure all the memory used by your program,
you can use the --pages-as-heap=yes. When this option is enabled,
Massif's normal heap block profiling is replaced by lower-level page
profiling. Every page allocated via mmap and similar system calls is
treated as a distinct block. This means that code, data and BSS
segments are all measured, as they are just memory pages. Even the
stack is measured...

Why would you ever want to allocate memory on the heap rather than the stack? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
When is it best to use a Stack instead of a Heap and vice versa?
I've read a few of the other questions regarding the heap vs stack, but they seem to focus more on what the heap/stack do rather than why you would use them.
It seems to me that stack allocation would almost always be preferred since it is quicker (just moving the stack pointer vs looking for free space in the heap), and you don't have to manually free allocated memory when you're done using it. The only reason I can see for using heap allocation is if you wanted to create an object in a function and then use it outside that functions scope, since stack allocated memory is automatically unallocated after returning from the function.
Are there other reasons for using heap allocation instead of stack allocation that I am not aware of?
There are a few reasons:
The main one is that with heap allocation, you have the most flexible control over the object's lifetime (from malloc/calloc to free);
Stack space is typically a more limited resource than heap space, at least in default configurations;
A failure to allocate heap space can be handled gracefully, whereas running out of stack space is often unrecoverable.
Without the flexible object lifetime, useful data structures such as binary trees and linked lists would be virtually impossible to write.
You want an allocation to live beyond a function invocation
You want to conserve stack space (which is typically limited to a few MBs)
You're working with re-locatable memory (Win16, databases, etc.), or want to recover from allocation failures.
Variable length anything. You can fake around this, but your code will be really nasty.
The big one is #1. As soon as you get into any sort of concurrency or IPC #1 is everywhere. Even most non-trivial single threaded applications are tricky to devise without some heap allocation. That'd practically be faking a functional language in C/C++.
So I want to make a string. I can make it on the heap or on the stack. Let's try both:
char *heap = malloc(14);
if(heap == NULL)
{
// bad things happened!
}
strcat(heap, "Hello, world!");
And for the stack:
char stack[] = "Hello, world!";
So now I have these two strings in their respective places. Later, I want to make them longer:
char *tmp = realloc(heap, 20);
if(tmp == NULL)
{
// bad things happened!
}
heap = tmp;
memmove(heap + 13, heap + 7);
memcpy(heap + 7, "cruel ", 6);
And for the stack:
// umm... What?
This is only one benefit, and others have mentioned other benefits, but this is a rather nice one. With the heap, we can at least try to make our allocated space larger. With the stack, we're stuck with what we have. If we want room to grow, we have to declare it all up front, and we all know how it stinks to see this:
char username[MAX_BUF_SIZE];
The most obvious rationale for using the heap is when you call a function and need something of unknown length returned. Sometimes the caller may pass a memory block and size to the function, but at other times this is just impractical, especially if the returned stuff is complex (e.g. a collection of different objects with pointers flying around, etc.).
Size limits are a huge dealbreaker in a lot of cases. The stack is usually measured in the low megabytes or even kilobytes (that's for everything on the stack), whereas all modern PCs allow you a few gigabytes of heap. So if you're going to be using a large amount of data, you absolutely need the heap.
just to add
you can use alloca to allocate memory on the stack, but again memory on the stack is limited and also the space exists only during the function execution only.
that does not mean everything should be allocated on the heap. like all design decisions this is also somewhat difficult, a "judicious" combination of both should be used.
Besides manual control of object's lifetime (which you mentioned), the other reasons for using heap would include:
Run-time control over object's size (both initial size and it's "later" size, during the program's execution).
For example, you can allocate an array of certain size, which is only known at run time.
With the introduction of VLA (Variable Length Arrays) in C99, it became possible to allocate arrays of fixed run-time size without using heap (this is basically a language-level implementation of 'alloca' functionality). However, in other cases you'd still need heap even in C99.
Run-time control over the total number of objects.
For example, when you build a binary tree stucture, you can't meaningfully allocate the nodes of the tree on the stack in advance. You have to use heap to allocated them "on demand".
Low-level technical considerations, as limited stack space (others already mentioned that).
When you need a large, say, I/O buffer, even for a short time (inside a single function) it makes more sense to request it from the heap instead of declaring a large automatic array.
Stack variables (often called 'automatic variables') is best used for things you want to always be the same, and always be small.
int x;
char foo[32];
Are all stack allocations, These are fixed at compile time too.
The best reason for heap allocation is that you cant always know how much space you need. You often only know this once the program is running. You might have an idea of limits but you would only want to use the exact amount of space required.
If you had to read in a file that could be anything from 1k to 50mb, you would not do this:-
int readdata ( FILE * f ) {
char inputdata[50*1024*1025];
...
return x;
}
That would try to allocate 50mb on the stack, which would usually fail as the stack is usually limited to 256k anyway.
The stack and heap share the same "open" memory space and will have to eventually come to a point where they meet, if you use the entire segment of memory. Keeping the balance between the space that each of them use will have amortized cost later for allocation and de-allocation of memory a smaller asymptotic value.

Resources