What is the origin of the term "heap" for the free store? - heap-memory

I am trying to find the official (or a good enough) reason that the free store is commonly referred to as the heap.
Except for the fact that it grows from the end of the data segment, I can't really think of a good reason, especially since it has very little to do with the heap data structure.
Note: Quite a few people mentioned that it's just a whole bunch of things that are kind of unorganized. But to me the term heap physically means a bunch of things that are physically dependent on one another. You pull one out from underneath, everything else collapses on it, etc. In other words, to me heap sounds loosely organized (e.g., latest things are on top). This is not exactly how a heap actually works on most computers, though if you put stuff towards the beginning of the heap and then grew it I guess it could work.

Knuth rejects the term "heap" used as a synonym for the free memory store.
Several authors began about 1975 to call the pool of available memory a "heap." But in the present series of books, we will use that word only in its more traditional sense related to priority queues. (Fundamental Algorithms, 3rd ed., p. 435)

For what it's worth, ALGOL68, which predated C, had an actual keyword heap that was used to allocate space for a variable from the "global heap", as opposed to loc which allocated it on the stack.
But I suspect the use may be simply because there's no real structure to it. By that, I mean that you're not guaranteed to get the best-fit block or the next block in memory, rather you'll take what you're given depending on the whims of the allocation strategy.
Like most names, it was probably thought of by some coder who just needed a name.
I've often heard of it referred to as an arena sometimes (an error message from many moons ago saying that the "memory arena was corrupted"). This brings up images of chunks of memory doing battle in gladiatorial style inside your address space (a la the movie Tron).
Bottom line, it's just a name for an area of memory, you could just as well call it the brk-pool or sbrk-pool (after the calls to modify it) or any of a dozen other names.
I remember when we were putting together comms protocol stacks even before the OSI 7-layer model was a twinkle in someone's eye, we used a layered approach and had to come up with names at each layer for the blocks.
We used blocks, segments, chunks, sections and various other names, all which simply indicated a fixed length thing. It may be that heap had a similar origin:
Carol: "Hey, Bob, what's a good name for a data structure that just doles out random bits of memory from a big area?"
Bob: "How about 'steaming pile of horse dung'?"
Carol: "Thanks, Bob, I'll just opt for 'heap', if that's okay with you. By the way, how are things going with the divorce?"

It's named a heap for the contrasting image it conjures up to that of a stack.
In a stack of items, items sit one on top of the other in the order they were placed there, and you can only remove the top one (without toppling the whole thing over).
In a heap, there is no particular order to the way items are placed. You can reach in and remove items in any order because there is no clear 'top' item.
It does a fairly good job of describing the two ways of allocating and freeing memory in a stack and a heap. Yum!

I don't know if it is the first but ALGOL 68 had keyword 'heap' for allocating memory from the free memory heap.
The first use of 'heap' is likely to be found somewhere between 1958, when John McCarthy invented garbage collection for Lisp and the development of ALGOL 68

It's unordered. A "heap" of storage.
It isn't an ordered heap data structure.

It's a large, tangled pile of disorganized junk, in which nothing can be found unless you know exactly where to look.

I always figured it was kind of a clever way of describing it in comparison to the stack. The heap is like an overgrown, disorganized stack.

Just like java and javascript, heap (free store) and heap (data structure) are named such to insure that our fraternity is inpenetrable to outsiders, thus keeping our jobs secure.

Heap usually has these features:
You can't predict the address of a block malloc() returns.
You have no idea of how the address returned by the next malloc() is related to the address of the previous one.
You can malloc() blocks of variable sizes.
So you have something that can store a bunch of blocks of variable sizes and return them in some generally unpredictable order. How else would you call it if not "heap"?

Related

Is it a memory leak in C?

An answer by user:surendra nath on this post stated that the code posted by OP has a memory leak, whereas, most of the time I've come across that a leak occurs when we fail to free a dynamically allocated memory region but I couldn't see any dynamic allocation in OPs code.
He quoted this wiki definition, here.
And another definition of Memory Leak from user:artificial idiot to post was:
Subtle definition: Failure to release reachable memory which is no
longer needed for your program to function correctly. This is nearly
impossible to detect with automated tools or by programmers who are
not familiar with the code. While technically it is not a leak, it has
the same implications as the naive one. This is not my own idea only.
You can come across projects that are written in a garbage collected
language but still mention fixing memory leaks in their changelogs.
So, my question:
Is the point stated by "surendra nath" on OPs code, can be said as a memory leak considering the definition of a memory leak given by "artificial idiot" or wikipedia? And if so, then why?
Does the memory leak in C always refer to the failure to free a previously dynamically allocated memory?
P.S. - I don't know if the definitions from wiki & "artificial idiot" have the same meaning, since wiki definitions are sometimes too broad to understand.
The issue that artificial idiot raises in The Best Memory Leak Definition can sometimes be a concern. Suppose you have a program that does something like this:
Construct an array of 1 million dynamically-allocated objects.
Do something with that array.
Construct another array of 1 million dynamically-allocated objects (with no references to anything in the first array).
Do something with that array.
Free the second array.
Free the first array.
During steps 3-5, the first array is still using memory, even though none of its data is needed. As a result, the program as a whole uses twice as much total memory as it really needs. It should free the first array after step 2 if it no longer needs it -- then its memory could be reused for the second array.
This is the kind of memory leak that can occur even in garbage-collected languages -- if the variable holding the reference to the first array doesn't get reassigned or go out of scope, it will keep all that data from being GC'ed.
A sufficiently clever optimizer could theoretically address this automatically. If it can analyze the control and data flow, and determine that the first array is never used after step 2, it could reorder step 6 to that place. But in many cases it may be difficult for the compiler to determine this.
Writing modular code will often avoid problems like this. If you create separate functions for processing each of the arrays, the functions should allocate them, use them, and free them.

C - Design your own free( ) function

Today, I appeared for an interview and the interviewer asked me this,
Tell me the steps how will you design your own free( ) function for
deallocate the allocated memory.
How can it be more efficient than C's default free() function ? What can you conclude ?
I was confused, couldn't think of the way to design.
What do you think guys ?
EDIT : Since we need to know about how malloc() works, can you tell me the steps to write our own malloc() function
That's actually a pretty vague question, and that's probably why you got confused. Does he mean, given an existing malloc implementation, how would you go about trying to develop a more efficient way to free the underlying memory? Or was he expecting you to start discussing different kinds of malloc implementations and their benefits and problems? Did he expect you to know how virtual memory functions on the x86 architecture?
Also, by more efficient, does he mean more space efficient or more time efficient? Does free() have to be deterministic? Does it have to return as much memory to the OS as possible because it's in a low-memory, multi-tasking environment? What's our criteria here?
It's hard to say where to start with a vague question like that, other than to start asking your own questions to get clarification. After all, in order to design your own free function, you first have to know how malloc is implemented. So chances are, the question was really about whether or not you knew anything about how malloc can be implemented.
If you're not familiar with the internals of memory management, the easiest way to get started with understanding how malloc is implemented is to first write your own.
Check out this IBM DeveloperWorks article called "Inside Memory Management" for starters.
But before you can write your own malloc/free, you first need memory to allocate/free. Unfortunately, in a protected mode OS, you can't directly address the memory on the machine. So how do you get it?
You ask the OS for it. With the virtual memory features of the x86, any piece of RAM or swap memory can be mapped to a memory address by the OS. What your program sees as memory could be physically fragmented throughout the entire system, but thanks to the kernel's virtual memory manager, it all looks the same.
The kernel usually provides system calls that allow you to map in additional memory for your process. On older UNIX OS's this was usually brk/sbrk to grow heap memory onto the edge of your process or shrink it off, but a lot of systems also provide mmap/munmap to simply map a large block of heap memory in. It's only once you have access to a large, contiguous looking block of memory that you need malloc/free to manage it.
Once your process has some heap memory available to it, it's all about splitting it into chunks, with each chunk containing its own meta information about its size and position and whether or not it's allocated, and then managing those chunks. A simple list of structs, each containing some fields for meta information and a large array of bytes, could work, in which case malloc has to run through the list until if finds a large enough unallocated chunk (or chunks it can combine), and then map in more memory if it can't find a big enough chunk. Once you find a chunk, you just return a pointer to the data. free() can then use that pointer to reverse back a few bytes to the member fields that exist in the structure, which it can then modify (i.e. marking chunk.allocated = false;). If there's enough unallocated chunks at the end of your list, you can even remove them from the list and unmap or shrink that memory off your process's heap.
That's a real simple method of implementing malloc though. As you can imagine, there's a lot of possible ways of splitting your memory into chunks and then managing those chunks. There's as many ways as there are data structures and algorithms. They're all designed for different purposes too, like limiting fragmentation due to small, allocated chunks mixed with small, unallocated chunks, or ensuring that malloc and free run fast (or sometimes even more slowly, but predictably slowly). There's dlmalloc, ptmalloc, jemalloc, Hoard's malloc, and many more out there, and many of them are quite small and succinct, so don't be afraid to read them. If I remember correctly, "The C Programming Language" by Kernighan and Ritchie even uses a simple malloc implementation as one of their examples.
You can't blindly design free() without knowing how malloc() works under the hood because your implementation of free() would need to know how to manipulate the bookkeeping data and that's impossible without knowing how malloc() is implemented.
So an unswerable question could be how you would design malloc() and free() instead which is not a trivial question but you could answer it partially for example by proposing some very simple implementation of a memory pool that would not be equivalent to malloc() of course but would indicate your presence of knowledge.
One common approach when you only have access to user space (generally known as memory pool) is to get a large chunk of memory from the OS on application start-up. Your malloc needs to check which areas of the right size of that pool are still free (through some data structure) and hand out pointers to that memory. Your free needs to mark the memory as free again in the data structure and possibly needs to check for fragmentation of the pool.
The benefits are that you can do allocation in nearly constant time, the drawback is that your application consumes more memory than actually is needed.
Tell me the steps how will you design your own free( ) function for deallocate the allocated memory.
#include <stdlib.h>
#undef free
#define free(X) my_free(X)
inline void my_free(void *ptr) { }
How can it be more efficient than C's default free() function ?
It is extremely fast, requiring zero machine cycles. It also makes use-after-free bugs go away. It's a very useful free function for use in programs which are instantiated as short-lived batch processes; it can usefully be deployed in some production situations.
What can you conclude ?
I really want this job, but in another company.
Memory usage patterns could be a factor. A default implementation of free can't assume anything about how often you allocate/deallocate and what sizes you allocate when you do.
For example, if you frequently allocate and deallocate objects that are of similar size, you could gain speed, memory efficiency, and reduced fragmentation by using a memory pool.
EDIT: as sharptooth noted, only makes sense to design free and malloc together. So the first thing would be to figure out how malloc is implemented.
malloc and free only have a meaning if your app is to work on top of an OS. If you would like to write your own memory management functions you would have to know how to request the memory from that specific OS or you could reserve the heap memory right away using existing malloc and then use your own functions to distribute/redistribute the allocated memory through out your app
There is an architecture that malloc and free are supposed to adhere to -- essentially a class architecture permitting different strategies to coexist. Then the version of free that is executed corresponds to the version of malloc used.
However, I'm not sure how often this architecture is observed.
The knowledge of working of malloc() is necessary to implement free(). You can find a implementation of malloc() and free() using the sbrk() system call in K&R The C Programming Language Chapter 8, Section 8.7 "Example--A Storage Allocator" pp.185-189.

Efficient memory reallocation question

Let's say I have a program(C++, for example) that allocates multiple objects, never bigger than a given size(let's call it MAX_OBJECT_SIZE).
I also have a region(I'll call it a "page") on the heap(allocated with, say, malloc(REGION_SIZE), where REGION_SIZE >= MAX_OBJECT_SIZE).
I keep reserving space in that page until the filled space equals PAGE_SIZE(or at least gets > PAGE_SIZE - MAX_OBJECT_SIZE).
Now, I want to allocate more memory. Obviously my previous "page" won't be enough. So I have at least two options:
Use realloc(page, NEW_SIZE), where NEW_SIZE > PAGE_SIZE;
Allocate a new "page"(page2) and put the new object there.
If I wanted to have a custom allocate function, then:
Using the first method, I'd see how much I had filled, and then put my new object there(and add the size of the object to my filled memory variable).
Using the second method, I'd have a list(vector? array?) of pages, then look for the current page, and then use a method similar to 1 on the selected page.
Eventually, I'd need a method to free memory too, but I can figure out that part.
So my question is: What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations?
I understand that different operations may perform differently, but I'm looking for an overall metric.
In my experience option 2 is much easier to work with has minimal overhead. Realloc does not guarantee it will increase the size of existing memory. And in practice it almost never does. If you use it you will need to go back and remap all of the old objects. That would require that you remember where every object allocated was... That can be a ton over overhead.
But it's hard to qualify "most efficient" without knowing exactly what metrics you use.
This is the memory manager I always use. It works for the entire application not just one object.
allocs:
for every allocation determine the size of the object allocated.
1 look at a link list of frees for objects of that size to see if anything has been freed if so take the first free
2 look for in a look up table and if not found
2.1 allocate an array of N objects of the size being allocated.
3 return the next free object of the desired size.
3.1 if the array is full add a new page.
N objects can be programmer tunned. If you know you have a million 16 byte objects you might want that N to be slightly higher.
for objects over some size X, do not keep an array simply allocate a new object.
frees:
determine the size of the object, add it to the link list of frees.
if the size of the object allocated is less than the size of a pointer the link list does not need to incur any memory overhead. simply use the already allocated memory to store the nodes.
The problem with this method is memory is never returned to the operating system until the application has exited or the programmer decides to defragment the memory. defragmenting is another post. it can be done.
It is not clear from your question why you need to allocate a big block of memory in advance rather than allocating memory for each object as needed. I'm assuming you are using it as a contiguous array. Otherwise, it would make more sense to malloc the memory of each object as it is needed.
If it is indeed acting as an array,malloc-ing another block gives you another chunk of memory that you have to access via another pointer (in your case page2). Thus it is no longer on contiguous block and you cannot use the two blocks as part of one array.
realloc, on the other hand, allocates one contiguous block of memory. You can use it as a single array and do all sorts of pointer arithmetic not possible if there are separate blocks. realloc is also useful when you actually want to shrink the block you are working with, but that is probably not what you are seeking to do here.
So, if you are using this as an array, realloc is basically the better option. Otherwise, there is nothing wrong with malloc. Actually, you might want to use malloc for each object you create rather than having to keep track of and micro-manage blocks of memory.
You have not given any details on what platform you are experimenting. There are some performance differences for realloc between Linux and Windows, for example.
Depending on the situation, realloc might have to allocate a new memory block if it can't grow the current one and copy the old memory to the new one, which is expensive.
If you don't really need a contiguous block of memory you should avoid using realloc.
My sugestion would be to use the second approach, or use a custom allocator (you could implement a simple buddy allocator [2]).
You could also use more advanced memory allocators, like
APR memory pools
Google's TCMalloc
In the worst case, option 1 could cause a "move" of the original memory, that is an extrawork to be done. If the memory is not moved, anyway the "extra" size is initialized, which is other work too. So realloc would be "defeated" by the malloc method, but to say how much, you should do tests (and I think there's a bias on how the system is when the memory requests are done).
Depending on how many times you expect the realloc/malloc have to be performed, it could be an useful idea or an unuseful one. I would use malloc anyway.
The free strategy depends on the implementation. To free all the pages as whole, it is enough to "traverse" them; instead of an array, I would use linked "pages": add sizeof(void *) to the "page" size, and you can use the extra bytes to store the pointer to the next page.
If you have to free a single object, located anywhere in one of the pages, it becomes a little bit more complex. My idea is to keep a list of non-sequential free "block"/"slot" (suitable to hold any object). When a new "block" is requested, first you pop a value from this list; if it is empty, then you get the next "slot" in the last in use page, and eventually a new page is triggered. Freeing an object, means just to put the empty slot address in a stack/list (whatever you prefer to use).
In linux (and probably other POSIX systems) there is a third possibility, that is to use a memory mapped region with shm_open. Such a region is initialized by zeroes once you access it, but AFAIK pages that you never access come with no cost, if it isn't just the address-range in virtual memory that you reserve. So you could just reserve a large chunk of memory at the beginning (more than you ever would need) of your execution and then fill it incrementally from the start.
What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations?
Option 1. For it to be efficient, NEW_SIZE has to depend on old size non-linearly. Otherwise you risk running into O(n^2) performance of realloc() due to the redundant copying. I generally do new_size = old_size + old_size/4 (increase by 25% percent) as theoretically best new_size = old_size*2 might in worst case reserve too much unused memory.
Option 2. It should be more optimal as most modern OSs (thanks to C++'s STL) are already well optimized for flood of small memory allocations. And small allocations have lesser chance to cause memory fragmentation.
In the end it all depends how often you allocate the new objects and how do you handle freeing. If you allocate a lot with #1 you would have some redundant copying when expanding but freeing is dead simple since all objects are in the same page. If you would need to free/reuse the objects, with #2 you would be spending some time walking through the list of pages.
From my experience #2 is better, as moving around large memory blocks might increase rate of heap fragmentation. The #2 is also allows to use pointers as objects do not change their location in memory (though for some applications I prefer to use pool_id/index pairs instead of raw pointers). If walking through pages becomes a problem later, it can be too optimized.
In the end you should also consider option #3: libc. I think that libc's malloc() is efficient enough for many many tasks. Please test it before investing more of your time. Unless you are stuck on some backward *NIX, there should be no problem using malloc() for every smallish object. I used custom memory management only when I needed to put objects in exotic places (e.g. shm or mmap). Keep in mind the multi-threading too: malloc()/realloc()/free() generally are already optimized and MT-ready; you would have to reimplement the optimizations anew to avoid threads being constantly colliding on memory management. And if you want to have memory pools or zones, there are already bunch of libraries for that too.

What are alternatives to malloc() in C?

I am writing C for an MPC 555 board and need to figure out how to allocate dynamic memory without using malloc.
Typically malloc() is implemented on Unix using sbrk() or mmap(). (If you use the latter, you want to use the MAP_ANON flag.)
If you're targetting Windows, VirtualAlloc may help. (More or less functionally equivalent to anonymous mmap().)
Update: Didn't realize you weren't running under a full OS, I somehow got the impression instead that this might be a homework assignment running on top of a Unix system or something...
If you are doing embedded work and you don't have a malloc(), I think you should find some memory range that it's OK for you to write on, and write your own malloc(). Or take someone else's.
Pretty much the standard one that everybody borrows from was written by Doug Lea at SUNY Oswego. For example glibc's malloc is based on this. See: malloc.c, malloc.h.
You might want to check out Ralph Hempel's Embedded Memory Manager.
If your runtime doesn't support malloc, you can find an open source malloc and tweak it to manage a chunk of memory yourself.
malloc() is an abstraction that is use to allow C programs to allocate memory without having to understand details about how memory is actually allocated from the operating system. If you can't use malloc, then you have no choice other than to use whatever facilities for memory allocation that are provided by your operating system.
If you have no operating system, then you must have full control over the layout of memory. At that point for simple systems the easiest solution is to just make everything static and/or global, for more complex systems, you will want to reserve some portion of memory for a heap allocator and then write (or borrow) some code that use that memory to implement malloc.
An answer really depends on why you might need to dynamically allocate memory. What is the system doing that it needs to allocate memory yet cannot use a static buffer? The answer to that question will guide your requirements in managing memory. From there, you can determine which data structure you want to use to manage your memory.
For example, a friend of mine wrote a thing like a video game, which rendered video in scan-lines to the screen. That team determined that memory would be allocated for each scan-line, but there was a specific limit to how many bytes that could be for any given scene. After rendering each scan-line, all the temporary objects allocated during that rendering were freed.
To avoid the possibility of memory leaks and for performance reasons (this was in the 90's and computers were slower then), they took the following approach: They pre-allocated a buffer which was large enough to satisfy all the allocations for a scan-line, according to the scene parameters which determined the maximum size needed. At the beginning of each scan-line, a global pointer was set to the beginning of the scan line. As each object was allocated from this buffer, the global pointer value was returned, and the pointer was advanced to the next machine-word-aligned position following the allocated amount of bytes. (This alignment padding was including in the original calculation of buffer size, and in the 90's was four bytes but should now be 16 bytes on some machinery.) At the end of each scan-line, the global pointer was reset to the beginning of the buffer.
In "debug" builds, there were two scan buffers, which were protected using virtual memory protection during alternating scan lines. This method detects stale pointers being used from one scan-line to the next.
The buffer of scan-line memory may be called a "pool" or "arena" depending on whome you ask. The relevant detail is that this is a very simple data structure which manages memory for a certain task. It is not a general memory manager (or, properly, "free store implementation") such as malloc, which might be what you are asking for.
Your application may require a different data structure to keep track of your free storage. What is your application?
You should explain why you can't use malloc(), as there might be different solutions for different reasons, and there are several reasons why it might be forbidden or unavailable on small/embedded systems:
concern over memory fragmentation. In this case a set of routines that allocate fixed size memory blocks for one or more pools of memory might be the solution.
the runtime doesn't provide a malloc() - I think most modern toolsets for embedded systems do provide some way to link in a malloc() implementation, but maybe you're using one that doesn't for whatever reason. In that case, using Doug Lea's public domain malloc might be a good choice, but it might be too large for your system (I'm not familiar with the MPC 555 off the top of my head). If that's the case, a very simple, custom malloc() facility might be in order. It's not too hard to write, but make sure you unit test the hell out of uit because it's also easy to get details wrong. For example, I have a set of very small routines that use a brain dead memory allocation strategy using blocks on a free list (the allocator can be compile-time configured for first, best or last fit). I give it an array of char at initialization, and subsequent allocation calls will split free blocks as necessary. It's nowhere near as sophisticated as Lea's malloc(), but it's pretty dang small so for simple uses it'll do the trick.
many embedded projects forbid the use of dynamic memory allocation - in this case, you have to live with statically allocated structures
Write your own. Since your allocator will probably be specialized to a few types of objects, I recommend the Quick Fit scheme developed by Bill Wulf and Charles Weinstock. (I have not been able to find a free copy of this paper, but many people have access to the ACM digital library.) The paper is short, easy to read, and well suited to your problem.
If you turn out to need a more general allocator, the best guide I have found on the topic of programming on machines with fixed memory is Donald Knuth's book The Art of Computer Programming, Volume 1. If you want examples, you can find good ones in Don's epic book-length treatment of the source code of TeX, TeX: The Program.
Finally, the undergraduate textbook by Bryant and O'Hallaron is rather expensive, but it goes through the implementation of malloc in excruciating detail.
Write your own. Preallocate a big chunk of static RAM, then write some functions to grab and release chunks of it. That's the spirit of what malloc() does, except that it asks the OS to allocate and deallocate memory pages dynamically.
There are a multitude of ways of keeping track of what is allocated and what is not (bitmaps, used/free linked lists, binary trees, etc.). You should be able to find many references with a few choice Google searches.
malloc() and its related functions are the only game in town. You can, of course, roll your own memory management system in whatever way you choose.
If there are issues allocating dynamic memory from the heap, you can try allocating memory from the stack using alloca(). The usual caveats apply:
The memory is gone when you return.
The amount of memory you can allocate is dependent on the maximum size of your stack.
You might be interested in: liballoc
It's a simple, easy-to-implement malloc/free/calloc/realloc replacement which works.
If you know beforehand or can figure out the available memory regions on your device, you can also use their libbmmm to manage these large memory blocks and provide a backing-store for liballoc. They are BSD licensed and free.
FreeRTOS contains 3 examples implementations of memory allocation (including malloc()) to achieve different optimizations and use cases appropriate for small embedded systems (AVR, ARM, etc). See the FreeRTOS manual for more information.
I don't see a port for the MPC555, but it shouldn't be difficult to adapt the code to your needs.
If the library supplied with your compiler does not provide malloc, then it probably has no concept of a heap.
A heap (at least in an OS-less system) is simply an area of memory reserved for dynamic memory allocation. You can reserve such an area simply by creating a suitably sized statically allocated array and then providing an interface to provide contiguous chunks of this array on demand and to manage chunks in use and returned to the heap.
A somewhat neater method is to have the linker allocate the heap from whatever memory remains after stack and static memory allocation. That way the heap is always automatically as large as it possibly can be, allowing you to use all available memory simply. This will require modification of the application's linker script. Linker scripts are specific to the particular toolchain, and invariable somewhat arcane.
K&R included a simple implementation of malloc for example.

Can I write a C application without using the heap?

I'm experiencing what appears to be a stack/heap collision in an embedded environment (see this question for some background).
I'd like to try rewriting the code so that it doesn't allocate memory on the heap.
Can I write an application without using the heap in C? For example, how would I use the stack only if I have a need for dynamic memory allocation?
I did it once in an embedded environment where we were writing "super safe" code for biomedical machines.
Malloc()s were explicitly forbidden, partly for the resources limits and for the unexpected behavior you can get from dynamic memory (look for malloc(), VxWorks/Tornado and fragmentation and you'll have a good example).
Anyway, the solution was to plan in advance the needed resources and statically allocate the "dynamic" ones in a vector contained in a separate module, having some kind of special purpose allocator give and take back pointers. This approach avoided fragmentation issues altogether and helped getting finer grained error info, if a resource was exhausted.
This may sound silly on big iron, but on embedded systems, and particularly on safety critical ones, it's better to have a very good understanding of which -time and space- resources are needed beforehand, if only for the purpose of sizing the hardware.
Funnily enough, I once saw a database application which completly relied on static allocated memory. This application had a strong restriction on field and record lengths. Even the embedded text editor (I still shiver calling it that) was unable to create texts with more than 250 lines of text. That solved some question I had at this time: why are only 40 records allowed per client?
In serious applications you can not calculate in advance the memory requirements of your running system. Therefore it is a good idea to allocate memory dynamically as you need it. Nevertheless it is common case in embedded systems to preallocate memory you really need to prevent unexpected failures due to memory shortage.
You might allocate dynamic memory on the stack using the alloca() library calls. But this memory is tight to the execution context of the application and it is a bad idea to return memory of this type the caller, because it will be overwritten by later subroutine calls.
So I might answer your question with a crisp and clear "it depends"...
You can use alloca() function that allocates memory on the stack - this memory will be freed automatically when you exit the function. alloca() is GNU-specific, you use GCC so it must be available.
See man alloca.
Another option is to use variable-length arrays, but you need to use C99 mode.
It's possible to allocate a large amount of memory from the stack in main() and have your code sub-allocate it later on. It's a silly thing to do since it means your program is taking up memory that it doesn't actually need.
I can think of no reason (save some kind of silly programming challenge or learning exercise) for wanting to avoid the heap. If you've "heard" that heap allocation is slow and stack allocation is fast, it's simply because the heap involves dynamic allocation. If you were to dynamically allocate memory from a reserved block within the stack, it would be just as slow.
Stack allocation is easy and fast because you may only deallocate the "youngest" item on the stack. It works for local variables. It doesn't work for dynamic data structures.
Edit: Having seen the motivation for the question...
Firstly, the heap and the stack have to compete for the same amount of available space. Generally, they grow towards each other. This means that if you move all your heap usage into the stack somehow, then rather than stack colliding with heap, the stack size will just exceed the amount of RAM you have available.
I think you just need to watch your heap and stack usage (you can grab pointers to local variables to get an idea of where the stack is at the moment) and if it's too high, reduce it. If you have lots of small dynamically-allocated objects, remember that each allocation has some memory overhead, so sub-allocating them from a pool can help cut down on memory requirements. If you use recursion anywhere think about replacing it with an array-based solution.
You can't do dynamic memory allocation in C without using heap memory. It would be pretty hard to write a real world application without using Heap. At least, I can't think of a way to do this.
BTW, Why do you want to avoid heap? What's so wrong with it?
1: Yes you can - if you don't need dynamic memory allocation, but it could have a horrible performance, depending on your app. (i.e. not using the heap won't give you better apps)
2: No I don't think you can allocate memory dynamically on the stack, since that part is managed by the compiler.
Yes, it's doable. Shift your dynamic needs out of memory and onto disk (or whatever mass storage you have available) -- and suffer the consequent performance penalty.
E.g., You need to build and reference a binary tree of unknown size. Specify a record layout describing a node of the tree, where pointers to other nodes are actually record numbers in your tree file. Write routines that let you add to the tree by writing an additional record to file, and walk the tree by reading a record, finding its child as another record number, reading that record, etc.
This technique allocates space dynamically, but it's disk space, not RAM space. All the routines involved can be written using statically allocated space -- on the stack.
Embedded applications need to be careful with memory allocations but I don't think using the stack or your own pre-allocated heap is the answer. If possible, allocate all required memory (usually buffers and large data structures) at initialization time from a heap. This requires a different style of program than most of us are used to now but it's the best way to get close to deterministic behavior.
A large heap that is sub-allocated later would still be subject to running out of memory and the only thing to do then is have a watchdog kick in (or similar action). Using the stack sounds appealing but if you're going to allocate large buffers/data structures on the stack you have to be sure that the stack is large enough to handle all possible code paths that your program could execute. This is not easy and in the end is similar to a sub-allocated heap.
My foremost concern is, does abolishing the heap really helps?
Since your wish of not using heap stems from stack/heap collision, assuming the start of stack and start of heap are set properly (e.g. in the same setting, small sample programs have no such collision problem), then the collision means the hardware has not enough memory for your program.
Not using heap, one may indeed save some waste space from heap fragmentation; but if your program does not use the heap for a bunch of irregular large size allocation, the waste there are probably not much. I will see your collision problem more of an out of memory problem, something not fixable by merely avoiding heap.
My advices on tackling this case:
Calculate the total potential memory usage of your program. If it is too close to but not yet exceeding the amount of memory you prepared for the hardware, then you may
Try using less memory (improve the algorithms) or using the memory more efficiently (e.g. smaller and more-regular-sized malloc() to reduce heap fragmentation); or
Simply buy more memory for the hardware
Of course you may try pushing everything into pre-defined static memory space, but it is very probable that it will be stack overwriting into static memory this time. So improve the algorithm to be less memory-consuming first and buy more memory the second.
I'd attack this problem in a different way - if you think the the stack and heap are colliding, then test this by guarding against it.
For example (assuming a *ix system) try mprotect()ing the last stack page (assuming a fixed size stack) so it is not accessible. Or - if your stack grows - then mmap a page in the middle of the stack and heap. If you get a segv on your guard page you know you've run off the end of the stack or heap; and by looking at the address of the seg fault you can see which of the stack & heap collided.
It is often possible to write your embedded application without using dynamic memory allocation. In many embedded applications the use of dynamic allocation is deprecated because of the problems that can arise due to heap fragmentation. Over time it becomes highly likely that there will not be a suitably sized region of free heap space to allow the memory to be allocated and unless there is a scheme in place to handle this error the application will crash. There are various schemes to get around this, one being to always allocate fixed size objects on the heap so that a new allocation will always fit into a freed memory area. Another to detect the allocation failure and to perform a defragmentation process on all of the objects on the heap (left as an exercise for the reader!)
You do not say what processor or toolset you are using but in many the static, heap and stack are allocated to separate defined segments in the linker. If this is the case then it must be that your stack is growing outside the memory space that you have defined for it. The solution that you require is to reduce the heap and/or static variable size (assuming that these two are contiguous) so that there is more available for the stack. It may be possible to reduce the heap unilaterally although this can increase the probability of fragmentation problems. Ensuring that there are no unnecessary static variables will free some space at the cost of possibly increasing the stack usage if the variable is made auto.

Resources