Memory pools implementation in C - c

I am looking for a good memory pool implementation in C.
it should include the following:
Anti fragmentation.
Be super fast :)
Ability to "bundle" several allocations from different sizes under some identifier and delete all the allocations with the given identifier.
Thread safe

I think the excellent talloc, developed as part of samba might be what you're looking for. The part I find most interesting is that any pointer returned from talloc is a valid memory context. Their example is:
struct foo *X = talloc(mem_ctx, struct foo);
X->name = talloc_strdup(X, "foo");
// ...
talloc_free(X); // frees memory for both X and X->name
In response to your particular points:
(1) Not sure what anti-fragmentation is in this case. In C you're not going to get compacting garbage collection anyway, so I think your choices are somewhat limited.
(2) It advertises being only 4% slower than plain malloc(3), which is quite fast.
(3) See example above.
(4) It is thread safe as long as different threads use different contexts & the underlying malloc is thread safe.

Have you looked into
nedmalloc http://www.nedprod.com/programs/portable/nedmalloc/
ptmalloc http://www.malloc.de/en/
Both leverage a memory pool but keep it mostly transparent to the user.
In general, you will find best performance in your own custom memory pool (you can optimize for your pattern). I ended up writing a few for different access patterns.

For memory pools that have been thoroughly tried and tested you may want to just use the APR ones:
http://apr.apache.org/docs/apr/1.4/apr__pools_8h.html
Mind you, single pools are not thread safe, you'll have to handle that yourself.

bget is another choice. It's well tested and production ready.
http://www.fourmilab.ch/bget/

Related

memory allocation/deallocation for embedded devices

Currently we use malloc/free Linux commands for memory allocation/de-allocation in our C based embedded application. I heard that this would cause memory fragmentation as the heap size increases/decreases because of memory allocation/de-allocation which would result in performance degradation. Other programming languages with efficient Garbage Collection solves this issue by freeing the memory when not in use.
Are there any alternate approaches which would solve this issue in C based embedded programs ?
You may take a look at a solution called memory pool allocation.
See: Memory pools implementation in C
Yes, there's an easy solution: don't use dynamic memory allocation outside of initialization.
It is common (in my experience) in embedded systems to only allow calls to malloc when a program starts (this is usually done by convention, there's nothing in C to enforce this. Although you can create your own wrapper for malloc to do this). This requires more work to analyze what memory your program could possibly use since you have to allocate it all at once. The benefit you get, however, is a complete understanding of what memory your program uses.
In some cases this is fairly straightforward, in particular if your system has enough memory to allocate everything it could possibly need all at once. In severely memory-limited systems, however, you're left with the managing the memory yourself. I've seen this done by writing "custom allocators" which you allocate and free memory from. I'll provide an example.
Let's say you're implementing some mathematical program that needs lots of big matrices (not terribly huge, but for example 1000x1000 floats). Your system may not have the memory to allocate many of these matrices, but if you can allocate at least one of them, you could create a pool of memory used for matrix objects, and every time you need a matrix you grab memory from that pool, and when you're done with it you return it to the pool. This is easy if you can return them in the same order you got them in, meaning the memory pool works just like a stack. If this isn't the case, perhaps you could just clear the entire pool at the end of each "iteration" (assuming this math system is periodic).
With more detail about what exactly you're trying to implement I could provide more relevant/specific examples.
Edit: See sg7's answer as well: that user provides a link to well-established frameworks which implement what I describe here.

When would one use malloc over zmalloc?

There is precious little information online or on stackoverflow with regards to a function I recently encountered called zmalloc. (In fact, this is only the 3rd zmalloc-tagged question on SO).
I gleaned the following:
zmalloc automatically keeps track of, and frees unfreed memory, similar to C++ smart pointers.
zmalloc apparently enables some metrics, at least in the case of the redis source.
So my questions are:
What flexibility does one lose, then, in using zmalloc over malloc? i.e. what benefits do malloc continue to offer that zmalloc does not?
Is zmalloc non-standard in C11? Is this a custom-built function?
It looks like zmalloc is part of the redis-tools (https://github.com/antirez/redis-tools). redis is a kind of database which keeps stuff in memory (http://redis.io/).
Typically malloc replacements are developed because some target systems do not provide a suitable malloc, or because the caller needs extra functionality. I think zmalloc is a pretty simple wrapper of the system malloc/free, just keeping track of the overall memory allocated. No automatic free involved. The post you pointed to also explains the need: The database can be configured to not use more than some amount of memory and thus needs to keep track of the overall consumption.

simple c malloc

While there are lots of different sophisticated implementations of malloc / free for C/C++, I'm looking for a really simple and (especially) small one that works on a fixed-size buffer and supports realloc. Thread-safety etc. are not needed and my objects are small and do not vary much in size. Is there any implementation that you could recommend?
EDIT:
I'll use that implementation for a communication buffer at the receiver to transport objects with variable size (unknown to the receiver). The allocated objects won't live long, but there are possibly several objects used at the same time.
As everyone seems to recommend the standard malloc, I should perhaps reformulate my question. What I need is the "simplest" implementation of malloc on top of a buffer that I can start to optimize for my own needs. Perhaps the original question was unclear because I'm not looking for an optimized malloc, only for a simple one. I don't want to start with a glibc-malloc and extend it, but with a light-weight one.
Kerninghan & Ritchie seem to have provided a small malloc / free in their C book - that's exactly what I was looking for (reimplementation found here). I'll only add a simple realloc.
I'd still be glad about suggestions for other implementations that are as simple and concise as this one (for example, using doubly-linked lists).
I recommend the one that came with standard library bundled with your compiler.
One should also note there is no legal way to redefine malloc/free
The malloc/free/realloc that come with your compiler are almost certainly better than some functions you're going to plug in.
It is possible to improve things for fixed-size objects, but that usually doesn't involve trying to replace the malloc but rather supplementing it with memory pools. Typically, you would use malloc to get a large chunk of memory that you can divide into discrete blocks of the appropriate size, and manage those blocks.
It sounds to me that you are looking for a memory pool. The Apache Runtime library has a pretty good one, and it is cross-platform too.
It may not be entirely light-weight, but the source is open and you can modify it.
There's a relatively simple memory pool implementation in CCAN:
http://ccodearchive.net/info/antithread/alloc.html
This looks like fits your bill. Sure, alloc.c is 1230 lines, but a good chunk of that is test code and list manipulation. It's a bit more complex than the code you implemented, but decent memory allocation is complicated.
I would generally not reinvent the wheel with allocation functions unless my memory-usage pattern either is not supported by malloc/etc. or memory can be partitioned into one or more pre-allocated zones, each containing one or two LIFO heaps (freeing any object releases all objects in the same heap that were allocated after it). In a common version of the latter scenario, the only time anything is freed, everything is freed; in such a case, malloc() may be usefully rewritten as:
char *malloc_ptr;
void *malloc(int size)
{
void *ret;
ret = (void*)malloc_ptr;
malloc_ptr += size;
return ret;
}
Zero bytes of overhead per allocated object. An example of a scenario where a custom memory manager was used for a scenario where malloc() was insufficient was an application where variable-length test records produced variable-length result records (which could be longer or shorter); the application needed to support fetching results and adding more tests mid-batch. Tests were stored at increasing addresses starting at the bottom of the buffer, while results were stored at decreasing addresses starting at the top. As a background task, tests after the current one would be copied to the start of the buffer (since there was only one pointer that was used to read tests for processing, the copy logic would update that pointer as required). Had the application used malloc/free, it's possible that the interleaving of allocations for tests and results could have fragmented memory, but with the system used there was no such risk.
Echoing advice to measure first and only specialize if performance sucks - should be easy to abstract your malloc/free/reallocs such that replacement is straightforward.
Given the specialized platform I can't comment on effectiveness of the runtimes. If you do investigate your own then object pooling (see other answers) or small object allocation a la Loki or this is worth a look. The second link has some interesting commentary on the issue as well.

Resources for memory management in embedded application

How should I manage memory in my mission critical embedded application?
I found some articles with google, but couldn't pinpoint a really useful practical guide.
The DO-178b forbids dynamic memory allocations, but how will you manage the memory then? Preallocate everything in advance and send a pointer to each function that needs allocation? Allocate it on the stack? Use a global static allocator (but then it's very similar to dynamic allocation)?
Answers can be of the form of regular answer, reference to a resource, or reference to good opensource embedded system for example.
clarification: The issue here is not whether or not memory management is availible for the embedded system. But what is a good design for an embedded system, to maximize reliability.
I don't understand why statically preallocating a buffer pool, and dynamically getting and dropping it, is different from dynamically allocating memory.
As someone who has dealt with embedded systems, though not to such rigor so far (I have read DO-178B, though):
If you look at the u-boot bootloader, a lot is done with a globally placed structure. Depending on your exact application, you may be able to get away with a global structure and stack. Of course, there are re-entrancy and related issues there that don't really apply to a bootloader but might for you.
Preallocate, preallocate, preallocate. If you can at design-time bind the size of an array/list structure/etc, declare it as a global (or static global -- look Ma, encapsulation).
The stack is very useful, use it where needed -- but be careful, as it can be easy to keep allocating off of it until you have no stack space left. Some code I once found myself debugging would allocate 1k buffers for string management in multiple functions...occasionally, the usage of the buffers would hit another program's stack space, as the default stack size was 4k.
The buffer pool case may depend on exactly how it's implemented. If you know you need to pass around fixed-size buffers of a size known at compile time, dealing with a buffer pool is likely more easy to demonstrate correctness than a complete dynamic allocator. You just need to verify buffers cannot be lost, and validate your handling won't fail. There seem to be some good tips here: http://www.cotsjournalonline.com/articles/view/101217
Really, though, I think your answers might be found in joining http://www.do178site.com/
I've worked in a DO-178B environment (systems for airplanes). What I have understood, is that the main reason for not allowing dynamic allocation is mainly certification. Certification is done through tests (unitary, coverage, integration, ...). With those tests you have to prove that you the behavior of your program is 100% predictable, nearly to the point that the memory footprint of your process is the same from one execution to the next. As dynamic allocation is done on the heap (and can fail) you can not easily prove that (I imagine it should be possible if you master all the tools from the hardware to any piece of code written, but ...). You have not this problem with static allocation. That also why C++ was not used at this time in such environments. (it was about 15 years ago, that might have changed ...)
Practically, you have to write a lot of struct pools and allocation functions that guarantee that you have something deterministic. You can imagine a lot of solutions. The key is that you have to prove (with TONS of tests) a high level of deterministic behavior. It's easier to prove that your hand crafted developpement work deterministically that to prove that linux + gcc is deterministic in allocating memory.
Just my 2 cents. It was a long time ago, things might have changed, but concerning certification like DO-178B, the point is to prove your app will work the same any time in any context.
Disclaimer: I've not worked specifically with DO-178b, but I have written software for certified systems.
On the certified systems for which I have been a developer, ...
Dynamic memory allocation was
acceptable ONLY during the
initialization phase.
Dynamic memory de-allocation was NEVER acceptable.
This left us with the following options ...
Use statically allocated structures.
Create a pool of structures and then get/release them from/back to the pool.
For flexibility, we could dynamically allocate the size of the pools or number of structures during the initialization phase. However, once past that init phase, we were stuck with what we had.
Our company found that pools of structures and then get/releasing from/back into the pool was most useful. We were able to keep to the model, and keep things deterministic with minimal problems.
Hope that helps.
Real-time, long running, mission critical systems should not dynamically allocate and free memory from heap. If you need and cannot design around it to then write your own allocated and fixed pool management scheme. Yes, allocated fixed ahead of time whenever possible. Anything else is asking for eventual trouble.
Allocating everything from stack is commonly done in embedded systems or elsewhere where the possibility of an allocation failing is unacceptable. I don't know what DO-178b is, but if the problem is that malloc is not available on your platform, you can also implement it yourself (implementing your own heap), but this still may lead to an allocation failing when you run out of space, of course.
There's no way to be 100% sure.
You may look at FreeRTOS' memory allocators examples. Those use static pool, if i'm not mistaken.
You might find this question interesting as well, dynamic allocation is often prohibited in space hardened settings (actually, core memory is still useful there).
Typically, when malloc() is not available, I just use the stack. As Tronic said, the whole reason behind not using malloc() is that it can fail. If you are using a global static pool, it is conceivable that your internal malloc() implementation could be made fail proof.
It really, really, really depends on the task at hand and what the board is going to be exposed to.

Patterns for freeing memory in C?

I'm currently working on a C based application am a bit stuck on freeing memory in a non-antipattern fashion. I am a memory-management amateur.
My main problem is I declare memory structures in various different scopes, and these structures get passed around by reference to other functions. Some of those functions may throw errors and exit().
How do I go about freeing my structures if I exit() in one scope, but not all my data structures are in that scope?
I get the feeling I need to wrap it all up in a psuedo exception handler and have the handler deal with freeing, but that still seems ugly because it would have to know about everything I may or may not need to free...
Consider wrappers to malloc and using them in a disciplined way. Track the memory that you do allocate (in a linked list maybe) and use a wrapper to exit to enumerate your memory to free it. You could also name the memory with an additional parameter and member of your linked list structure. In applications where allocated memory is highly scope dependent you will find yourself leaking memory and this can be a good method to dump the memory and analyze it.
UPDATE:
Threading in your application will make this very complex. See other answers regarding threading issues.
You don't need to worry about freeing memory when exit() is called. When the process exits, the operating system will free all of the associated memory.
I think to answer this question appropriately, we would need to know about the architecture of your entire program (or system, or whatever the case may be).
The answer is: it depends. There are a number of strategies you can use.
As others have pointed out, on a modern desktop or server operating system, you can exit() and not worry about the memory your program has allocated.
This strategy changes, for example, if you are developing on an embedded operating system where exit() might not clean everything up. Typically what I see is when individual functions return due to an error, they make sure to clean up anything they themselves have allocated. You wouldn't see any exit() calls after calling, say, 10 functions. Each function would in turn indicate an error when it returns, and each function would clean up after itself. The original main() function (if you will - it might not be called main()) would detect the error, clean up any memory it had allocated, and take the appropriate actions.
When you just have scopes-within-scopes, it's not rocket science. Where it gets difficult is if you have multiple threads of execution, and shared data structures. Then you might need a garbage collector or a way to count references and free the memory when the last user of the structure is done with it. For example, if you look at the source to the BSD networking stack, you'll see that it uses a refcnt (reference count) value in some structures that need to be kept "alive" for an extended period of time and shared among different users. (This is basically what garbage collectors do, as well.)
You can create a simple memory manager for malloc'd memory that is shared between scopes/functions.
Register it when you malloc it, de-register it when you free it. Have a function that frees all registered memory before you call exit.
It adds a bit of overhead, but it helps keep track of memory. It can also help you hunt down pesky memory leaks.
Michael's advice is sound - if you are exiting, you don't need to worry about freeing the memory since the system will reclaim it anyway.
One exception to that is shared memory segments - at least under System V Shared Memory. Those segments can persist longer than the program that creates them.
One option not mentioned so far is to use an arena-based memory allocation scheme, built on top of standard malloc(). If the entire application uses a single arena, your cleanup code can release that arena, and all is freed at once. (APR - Apache Portable Runtime - provides a pools feature which I believe is similar; David Hanson's "C Interfaces and Implementations" provides an arena-based memory allocation system; I've written one that you could use if you wanted to.) You can think of this as "poor man's garbage collection".
As a general memory discipline, every time you allocate memory dynamically, you should understand which code is going to release it and when it can be released. There are a few standard patterns. The simplest is "allocated in this function; released before this function returns". This keeps the memory largely under control (if you don't run too many iterations on the loop that contains the memory allocation), and scopes it so that it can be made available to the current function and the functions it calls. Obviously, you have to be reasonably sure that the functions you call are not going to squirrel away (cache) pointers to the data and try to reuse them later after you've released and reused the memory.
The next standard pattern is exemplified by fopen() and fclose(); there's a function that allocates a pointer to some memory, which can be used by the calling code, and then released when the program has finished with it. However, this often becomes very similar to the first case - it is usually a good idea to call fclose() in the function that called fopen() too.
Most of the remaining 'patterns' are somewhat ad hoc.
People have already pointed out that you probably don't need to worry about freeing memory if you're just exiting (or aborting) your code in case of error. But just in case, here's a pattern I developed and use a lot for creating and tearing down resources in case of error. NOTE: I'm showing a pattern here to make a point, not writing real code!
int foo_create(foo_t *foo_out) {
int res;
foo_t foo;
bar_t bar;
baz_t baz;
res = bar_create(&bar);
if (res != 0)
goto fail_bar;
res = baz_create(&baz);
if (res != 0)
goto fail_baz;
foo = malloc(sizeof(foo_s));
if (foo == NULL)
goto fail_alloc;
foo->bar = bar;
foo->baz = baz;
etc. etc. you get the idea
*foo_out = foo;
return 0; /* meaning OK */
/* tear down stuff */
fail_alloc:
baz_destroy(baz);
fail_baz:
bar_destroy(bar);
fail_bar:
return res; /* propagate error code */
}
I can bet I'm going to get some comments saying "this is bad because you use goto". But this is a disciplined and structured use of goto that makes code clearer, simpler, and easier to maintain if applied consistently. You can't achieve a simple, documented tear-down path through the code without it.
If you want to see this in real in-use commercial code, take a look at, say, arena.c from the MPS (which is coincidentally a memory management system).
It's a kind of poor-man's try...finish handler, and gives you something a bit like destructors.
I'm going to sound like a greybeard now, but in my many years of working on other people's C code, lack of clear error paths is often a very serious problem, especially in network code and other unreliable situations. Introducing them has occasionally made me quite a bit of consultancy income.
There are plenty of other things to say about your question -- I'm just going to leave it with this pattern in case that's useful.
Very simply, why not have a reference counted implementation, so when you create an object and pass it around you increment and decrement the reference counted number (remember to be atomic if you have more than one thread).
That way, when an object is no longer used (zero references) you can safely delete it, or automatically delete it in the reference count decrement call.
This sounds like a task for a Boehm garbage collector.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
Depends on the system of course whether you can or should afford to use it.

Resources