I need a generic malloc implementation that uses one big fixed-size buffer. Something similar to the "Zero-malloc memory allocator" SQLite has. Do you know of any such implementations? It should be light-weight and portable that can be used for embedded applications.
Thanks in advance.
Two suggestions:
IF you need something production quality and well tested, just borrow SQLite's allocator. SQLite's source code is very well-written, documented, extremely well-tested and has a very permissive open-source license.
IF you need something small and simple, either to learn or to use in an embedded environment, consider this implementation [shameless plug!] - just 350 LOC of commented C code.
The SQLite source code is freely available. If you like that a particular implementation, why not use it?
Most current malloc implementations work by carving up a large chunk of memory they obtained from the OS. If that block runs out, malloc asks the OS for a new large block.
You could base your own implementation on an existing malloc implementation (for example the glibc one), and instead of obtaining a block from the OS, you use a single static buffer. When that runs out, malloc will start failing, just as it does when the OS can't provide any new blocks.
Related
Currently we use malloc/free Linux commands for memory allocation/de-allocation in our C based embedded application. I heard that this would cause memory fragmentation as the heap size increases/decreases because of memory allocation/de-allocation which would result in performance degradation. Other programming languages with efficient Garbage Collection solves this issue by freeing the memory when not in use.
Are there any alternate approaches which would solve this issue in C based embedded programs ?
You may take a look at a solution called memory pool allocation.
See: Memory pools implementation in C
Yes, there's an easy solution: don't use dynamic memory allocation outside of initialization.
It is common (in my experience) in embedded systems to only allow calls to malloc when a program starts (this is usually done by convention, there's nothing in C to enforce this. Although you can create your own wrapper for malloc to do this). This requires more work to analyze what memory your program could possibly use since you have to allocate it all at once. The benefit you get, however, is a complete understanding of what memory your program uses.
In some cases this is fairly straightforward, in particular if your system has enough memory to allocate everything it could possibly need all at once. In severely memory-limited systems, however, you're left with the managing the memory yourself. I've seen this done by writing "custom allocators" which you allocate and free memory from. I'll provide an example.
Let's say you're implementing some mathematical program that needs lots of big matrices (not terribly huge, but for example 1000x1000 floats). Your system may not have the memory to allocate many of these matrices, but if you can allocate at least one of them, you could create a pool of memory used for matrix objects, and every time you need a matrix you grab memory from that pool, and when you're done with it you return it to the pool. This is easy if you can return them in the same order you got them in, meaning the memory pool works just like a stack. If this isn't the case, perhaps you could just clear the entire pool at the end of each "iteration" (assuming this math system is periodic).
With more detail about what exactly you're trying to implement I could provide more relevant/specific examples.
Edit: See sg7's answer as well: that user provides a link to well-established frameworks which implement what I describe here.
I'm currently playing around with the CSR 1000 chip and I wanted to allocate memory. I tried using malloc but the compiler tells me:
undefined reference to `malloc'
I assume that is because gcc is run with -nostdlib parameter
So please could somebody with CSR uEnergy SDK experience, tell me why I can't allocate memory, and how I should do it instead??
If there is an SDK bundled with that chip that provides basic routines for memory allocation then use those, alternatively you can write your own allocator or use an existing one off of the web (with some fiddling).
As a quick solution you can probably mark a region in memory using a modified linker script or by using the gcc 'section' attribute (more here) and then use that as your heap arena in your malloc allocator.
A very simple allocator would not keep any accounting information such as headers/footers but rather allocate linearly one region after another (free-ing would essentially be a no-op in this case), this won't get you far but you will be able to run simple programs.
You probably want something more sophisticated, you could also look into implementing some kind of memory pool or any of the standard allocation algorithms.
The classic book The C Programming Language by Dennis Ritchie and Brian Kernighan provides a simple memory allocator if I re-call correctly. You may want to have a look at that.
I have three months of experience with this chip.
The malloc function is found in standard C library, which is typically available in desktop software development or embedded linux. But this is a small and resource-limited embedded chip. There is no standard C library.
If you browse the uEnergy SDK installation directory, something like this: C:\uEnergy_SDK-2.0.0\doc\reference\html\index.html. Click Modules tag on the top. You will find that under the section "C Standard Library APIs", CSR provides a few functions that mimic a subset of the standard C library. Unfortunately, there is no methods like malloc.
In general, when you work with small embedded systems, it is quite often that there is no dynamic memory allocation. However, for RF applications which are usually event-driven, there is typically a simple dynamic memory allocation function provided so incoming packets can be handed to you by the OS to your application. I used TI's CC2430 and its Zigbee stacks. They provide functions osal_mem_alloc and osal_mem_free, which mimic the malloc and free in the standard C library.
From my experience working with both chips, I found that CSR is much more protective than TI, in the same way as iOS vs. Android. You don't know what MCU they use except they tell you that it is a 16-bit RISC.
I suspect they have the dynamic memory allocation internally but your application just can't use those functions. RF packets are handed to you by the OS in the AppProcessLmEvent function, from there you get your data via the p_event_data pointer. You don't have to deallocate it as the OS will do it for you once you finish handling that event.
So back to your question, you can allocate memory so you just reserve a block of memory as global array and work on it.
Hope this helps.
add #include <malloc.h> to the head of your file
While there are lots of different sophisticated implementations of malloc / free for C/C++, I'm looking for a really simple and (especially) small one that works on a fixed-size buffer and supports realloc. Thread-safety etc. are not needed and my objects are small and do not vary much in size. Is there any implementation that you could recommend?
EDIT:
I'll use that implementation for a communication buffer at the receiver to transport objects with variable size (unknown to the receiver). The allocated objects won't live long, but there are possibly several objects used at the same time.
As everyone seems to recommend the standard malloc, I should perhaps reformulate my question. What I need is the "simplest" implementation of malloc on top of a buffer that I can start to optimize for my own needs. Perhaps the original question was unclear because I'm not looking for an optimized malloc, only for a simple one. I don't want to start with a glibc-malloc and extend it, but with a light-weight one.
Kerninghan & Ritchie seem to have provided a small malloc / free in their C book - that's exactly what I was looking for (reimplementation found here). I'll only add a simple realloc.
I'd still be glad about suggestions for other implementations that are as simple and concise as this one (for example, using doubly-linked lists).
I recommend the one that came with standard library bundled with your compiler.
One should also note there is no legal way to redefine malloc/free
The malloc/free/realloc that come with your compiler are almost certainly better than some functions you're going to plug in.
It is possible to improve things for fixed-size objects, but that usually doesn't involve trying to replace the malloc but rather supplementing it with memory pools. Typically, you would use malloc to get a large chunk of memory that you can divide into discrete blocks of the appropriate size, and manage those blocks.
It sounds to me that you are looking for a memory pool. The Apache Runtime library has a pretty good one, and it is cross-platform too.
It may not be entirely light-weight, but the source is open and you can modify it.
There's a relatively simple memory pool implementation in CCAN:
http://ccodearchive.net/info/antithread/alloc.html
This looks like fits your bill. Sure, alloc.c is 1230 lines, but a good chunk of that is test code and list manipulation. It's a bit more complex than the code you implemented, but decent memory allocation is complicated.
I would generally not reinvent the wheel with allocation functions unless my memory-usage pattern either is not supported by malloc/etc. or memory can be partitioned into one or more pre-allocated zones, each containing one or two LIFO heaps (freeing any object releases all objects in the same heap that were allocated after it). In a common version of the latter scenario, the only time anything is freed, everything is freed; in such a case, malloc() may be usefully rewritten as:
char *malloc_ptr;
void *malloc(int size)
{
void *ret;
ret = (void*)malloc_ptr;
malloc_ptr += size;
return ret;
}
Zero bytes of overhead per allocated object. An example of a scenario where a custom memory manager was used for a scenario where malloc() was insufficient was an application where variable-length test records produced variable-length result records (which could be longer or shorter); the application needed to support fetching results and adding more tests mid-batch. Tests were stored at increasing addresses starting at the bottom of the buffer, while results were stored at decreasing addresses starting at the top. As a background task, tests after the current one would be copied to the start of the buffer (since there was only one pointer that was used to read tests for processing, the copy logic would update that pointer as required). Had the application used malloc/free, it's possible that the interleaving of allocations for tests and results could have fragmented memory, but with the system used there was no such risk.
Echoing advice to measure first and only specialize if performance sucks - should be easy to abstract your malloc/free/reallocs such that replacement is straightforward.
Given the specialized platform I can't comment on effectiveness of the runtimes. If you do investigate your own then object pooling (see other answers) or small object allocation a la Loki or this is worth a look. The second link has some interesting commentary on the issue as well.
How should I manage memory in my mission critical embedded application?
I found some articles with google, but couldn't pinpoint a really useful practical guide.
The DO-178b forbids dynamic memory allocations, but how will you manage the memory then? Preallocate everything in advance and send a pointer to each function that needs allocation? Allocate it on the stack? Use a global static allocator (but then it's very similar to dynamic allocation)?
Answers can be of the form of regular answer, reference to a resource, or reference to good opensource embedded system for example.
clarification: The issue here is not whether or not memory management is availible for the embedded system. But what is a good design for an embedded system, to maximize reliability.
I don't understand why statically preallocating a buffer pool, and dynamically getting and dropping it, is different from dynamically allocating memory.
As someone who has dealt with embedded systems, though not to such rigor so far (I have read DO-178B, though):
If you look at the u-boot bootloader, a lot is done with a globally placed structure. Depending on your exact application, you may be able to get away with a global structure and stack. Of course, there are re-entrancy and related issues there that don't really apply to a bootloader but might for you.
Preallocate, preallocate, preallocate. If you can at design-time bind the size of an array/list structure/etc, declare it as a global (or static global -- look Ma, encapsulation).
The stack is very useful, use it where needed -- but be careful, as it can be easy to keep allocating off of it until you have no stack space left. Some code I once found myself debugging would allocate 1k buffers for string management in multiple functions...occasionally, the usage of the buffers would hit another program's stack space, as the default stack size was 4k.
The buffer pool case may depend on exactly how it's implemented. If you know you need to pass around fixed-size buffers of a size known at compile time, dealing with a buffer pool is likely more easy to demonstrate correctness than a complete dynamic allocator. You just need to verify buffers cannot be lost, and validate your handling won't fail. There seem to be some good tips here: http://www.cotsjournalonline.com/articles/view/101217
Really, though, I think your answers might be found in joining http://www.do178site.com/
I've worked in a DO-178B environment (systems for airplanes). What I have understood, is that the main reason for not allowing dynamic allocation is mainly certification. Certification is done through tests (unitary, coverage, integration, ...). With those tests you have to prove that you the behavior of your program is 100% predictable, nearly to the point that the memory footprint of your process is the same from one execution to the next. As dynamic allocation is done on the heap (and can fail) you can not easily prove that (I imagine it should be possible if you master all the tools from the hardware to any piece of code written, but ...). You have not this problem with static allocation. That also why C++ was not used at this time in such environments. (it was about 15 years ago, that might have changed ...)
Practically, you have to write a lot of struct pools and allocation functions that guarantee that you have something deterministic. You can imagine a lot of solutions. The key is that you have to prove (with TONS of tests) a high level of deterministic behavior. It's easier to prove that your hand crafted developpement work deterministically that to prove that linux + gcc is deterministic in allocating memory.
Just my 2 cents. It was a long time ago, things might have changed, but concerning certification like DO-178B, the point is to prove your app will work the same any time in any context.
Disclaimer: I've not worked specifically with DO-178b, but I have written software for certified systems.
On the certified systems for which I have been a developer, ...
Dynamic memory allocation was
acceptable ONLY during the
initialization phase.
Dynamic memory de-allocation was NEVER acceptable.
This left us with the following options ...
Use statically allocated structures.
Create a pool of structures and then get/release them from/back to the pool.
For flexibility, we could dynamically allocate the size of the pools or number of structures during the initialization phase. However, once past that init phase, we were stuck with what we had.
Our company found that pools of structures and then get/releasing from/back into the pool was most useful. We were able to keep to the model, and keep things deterministic with minimal problems.
Hope that helps.
Real-time, long running, mission critical systems should not dynamically allocate and free memory from heap. If you need and cannot design around it to then write your own allocated and fixed pool management scheme. Yes, allocated fixed ahead of time whenever possible. Anything else is asking for eventual trouble.
Allocating everything from stack is commonly done in embedded systems or elsewhere where the possibility of an allocation failing is unacceptable. I don't know what DO-178b is, but if the problem is that malloc is not available on your platform, you can also implement it yourself (implementing your own heap), but this still may lead to an allocation failing when you run out of space, of course.
There's no way to be 100% sure.
You may look at FreeRTOS' memory allocators examples. Those use static pool, if i'm not mistaken.
You might find this question interesting as well, dynamic allocation is often prohibited in space hardened settings (actually, core memory is still useful there).
Typically, when malloc() is not available, I just use the stack. As Tronic said, the whole reason behind not using malloc() is that it can fail. If you are using a global static pool, it is conceivable that your internal malloc() implementation could be made fail proof.
It really, really, really depends on the task at hand and what the board is going to be exposed to.
I am writing C for an MPC 555 board and need to figure out how to allocate dynamic memory without using malloc.
Typically malloc() is implemented on Unix using sbrk() or mmap(). (If you use the latter, you want to use the MAP_ANON flag.)
If you're targetting Windows, VirtualAlloc may help. (More or less functionally equivalent to anonymous mmap().)
Update: Didn't realize you weren't running under a full OS, I somehow got the impression instead that this might be a homework assignment running on top of a Unix system or something...
If you are doing embedded work and you don't have a malloc(), I think you should find some memory range that it's OK for you to write on, and write your own malloc(). Or take someone else's.
Pretty much the standard one that everybody borrows from was written by Doug Lea at SUNY Oswego. For example glibc's malloc is based on this. See: malloc.c, malloc.h.
You might want to check out Ralph Hempel's Embedded Memory Manager.
If your runtime doesn't support malloc, you can find an open source malloc and tweak it to manage a chunk of memory yourself.
malloc() is an abstraction that is use to allow C programs to allocate memory without having to understand details about how memory is actually allocated from the operating system. If you can't use malloc, then you have no choice other than to use whatever facilities for memory allocation that are provided by your operating system.
If you have no operating system, then you must have full control over the layout of memory. At that point for simple systems the easiest solution is to just make everything static and/or global, for more complex systems, you will want to reserve some portion of memory for a heap allocator and then write (or borrow) some code that use that memory to implement malloc.
An answer really depends on why you might need to dynamically allocate memory. What is the system doing that it needs to allocate memory yet cannot use a static buffer? The answer to that question will guide your requirements in managing memory. From there, you can determine which data structure you want to use to manage your memory.
For example, a friend of mine wrote a thing like a video game, which rendered video in scan-lines to the screen. That team determined that memory would be allocated for each scan-line, but there was a specific limit to how many bytes that could be for any given scene. After rendering each scan-line, all the temporary objects allocated during that rendering were freed.
To avoid the possibility of memory leaks and for performance reasons (this was in the 90's and computers were slower then), they took the following approach: They pre-allocated a buffer which was large enough to satisfy all the allocations for a scan-line, according to the scene parameters which determined the maximum size needed. At the beginning of each scan-line, a global pointer was set to the beginning of the scan line. As each object was allocated from this buffer, the global pointer value was returned, and the pointer was advanced to the next machine-word-aligned position following the allocated amount of bytes. (This alignment padding was including in the original calculation of buffer size, and in the 90's was four bytes but should now be 16 bytes on some machinery.) At the end of each scan-line, the global pointer was reset to the beginning of the buffer.
In "debug" builds, there were two scan buffers, which were protected using virtual memory protection during alternating scan lines. This method detects stale pointers being used from one scan-line to the next.
The buffer of scan-line memory may be called a "pool" or "arena" depending on whome you ask. The relevant detail is that this is a very simple data structure which manages memory for a certain task. It is not a general memory manager (or, properly, "free store implementation") such as malloc, which might be what you are asking for.
Your application may require a different data structure to keep track of your free storage. What is your application?
You should explain why you can't use malloc(), as there might be different solutions for different reasons, and there are several reasons why it might be forbidden or unavailable on small/embedded systems:
concern over memory fragmentation. In this case a set of routines that allocate fixed size memory blocks for one or more pools of memory might be the solution.
the runtime doesn't provide a malloc() - I think most modern toolsets for embedded systems do provide some way to link in a malloc() implementation, but maybe you're using one that doesn't for whatever reason. In that case, using Doug Lea's public domain malloc might be a good choice, but it might be too large for your system (I'm not familiar with the MPC 555 off the top of my head). If that's the case, a very simple, custom malloc() facility might be in order. It's not too hard to write, but make sure you unit test the hell out of uit because it's also easy to get details wrong. For example, I have a set of very small routines that use a brain dead memory allocation strategy using blocks on a free list (the allocator can be compile-time configured for first, best or last fit). I give it an array of char at initialization, and subsequent allocation calls will split free blocks as necessary. It's nowhere near as sophisticated as Lea's malloc(), but it's pretty dang small so for simple uses it'll do the trick.
many embedded projects forbid the use of dynamic memory allocation - in this case, you have to live with statically allocated structures
Write your own. Since your allocator will probably be specialized to a few types of objects, I recommend the Quick Fit scheme developed by Bill Wulf and Charles Weinstock. (I have not been able to find a free copy of this paper, but many people have access to the ACM digital library.) The paper is short, easy to read, and well suited to your problem.
If you turn out to need a more general allocator, the best guide I have found on the topic of programming on machines with fixed memory is Donald Knuth's book The Art of Computer Programming, Volume 1. If you want examples, you can find good ones in Don's epic book-length treatment of the source code of TeX, TeX: The Program.
Finally, the undergraduate textbook by Bryant and O'Hallaron is rather expensive, but it goes through the implementation of malloc in excruciating detail.
Write your own. Preallocate a big chunk of static RAM, then write some functions to grab and release chunks of it. That's the spirit of what malloc() does, except that it asks the OS to allocate and deallocate memory pages dynamically.
There are a multitude of ways of keeping track of what is allocated and what is not (bitmaps, used/free linked lists, binary trees, etc.). You should be able to find many references with a few choice Google searches.
malloc() and its related functions are the only game in town. You can, of course, roll your own memory management system in whatever way you choose.
If there are issues allocating dynamic memory from the heap, you can try allocating memory from the stack using alloca(). The usual caveats apply:
The memory is gone when you return.
The amount of memory you can allocate is dependent on the maximum size of your stack.
You might be interested in: liballoc
It's a simple, easy-to-implement malloc/free/calloc/realloc replacement which works.
If you know beforehand or can figure out the available memory regions on your device, you can also use their libbmmm to manage these large memory blocks and provide a backing-store for liballoc. They are BSD licensed and free.
FreeRTOS contains 3 examples implementations of memory allocation (including malloc()) to achieve different optimizations and use cases appropriate for small embedded systems (AVR, ARM, etc). See the FreeRTOS manual for more information.
I don't see a port for the MPC555, but it shouldn't be difficult to adapt the code to your needs.
If the library supplied with your compiler does not provide malloc, then it probably has no concept of a heap.
A heap (at least in an OS-less system) is simply an area of memory reserved for dynamic memory allocation. You can reserve such an area simply by creating a suitably sized statically allocated array and then providing an interface to provide contiguous chunks of this array on demand and to manage chunks in use and returned to the heap.
A somewhat neater method is to have the linker allocate the heap from whatever memory remains after stack and static memory allocation. That way the heap is always automatically as large as it possibly can be, allowing you to use all available memory simply. This will require modification of the application's linker script. Linker scripts are specific to the particular toolchain, and invariable somewhat arcane.
K&R included a simple implementation of malloc for example.