Popular use of Dynamic memory allocation - c

I have been reading coding standards in C and most of them discourages use of dynamic memory allocation.But In popular use Dynamic memory allocation leads .Any solid reason for this.I am asking the reasons for its use despite the Demerits it posses ?
These are my references
JPL Standards :http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf
Power of 10 :http://spinroot.com/gerard/pdf/P10.pdf

Dynamic memory allocation is generally banned in embedded systems programming, particularly in safety-critical embedded software. All industry standards for safety-critical software bans it: MISRA-C, DO178B, IEC 61508, ISO 26262 and so on.
There are many well-known issues with dynamic memory allocation: slow and possibly indeterministic access time, memory leaks and heap fragmentation.
None of these issues are desired in any kind of program. But in PC/desktop etc programming, they are regarded as a necessary evil, mainly because the mainstream operative systems restrict the amount of static process memory given to each process and if you want to store data beyond that, you have to store it on the heap.
It is also convenient to use dynamic memory when the amount of data isn't known until runtime. However, there exist no computer in the known world with unlimited memory, so "I want to use a completely variable amount of data, I don't know how much" is kind of a nonsense argument. A proper software engineer always designs for the worst case scenario.
Particularly in embedded systems, where the amount of RAM is limited and the consequences of bugs are far more dire than an out-of-memory message box popping up,
your program must have 100% deterministic behavior. You can't design in things like "this program will work until it runs out of RAM, then it will crash and burn". You can't allow a variable "x" number of trains to exist in your railway supervisory system, you must specify the upper limit and design the system after that.
So no matter all the issues with dynamic memory mentioned above, you don't want to use dynamic memory in these kind of systems, simply because it doesn't make any sense.
Recursion is also banned from these systems, for pretty much the same reasons.

Dynamic memory allocation in C sits on the blurry line between abstract mathematics and real-world engineering. Mathematically you say, "put this data in some memory", and indeed malloc() just gives you "some memory", basically pretending that there is an unbounded amount of memory. (And on many real-world systems malloc() does in fact never fail, due to over-comitting.)
Real engineering has to face the boundedness of all resources, and if you approach a problem knowing full well that you have X amound of memory available, then you have to plan where the memory goes. This is more cumbersome and challenging, but it can also lead to better code and better performance, if for no other reason than that it forces you to think carefully about the data flow of your program.
By contrast to common desktop machines on which malloc() never fails, there are also, at the opposite end of the spectrum, embedded machines which don't have sophisticated memory managers and on which malloc() essentially "always fails". If you are able to program without it, then you are able to program for such platforms. On the other hand, if your programming style always assumes the unlimited availability of magic memory, then you will find programming on such platforms very difficult.

Related

How come the stack cannot be increased during runtime in most operating system?

Is it to avoid fragmentation? Or some other reason? A set lifetime for a memory allocation is a pretty useful construct, compared to malloc() which has a manual lifetime.
The space used for stack is increased and decreased frequently during program execution, as functions are called and return. The maximum space allowed for the stack is commonly a fixed limit.
In older computers, memory was a very limited resource, and it may still be in small devices. In these cases, the hardware capability may impose a necessary limit on the maximum stack size.
In modern systems with plenty of memory, a great deal may be available for stack, but the maximum allowed is generally set to some lower value. That provides a means of catching “runaway” programs. Generally, controlling how much stack space a program uses is a neglected part of software engineering. Limits have been set based on practical experience, but we could do better.1
Theoretically, a program could be given a small initial limit and could, if it finds itself working on a “large” problem, tell the operating system it intends to use more. That would still catch most “runaway” programs while allowing well crafted programs to use more space. However, by and large we design programs to use stack for program control (managing function calls and returns, along with a modest amount of space for local data) and other memory for the data being operated on. So stack space is largely a function of program design (which is fixed) rather than problem size. That model has been working well, so we keep using it.
Footnote
1 For example, compilers could report, for each routine that does not use objects with run-time variable size, the maximum space used by the routine in any path through it. Linkers or other tools could report, for any call tree [hence without loops] the maximum stack space used. Additional tools could help analyze stack use in call graphs with potential recursion.
How come the stack cannot be increased during runtime in most operating system?
This is wrong for Linux. On recent Linux systems, each thread has its own call stack (see pthreads(7)), and an application could (with clever tricks) increase some call stacks using mmap(2) and mremap(2) after querying the call stacks thru /proc/ (see proc(5) and use /proc/self/maps) like e.g. pmap(1) does.
Of course, such code is architecture specific, since in some cases the call stack grows towards increasing addresses and in other cases towards decreasing addresses.
Read also Operating Systems: Three Easy Pieces and the OSDEV wiki, and study the source code of GNU libc.
BTW, Appel's book Compiling with Continuations, his old paper Garbage Collection can be faster than Stack Allocation and this paper on Compiling with Continuations and LLVM could interest you, and both are very related to your question: sometimes, there is almost "no call stack" and it makes no sense to "increase it".

memory allocation/deallocation for embedded devices

Currently we use malloc/free Linux commands for memory allocation/de-allocation in our C based embedded application. I heard that this would cause memory fragmentation as the heap size increases/decreases because of memory allocation/de-allocation which would result in performance degradation. Other programming languages with efficient Garbage Collection solves this issue by freeing the memory when not in use.
Are there any alternate approaches which would solve this issue in C based embedded programs ?
You may take a look at a solution called memory pool allocation.
See: Memory pools implementation in C
Yes, there's an easy solution: don't use dynamic memory allocation outside of initialization.
It is common (in my experience) in embedded systems to only allow calls to malloc when a program starts (this is usually done by convention, there's nothing in C to enforce this. Although you can create your own wrapper for malloc to do this). This requires more work to analyze what memory your program could possibly use since you have to allocate it all at once. The benefit you get, however, is a complete understanding of what memory your program uses.
In some cases this is fairly straightforward, in particular if your system has enough memory to allocate everything it could possibly need all at once. In severely memory-limited systems, however, you're left with the managing the memory yourself. I've seen this done by writing "custom allocators" which you allocate and free memory from. I'll provide an example.
Let's say you're implementing some mathematical program that needs lots of big matrices (not terribly huge, but for example 1000x1000 floats). Your system may not have the memory to allocate many of these matrices, but if you can allocate at least one of them, you could create a pool of memory used for matrix objects, and every time you need a matrix you grab memory from that pool, and when you're done with it you return it to the pool. This is easy if you can return them in the same order you got them in, meaning the memory pool works just like a stack. If this isn't the case, perhaps you could just clear the entire pool at the end of each "iteration" (assuming this math system is periodic).
With more detail about what exactly you're trying to implement I could provide more relevant/specific examples.
Edit: See sg7's answer as well: that user provides a link to well-established frameworks which implement what I describe here.

What is the optimal amount to malloc at a given time when the total needed is not known?

I've implemented a multi-level cache simulator that needs to store the values currently in the simulator. With current configurations, the maximum size of all values being stored could reach 2G. Obviously I'm not going to assume this worst case scenario and allocate all of that memory up-front. Instead, I have the program set to allocate memory as needed in chunks. The expense of this allocation is exacerbated by the fact that I'm callocing in order to provide 0 values when no write has occurred previously at the specified location.
My question is, is there a good heuristic for how much memory should be allocated each time more is needed? Currently I'm using an arbitrary value and I considered some solution that would use some ratio of the total system memory (I presume it's possible to dynamically detect this at compile and/or runtime), but even with the latter I'm using an arbitrary ratio with still doesn't sit well with me.
Any insight into best practices for this kind of situation would be appreciated!
A common rule of thumb is to grow geometrically, for example by doubling, on each reallocation.
It's best to understand allocation patterns of your program, if this is a problem you need to optimize for. This comes by understanding the program's implementation, the architecture(s) it runs within, and by observation (e.g. time and memory profiling).
The truth is, you can optimize from many perspectives, but things change over time (inputs change, environments change). In the user-land, your memory usage is already second guessed.
Given your allocation sizes, I assume you are already depending on a system which will default to a backing store as needed. As such, you don't have much control over what is paged or when. Peeking at available physical memory is not worth consideration in this case, and you will have to work hard to do better than the system's existing virtual memory implementation. Several of these systems try to use all available memory (e.g. "Unused RAM is wasted RAM").
Having said that and if those assumptions are correct: It's often better to just reduce your allocation sizes and working sets and do I/O yourself as needed.
Your OSs probably use disk caching as well; reads and writes are probably faster than you suspect for large blocks of memory.
Even deeper: Use virtual memory or memory mapped files for these large data sets. Your kernel will likely handle these cases very well.
Obviously I'm not going to assume this worst case scenario and allocate all of that memory up-front.
Then you will likely be surprised to learn that a 2 GB calloc alone may be better than other alternatives people come up with in some environments because a large calloc could just reserve a domain in virtual memory, loading/initializing pages only when you access them. Depending on your usage, this approach will be much better than some alternatives you may be given.
A good starting point for many problems when understanding a program or input's allocation patterns is to start out conservative, and then make the most beneficial adjustments based on observation. In many cases, you will need little more information than a) accurately determining how much to resize by when resizing is necessary b) reusing allocations where appropriate c) designing your data well for the problem at hand.

Is realloc() safe in embedded system?

While developing a piece of software for embedded system I used realloc() function many times. Now I've been said that I "should not use realloc() in embedded" without any explanation.
Is realloc() dangerous for embedded system and why?
Yes, all dynamic memory allocation is regarded as dangerous, and it is banned from most "high integrity" embedded systems, such as industrial/automotive/aerospace/med-tech etc etc. The answer to your question depends on what sort of embedded system you are doing.
The reasons it's banned from high integrity embedded systems is not only the potential memory leaks, but also a lot of dangerous undefined/unspecified/impl.defined behavior asociated with those functions.
EDIT: I also forgot to mention heap fragmentation, which is another danger. In addition, MISRA-C also mentions "data inconsistency, memory exhaustion, non-deterministic behaviour" as reasons why it shouldn't be used. The former two seem rather subjective, but non-deterministic behaviour is definitely something that isn't allowed in these kind of systems.
References:
MISRA-C:2004 Rule 20.4 "Dynamic heap memory allocation shall not be used."
IEC 61508 Functional safety, 61508-3 Annex B (normative) Table B1, >SIL1: "No dynamic objects", "No dynamic variables".
It depends on the particular embedded system. Dynamic memory management on an small embedded system is tricky to begin with, but realloc is no more complicated than a free and malloc (of course, that's not what it does). On some embedded systems you'd never dream of calling malloc in the first place. On other embedded systems, you almost pretend it's a desktop.
If your embedded system has a poor allocator or not much RAM, then realloc might cause fragmentation problems. Which is why you avoid malloc too, cause it causes the same problems.
The other reason is that some embedded systems must be high reliability, and malloc / realloc can return NULL. In these situations, all memory is allocated statically.
In many embedded systems, a custom memory manager can provide better semantics than are available with malloc/realloc/free. Some applications, for example, can get by with a simple mark-and-release allocator. Keep a pointer to the start of not-yet-allocated memory, allocate things by moving the pointer upward, and jettison them by moving the pointer below them. That won't work if it's necessary to jettison some things while keeping other things that were allocated after them, but in situations where that isn't necessary the mark-and-release allocator is cheaper than any other allocation method. In some cases where the mark-and-release allocator isn't quite good enough, it may be helpful to allocate some things from the start of the heap and other things from the end of the heap; one may free up the things allocated from one end without affecting those allocated from the other.
Another approach that can sometimes be useful in non-multitasking or cooperative-multitasking systems is to use memory handles rather than direct pointers. In a typical handle-based system, there's a table of all allocated objects, built at the top of memory working downward, and objects themselves are allocated from the bottom up. Each allocated object in memory holds either a reference to the table slot that references it (if live) or else an indication of its size (if dead). The table entry for each object will hold the object's size as well as a pointer to the object in memory. Objects may be allocated by simply finding a free table slot (easy, since table slots are all fixed size), storing the address of the object's table slot at the start of free memory, storing the object itself just beyond that, and updating the start of free memory to point just past the object. Objects may be freed by replacing the back-reference with a length indication, and freeing the object in the table. If an allocation would fail, relocate all live objects starting at the top of memory, overwriting any dead objects, and updating the object table to point to their new addresses.
The performance of this approach is non-deterministic, but fragmentation is not a problem. Further, it may be possible in some cooperative multitasking systems to perform garbage collection "in the background"; provided that the garbage collector can complete a pass in the time it takes to chug through the slack space, long waits can be avoided. Further, some fairly simple "generational" logic may be used to improve average-case performance at the expense of worst-case performance.
realloc can fail, just like malloc can. This is one reason why you probably should not use either in an embedded system.
realloc is worse than malloc in that you will need to have the old and new pointers valid during the realloc. In other words, you will need 2X the memory space of the original malloc, plus any additional amount (assuming realloc is increasing the buffer size).
Using realloc is going to be very dangerous, because it may return a new pointer to your memory location. This means:
All references to the old pointer must be corrected after realloc.
For a multi-threaded system, the realloc must be atomic. If you are disabling interrupts to achieve this, the realloc time might be long enough to cause a hardware reset by the watchdog.
Update: I just wanted to make it clear. I'm not saying that realloc is worse than implementing realloc using a malloc/free. That would be just as bad. If you can do a single malloc and free, without resizing, it's slightly better, yet still dangerous.
The issues with realloc() in embedded systems are no different than in any other system, but the consequences may be more severe in systems where memory is more constrained, and the sonsequences of failure less acceptable.
One problem not mentioned so far is that realloc() (and any other dynamic memory operation for that matter) is non-deterministic; that is it's execution time is variable and unpredictable. Many embedded systems are also real-time systems, and in such systems, non-deterministic behaviour is unacceptable.
Another issue is that of thread-safety. Check your library's documantation to see if your library is thread-safe for dynamic memory allocation. Generally if it is, you will need to implement mutex stubs to integrate it with your particular thread library or RTOS.
Not all emebdded systems are alike; if your embedded system is not real-time (or the process/task/thread in question is not real-time, and is independent of the real-time elements), and you have large amounts of memory unused, or virtual memory capabilities, then the use of realloc() may be acceptable, if perhaps ill-advised in most cases.
Rather than accept "conventional wisdom" and bar dynamic memory regardless, you should understand your system requirements, and the behaviour of dynamic memory functions and make an appropriate decision. That said, if you are building code for reuability and portability to as wide a range of platforms and applications as possible, then reallocation is probably a really bad idea. Don't hide it in a library for example.
Note too that the same problem exists with C++ STL container classes that dynamically reallocate and copy data when the container capacity is increased.
Well, it's better to avoid using realloc if it's possible, since this operation is costly especially being put into the loop: for example, if some allocated memory needs to be extended and there no gap between after current block and the next allocated block - this operation is almost equals: malloc + memcopy + free.

Resources for memory management in embedded application

How should I manage memory in my mission critical embedded application?
I found some articles with google, but couldn't pinpoint a really useful practical guide.
The DO-178b forbids dynamic memory allocations, but how will you manage the memory then? Preallocate everything in advance and send a pointer to each function that needs allocation? Allocate it on the stack? Use a global static allocator (but then it's very similar to dynamic allocation)?
Answers can be of the form of regular answer, reference to a resource, or reference to good opensource embedded system for example.
clarification: The issue here is not whether or not memory management is availible for the embedded system. But what is a good design for an embedded system, to maximize reliability.
I don't understand why statically preallocating a buffer pool, and dynamically getting and dropping it, is different from dynamically allocating memory.
As someone who has dealt with embedded systems, though not to such rigor so far (I have read DO-178B, though):
If you look at the u-boot bootloader, a lot is done with a globally placed structure. Depending on your exact application, you may be able to get away with a global structure and stack. Of course, there are re-entrancy and related issues there that don't really apply to a bootloader but might for you.
Preallocate, preallocate, preallocate. If you can at design-time bind the size of an array/list structure/etc, declare it as a global (or static global -- look Ma, encapsulation).
The stack is very useful, use it where needed -- but be careful, as it can be easy to keep allocating off of it until you have no stack space left. Some code I once found myself debugging would allocate 1k buffers for string management in multiple functions...occasionally, the usage of the buffers would hit another program's stack space, as the default stack size was 4k.
The buffer pool case may depend on exactly how it's implemented. If you know you need to pass around fixed-size buffers of a size known at compile time, dealing with a buffer pool is likely more easy to demonstrate correctness than a complete dynamic allocator. You just need to verify buffers cannot be lost, and validate your handling won't fail. There seem to be some good tips here: http://www.cotsjournalonline.com/articles/view/101217
Really, though, I think your answers might be found in joining http://www.do178site.com/
I've worked in a DO-178B environment (systems for airplanes). What I have understood, is that the main reason for not allowing dynamic allocation is mainly certification. Certification is done through tests (unitary, coverage, integration, ...). With those tests you have to prove that you the behavior of your program is 100% predictable, nearly to the point that the memory footprint of your process is the same from one execution to the next. As dynamic allocation is done on the heap (and can fail) you can not easily prove that (I imagine it should be possible if you master all the tools from the hardware to any piece of code written, but ...). You have not this problem with static allocation. That also why C++ was not used at this time in such environments. (it was about 15 years ago, that might have changed ...)
Practically, you have to write a lot of struct pools and allocation functions that guarantee that you have something deterministic. You can imagine a lot of solutions. The key is that you have to prove (with TONS of tests) a high level of deterministic behavior. It's easier to prove that your hand crafted developpement work deterministically that to prove that linux + gcc is deterministic in allocating memory.
Just my 2 cents. It was a long time ago, things might have changed, but concerning certification like DO-178B, the point is to prove your app will work the same any time in any context.
Disclaimer: I've not worked specifically with DO-178b, but I have written software for certified systems.
On the certified systems for which I have been a developer, ...
Dynamic memory allocation was
acceptable ONLY during the
initialization phase.
Dynamic memory de-allocation was NEVER acceptable.
This left us with the following options ...
Use statically allocated structures.
Create a pool of structures and then get/release them from/back to the pool.
For flexibility, we could dynamically allocate the size of the pools or number of structures during the initialization phase. However, once past that init phase, we were stuck with what we had.
Our company found that pools of structures and then get/releasing from/back into the pool was most useful. We were able to keep to the model, and keep things deterministic with minimal problems.
Hope that helps.
Real-time, long running, mission critical systems should not dynamically allocate and free memory from heap. If you need and cannot design around it to then write your own allocated and fixed pool management scheme. Yes, allocated fixed ahead of time whenever possible. Anything else is asking for eventual trouble.
Allocating everything from stack is commonly done in embedded systems or elsewhere where the possibility of an allocation failing is unacceptable. I don't know what DO-178b is, but if the problem is that malloc is not available on your platform, you can also implement it yourself (implementing your own heap), but this still may lead to an allocation failing when you run out of space, of course.
There's no way to be 100% sure.
You may look at FreeRTOS' memory allocators examples. Those use static pool, if i'm not mistaken.
You might find this question interesting as well, dynamic allocation is often prohibited in space hardened settings (actually, core memory is still useful there).
Typically, when malloc() is not available, I just use the stack. As Tronic said, the whole reason behind not using malloc() is that it can fail. If you are using a global static pool, it is conceivable that your internal malloc() implementation could be made fail proof.
It really, really, really depends on the task at hand and what the board is going to be exposed to.

Resources