Direct stack and heap access; Virtual- or hardware- level? - heap-memory

When I'm on SO I read a lot of comments guiding (Especially in C)
"dynamic allocation allways goes to the heap, automatic allocation on the stack"
But especially regarding to plain C I disaggree with that. As the ISO/IEC9899 doesn't even drop a word of heap or stack. It just mentions three storage duriations (static, automatic, and allocated) and advises how each of them has to be treat.
What would give a compiler the option to do it even wise versa if it would like to.
So my question is:
Are the heap and the stack physical existing that (even if not in C) a standardized language can say "... has to happen on heap and ... on the stack"?
Or are they just a virtuell system of managing memory access so that a language can't make rules about them, as it can't even be ensured the enviroment supports them?
In my knowledgebase only the second would make sense. But I read allready many times people writing comments like "In language XY this WILL happen on the stack/heap". But if I'm right this had to be indeterminable as long the language isn't just made for such systems which guarantee to have a stack and heap. And all thoose comments would be wrong.
Thats what lead me to ask about this question. Am I that wrong, or is there a big error in reasoning going around about that?

You are correct in that the C spec doesn't require the use of a heap or stack, as long as it implements the storage classes correctly.
However, virtually every compiler will use stacks for automatic variables and heaps for allocated variables. While you could implement a compiler that doesn't use a stack or heap, it probably wouldn't perform very well and wouldn't be familiar to most devs.
So when people say "always", they really mean "virtually always".

Related

Popular use of Dynamic memory allocation

I have been reading coding standards in C and most of them discourages use of dynamic memory allocation.But In popular use Dynamic memory allocation leads .Any solid reason for this.I am asking the reasons for its use despite the Demerits it posses ?
These are my references
JPL Standards :http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf
Power of 10 :http://spinroot.com/gerard/pdf/P10.pdf
Dynamic memory allocation is generally banned in embedded systems programming, particularly in safety-critical embedded software. All industry standards for safety-critical software bans it: MISRA-C, DO178B, IEC 61508, ISO 26262 and so on.
There are many well-known issues with dynamic memory allocation: slow and possibly indeterministic access time, memory leaks and heap fragmentation.
None of these issues are desired in any kind of program. But in PC/desktop etc programming, they are regarded as a necessary evil, mainly because the mainstream operative systems restrict the amount of static process memory given to each process and if you want to store data beyond that, you have to store it on the heap.
It is also convenient to use dynamic memory when the amount of data isn't known until runtime. However, there exist no computer in the known world with unlimited memory, so "I want to use a completely variable amount of data, I don't know how much" is kind of a nonsense argument. A proper software engineer always designs for the worst case scenario.
Particularly in embedded systems, where the amount of RAM is limited and the consequences of bugs are far more dire than an out-of-memory message box popping up,
your program must have 100% deterministic behavior. You can't design in things like "this program will work until it runs out of RAM, then it will crash and burn". You can't allow a variable "x" number of trains to exist in your railway supervisory system, you must specify the upper limit and design the system after that.
So no matter all the issues with dynamic memory mentioned above, you don't want to use dynamic memory in these kind of systems, simply because it doesn't make any sense.
Recursion is also banned from these systems, for pretty much the same reasons.
Dynamic memory allocation in C sits on the blurry line between abstract mathematics and real-world engineering. Mathematically you say, "put this data in some memory", and indeed malloc() just gives you "some memory", basically pretending that there is an unbounded amount of memory. (And on many real-world systems malloc() does in fact never fail, due to over-comitting.)
Real engineering has to face the boundedness of all resources, and if you approach a problem knowing full well that you have X amound of memory available, then you have to plan where the memory goes. This is more cumbersome and challenging, but it can also lead to better code and better performance, if for no other reason than that it forces you to think carefully about the data flow of your program.
By contrast to common desktop machines on which malloc() never fails, there are also, at the opposite end of the spectrum, embedded machines which don't have sophisticated memory managers and on which malloc() essentially "always fails". If you are able to program without it, then you are able to program for such platforms. On the other hand, if your programming style always assumes the unlimited availability of magic memory, then you will find programming on such platforms very difficult.

C Language: Why does malloc() return a pointer, and not the value?

From my understanding of C it seems that you are supposed to use malloc(size) whenever you are trying to initialize, for instance, an array whose size you do not know of until runtime.
But I was wondering why the function malloc() returns a pointer to the location of the variable and why you even need that.
Basically, why doesn't C just hide it all from you, so that whenever you do something like this:
// 'n' gets stdin'ed from the user
...
int someArray[n];
for(int i = 0; i < n; i++)
someArray[i] = 5;
you can do it without ever having to call malloc() or some other function? Do other languages do it like this (by hiding the memory properties/location altogether)? I feel that as a beginner this whole process of dealing with the memory locations of variables you use just confuse programmers (and since other languages don't use it, C seems to make a simple initialization process such as this overly complicated)...
Basically, what I'm trying to ask is why malloc() is even necessary, because why the language doesn't take care of all that for you internally without the programmer having to be concerned about or having to see memory. Thanks
*edit: Ok, maybe there are some versions of C that I'm not aware of that allows you to forgo the use of malloc() but let's try to ignore that for now...
C lets you manage every little bit of your program. You can manage when memory gets allocated; you can manage when it gets deallocated; you can manage how to grow a small allocation, etc.
If you prefer not to manage that and let the compiler do it for you, use another language.
Actually C99 allows this (so you're not the only one thinking of it). The feature is called VLA (VAriable Length Array).
It's legal to read an int and then have an array of that size:
int n;
fscanf("%d", &n);
int array[n];
Of course there are limitations since malloc uses the heap and VLAs use the stack (so the VLAs can't be as big as the malloced objects).
*edit: Ok, maybe there are some versions of C that I'm not aware of that allows you to forgo the use of malloc() but let's try to ignore
that for now...
So we can concentrate on the flame ?
Basically, what I'm trying to ask is why malloc() is even necessary,
because why the language doesn't take care of all that for you
internally without the programmer having to be concerned about or
having to see memory.
The very point of malloc(), it's raison d'ĂȘtre, it's function, if you will, is to allocate a block of memory. The way we refer to a block of memory in C is by its starting address, which is by definition a pointer.
C is close to 40 years old, and it's not nearly as "high level" as some more modern languages. Some languages, like Java, attempt to prevent mistakes and simplify programming by hiding pointers and explicit memory management from the programmer. C is not like that. Why? Because it just isn't.
Basically, what I'm trying to ask is why malloc() is even necessary, because why the language doesn't take care of all that for you internally without the programmer having to be concerned about or having to see memory. Thanks
One of the hallmarks of C is its simplicity (C compilers are relatively easy to implement); one way of making a language simple is to force the programmer to do all his own memory management. Clearly, other languages do manage objects on the heap for you - Java and C# are modern examples, but the concept isn't new at all; Lisp implementations have been doing it for decades. But that convenience comes at a cost in both compiler complexity and runtime performance.
The Java/C# approach helps eliminate whole classes of memory-management bugs endemic to C (memory leaks, invalid pointer dereferences, etc.). By the same token, C provides a level of control over memory management that allows the programmer to achieve high levels of performance that would be difficult (not impossible) to match in other languages.
If the only purpose of dynamic allocation were to allocate variable-length arrays, then malloc() might not be necessary. (But note that malloc() was around long before variable-length arrays were added to the language.)
But the size of a VLA is fixed (at run time) when the object is created. It can't be resized, and it's deallocated only when you leave the scope in which it's declared. (And VLAs, unlike malloc(), don't have a mechanism for reporting allocation failures.)
malloc() gives you a lot more flexibility.
Consider creating a linked list. Each node is a structure, containing some data and a pointer to the next node in the list. You might know the size of each node in advance, but you don't know how many nodes to allocate. For example, you might read lines from a text file, creating and appending a new node for each line.
You can also use malloc() along with realloc() to create a buffer (say, an array of unsigned char) whose size can be changed after you created it.
Yes, there are languages that don't expose pointers, and that handle memory management for you.
A lot of them are implemented in C.
Maybe the question should be "why do you need something like int array[n] when you can use pointers?"
After all, pointers allow you to keep an object alive beyond the scope it was created in, you can use pointer to slice and dice arrays (for example strchr() returns a pointer to a string), pointers are light-weight objects, so it's cheap to pass them to functions and return them from functions, etc.
But the real answer is "that's how it is". Other options are possible, and the proof is that there are other languages that do other things (and even C99 allows different things).
C is treated as highly developed low-level language, basically malloc is used in dynamic arrays which is a key component in stack & queue. for other languages that hides the pointer part from the developer are not well capable of doing hardware related programming.
The short answer to your question is to ponder this question: What if you also need to control exactly when the memory is de-allocated?
C is a compiled language, not an interpreted one. If you don't know n at compile time, how is the compiler supposed to produce a binary?

Is making smaller functions generally more efficient memory-wise since variables get deallocated more frequently?

Is dividing the work into 5 functions as opposed to one big function more memory efficient in C since at a given time there are fewer variables in memory, as the stack-frame gets deallocated more often? Does it depend on the compiler, and optimization? if so in what compilers is it faster?
Answer given there are a lot of local variables and the stack frames comes from a centralized main and not created on the top of each other.
I know other advantages of breaking out the function into smaller functions. Please answer this question, only in respect to memory usage.
It might reduce "high water mark" of stack usage for your program, and if so that might reduce the overall memory requirement of the program.
Yes, it depends on optimization. If the optimizer inlines the function calls, you might well find that all the variables of all the functions inlined are wrapped into one big stack frame. Any compiler worth using is capable of inlining[*], so the fact that it can happen doesn't depend on compiler. Exactly when it happens, will differ.
If your local variables are small, though, then it's fairly rare for your program to use more stack than has been automatically allocated to you at startup. Unless you go past what you're given initially, how much you use makes no difference to overall memory requirements.
If you're putting great big structures on the stack (multiple kilobytes), or if you're on a machine where a kilobyte is a lot of memory, then it might make a difference to overall memory usage. So, if by "a lot of local variables" you mean few dozen ints and pointers then no, nothing you do makes any significant difference. If by "a lot of local variables" you mean a few dozen 10k buffers, or if your function recurses very deep so that you have hundreds of levels of your few dozen ints, then it's a least possible it could make a difference, depending on the OS and configuration.
The model that stack and heap grow towards each other through general RAM, and the free memory in the middle can be used equally by either one of them, is obsolete. With the exception of a very few, very restricted systems, memory models are not designed that way any more. In modern OSes, we have so-called "virtual memory", and stack space is allocated to your program one page at a time. Most of them automatically allocate more pages of stack as it is used, up to a configured limit that's usually very large. A few don't automatically extend stack (Symbian last I used it, which was some years ago, didn't, although arguably Symbian is not a "modern" OS). If you're using an embedded OS, check what the manual says about stack.
Either way, the only thing that affects total memory use is how many pages of stack you need at any one time. If your system automatically extends stack, you won't even notice how much you're using. If it doesn't, you'll need to ensure that the program is given sufficient stack for its high-water mark, and that's when you might notice excessive stack use.
In short, this is one of those things that in theory makes a difference, but in practice that difference is almost always insignificant. It only matters if your program uses massive amounts of stack relative to the resources of the environment it runs in.
[*] People programming in C for PICs or something, using a C compiler that is basically a non-optimizing assembler, are allowed to be offended that I've called their compiler "not worth using". The stack on such devices is so different from "typical" systems that the answer is different anyway.
I think in most cases the area of memory allocated for the stack (for the entire program) remains constant. The amount in use will change based on the depth of call stack and that amount would be less when fewer variables are used (but note that function calls push the return address and stack pointer also).
Also it depends on how the functions are called. If two functions are called in series, for example, and the stack of the first is popped before the call to the second, then you'll be using less of the stack..but if the first function calls the second then you're back to where you were with one big function (plus the function call overhead).
There's no memory allocation on stack - just moving the stack pointer towards next value. While stack size itself is predefined. So there's no difference in memory usage (apart of situations when you get stack overflow).
Yes, in the same vein that using a finer coat of paint on a jet plane increases its aerodynamic properties. Ok, that's a bad analogy, but the point is that if there is ever a question of making things clear and telegraphic or trying to use more functions, go with telegraphic. In most cases these are not mutually exclusive anyway as the beginners tend to give subroutines or functions too much to do.
In terms of memory I think that if you are truly splitting up up work (f, then g, then h) then you will see some minute available memory increases but if these are interdependent then you will not.
As #Joel Burget says, memory management is not really a consideration in code structuring.
Just my take.
Splitting a huge function into smaller ones does have its benefits, among them is potentially more optimized memory usage.
Say, you have this function.
void huge_func(int input) {
char a[1024];
char b[1024];
// do something with input and a
// do something with input and b
}
And you split it to two.
void func_a(int input) {
char a[1024];
// do something with input and a
}
void func_b(int input) {
char b[1024];
// do something with input and b
}
Calling huge_func will take at least 2048 bytes of memory, and calling func_a then func_b achieves the same outcome with about half less memory. However, if inside func_a you call func_b, the amount of memory used is about the same as huge_func. Essentially, as what #sje397 wrote.
I might be wrong to say this but I do not think there is any compiler optimization that could help you reduce the usage of stack memory. I believe the layout of stack memory must ensure that sufficient memory is reserved for all declared variables, whether used or not.

Why isn't there a "memsize" in C which returns the size of a memory block allocated in the heap using malloc?

ok. It can be called anything else as in _msize in Visual Studio.
But why is it not in the standard to return the size of the memory given the memory block alloced using malloc? Since we can not tell how much memory is pointed to by the return pointer following malloc, we could use this "memsize" call to return that information should we need it. "memsize" would be implementation specific as are malloc/free
Just asking as I had to write a wrapper sometime back to store some additional bytes for the size.
Because the C library, including malloc, was designed for minimum overhead. A function like the one you want would require the implementation to record the exact size of the allocation, while implementations may now choose to "round" the size up as they please, to prevent actually reallocating in realloc.
Storing the size requires an extra size_t per allocation, which may be heavy for embedded systems. (And for the PDP-11s and 286s that were still abundant when C89 was written.)
To turn this around, why should there be? There's plenty of stuff in the Standards already, particularly the C++ standard. What are your use cases?
You ask for an adequately-sized chunk of memory, and you get it (or a null pointer or exception). There may or may not be additional bytes allocated, and some of these may be reserved. This is conceptually simple: you ask for what you want, and you get something you can use.
Why complicate it?
I don't think there is any definite answer. The developers of the standard probably considered it, and weighed the pros and cons. Anything that goes into a standard must be implemented by every implementation, so adding things to it places a significant burden on developers. I guess they just didn't find that feature useful enough to warrant this.
In C++, the wrapper that you talk about is provided by the standard. If you allocate a block of memory with std::vector, you can use the member function vector::size() to determine the size of the array and use vector::capacity() to determine the size of the allocation (which might be different).
C, on the other hand, is a low-level language which leaves such concerns to be managed by the developer, since tracking it dynamically (as you suggest) is not strictly necessary and would be redundant in many cases.

Resources for memory management in embedded application

How should I manage memory in my mission critical embedded application?
I found some articles with google, but couldn't pinpoint a really useful practical guide.
The DO-178b forbids dynamic memory allocations, but how will you manage the memory then? Preallocate everything in advance and send a pointer to each function that needs allocation? Allocate it on the stack? Use a global static allocator (but then it's very similar to dynamic allocation)?
Answers can be of the form of regular answer, reference to a resource, or reference to good opensource embedded system for example.
clarification: The issue here is not whether or not memory management is availible for the embedded system. But what is a good design for an embedded system, to maximize reliability.
I don't understand why statically preallocating a buffer pool, and dynamically getting and dropping it, is different from dynamically allocating memory.
As someone who has dealt with embedded systems, though not to such rigor so far (I have read DO-178B, though):
If you look at the u-boot bootloader, a lot is done with a globally placed structure. Depending on your exact application, you may be able to get away with a global structure and stack. Of course, there are re-entrancy and related issues there that don't really apply to a bootloader but might for you.
Preallocate, preallocate, preallocate. If you can at design-time bind the size of an array/list structure/etc, declare it as a global (or static global -- look Ma, encapsulation).
The stack is very useful, use it where needed -- but be careful, as it can be easy to keep allocating off of it until you have no stack space left. Some code I once found myself debugging would allocate 1k buffers for string management in multiple functions...occasionally, the usage of the buffers would hit another program's stack space, as the default stack size was 4k.
The buffer pool case may depend on exactly how it's implemented. If you know you need to pass around fixed-size buffers of a size known at compile time, dealing with a buffer pool is likely more easy to demonstrate correctness than a complete dynamic allocator. You just need to verify buffers cannot be lost, and validate your handling won't fail. There seem to be some good tips here: http://www.cotsjournalonline.com/articles/view/101217
Really, though, I think your answers might be found in joining http://www.do178site.com/
I've worked in a DO-178B environment (systems for airplanes). What I have understood, is that the main reason for not allowing dynamic allocation is mainly certification. Certification is done through tests (unitary, coverage, integration, ...). With those tests you have to prove that you the behavior of your program is 100% predictable, nearly to the point that the memory footprint of your process is the same from one execution to the next. As dynamic allocation is done on the heap (and can fail) you can not easily prove that (I imagine it should be possible if you master all the tools from the hardware to any piece of code written, but ...). You have not this problem with static allocation. That also why C++ was not used at this time in such environments. (it was about 15 years ago, that might have changed ...)
Practically, you have to write a lot of struct pools and allocation functions that guarantee that you have something deterministic. You can imagine a lot of solutions. The key is that you have to prove (with TONS of tests) a high level of deterministic behavior. It's easier to prove that your hand crafted developpement work deterministically that to prove that linux + gcc is deterministic in allocating memory.
Just my 2 cents. It was a long time ago, things might have changed, but concerning certification like DO-178B, the point is to prove your app will work the same any time in any context.
Disclaimer: I've not worked specifically with DO-178b, but I have written software for certified systems.
On the certified systems for which I have been a developer, ...
Dynamic memory allocation was
acceptable ONLY during the
initialization phase.
Dynamic memory de-allocation was NEVER acceptable.
This left us with the following options ...
Use statically allocated structures.
Create a pool of structures and then get/release them from/back to the pool.
For flexibility, we could dynamically allocate the size of the pools or number of structures during the initialization phase. However, once past that init phase, we were stuck with what we had.
Our company found that pools of structures and then get/releasing from/back into the pool was most useful. We were able to keep to the model, and keep things deterministic with minimal problems.
Hope that helps.
Real-time, long running, mission critical systems should not dynamically allocate and free memory from heap. If you need and cannot design around it to then write your own allocated and fixed pool management scheme. Yes, allocated fixed ahead of time whenever possible. Anything else is asking for eventual trouble.
Allocating everything from stack is commonly done in embedded systems or elsewhere where the possibility of an allocation failing is unacceptable. I don't know what DO-178b is, but if the problem is that malloc is not available on your platform, you can also implement it yourself (implementing your own heap), but this still may lead to an allocation failing when you run out of space, of course.
There's no way to be 100% sure.
You may look at FreeRTOS' memory allocators examples. Those use static pool, if i'm not mistaken.
You might find this question interesting as well, dynamic allocation is often prohibited in space hardened settings (actually, core memory is still useful there).
Typically, when malloc() is not available, I just use the stack. As Tronic said, the whole reason behind not using malloc() is that it can fail. If you are using a global static pool, it is conceivable that your internal malloc() implementation could be made fail proof.
It really, really, really depends on the task at hand and what the board is going to be exposed to.

Resources