OpenSSL 1.1.0: HMAC_CTX must now be allocated, why? - c

Checking the changes in recent OpenSSL releases, I now noticed that the HMAC_CTX structure must be allocated on the heap now. The headers only forward-declare it (in ossl_typ.h).
I wonder what the idea behind is. Given that heap allocated memory creates overhead, they must have a good reason for making the library slower. I just can't find the rationale behind it.
Anyone here know what made the developers decide to force allocation for this?

I've seen a lot of the OpenSSL structures going the same way. I would think that it's because the implementers of OpenSSL want to "hide" the implementation state away from the users of the library. That way the user can't "mess" with it in ways that the implementers don't want you to. It also means that the implementers can change there implementation without the user code caring. It's basically the "C" version of the C++ PIMPL pattern.

Related

Memory allocation that resizes a buffer ONLY if it can grow in place?

After reading the man-page for realloc(), I came to the realization that it works a little differently than I thought it did. I originally thought that realloc() would attempt to resize a buffer, previously allocated with one of the malloc-family functions, and if it could NOT extend the buffer in place, then it would fail. However, the man-page states:
The realloc() function returns a pointer to the newly allocated memory, which is suitably aligned for any built-in type and may be different from ptr, or NULL if the request fails.
The "may be different from ptr" part is what I'm talking about.
Basically, what I want is a function, similar to realloc(), but which fails if it cannot extend the buffer in place. It seems that there is no function in the standard C library that does this; however, I'm assuming there may be some OS-specific functions that accomplish the same thing.
Could someone tell me what functions are out there that do what I described above, and which OS's they are specific to? Preferably, I'd like to know at least the functions specific to Linux and Windows (and Mac OS would be a nice bonus too :) ).
This may be a duplicate of this post, but I don't think it is for the following reasons:
The question in the post I linked to simply asks, is there a function that extends a buffer in place, whereas, I'm asking, which functions extend a buffer in place.
The accepted answer for that post does not contain the information I need.
EDIT
Some people were wondering what is the use case I need this for, so I'll explain, below:
I'm writing a C preprocessor (yes, I know... don't reinvent the wheel... well, I'm doing it anyways, so there). And one component of the C preprocessor is a cache for storing pp-tokens which come from various source files, where each source file's set of pp-tokens may be fragmented within the cache. The cache itself, is a linked-list of large chunks of memory. Ideally, I'd like to keep this linked-list short, hence why I'd like to first try resizing the buffer (in place); however, if resizing in place is not possible, then I want to just add another node (i.e. chunk of memory) to the linked list.
Within each cache buffer, there are additional linked-list nodes, which provide a means for iterating through all the pp-tokens of each individual source file, which may be fragmented across the various cache buffers that make up the cache.
The reasons I need the kind of memory reallocation I discussed earlier are the following:
If resizing a cache buffer could not be done in place, and a new buffer had to be allocated and the old memory contents copied, then I'd have a lot of dangling pointers. Jonathan Leffler suggested that I instead store offsets within the buffer, rather than pointers, which I had not even thought about, and is a great idea! However, reason #2...
I want the implementation of the cache to be as fast as possible, and, please correct me if I'm wrong, but it seems to me that (for my use case) it would be faster on average to just add a new cache buffer to the linked list if a given cache buffer could not be resized in place, rather than allocating a new buffer and copying all previous contents and freeing the old buffer. As a sidenote, I am planning on doubling the size of the allocated cache buffer each time cache resizing is needed.
Memory management (in the form of malloc and friends) is generally implemented as a library; it is not part of the Operating System. (An implementation of the library will probably need to use some OS facilities to acquire raw memory -- although that's not a given -- but there is no need to involve the OS for allocating and freeing individual allocations.) So you're not going to find an "OS-specific" solution.
There are a number of different memory allocation libraries available. If you decide to use an alternative to the one preinstalled with your particular distribution, you will probably want to arrange for it to be used by the standard library as well. Details for how to do that vary.
Most allocation libraries do include some additional interfaces, but I don't know of any library which offers the function you're looking for. More common is an API for finding out how much memory is actually in an allocation (which is often more than the amount requested by the malloc). For many libraries, realloc will only expand the allocation in place if it was already big enough, but there may be libraries which are willing to merge a following free block in order to make non-copying realloc possible.
There's a list of some commonly-used libraries in the Wikipedia page on dynamic memory allocation, which also has a good overview of implementation techniques.
And, of course, you could always write your own memory manager (or modify an open source library) to implement that feature. However, while that would be an interesting and satisfying project, I'd strongly suggest you think about (and research) the reasons why this seemingly simple idea has not been implemented in common memory management libraries. There are good reasons.

Direct stack and heap access; Virtual- or hardware- level?

When I'm on SO I read a lot of comments guiding (Especially in C)
"dynamic allocation allways goes to the heap, automatic allocation on the stack"
But especially regarding to plain C I disaggree with that. As the ISO/IEC9899 doesn't even drop a word of heap or stack. It just mentions three storage duriations (static, automatic, and allocated) and advises how each of them has to be treat.
What would give a compiler the option to do it even wise versa if it would like to.
So my question is:
Are the heap and the stack physical existing that (even if not in C) a standardized language can say "... has to happen on heap and ... on the stack"?
Or are they just a virtuell system of managing memory access so that a language can't make rules about them, as it can't even be ensured the enviroment supports them?
In my knowledgebase only the second would make sense. But I read allready many times people writing comments like "In language XY this WILL happen on the stack/heap". But if I'm right this had to be indeterminable as long the language isn't just made for such systems which guarantee to have a stack and heap. And all thoose comments would be wrong.
Thats what lead me to ask about this question. Am I that wrong, or is there a big error in reasoning going around about that?
You are correct in that the C spec doesn't require the use of a heap or stack, as long as it implements the storage classes correctly.
However, virtually every compiler will use stacks for automatic variables and heaps for allocated variables. While you could implement a compiler that doesn't use a stack or heap, it probably wouldn't perform very well and wouldn't be familiar to most devs.
So when people say "always", they really mean "virtually always".

When would one use malloc over zmalloc?

There is precious little information online or on stackoverflow with regards to a function I recently encountered called zmalloc. (In fact, this is only the 3rd zmalloc-tagged question on SO).
I gleaned the following:
zmalloc automatically keeps track of, and frees unfreed memory, similar to C++ smart pointers.
zmalloc apparently enables some metrics, at least in the case of the redis source.
So my questions are:
What flexibility does one lose, then, in using zmalloc over malloc? i.e. what benefits do malloc continue to offer that zmalloc does not?
Is zmalloc non-standard in C11? Is this a custom-built function?
It looks like zmalloc is part of the redis-tools (https://github.com/antirez/redis-tools). redis is a kind of database which keeps stuff in memory (http://redis.io/).
Typically malloc replacements are developed because some target systems do not provide a suitable malloc, or because the caller needs extra functionality. I think zmalloc is a pretty simple wrapper of the system malloc/free, just keeping track of the overall memory allocated. No automatic free involved. The post you pointed to also explains the need: The database can be configured to not use more than some amount of memory and thus needs to keep track of the overall consumption.

Heap Header and free() in C

I was wondering if anyone could point me to a resource that gave an in-depth explanation of the heap. I would like to know more about the headers utilized in practice and how the free() function actually "deallocates" memory by deleting the header information. Many resources just give the generic
struct heapHeader
{
heapHeader* next;
heapHeader* previous;
unsigned int size;
}
and then go on to say that this is never implemented in practice. So, to sum it up, I would like to know more about how heap headers are implemented in "practice" and how functions such as free() interact with said headers.
The C language standard does not define the specifics of a heap. It specifies malloc, calloc, realloc and free in terms of what task they perform, their parameters, and what the programmer can do with the result.
If you ask for implementations details, you are tempted to make assumptions that might bite you later. Unless you have very specific reasons to do so, you shouldn't need to investigate the inner workings. The way malloc and free work may change with the next OS revision, the next compiler version, or even the compile options used.
Following is an interesting article: comprehensive, descriptive, about heap management.
[1]
http://www.apriorit.com/our-company/dev-blog/209-linux-memory-management
simple implementation example:
[2] http://lambda.uta.edu/cse5317/notes/node45.html
Hope that helps.

Resources for memory management in embedded application

How should I manage memory in my mission critical embedded application?
I found some articles with google, but couldn't pinpoint a really useful practical guide.
The DO-178b forbids dynamic memory allocations, but how will you manage the memory then? Preallocate everything in advance and send a pointer to each function that needs allocation? Allocate it on the stack? Use a global static allocator (but then it's very similar to dynamic allocation)?
Answers can be of the form of regular answer, reference to a resource, or reference to good opensource embedded system for example.
clarification: The issue here is not whether or not memory management is availible for the embedded system. But what is a good design for an embedded system, to maximize reliability.
I don't understand why statically preallocating a buffer pool, and dynamically getting and dropping it, is different from dynamically allocating memory.
As someone who has dealt with embedded systems, though not to such rigor so far (I have read DO-178B, though):
If you look at the u-boot bootloader, a lot is done with a globally placed structure. Depending on your exact application, you may be able to get away with a global structure and stack. Of course, there are re-entrancy and related issues there that don't really apply to a bootloader but might for you.
Preallocate, preallocate, preallocate. If you can at design-time bind the size of an array/list structure/etc, declare it as a global (or static global -- look Ma, encapsulation).
The stack is very useful, use it where needed -- but be careful, as it can be easy to keep allocating off of it until you have no stack space left. Some code I once found myself debugging would allocate 1k buffers for string management in multiple functions...occasionally, the usage of the buffers would hit another program's stack space, as the default stack size was 4k.
The buffer pool case may depend on exactly how it's implemented. If you know you need to pass around fixed-size buffers of a size known at compile time, dealing with a buffer pool is likely more easy to demonstrate correctness than a complete dynamic allocator. You just need to verify buffers cannot be lost, and validate your handling won't fail. There seem to be some good tips here: http://www.cotsjournalonline.com/articles/view/101217
Really, though, I think your answers might be found in joining http://www.do178site.com/
I've worked in a DO-178B environment (systems for airplanes). What I have understood, is that the main reason for not allowing dynamic allocation is mainly certification. Certification is done through tests (unitary, coverage, integration, ...). With those tests you have to prove that you the behavior of your program is 100% predictable, nearly to the point that the memory footprint of your process is the same from one execution to the next. As dynamic allocation is done on the heap (and can fail) you can not easily prove that (I imagine it should be possible if you master all the tools from the hardware to any piece of code written, but ...). You have not this problem with static allocation. That also why C++ was not used at this time in such environments. (it was about 15 years ago, that might have changed ...)
Practically, you have to write a lot of struct pools and allocation functions that guarantee that you have something deterministic. You can imagine a lot of solutions. The key is that you have to prove (with TONS of tests) a high level of deterministic behavior. It's easier to prove that your hand crafted developpement work deterministically that to prove that linux + gcc is deterministic in allocating memory.
Just my 2 cents. It was a long time ago, things might have changed, but concerning certification like DO-178B, the point is to prove your app will work the same any time in any context.
Disclaimer: I've not worked specifically with DO-178b, but I have written software for certified systems.
On the certified systems for which I have been a developer, ...
Dynamic memory allocation was
acceptable ONLY during the
initialization phase.
Dynamic memory de-allocation was NEVER acceptable.
This left us with the following options ...
Use statically allocated structures.
Create a pool of structures and then get/release them from/back to the pool.
For flexibility, we could dynamically allocate the size of the pools or number of structures during the initialization phase. However, once past that init phase, we were stuck with what we had.
Our company found that pools of structures and then get/releasing from/back into the pool was most useful. We were able to keep to the model, and keep things deterministic with minimal problems.
Hope that helps.
Real-time, long running, mission critical systems should not dynamically allocate and free memory from heap. If you need and cannot design around it to then write your own allocated and fixed pool management scheme. Yes, allocated fixed ahead of time whenever possible. Anything else is asking for eventual trouble.
Allocating everything from stack is commonly done in embedded systems or elsewhere where the possibility of an allocation failing is unacceptable. I don't know what DO-178b is, but if the problem is that malloc is not available on your platform, you can also implement it yourself (implementing your own heap), but this still may lead to an allocation failing when you run out of space, of course.
There's no way to be 100% sure.
You may look at FreeRTOS' memory allocators examples. Those use static pool, if i'm not mistaken.
You might find this question interesting as well, dynamic allocation is often prohibited in space hardened settings (actually, core memory is still useful there).
Typically, when malloc() is not available, I just use the stack. As Tronic said, the whole reason behind not using malloc() is that it can fail. If you are using a global static pool, it is conceivable that your internal malloc() implementation could be made fail proof.
It really, really, really depends on the task at hand and what the board is going to be exposed to.

Resources