Secure way to realloc

Secure way to realloc - c

I'm writing a C library which needs to often move around various sensitive data. I want to have benefits of realloc (extending allocated block instead copying when memory is available) while having some way to erase content of old block if copying is necessary.
Is there some lightweight implementation of malloc/realloc/free which could be used for mingw-gcc or some other trick to it, or I must overallocate and just allocate-and-copy without relying on realloc?

On Linux, mmap the block, mlock it, and then do mremap instead of using realloc.
Protecting against hidden copies isn't enough. You also need to make sure the memory never ever gets swapped to disk before you get a chance to zero it.

Related

GNU C: What will happen if i overwrite malloc() free() but not realloc()?

I am coding for an embedded system using ARM cross toolchain arm-none-ebi-gcc. Because the code is running freeRTOS which has its own heap memory management so I want to overwrite malloc(), free() and realloc() in the libc and wrap them simply to call the functions in freeRTOS. Only one problme, the freeRTOS does not have realloc(), that's strange, but my code definitely need it. So I want to understand, what will happen if I only overwrite malloc() and free() but still keep the realloc() the version as it be in the libc? Also, I feel providing my own realloc() that just call malloc() with the new size and do the memcopy after the new memory block got allocated looks not safe enough to my mind, because the new size usually larger than the old size in my application, so when I do a memcopy() with a size larger than the actually allocated memory block will could create some pointer access error, it that possible?
Thanks in advance.
-woody

Partially replacing the allocator (replacing some functions but not others) can't work. At worst, you will get serious heap data structure corruption from one implementation interpreting another's data structures as its own. It's possible to harden against this so that things just fail to link or fail to allocate (provide null pointer results) at runtime if this is done, and I did this in musl libc as described in these commits:
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=c9f415d7ea2dace5bf77f6518b6afc36bb7a5732
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=618b18c78e33acfe54a4434e91aa57b8e171df89
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=b4b1e10364c8737a632be61582e05a8d3acf5690
but I doubt many other implementations take the same precautions. And they won't help what you want actually work; they'd just prevent catastrophic results.
If you really need realloc, you're going to have to make a working one for the implementation you're adopting. The easiest way to do that is make it just malloc, memcpy, and free, but indeed you need a way to determine the length argument to pass to memcpy. If you just pass the new length, it might be safe on a microcontroller without MMU, as long as your lengths aren't so large they risk running over into an MMIO range or something. But the right thing to do is read the malloc implementation enough to understand where it's stored the allocated size, and write your own code to extract that. At that point you can write a correct/valid realloc using memcpy.

Memory allocation that resizes a buffer ONLY if it can grow in place?

After reading the man-page for realloc(), I came to the realization that it works a little differently than I thought it did. I originally thought that realloc() would attempt to resize a buffer, previously allocated with one of the malloc-family functions, and if it could NOT extend the buffer in place, then it would fail. However, the man-page states:
The realloc() function returns a pointer to the newly allocated memory, which is suitably aligned for any built-in type and may be different from ptr, or NULL if the request fails.
The "may be different from ptr" part is what I'm talking about.
Basically, what I want is a function, similar to realloc(), but which fails if it cannot extend the buffer in place. It seems that there is no function in the standard C library that does this; however, I'm assuming there may be some OS-specific functions that accomplish the same thing.
Could someone tell me what functions are out there that do what I described above, and which OS's they are specific to? Preferably, I'd like to know at least the functions specific to Linux and Windows (and Mac OS would be a nice bonus too :) ).
This may be a duplicate of this post, but I don't think it is for the following reasons:
The question in the post I linked to simply asks, is there a function that extends a buffer in place, whereas, I'm asking, which functions extend a buffer in place.
The accepted answer for that post does not contain the information I need.
EDIT
Some people were wondering what is the use case I need this for, so I'll explain, below:
I'm writing a C preprocessor (yes, I know... don't reinvent the wheel... well, I'm doing it anyways, so there). And one component of the C preprocessor is a cache for storing pp-tokens which come from various source files, where each source file's set of pp-tokens may be fragmented within the cache. The cache itself, is a linked-list of large chunks of memory. Ideally, I'd like to keep this linked-list short, hence why I'd like to first try resizing the buffer (in place); however, if resizing in place is not possible, then I want to just add another node (i.e. chunk of memory) to the linked list.
Within each cache buffer, there are additional linked-list nodes, which provide a means for iterating through all the pp-tokens of each individual source file, which may be fragmented across the various cache buffers that make up the cache.
The reasons I need the kind of memory reallocation I discussed earlier are the following:
If resizing a cache buffer could not be done in place, and a new buffer had to be allocated and the old memory contents copied, then I'd have a lot of dangling pointers. Jonathan Leffler suggested that I instead store offsets within the buffer, rather than pointers, which I had not even thought about, and is a great idea! However, reason #2...
I want the implementation of the cache to be as fast as possible, and, please correct me if I'm wrong, but it seems to me that (for my use case) it would be faster on average to just add a new cache buffer to the linked list if a given cache buffer could not be resized in place, rather than allocating a new buffer and copying all previous contents and freeing the old buffer. As a sidenote, I am planning on doubling the size of the allocated cache buffer each time cache resizing is needed.

Memory management (in the form of malloc and friends) is generally implemented as a library; it is not part of the Operating System. (An implementation of the library will probably need to use some OS facilities to acquire raw memory -- although that's not a given -- but there is no need to involve the OS for allocating and freeing individual allocations.) So you're not going to find an "OS-specific" solution.
There are a number of different memory allocation libraries available. If you decide to use an alternative to the one preinstalled with your particular distribution, you will probably want to arrange for it to be used by the standard library as well. Details for how to do that vary.
Most allocation libraries do include some additional interfaces, but I don't know of any library which offers the function you're looking for. More common is an API for finding out how much memory is actually in an allocation (which is often more than the amount requested by the malloc). For many libraries, realloc will only expand the allocation in place if it was already big enough, but there may be libraries which are willing to merge a following free block in order to make non-copying realloc possible.
There's a list of some commonly-used libraries in the Wikipedia page on dynamic memory allocation, which also has a good overview of implementation techniques.
And, of course, you could always write your own memory manager (or modify an open source library) to implement that feature. However, while that would be an interesting and satisfying project, I'd strongly suggest you think about (and research) the reasons why this seemingly simple idea has not been implemented in common memory management libraries. There are good reasons.

When is it more appropriate to use valloc() as opposed to malloc()?

C (and C++) include a family of dynamic memory allocation functions, most of which are intuitively named and easy to explain to a programmer with a basic understanding of memory. malloc() simply allocates memory, while calloc() allocates some memory and clears it eagerly. There are also realloc() and free(), which are pretty self-explanatory.
The manpage for malloc() also mentions valloc(), which allocates (size) bytes aligned to the page border.
Unfortunately, my background isn't thorough enough in low-level intricacies; what are the implications of allocating and using page border-aligned memory, and when is this appropriate as opposed to regular malloc() or calloc()?

The manpage for valloc contains an important note:
The function valloc() appeared in 3.0BSD. It is documented as being obsolete in 4.3BSD, and as legacy in SUSv2. It does not appear in POSIX.1-2001.
valloc is obsolete and nonstandard - to answer your question, it would never be appropriate to use in new code.
While there are some reasons to want to allocate aligned memory - this question lists a few good ones - it is usually better to let the memory allocator figure out which bit of memory to give you. If you are certain that you need your freshly-allocated memory aligned to something, use aligned_alloc (C11) or posix_memalign (POSIX) instead.

Allocations with page alignment usually are not done for speed - they're because you want to take advantage of some feature of your processor's MMU, which typically works with page granularity.
One example is if you want to use mprotect(2) to change the access rights on that memory. Suppose, for instance, that you want to store some data in a chunk of memory, and then make it read only, so that any buggy part of your program that tries to write there will trigger a segfault. Since mprotect(2) can only change permissions page by page (since this is what the underlying CPU hardware can enforce), the block where you store your data had better be page aligned, and its size had better be a multiple of the page size. Otherwise the area you set read-only might include other, unrelated data that still needs to be written.
Or, perhaps you are going to generate some executable code in memory and then want to execute it later. Memory you allocate by default probably isn't set to allow code execution, so you'll have to use mprotect to give it execute permission. Again, this has to be done with page granularity.
Another example is if you want to allocate memory now, but might want to mmap something on top of it later.
So in general, a need for page-aligned memory would relate to some fairly low-level application, often involving something system-specific. If you needed it, you'd know. (And as mentioned, you should allocate it not with valloc, but using posix_memalign, or perhaps an anonymous mmap.)

First of all valloc is obsolete, and memalignshould be used instead.
Second thing it's not part of the C (C++) standard at all.
It's a special allocation which is aligned to _SC_PAGESIZE boundary.
When is it useful to use it? I guess never, unless you have some specific low level requirement. If you would need it, you would know to need it, since it's rarely useful (maybe just when trying some micro-optimizations or creating shared memory between processes).

The self-evident answer is that it is appropriate to use valloc when malloc is unsuitable (less efficient) for the application (virtual) memory usage pattern and valloc is better suited (more efficient). This will depend on the OS and libraries and architecture and application...
malloc traditionally allocated real memory from freed memory if available and by increasing the brk point if not, in which case it is cleared by the OS for security reasons.
calloc in a dumb implementation does a malloc and then (re)clears the memory, while a smart implementation would avoid reclearing newly allocated memory that is automatically cleared by the operating system.
valloc relates to virtual memory. In a virtual memory system using the file system, you can allocate a large amount of memory or filespace/swapspace, even more than physical memory, and it will be swapped in by pages so alignment is a factor. In Unix creation of file of a specified file and adding/deleting pages is done using inodes to define the file but doesn't deal with actual disk blocks till needed, in which case it creates them cleared. So I would expect a valloc system to increase the size of the data segment swap without actually allocating physical or swap pages, or running a for loop to clear it all - as the file and paging system does that as needed. Thus valloc should be a heck of a lot faster than malloc. But as with calloc, how particular idiotsyncratic *x/C flavours do it is up to them, and the valloc man page is totally unhelpful about these expectations.
Traditionally this was implemented with brk/sbrk. Of course in a virtual memory system, whether a paged or a segmented system, there is no real need for any of this brk/sbrk stuff and it is enough to simply write the last location in a file or address space to extend up to that point.
Re the allocation to page boundaries, that is not usually something the user wants or needs, but rather is usually something the system wants or needs.
A (probably more expensive) way to simulate valloc is to determine the page boundary and then call aligned_alloc or posix_memalign with this alignment spec.
The fact that valloc is deprecated or has been removed or is not required in some OS' doesn't mean that it isn't still useful and required for best efficiency in others. If it has been deprecated or removed, one would hope that there are replacements that are as efficient (but I wouldn't bet on it, and might, indeed have, written my own malloc replacement).
Over the last 40 years the tradeoffs of real and (once invented) virtual memory have changed periodically, and mainstream OS has tended to go for frills rather than efficiency, with programmers who don't have (time or space) efficiency as a major imperative. In the embedded systems, efficiency is more critical, but even there efficiency is often not well supported by the standard OS and/or tools. But when in doubt, you can roll your own malloc replacement for your application that does what you need, rather than depend on what someone else woke up and decided to do/implement, or to undo/deprecate.
So the real answer is you don't necessarily want to use valloc or malloc or calloc or any of the replacements your current subversion of an OS provides.

Do we really have to free() when we malloc()? What makes it different from an automatic variable then?

The OS will just recover it (after the program exits) right? So what's the use other than good programming style? Or is there something I'm misunderstanding? What makes it different from "automatic" allocation since both can be changed during run time, and both end after program execution?

When your application is working with vast amounts of data, you must free in order to conserve heap space. If you don't, several bad things can happen:
the OS will stop allocating memory for you (crashing)
the OS will start swapping your data to disk (thrashing)
other applications will have less space to put their data
The fact that the OS collects all the space you allocate when the application exits does not mean you should rely upon this to write a solid application. This would be like trying to rely on the compiler to optimize poor programming. Memory management is crucial for good performance, scalability, and reliability.
As others have mentioned, malloc allocates space in the heap, while auto variables are created on the stack. There are uses for both, but they are indeed very different. Heap space must be allocated and managed by the OS and can store data dynamically and of different sizes.

If you call a macro for thousand times without using free() then compiler or safe to say system will assign you thousand different address, but if you use free() after each malloc then only one memory address will be given to you every time.
So chances of memory leak, bus error, memory out of bound and crash would be minimum.
Its safe to use free().

In C/C++ "auto" variables are allocated on the stack. They are destroyed right at the exit from the function. This will happen automatically. You do not need to write anything for this.
Heap allocations (result of a call to malloc) are either released explicitly (with a call to free) or they are cleaned up when the process ends.
If you are writing small program that will be used maybe once or twice, then it is ok not to free your heap allocations. This is not nice but acceptable.
If you are writing medium or big project or are planning to include your code into other project, you should definitely release every heap allocation. Not doing this will create HUGE trouble. The heap memory is not endless. Program may use it all. Even if you will allocate small amount of memory, this will still create unnedded pressure on the OS, cause swapping, etc.
The bottom line: freeing allocations is much more than just a style or a good habit.

An automatic variable is destroyed (and its memory is re-usable) as soon as you exit the scope in which it is defined. For most variables that's much earlier than program exit.
If you malloc and don't free, then the memory isn't re-usable until the program exits. Not even then, on some systems with very minimal OS.
So yes, there's big difference between an automatic variable and a leaked memory allocation. Call a function that leaks an allocation enough times, and you'll run out of memory. Call a function with an automatic variable in it as many times as you like, the memory is re-usable.

It is good programming style and it's more than that. Not doing proper memory management in non-trivial programs will eventually influence the usability of your program. Sure the OS can reclaim any resources that you've allocated/used after your program terminates, but that doesn't alleviate the burden or potential issues during program execution.
Consider the web browser that you've used to post this question: if the browser is written in a language that requires memory management, and the code didn't do it properly, how long do you think it would be before you'd notice that it's eating up all your memory? How long do you think the browser would remain usable? Now consider that users often leave browsers open for long periods of time: without proper memory management, they would become unusable after few page loads.

If your program does not exit immediately and you're not freeing your memory you're going to end up wasting it. Either you'll run out of memory eventually, or you'll start swapping to disk (which is slow, and also not unlimited).

automatic variable is on the stack and its size should be known on compilation time. if you need to store data that you don't the size, for example, maintain a binary tree, where the user add and removes objects. beside that stack size might be limited (depends on your target), for example, linux kernel the stack is 4k-8k usually. you also trash the instruction cache, which affects performance,

Yes you absolutely have to use free() after malloc() (as well as closing files and other resources when you're done). While it's true that the OS will recover it after execution, a long running process will leak memory that way. If your program is as simple as a main method that runs a single method then exists, it's probably not a big deal, albeit incredibly sloppy. You should get in the habit of managing memory properly in C because one day you may want to write a nontrivial program that runs for more than a second, and if you don't learn how to do it in advance, you'll have a huge headache dealing with memory leaks.

What if I allocate memory using mmap instead of malloc?

What are the disadvantages of allocating memory using mmap (with MAP_PRIVATE and MAP_ANONYMOUS) than using malloc? For data in function scope, I would use stack memory anyway and therefore not malloc.
One disadvantage that comes to mind is for dynamic data structures such as trees and linked lists, where you frequently require to allocate and deallocate small chunks of data. Using mmap there would be expensive for two reasons, one for allocating at granularity of 4096 bytes and the other for requiring to make a system call.
But in other scenarios, do you think malloc is better than mmap? Secondly, am I overestimating disadvantage of mmap for dynamic data structures?
One advantage of mmap over malloc I can think of is that memory is immediately returned to the OS, when you do munmap, whereas with malloc/free, I guess memory uptil the data segment break point is never returned, but kept for reusage.

Yes, malloc is better than mmap. It's much easier to use, much more fine-grained and much more portable. In the end, it will call mmap anyway.
If you start doing everyday memory management with mmap, you'll want to implement some way of parceling it out in smaller chunks than pages and you will end up reimplementing malloc -- in a suboptimal way, probably.

First off, mmap() is a platform specific construct, so if you plan on writing portable C, it's already out.
Second, malloc() is essentially implemented in terms of mmap(), but it's a sort of intelligent library wrapper around the system call: it will request new memory from the system when needed, but until then it will pick a piece of memory in an area that's already committed to the process.
Therefore, if you want to do ordinary dynamic memory allocation, use malloc(), end of story. Use of mmap() for memory allocation should be reserved for special situations (e.g. if you actually want a whole page for yourself, aligned at the page boundary), and always abstracted into a single piece of library code so that others may easily understand what you're doing.

One feature that mmap has that malloc doesn't, is mmap allows you to allocate using Huge Pages (flag argument has MAP_HUGETLB set), while malloc doesn't have that option.