Does realloc keep the memory alignment of posix_memalign? - c

Aligned malloc is posix_memalign, that's OK, but what about the aligned realloc? Does realloc retain the alignment or how to assure that reallocated memory has the same alignment? Assume Linux and x86_64.

No, realloc on the memory returned from posix_memalign is not guaranteed by either ISO or POSIX to maintain the same alignment. A realloc may simply expand the current block at the same address but it may also move the block to a different address whose alignment is less strict than the original.
If you want the same alignment, it's probably best to allocate another block and copy the data over.
There is, unfortunately, no posix_memalign_realloc function in the Single UNIX Specification either.
If you don't want to go through the hassle of copying data every time, you could try the realloc (a) and, if the alignment of that was not as expected, then and only then call posix_memalign to get a correctly aligned address and copy the data in to there, freeing the old address when done.
This may result in:
zero copies (if the current block can be expanded in-place);
one copy (if realloc copies but happens to give you a correctly aligned block); or
two copies (if realloc copies and then you also have to copy due to misalignment).
It may also result in less copying than indicated depending on the underlying memory management implementation. For example, a "copy" may simply involve remapping memory blocks rather than physically moving the data.
So you may want to keep some statistics to see if this scheme is worthwhile.
(a) Just keep in mind that neither POSIX nor Linux man pages specify whether or not you even can pass these pointers to realloc, only that you can pass them to free.
However, based on the current GNU libc source code, it appears to work, although that's no guarantee it will continue to work in future :-)
My fear was that it would allocate memory normally (standard alignment) and pass back an offset address (ie, not the actual address allocated, but one N bytes beyond that) which free was intelligent enough to turn back into the actual address before weaving its magic.
One way of doing that would be to store the actual address immediately before the returned address though this of course would lead to wastage even for regular allocations.
In that case, free may have been made intelligent (since the specs say it must be able to handle the allocations done by posix_memalign) but realloc may not have been given the same intelligence (since the docs are silent on that matter).
However, based on GNU glibc 2.14.1, it actually allocates more memory than needed then fiddles with the arena to free up the pre-space and post-space, so that the address returned is a "real" address, usable by free or realloc.
But, as stated, the documentation doesn't guarantee this.

If you look at the glibc source code for realloc, it calls directly on to malloc. So the memory is aligned in the same way as malloc.

Related

How does free() function know how much bytes to deallocate and how to access that information with in our program? [duplicate]

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Determining size of array allocated via HeapAlloc()

I'm using WinAPI's HeapAlloc() function for allocating memory, and I want to find out the size of it somewhere else in my code. Do I have to keep track of the sizes myself or is there another way?
HeapAlloc rounds up allocations to the nearest alignment. If you ask for 2 bytes, it will give you at least two bytes, but may give you more. As the documentation says:
If the HeapAlloc function succeeds, it allocates at least the amount of memory requested.
The specific alignment that HeapAlloc uses is not documented, but if I remember correctly, all of the Heap Manager APIs use an 8-byte alignment on 32-bit x86 and a 16-byte alignment on 64-bit x86. This old knowledge base article jives with my recollection. Of course, because it is not explicitly documented and subject to change in future versions of Windows and/or on different architectures, you should not rely on hard-coded alignment values.
However, if these functions do allocate more memory than request, the caller is free to use all of that memory. To determine the actual size of the allocation, you call the HeapSize function. Raymond Chen blogged about this some time ago.
So the behavior you're seeing actually makes sense, even though you are going about making the determination in entirely the wrong way. As has been pointed out in the comments already, sizeof doesn't tell you the size of the allocation. You need HeapSize to do that. All sizeof tells you is the size of the element matches[lastMeal] at compile time, which is also 8 bytes instead of 2 for alignment reasons.
As for your edit: best practice is to track this information yourself. Whenever you pass a pointer, pass the size of the allocation along with it. Note that this should be the expected size of the allocation (the 2 bytes that you asked for), not the actual size of the allocation (the 8 bytes that the Heap Manager returned for internal alignment considerations). When you free the memory by calling HeapFree, it knows how big the actual allocation was and will free it as necessary. There is, however, no way for your client code to determine the size of the allocation that you initially requested, which is why you need to track it yourself.

Dynamic Memory Allocation

How malloc() stores metadata?
void* p;
void* q;
p = malloc(sizeof(char));
q = malloc(sizeof(int));
I know that the return value p[0] points to the start of allocated block of memory,
than if I iterate and print
p[-1], p[-2].... q[-1], q[-2]....
or p[1], p[2], p[3], p[4]....
or q[1], q[2], q[3], q[4]....
I find some value that help malloc() to store data howether I can t understand
precisely what that metadata means..I only know that some of them are for block size, for the adress of the next free block but i can t find on the web nothing more
Please, Can you give me some detailed explanation of those value?
How this metadata works and is used depends entirely on the memory management in your libc. Here are some useful writeups to get you started:
Malloc - Typically, classic malloc.
DLMalloc - Doug Lee's Malloc.
GC Malloc - Malloc with garbage collection.
TC Malloc - Thread caching malloc.
Each of these has different aims, benefits and possible deficiencies. For example, perhaps you are concerned about possible heap overflow issues and protections. This may lead to one choice. Perhaps you are looking for better fragment management. This might lead to the selection of Doug Lee's malloc. You really need to specify which library you are using or research them all to understand how the metadata is used to maintain bins, coalesce adjustment free regions, etc.
David Hoelzer has an excellent answer, but I wanted to follow it up a touch more:
"Meta Data" for malloc is 100% implementation driven. Here is a quick overview of some implementations I have written or used:
The meta data stores "next" block pointers.
It stores canary values to make sure you haven't written passed the end of a block
There is no meta data at all because all the blocks are the same size and they are placed in a stack to allow O(1) access - A Unit Allocator. In this case, you'd hit memory from an adjacent block.
The data stores next block ptr, prev block ptr, alignment id, block size, and an identifier that tells a memory debugger exactly where the allocation occured. Very useful for leak tracking.
It hits the previous block's info because the data is stored at the tail of the block instead of the head...
Or mix and match all the above. In any case, reading mem[-1] is a bad idea, and in some cases (thinking embedded systems), could indeed cause a segment fault on only a read if said read happen to address beyond the current memory page and into a forbidden area of memory.
Update as per OP
The 4th scheme I described is one that has quite a bit of information per block - 16-bytes of information not being uncommon as that size won't throw off common alignment schemes. It would contain a 4-byte pointer to the next allocated block, a 4-byte pointer to the previous, an identifier for alignment - a single byte can handle this directly for common sizes of 0-256 byte alignement, but that byte could also represent 256 possible alignment enum values instead, 3 bytes of pad or canary values, and a 4-byte unique identifier to identify where in the code the call was made, though it could be made smaller. This can be a lookup value to a debug table that contains __file__, __line__, and whatever other info you wish to save with the alloction.
This would be one of the heavier varieties as it has a lot of information that needs to be updated when values are allocated and freed.
It is illegal to access p[-1], etc in your example. The results will be undefined and maybe cause memory corruption, or segmentation fault. In general you don't get any control over malloc() at all or information about what it is doing.
That said, some malloc() "replacement" libraries will give you finer grained control or information - you link these into your binary "on top" of the system malloc().
The C standard does not specify how malloc stores its metadata (or even if it will have accessible metadata); therefore, the metadata format and location are implementation-dependent. Portable code must therefore never attempt to access or parse the metadata.
Some malloc implementations provide, as an extension, functions like malloc_usable_size or malloc_size which can tell you the size of allocated blocks. While the presence of these functions is also implementation dependent, they are at least reliable and correct ways to get the information you need if they are present.

Is the bookkeeping of allocated memory blocks redundant?

When we use malloc() we provide a size in byte.
When we use free() we provide nothing.
This is because the OS of course knows about it already, it must have stored the information somewhere.
By the way, also our software must remember how many memory blocks it has requested, so that we can (for instance) safely iterates starting from the pointer and going ahead.
So, my question is: isn't this redundant? Can't we simply ask the OS the size of the memory pointed by a given pointer since it knows it? And if not, why not?
When we use malloc() we provide a size in byte. When we use free() we
provide nothing. This is because the OS of course knows about it
already, it must have stored the information somewhere.
Even though it gives you memory and it keeps track of what memory range belongs to your process, the OS doesn't concern itself with the internal details of your memory. malloc stores the size of the allocated chunk in its own place, also reserved inside your process (usually, it's a few bytes before the logical address returned by malloc). free simply reads that reserved information and deallocates automatically.
By the way, also our software must remember how many memory blocks it
has requested, so that we can (for instance) safely iterates starting
from the pointer and going ahead.
So, my question is: isn't this redundant? Can't we simply ask the OS
the size of the memory pointed by a given pointer since it knows it?
And if not, why not?
Given the above, it is redundant to store that information, yes. But you pretty much have to store it, because the way malloc does its book-keeping is an implementation detail.
If you know how your particular implementation works and you want to take that risk for your software, you are free (no pun intended) to do it. If you don't want to base your logic on an implementation detail (and you'd be right not to want to), you'll have to do this redundant book-keeping side-by-side with malloc's own book-keeping.
No, it's not redundant. malloc() manages, in cooperation with free() and a few other functions, a zillion tiny, individually addressed blocks within relatively large blocks which are generally obtained with sbrk(). The OS only knows about the large range(s), and has no clue which tiny block within it are in use or not. To add to the differences, sbrk() only lets you move the end of your data segment, not split it into parts to free independently. Though one could allocated memory using sbrk exclusively, you would be unable to free arbitrary chunks for reuse, or coalesce smaller chunks into larger ones, or split chunks without writing a bunch of bookkeeping code for this purpose - which ends up essentially being the same as writing malloc. Additionally, using malloc/free/... allows you to call sbrk only rarely, which is a performance bonus since sbrk is a system call with special overhead.
When we use free() we provide nothing.
Not quite true; we provide the pointer that was returned by malloc.
Can't we simply ask the OS the size of the memory pointed by a given pointer since it knows it?
Nope. Pointers are simply addresses; apart from their type, they carry no information about the size of the object they point to. How malloc/calloc/realloc and free keep track of object sizes and allocated vs. free blocks is up to the individual implementation; they may reserve some space immediately before the allocated memory to store the size, they may build an internal map of addresses and sizes, or they may do something else completely.
It would be nice if you could query a pointer for the size of the object it points to; unfortunately, that's simply not a feature of the language.

How does free know how much to free?

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Resources