Determining size of array allocated via HeapAlloc() - c

I'm using WinAPI's HeapAlloc() function for allocating memory, and I want to find out the size of it somewhere else in my code. Do I have to keep track of the sizes myself or is there another way?

HeapAlloc rounds up allocations to the nearest alignment. If you ask for 2 bytes, it will give you at least two bytes, but may give you more. As the documentation says:
If the HeapAlloc function succeeds, it allocates at least the amount of memory requested.
The specific alignment that HeapAlloc uses is not documented, but if I remember correctly, all of the Heap Manager APIs use an 8-byte alignment on 32-bit x86 and a 16-byte alignment on 64-bit x86. This old knowledge base article jives with my recollection. Of course, because it is not explicitly documented and subject to change in future versions of Windows and/or on different architectures, you should not rely on hard-coded alignment values.
However, if these functions do allocate more memory than request, the caller is free to use all of that memory. To determine the actual size of the allocation, you call the HeapSize function. Raymond Chen blogged about this some time ago.
So the behavior you're seeing actually makes sense, even though you are going about making the determination in entirely the wrong way. As has been pointed out in the comments already, sizeof doesn't tell you the size of the allocation. You need HeapSize to do that. All sizeof tells you is the size of the element matches[lastMeal] at compile time, which is also 8 bytes instead of 2 for alignment reasons.
As for your edit: best practice is to track this information yourself. Whenever you pass a pointer, pass the size of the allocation along with it. Note that this should be the expected size of the allocation (the 2 bytes that you asked for), not the actual size of the allocation (the 8 bytes that the Heap Manager returned for internal alignment considerations). When you free the memory by calling HeapFree, it knows how big the actual allocation was and will free it as necessary. There is, however, no way for your client code to determine the size of the allocation that you initially requested, which is why you need to track it yourself.

Related

How does free() function know how much bytes to deallocate and how to access that information with in our program? [duplicate]

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Dynamic Memory Allocation

How malloc() stores metadata?
void* p;
void* q;
p = malloc(sizeof(char));
q = malloc(sizeof(int));
I know that the return value p[0] points to the start of allocated block of memory,
than if I iterate and print
p[-1], p[-2].... q[-1], q[-2]....
or p[1], p[2], p[3], p[4]....
or q[1], q[2], q[3], q[4]....
I find some value that help malloc() to store data howether I can t understand
precisely what that metadata means..I only know that some of them are for block size, for the adress of the next free block but i can t find on the web nothing more
Please, Can you give me some detailed explanation of those value?
How this metadata works and is used depends entirely on the memory management in your libc. Here are some useful writeups to get you started:
Malloc - Typically, classic malloc.
DLMalloc - Doug Lee's Malloc.
GC Malloc - Malloc with garbage collection.
TC Malloc - Thread caching malloc.
Each of these has different aims, benefits and possible deficiencies. For example, perhaps you are concerned about possible heap overflow issues and protections. This may lead to one choice. Perhaps you are looking for better fragment management. This might lead to the selection of Doug Lee's malloc. You really need to specify which library you are using or research them all to understand how the metadata is used to maintain bins, coalesce adjustment free regions, etc.
David Hoelzer has an excellent answer, but I wanted to follow it up a touch more:
"Meta Data" for malloc is 100% implementation driven. Here is a quick overview of some implementations I have written or used:
The meta data stores "next" block pointers.
It stores canary values to make sure you haven't written passed the end of a block
There is no meta data at all because all the blocks are the same size and they are placed in a stack to allow O(1) access - A Unit Allocator. In this case, you'd hit memory from an adjacent block.
The data stores next block ptr, prev block ptr, alignment id, block size, and an identifier that tells a memory debugger exactly where the allocation occured. Very useful for leak tracking.
It hits the previous block's info because the data is stored at the tail of the block instead of the head...
Or mix and match all the above. In any case, reading mem[-1] is a bad idea, and in some cases (thinking embedded systems), could indeed cause a segment fault on only a read if said read happen to address beyond the current memory page and into a forbidden area of memory.
Update as per OP
The 4th scheme I described is one that has quite a bit of information per block - 16-bytes of information not being uncommon as that size won't throw off common alignment schemes. It would contain a 4-byte pointer to the next allocated block, a 4-byte pointer to the previous, an identifier for alignment - a single byte can handle this directly for common sizes of 0-256 byte alignement, but that byte could also represent 256 possible alignment enum values instead, 3 bytes of pad or canary values, and a 4-byte unique identifier to identify where in the code the call was made, though it could be made smaller. This can be a lookup value to a debug table that contains __file__, __line__, and whatever other info you wish to save with the alloction.
This would be one of the heavier varieties as it has a lot of information that needs to be updated when values are allocated and freed.
It is illegal to access p[-1], etc in your example. The results will be undefined and maybe cause memory corruption, or segmentation fault. In general you don't get any control over malloc() at all or information about what it is doing.
That said, some malloc() "replacement" libraries will give you finer grained control or information - you link these into your binary "on top" of the system malloc().
The C standard does not specify how malloc stores its metadata (or even if it will have accessible metadata); therefore, the metadata format and location are implementation-dependent. Portable code must therefore never attempt to access or parse the metadata.
Some malloc implementations provide, as an extension, functions like malloc_usable_size or malloc_size which can tell you the size of allocated blocks. While the presence of these functions is also implementation dependent, they are at least reliable and correct ways to get the information you need if they are present.

Does realloc keep the memory alignment of posix_memalign?

Aligned malloc is posix_memalign, that's OK, but what about the aligned realloc? Does realloc retain the alignment or how to assure that reallocated memory has the same alignment? Assume Linux and x86_64.
No, realloc on the memory returned from posix_memalign is not guaranteed by either ISO or POSIX to maintain the same alignment. A realloc may simply expand the current block at the same address but it may also move the block to a different address whose alignment is less strict than the original.
If you want the same alignment, it's probably best to allocate another block and copy the data over.
There is, unfortunately, no posix_memalign_realloc function in the Single UNIX Specification either.
If you don't want to go through the hassle of copying data every time, you could try the realloc (a) and, if the alignment of that was not as expected, then and only then call posix_memalign to get a correctly aligned address and copy the data in to there, freeing the old address when done.
This may result in:
zero copies (if the current block can be expanded in-place);
one copy (if realloc copies but happens to give you a correctly aligned block); or
two copies (if realloc copies and then you also have to copy due to misalignment).
It may also result in less copying than indicated depending on the underlying memory management implementation. For example, a "copy" may simply involve remapping memory blocks rather than physically moving the data.
So you may want to keep some statistics to see if this scheme is worthwhile.
(a) Just keep in mind that neither POSIX nor Linux man pages specify whether or not you even can pass these pointers to realloc, only that you can pass them to free.
However, based on the current GNU libc source code, it appears to work, although that's no guarantee it will continue to work in future :-)
My fear was that it would allocate memory normally (standard alignment) and pass back an offset address (ie, not the actual address allocated, but one N bytes beyond that) which free was intelligent enough to turn back into the actual address before weaving its magic.
One way of doing that would be to store the actual address immediately before the returned address though this of course would lead to wastage even for regular allocations.
In that case, free may have been made intelligent (since the specs say it must be able to handle the allocations done by posix_memalign) but realloc may not have been given the same intelligence (since the docs are silent on that matter).
However, based on GNU glibc 2.14.1, it actually allocates more memory than needed then fiddles with the arena to free up the pre-space and post-space, so that the address returned is a "real" address, usable by free or realloc.
But, as stated, the documentation doesn't guarantee this.
If you look at the glibc source code for realloc, it calls directly on to malloc. So the memory is aligned in the same way as malloc.

How to find how much memory is actually used up by a malloc call?

If I call:
char *myChar = (char *)malloc(sizeof(char));
I am likely to be using more than 1 byte of memory, because malloc is likely to be using some memory on its own to keep track of free blocks in the heap, and it may effectively cost me some memory by always aligning allocations along certain boundaries.
My question is: Is there a way to find out how much memory is really used up by a particular malloc call, including the effective cost of alignment, and the overhead used by malloc/free?
Just to be clear, I am not asking to find out how much memory a pointer points to after a call to malloc. Rather, I am debugging a program that uses a great deal of memory, and I want to be aware of which parts of the code are allocating how much memory. I'd like to be able to have internal memory accounting that very closely matches the numbers reported by top. Ideally, I'd like to be able to do this programmatically on a per-malloc-call basis, as opposed to getting a summary at a checkpoint.
There isn't a portable solution to this, however there may be operating-system specific solutions for the environments you're interested in.
For example, with glibc on Linux, you can use the mallinfo() function from <malloc.h> which returns a struct mallinfo. The uordblks and hblkhd members of this structure contains the dynamically allocated address space used by the program including book-keeping overhead - if you take the difference of this before and after each malloc() call, you will know the amount of space used by that call. (The overhead is not necessarily constant for every call to malloc()).
Using your example:
char *myChar;
size_t s = sizeof(char);
struct mallinfo before, after;
int mused;
before = mallinfo();
myChar = malloc(s);
after = mallinfo();
mused = (after.uordblks - before.uordblks) + (after.hblkhd - before.hblkhd);
printf("Requested size %zu, used space %d, overhead %zu\n", s, mused, mused - s);
Really though, the overhead is likely to be pretty minor unless you are making a very very high number of very small allocations, which is a bad idea anyway.
It really depends on the implementation. You should really use some memory debugger. On Linux Valgrind's Massif tool can be useful. There are memory debugging libraries like dmalloc, ...
That said, typical overhead:
1 int for storing size + flags of this block.
possibly 1 int for storing size of previous/next block, to assist in coallescing blocks.
2 pointers, but these may only be used in free()'d blocks, being reused for application storage in allocated blocks.
Alignment to an approppriate type, e.g: double.
-1 int (yes, that's a minus) of the next/previous chunk's field containing our size if we are an allocated block, since we cannot be coallesced until we're freed.
So, a minimum size can be 16 to 24 bytes. and minimum overhead can be 4 bytes.
But you could also satisfy every allocation via mapping memory pages (typically 4Kb), which would mean overhead for smaller allocations would be huge. I think OpenBSD does this.
There is nothing defined in the C library to query the total amount of physical memory used by a malloc() call. The amount of memory allocated is controlled by whatever memory manager is hooked up behind the scenes that malloc() calls into. That memory manager can allocate as much extra memory as it deemes necessary for its internal tracking purposes, on top of whatever extra memory the OS itself requires. When you call free(), it accesses the memory manager, which knows how to access that extra memory so it all gets released properly, but there is no way for you to know how much memory that involves. If you need that much fine detail, then you need to write your own memory manager.
If you do use valgrind/Massif, there's an option to show either the malloc value or the top value, which differ a LOT in my experience. Here's an excerpt from the Valgrind manual http://valgrind.org/docs/manual/ms-manual.html :
...However, if you wish to measure all the memory used by your program,
you can use the --pages-as-heap=yes. When this option is enabled,
Massif's normal heap block profiling is replaced by lower-level page
profiling. Every page allocated via mmap and similar system calls is
treated as a distinct block. This means that code, data and BSS
segments are all measured, as they are just memory pages. Even the
stack is measured...

How does free know how much to free?

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Resources