Pointer Meta Information - c

An interesting feature of realloc() is that it somehow knows how long your data is when it is copying or extending your allocated memory.
I read that what happens is that behind the scenes there is some meta information stored about a pointer (which contains it's allocated memory size), usually immediately before the address the pointer is pointing at (but of course, subject to implementation).
So my question is, if there is such data stored why isn't it exposed via an API, so things like the C string for example, won't have to look for a \0 to know where the end of a string is.
There could be hundreds of other uses as well.

As you said yourself, it's subject to the implementation of the standard libraries' memory manager.
C doesn't have a standard for this type of functionality, probably because generally speaking C was designed to be simple (but also provide a lot of capabilities).
With that said, there isn't really much use for this type of functionality outside of features for debugging memory allocation.
Some compilers do provide this type of functionality (specifically memory block size), but it's made pretty clear that it's only for the purpose of debugging.
It's not uncommon to write your own memory manager, allowing you to have full control over allocations, and even make assumptions for other library components. As you mentioned, your implementation can store the size of an allocation somewhere in a header, and a string implementation can reference that value rather than walking the bytes for \0 termination.

This information is not always accurate.
On Windows, the size is rounded up to a multiple of at least 8.
realloc does fine with this because it cares only about the allocated size, but you won't get the requested size.

Here it's worth understanding how allocators work a bit. For example, if you take a buddy allocator, it allocates chunks in predetermined sizes. For simplicity, let's say powers of 2 (real world allocators usually use tighter sizes).
In this kind of simplified case, if you call malloc and request 19 bytes of memory for your personal use, then the allocator is actually going to give you a 32-byte chunk worth of memory. It doesn't care that it gave you more than you needed, as that still satisfies your basic request by giving you something as large or larger than what you needed.
These chunk sizes are usually stored somewhere, somehow, for a general-purpose allocator that can handle variable-sized requests to be able to free chunks and do things like merge free chunks together and split them apart. Yet they're identifying the chunk size, not the size of your request. So you can't use that chunk size to see where a string should end, e.g., since according to you, you wanted one with 19 bytes, not 32 bytes.
realloc also doesn't care about your requested 19-byte size. It's working at the level of bits and bytes, and so it copies the whole chunk's worth of memory to a new, larger chunk if it has to.
So these kinds of chunk sizes generally aren't useful for implementing things like data structures. For these, you want to be working at the data size proportional to the amount of memory you requested which is something many allocators don't even bother to store (they don't need to in order to work given their efficiency needs). So it's often up to you to keep track of that size which is in tune with the logic of your software in some form (a sentinel like a null terminator is one form).
As for why querying these sizes is not available in the standard, it's probably because the need to know it would be a rather obscure, low-level kind of need given that it has nothing to do with the amount of memory you actually requested to work with. I have often wished for it for some very low-level debugging needs, but there are platform/compiler-specific alternatives that I have used instead, and I found they weren't quite as handy as I thought they would be (even for just low-level debugging).

You asked:
if there is such data stored why isn't it exposed via an API, so things like the C string for example, won't have to look for a \0 to know where the end of a string is.
You can allocate enough memory with the ability to hold 100 characters but you may use that memory to hold a string that is only 5 characters long. If you relied on the pointer metadata, you will get the wrong result.
char* cp = malloc(100);
strcpy(cp, "hello");
The second reason you would need the terminating null character is when you use stack memory to create a string.
char str[100] = "hello";
Without the terminating null character, you won't be able determine the length of the string held in str.
You said:
There could be hundreds of other uses as well.
I don't have a good response to that other than to say that custom memory allocators often provide access to the metadata. The reason for not including such APIs in the standard library is not obvious to me.

Related

Why does `realloc` not re-allocate in-place when possible?

From c99:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.
[..]
The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated.
I am surprised that the standards do not specify that realloc should "try" to do in-place reallocation. Typically, if the reallocation size is lower than the currently allocated size, I would have expected the standards to ensure that the realloc would return the same pointer.
Is there a logic for the standards not to specify that realloc should be in-place if the size is reduced?
The notion of "try" doesn't mean much in the context of a standard - how hard does the implementer have to try before they have tried enough? How would one measure compliance?
Many common implementations will work exactly as you suggest: If you're resizing downward, or even resizing upward and the following memory happens to be free, they might return the original pointer after adjusting the housekeeping but not having to copy any data. Yay!
But I can think of lots of reasons why an allocator would not do this even if it were possible:
Some allocators keep different arenas for different sizes, where (making this up) it's a different pool for chunks from 1-128 bytes than there are for 64kbytes and larger. The whole scheme breaks down if the "big" pool has to keep small allocations around. This is especially the case if you're intentionally keeping "big" allocations on page boundaries.
Multi-thread aware applications often have to take special care to avoid contention so that memory allocation is not a bottleneck. If you're realloc'ing a chunk that was allocated in a different thread, it might be non-blocking to give you a new chunk (with copy) and defer on freeing the old pointer, but allowing you to keep the same pointer would block this or some other thread.
A debugging allocator will intentionally return different pointers to make sure the program doesn't incorrectly hang onto an old pointer by mistake: this breaks things sooner rather than later.
I cannot think of a case where a "please try" statement in the standard would change any decisions of any library designer. If keeping the same pointer makes sense for a given implementation, then of course they're going to use it, but if there's an overriding technical reason not to, then they won't.
I'm also not sure I can think of a case where this nudge would make any difference to the user of a library either. You still have to code it to account for all the cases, even one that "tries", so it's not like it's going to save you any code.
In the end, this is an implementation detail that a standard would never handcuff an implementer about, and the library will be judged on its own merits (performance, codesize, reliability, etc.) and this is just one aspect.
You can always code your own allocator if you really need that behavior for some reason.
EDIT: Another reason why an allocator would want to return a different pointer even if reallocing the same size: reducing memory fragmentation.
If my realloc request comes in at a time when there's a lot of free space on either side, the allocator could realize: I could extend this chunk in place (fast and easy), or I could move it some other place and coalesce what's left behind into a much larger free block.
This has been a nagging problem for a customer-written project: written ages ago in 32-bit Delphi, it runs for days at a time with a lot of memory pressure, and eventually the memory is so fragmented that it's unavailable to service modest requests even though there are many hundreds of megabytes free.
Ref: Are concatenated Delphi strings held in a hidden temporary variable that retains a reference to the string?
There's little I can do about this in Delphi, but in C it's very easy to imagine "aggressive avoiding of memory fragmentation" being a property of an allocator.
A standard is a general purpose description, it doesn't need to specify if it try to do somethings, it describes the general behavior of a function.
It is clear and logic, that for optimization purpose the system will try to resize the buffer in the same location, but since there are no guarantees that this happens, it can't specify it in the standard.

Memory allocation that resizes a buffer ONLY if it can grow in place?

After reading the man-page for realloc(), I came to the realization that it works a little differently than I thought it did. I originally thought that realloc() would attempt to resize a buffer, previously allocated with one of the malloc-family functions, and if it could NOT extend the buffer in place, then it would fail. However, the man-page states:
The realloc() function returns a pointer to the newly allocated memory, which is suitably aligned for any built-in type and may be different from ptr, or NULL if the request fails.
The "may be different from ptr" part is what I'm talking about.
Basically, what I want is a function, similar to realloc(), but which fails if it cannot extend the buffer in place. It seems that there is no function in the standard C library that does this; however, I'm assuming there may be some OS-specific functions that accomplish the same thing.
Could someone tell me what functions are out there that do what I described above, and which OS's they are specific to? Preferably, I'd like to know at least the functions specific to Linux and Windows (and Mac OS would be a nice bonus too :) ).
This may be a duplicate of this post, but I don't think it is for the following reasons:
The question in the post I linked to simply asks, is there a function that extends a buffer in place, whereas, I'm asking, which functions extend a buffer in place.
The accepted answer for that post does not contain the information I need.
EDIT
Some people were wondering what is the use case I need this for, so I'll explain, below:
I'm writing a C preprocessor (yes, I know... don't reinvent the wheel... well, I'm doing it anyways, so there). And one component of the C preprocessor is a cache for storing pp-tokens which come from various source files, where each source file's set of pp-tokens may be fragmented within the cache. The cache itself, is a linked-list of large chunks of memory. Ideally, I'd like to keep this linked-list short, hence why I'd like to first try resizing the buffer (in place); however, if resizing in place is not possible, then I want to just add another node (i.e. chunk of memory) to the linked list.
Within each cache buffer, there are additional linked-list nodes, which provide a means for iterating through all the pp-tokens of each individual source file, which may be fragmented across the various cache buffers that make up the cache.
The reasons I need the kind of memory reallocation I discussed earlier are the following:
If resizing a cache buffer could not be done in place, and a new buffer had to be allocated and the old memory contents copied, then I'd have a lot of dangling pointers. Jonathan Leffler suggested that I instead store offsets within the buffer, rather than pointers, which I had not even thought about, and is a great idea! However, reason #2...
I want the implementation of the cache to be as fast as possible, and, please correct me if I'm wrong, but it seems to me that (for my use case) it would be faster on average to just add a new cache buffer to the linked list if a given cache buffer could not be resized in place, rather than allocating a new buffer and copying all previous contents and freeing the old buffer. As a sidenote, I am planning on doubling the size of the allocated cache buffer each time cache resizing is needed.
Memory management (in the form of malloc and friends) is generally implemented as a library; it is not part of the Operating System. (An implementation of the library will probably need to use some OS facilities to acquire raw memory -- although that's not a given -- but there is no need to involve the OS for allocating and freeing individual allocations.) So you're not going to find an "OS-specific" solution.
There are a number of different memory allocation libraries available. If you decide to use an alternative to the one preinstalled with your particular distribution, you will probably want to arrange for it to be used by the standard library as well. Details for how to do that vary.
Most allocation libraries do include some additional interfaces, but I don't know of any library which offers the function you're looking for. More common is an API for finding out how much memory is actually in an allocation (which is often more than the amount requested by the malloc). For many libraries, realloc will only expand the allocation in place if it was already big enough, but there may be libraries which are willing to merge a following free block in order to make non-copying realloc possible.
There's a list of some commonly-used libraries in the Wikipedia page on dynamic memory allocation, which also has a good overview of implementation techniques.
And, of course, you could always write your own memory manager (or modify an open source library) to implement that feature. However, while that would be an interesting and satisfying project, I'd strongly suggest you think about (and research) the reasons why this seemingly simple idea has not been implemented in common memory management libraries. There are good reasons.

How does free() function know how much bytes to deallocate and how to access that information with in our program? [duplicate]

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Why is there no "recalloc" in the C standard?

Everyone knows that:
realloc resizes an existing block of memory or copies it to a larger block.
calloc ensures the memory is zeroed out and guards against arithmetic overflows and is generally geared toward large arrays.
Why doesn't the C standard provide a function like the following that combines both of the above?
void *recalloc(void *ptr, size_t num, size_t size);
Wouldn't it be useful for resizing huge hash tables or custom memory pools?
Generally in C, the point of the standard library is not to provide a rich set of cool functions. It is to provide an essential set of building blocks, from which you can build your own cool functions.
Your proposal for recalloc would be trivial to write, and therefore is not something the standard lib should provide.
Other languages take a different approach: C# and Java have super-rich libraries that make even complicated tasks trivial. But they come with enormous overhead. C has minimal overhead, and that aids in making it portable to all kinds of embedded devices.
I assume you're interested in only zeroing out the new part of the array:
Not every memory allocator knows how much memory you're using in an array. for example, if I do:
char* foo = malloc(1);
foo now points to at least a chunk of memory 1 byte large. But most allocators will allocate much more than 1 byte (for example, 8, to keep alignment).
This can happen with other allocations, too. The memory allocator will allocate at least as much memory as you request, though often just a little bit more.
And it's this "just a little bit more" part that screws things up (in addition to other factors that make this hard). Because we don't know if it's useful memory or not. If it's just padding, and you recalloc it, and the allocator doesn't zero it, then you now have "new" memory that has some nonzeros in it.
For example, what if I recalloc foo to get it to point to a new buffer that's at least 2 bytes large. Will that extra byte be zeroed? Or not? It should be, but note that the original allocation gave us 8 bytes, so are reallocation doesn't allocate any new memory. As far as the allocator can see, it doesn't need to zero any memory (because there's no "new" memory to zero). Which could lead to a serious bug in our code.

How does free know how much to free?

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Resources