Best way to expand dynamic memory in C - c

I'm looking for a way to allocate additional memory (in C) at runtime, for an existing structure (that already had its memory assigned initially). I have a feeling I might be able to use memmove or something similar but that's still just a copy operation, and doesn't increase the amount of memory available to a structure at runtime. Also I don't want to have to copy the entire structure every time I need to do this, which will be many hundreds of times during the program (the structure is already huge). Can anyone help?
UPDATE: Thanks everyone for the replies. To give more detail, what I am trying to do is run an MPI-parallelised code that creates many instances of the structure (call it 'S') initially. Each instance of the structure contains an array 'T' which records the time of a particular event happening as the code is run. These events occur at runtime, and the number of events differs for each instance of S. For example, S[0] might see 100 events (and therefore need an array of 100 elements in length) but S[1] might see only 1 event (and S[2] 30 events, etc.) Therefore it would be very wasteful to allocate huge amounts of memory at the start for every instance of S (for which there are millions) since some might fill the array but others would not even come close. Indeed I have tried this and it is too much for the machine I am running it on.
I will try some of the ideas here and post my progress. Many thanks!

You could probably use realloc().

There is no way to do what you describe, because there is no way to guarantee that there will be available memory next to the one that your structure is currently occupying.
The standard thing to do is to allocate more memory and copy your data.
Of course if you can know (an estimate of) the size of the memory allocation that you need you can preallocate it and avoid copying.
Note, however, that the structures in C have a fixed size once they are declared, so it seems you don't really need to allocate more memory for an existing structure...

realloc is the only way to expand the existing dynamic memory. realloc will tries to expand the existing buffer, if it fails in expansion it will allocate new buffer for the total size required and it will copy the data from old buffer. If you dont want to do realloc every time(which internally will memmove most of the time) then you can try to reallocate more memory than actually you required.
realloc(buf_ptr, (actual_size + additional_size) * 2);
This way will reduce the frequency of calling realloc (and memmove).
Note : Implementation of realloc is different in some architecture, it will never tries to expand the memory it always tries to allocate buffer for total size. So in those platforms memmove will be called for every call to realloc.

It sounds like you are looking for the C feature called flexible array member (example). It is only well-defined for C standard C99 or later.
The last member of the struct will have to be declared as a flexible array member, which you initially malloc, and later realloc (and of course memcpy to do the actual copying).

Related

How does free() function know how much bytes to deallocate and how to access that information with in our program? [duplicate]

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

malloc and other associated functions

I have an array named 'ArrayA' and it is full of ints but I want to add another 5 cell to the end of the array every time a condition is met. How would I do this? ( The internet is not being very helpful )
If this is a static array, you will have to create a new one with more space and copy the data yourself. If it was allocated with malloc(), as the title to your question suggests, then you can use realloc() to do this more-or-less automatically. Note that the address of your array will, in general, have changed.
It is precisely because of the need for "dynamic" arrays that grow (and shrink) as needed, that languages like C++ introduced vectors. They do the management under the covers.
You need the realloc function.
Also note that adding 5 cells is not the best performance solution.
It is best to double the size of your arrays every time an array increase is needed.
Use two variables, one for the size (the number of integers used) and one for capacity (the actual memory size of arrays)
In a modern OS it is generally safe to assume that if you allocate a lot of memory that you don't use then it will not actually consume physical RAM, but only exist as virtual mappings. The OS will provide physical RAM as soon as a page (today generally in chunks of 4Kb) is used for the first time.
You can specifically enforce this behavior by using mmap to create a large anonymous mapping (MAP_PRIVATE | MAP_ANONYMOUS) e.g. as much as you intend to hold at maximum. On modern x64 systems virtual mappings can be up to 64Tb large. It is logically memory available to your program, but in practice pages will be added to it as you start using them.
realloc as described by the other posters is the naiive way to resize a malloc mapping, but make sure that realloc was successful. It can fail!
Problems with memory arise when you use memory once, don't deallocate it and stop using it. In contrast allocated, but untouched memory generally does not actually use resources other then VM table entries.

Efficient memory reallocation question

Let's say I have a program(C++, for example) that allocates multiple objects, never bigger than a given size(let's call it MAX_OBJECT_SIZE).
I also have a region(I'll call it a "page") on the heap(allocated with, say, malloc(REGION_SIZE), where REGION_SIZE >= MAX_OBJECT_SIZE).
I keep reserving space in that page until the filled space equals PAGE_SIZE(or at least gets > PAGE_SIZE - MAX_OBJECT_SIZE).
Now, I want to allocate more memory. Obviously my previous "page" won't be enough. So I have at least two options:
Use realloc(page, NEW_SIZE), where NEW_SIZE > PAGE_SIZE;
Allocate a new "page"(page2) and put the new object there.
If I wanted to have a custom allocate function, then:
Using the first method, I'd see how much I had filled, and then put my new object there(and add the size of the object to my filled memory variable).
Using the second method, I'd have a list(vector? array?) of pages, then look for the current page, and then use a method similar to 1 on the selected page.
Eventually, I'd need a method to free memory too, but I can figure out that part.
So my question is: What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations?
I understand that different operations may perform differently, but I'm looking for an overall metric.
In my experience option 2 is much easier to work with has minimal overhead. Realloc does not guarantee it will increase the size of existing memory. And in practice it almost never does. If you use it you will need to go back and remap all of the old objects. That would require that you remember where every object allocated was... That can be a ton over overhead.
But it's hard to qualify "most efficient" without knowing exactly what metrics you use.
This is the memory manager I always use. It works for the entire application not just one object.
allocs:
for every allocation determine the size of the object allocated.
1 look at a link list of frees for objects of that size to see if anything has been freed if so take the first free
2 look for in a look up table and if not found
2.1 allocate an array of N objects of the size being allocated.
3 return the next free object of the desired size.
3.1 if the array is full add a new page.
N objects can be programmer tunned. If you know you have a million 16 byte objects you might want that N to be slightly higher.
for objects over some size X, do not keep an array simply allocate a new object.
frees:
determine the size of the object, add it to the link list of frees.
if the size of the object allocated is less than the size of a pointer the link list does not need to incur any memory overhead. simply use the already allocated memory to store the nodes.
The problem with this method is memory is never returned to the operating system until the application has exited or the programmer decides to defragment the memory. defragmenting is another post. it can be done.
It is not clear from your question why you need to allocate a big block of memory in advance rather than allocating memory for each object as needed. I'm assuming you are using it as a contiguous array. Otherwise, it would make more sense to malloc the memory of each object as it is needed.
If it is indeed acting as an array,malloc-ing another block gives you another chunk of memory that you have to access via another pointer (in your case page2). Thus it is no longer on contiguous block and you cannot use the two blocks as part of one array.
realloc, on the other hand, allocates one contiguous block of memory. You can use it as a single array and do all sorts of pointer arithmetic not possible if there are separate blocks. realloc is also useful when you actually want to shrink the block you are working with, but that is probably not what you are seeking to do here.
So, if you are using this as an array, realloc is basically the better option. Otherwise, there is nothing wrong with malloc. Actually, you might want to use malloc for each object you create rather than having to keep track of and micro-manage blocks of memory.
You have not given any details on what platform you are experimenting. There are some performance differences for realloc between Linux and Windows, for example.
Depending on the situation, realloc might have to allocate a new memory block if it can't grow the current one and copy the old memory to the new one, which is expensive.
If you don't really need a contiguous block of memory you should avoid using realloc.
My sugestion would be to use the second approach, or use a custom allocator (you could implement a simple buddy allocator [2]).
You could also use more advanced memory allocators, like
APR memory pools
Google's TCMalloc
In the worst case, option 1 could cause a "move" of the original memory, that is an extrawork to be done. If the memory is not moved, anyway the "extra" size is initialized, which is other work too. So realloc would be "defeated" by the malloc method, but to say how much, you should do tests (and I think there's a bias on how the system is when the memory requests are done).
Depending on how many times you expect the realloc/malloc have to be performed, it could be an useful idea or an unuseful one. I would use malloc anyway.
The free strategy depends on the implementation. To free all the pages as whole, it is enough to "traverse" them; instead of an array, I would use linked "pages": add sizeof(void *) to the "page" size, and you can use the extra bytes to store the pointer to the next page.
If you have to free a single object, located anywhere in one of the pages, it becomes a little bit more complex. My idea is to keep a list of non-sequential free "block"/"slot" (suitable to hold any object). When a new "block" is requested, first you pop a value from this list; if it is empty, then you get the next "slot" in the last in use page, and eventually a new page is triggered. Freeing an object, means just to put the empty slot address in a stack/list (whatever you prefer to use).
In linux (and probably other POSIX systems) there is a third possibility, that is to use a memory mapped region with shm_open. Such a region is initialized by zeroes once you access it, but AFAIK pages that you never access come with no cost, if it isn't just the address-range in virtual memory that you reserve. So you could just reserve a large chunk of memory at the beginning (more than you ever would need) of your execution and then fill it incrementally from the start.
What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations?
Option 1. For it to be efficient, NEW_SIZE has to depend on old size non-linearly. Otherwise you risk running into O(n^2) performance of realloc() due to the redundant copying. I generally do new_size = old_size + old_size/4 (increase by 25% percent) as theoretically best new_size = old_size*2 might in worst case reserve too much unused memory.
Option 2. It should be more optimal as most modern OSs (thanks to C++'s STL) are already well optimized for flood of small memory allocations. And small allocations have lesser chance to cause memory fragmentation.
In the end it all depends how often you allocate the new objects and how do you handle freeing. If you allocate a lot with #1 you would have some redundant copying when expanding but freeing is dead simple since all objects are in the same page. If you would need to free/reuse the objects, with #2 you would be spending some time walking through the list of pages.
From my experience #2 is better, as moving around large memory blocks might increase rate of heap fragmentation. The #2 is also allows to use pointers as objects do not change their location in memory (though for some applications I prefer to use pool_id/index pairs instead of raw pointers). If walking through pages becomes a problem later, it can be too optimized.
In the end you should also consider option #3: libc. I think that libc's malloc() is efficient enough for many many tasks. Please test it before investing more of your time. Unless you are stuck on some backward *NIX, there should be no problem using malloc() for every smallish object. I used custom memory management only when I needed to put objects in exotic places (e.g. shm or mmap). Keep in mind the multi-threading too: malloc()/realloc()/free() generally are already optimized and MT-ready; you would have to reimplement the optimizations anew to avoid threads being constantly colliding on memory management. And if you want to have memory pools or zones, there are already bunch of libraries for that too.

How does free know how much to free?

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

Determining realloc() behaviour before calling it

As I understand it, when asked to reserve a larger block of memory, the realloc() function will do one of three different things:
if free contiguous block exists
grow current block
else if sufficient memory
allocate new memory
copy old memory to new
free old memory
else
return null
Growing the current block is a very cheap operation, so this is behaviour I'd like to take advantage of. However, if I'm reallocating memory because I want to (for example) insert a char at the start of an existing string, I don't want realloc() to copy the memory. I'll end up copying the entire string with realloc(), then copying it again manually to free up the first array element.
Is it possible to determine what realloc() will do? If so, is it possible to achieve in a cross-platform way?
realloc()'s behavior is likely dependent on its specific implementation. And basing your code on that would be a terrible hack which, to say the least, violates encapsulation.
A better solution for your specific example is:
Find the size of the current buffer
Allocate a new buffer (with malloc()), greater than the previous one
Copy the prefix you want to the new buffer
Copy the string in the previous buffer to the new buffer, starting after the prefix
Release the previous buffer
As noted in the comments, case 3 in the question (no memory) is wrong; realloc() will return NULL if there is no memory available [question now fixed].
Steve McConnell in 'Code Complete' points out that if you save the return value from realloc() in the only copy of the original pointer when realloc() fails, you've just leaked memory. That is:
void *ptr = malloc(1024);
...
if ((ptr = realloc(ptr, 2048)) == 0)
{
/* Oops - cannot free original memory allocation any more! */
}
Different implementations of realloc() will behave differently. The only safe thing to assume is that the data will always be moved - that you will always get a new address when you realloc() memory.
As someone else pointed out, if you are concerned about this, maybe it is time to look at your algorithms.
Would storing your string backwards help?
Otherwise...
just malloc() more space than you need, and when you run out of room, copy to a new buffer. A simple technique is to double the space each time; this works pretty well because the larger the string (i.e. the more time copying to a new buffer will takes) the less often it needs to occur.
Using this method you can also right-justify your string in the buffer, so it's easy to add characters to the start.
If obstacks are a good match for your memory allocation needs, you can use their fast growing functionality. Obstacks are a feature of glibc, but they are also available in the libiberty library, which is fairly portable.
No - and if you think about it, it can't work. Between you checking what it's going to do and actually doing it, another process could allocate memory.
In a multi-threaded application this can't work. Between you checking what it's going to do and actually doing it, another thread could allocate memory.
If you're worried about this sort of thing, it might be time to look at the data structures you're using to see if you can fix the problem there. Depending on how these strings are constructed, you can do so quite efficiently with a well designed buffer.
Why not keep some empty buffer space in the left of the string, like so:
char* buf = malloc(1024);
char* start = buf + 1024 - 3;
start[0]='t';
start[1]='o';
start[2]='\0';
To add "on" to the beginning of your string to make it "onto\0":
start-=2;
if(start < buf)
DO_MEMORY_STUFF(start, buf);//time to reallocate!
start[0]='o';
start[1]='n';
This way, you won't have to keep copying your buffer every single time you want to do an insertion at the beginning.
If you have to do insertions at both the beginning and end, just have some space allocated at both ends; insertions in the middle will still need you to shuffle elements around, obviously.
A better approach is to use a linked list. Have each of your data objects allocated on a page, and allocate another page and have a link to it, either from the previous page or from an index page. This way you know when the next alloc fails, and you never need to copy memory.
I don't think it's possible in cross platform way.
Here is the code for ulibc implementation that might give you a clue how to do itin platform dependent way, actually it's better to find glibc source but this one was on top of google search :)

Resources