Allocate from buffer in C - c

I am building a simple particle system and want to use a single array buffer of structs to manage my particles. That said, I can't find a C function that allows me to malloc() and free() from an arbitrary buffer. Here is some pseudocode to show my intent:
Particle* particles = (Particle*) malloc( sizeof(Particle) * numParticles );
Particle* firstParticle = <buffer_alloc>( particles );
initialize_particle( firstParticle );
// ... Some more stuff
if (firstParticle->life < 0)
<buffer_free>( firstParticle );
// # program's end
free(particles);
Where <buffer_alloc> and <buffer_free> are functions that allocate and free memory chunks from arbitrary pointers (possibly with additional metadata such as buffer length, etc.). Do such functions exist and/or is there a better way to do this? Thank you!

Yeah, you’d have to write your own. It’s so simple it’s really silly, but its performance will scream in comparison to simply using malloc() and free() all the time....
static const int maxParticles = 1000;
static Particle particleBuf[maxParticles]; // global static array
static Particle* headParticle;
void initParticleAllocator()
{
Particle* p = particleBuf;
Particle* pEnd = &particleBuf[maxParticles-1];
// create a linked list of unallocated Particles
while (p!=pEnd)
{
*((Particle**)p) = p+1;
++p;
}
*((Particle**)p) = NULL; // terminate the end of the list
headParticle = particleBuf; // point 'head' at the 1st unalloc'ed one
}
Particle* ParticleAlloc()
{
// grab the next unalloc'ed Particle from the list
Particle* ret = headParticle;
if (ret)
headParticle = *(Particle**)ret;
return ret; // will return NULL if no more available
}
void ParticleFree(Particle* p)
{
// return p to the list of unalloc'ed Particles
*((Particle**)p) = headParticle;
headParticle = p;
}
You could modify the approach above to not start with any global static array at all, and use malloc() at first when the user calls ParticleAlloc(), but when Particles are returned, don't call free() but instead add the returned ones to the linked list of unalloc'ed particles. Then the next caller to ParticleAlloc() will get one off the list of free Particles rather than use malloc(). Any time there are no more on the free list, your ParticleAlloc() function could then fall back on malloc(). Or use a mix of the two strategies, which would really be the best of both worlds: If you know that your user will almost certainly be using at least 1000 Particles but occasionally might need more, you could start with a static array of 1000 and fall back on calling malloc() if you run out. If you do it that way, the malloc()'ed ones do not need special handling; just add them to your list of unalloc'ed Particles when they come back to ParticleFree(). You do NOT need to bother calling free() on them when your program exits; the OS will free the process'es entire memory space, so any leaked memory will clear up at that point.
I should mention that since you question was tagged "C" and not "C++", I answered it in the form of a C solution. In C++, the best way to implement this same thing would be to add "operator new" and "operator delete" methods to your Particle class. They would contain basically the same code as I showed above, but they override (not overload) the global 'new' operator and, for the Particle class only, define a specialized allocator that replaces global 'new'. The cool thing is that users of Particle objects don't even have to know that there's a special allocator; they simply use 'new' and 'delete' as normal and remain blissfully unaware that their Particle objects are coming from a special pre-allocated pool.

Oh, sorry. This question is C only I see. Not C++. Well, if it was C++ the following would help you out.
Look at Boost's pool allocation library.
It sounds to me that each of your allocations is the same size? The size of a particle, correct? If so the pool allocation functions from Boost will work really well and you don't have to write your own.

You would have to write your own, or find someone who has already written them and reuse what they wrote. There isn't a standard C library to manage that scenario, AFAIK.
You'd probably need 4 functions for your 'buffer allocation' code:
typedef struct ba_handle ba_handle;
ba_handle *ba_create(size_t element_size, size_t initial_space);
void ba_destroy(ba_handle *ba);
void *ba_alloc(ba_handle *ba);
void ba_free(ba_handle *ba, void *space);
The create function would do the initial allocation of space, and arrange to parcel out the information in units of the element_size. The returned handle allows you to have separate buffer allocations for different types (or even for the same type several times). The destroy function forcibly releases all the space associated with the handle.
The allocate function provides you with a new unit of space for use. The free function releases that for reuse.
Behind the scenes, the code keeps track of which units are in use (a bit map, perhaps) and might allocate extra space as needed, or might deny space when the initial allocation is used up. You could arrange for it to fail more or less dramatically when it runs out of space (so the allocator never returns a null pointer). Clearly, the free function can validate that the pointer it is given was one it supplied by the buffer allocator handle that is currently in use. This allows it to detect some errors that regular free() does not normally detect (though the GNU C library version of malloc() et al does seem to do some sanity checking that others do not necessarily do).

Maybe try something like this instead...
Particle * particles[numParticles];
particles[0] = malloc(sizeof(Particle));
initialize_particle( particle[0] );
// ... Some more stuff
if (particle[0]->life < 0)
free( particle[0] );
// # program's end
// don't free(particles);

I am building a simple particle system and want to use a single array buffer of structs to manage my particles.
I think you answered it:
static Particle myParticleArray[numParticles];
Gets allocated at the start of the program and deallocated at the end, simple. Or do like your pseudocode and malloc the array all at once. You might ask yourself why allocate a single particle, why not allocate the whole system? Write your API functions to take a pointer to a particle array and an index.

Related

Creating a million (or more) string array in C

I'm running into a problem creating an array big enough to store words from a large text document (think books).
Usually, I would just do:
char wordList[1000000][30];
But, as expected the program crashes as soon as it tries to initialize the array. So, I tried a few different things, such as:
char *wordList[30]
int k=0;
while(k<1000000){
wordList[k]= malloc(sizeof(char)*30);
k++;
}
This, too, didn't work. So I'm wondering if there is an easier way. I know its possible. For the first option, my research as lead my to believe the the variable is initialized on the stack (which has small memory) and segfaults.
I'm not sure why the second one fails. Any suggestions? I've searched everywhere I could to find an answer, but most of the suggestions are in java or c++ where you just call new, or arraylist etc.
wordList is an array of 30 char* pointers. You are accessing way beyond the limit of this array. Specifically, you are accessing up to million spaces, but the array only has 30 spaces. This will cause undefined behaviour.
You need to instead make sure wordList has enough space for 1000000 pointers.
This should be instead:
char *wordList[1000000];
This allows flexibility on the length of the words. The only fixed size here is the array size.
If you use:
wordList[k]= malloc(sizeof(char)*30);
Moreover, this will run into issues if the words are more than 29 characters, excluding the \0 character at the end. Although, their are not many words longer than 29 characters. Words as long as:
supercalifragilisticexpialidocious
Are hard to come by.
Furthermore, it depends on how you are reading these words from your text document. If you parse the words, and instead use a temporary buffer to store them, then you can do this:
wordList[k]= malloc(strlen(word)+1); /* +1 for null-terminator */
Which will allocate memory for any sized word you copy into wordList[k]. This will be more efficient for smaller words like "the" and "or", instead of allocating 30 spaces for any word.
Note: Allocating a million pointers on the heap beforehand is also very wasteful, this process should be done on an as needed basis. It might even be better to use char **wordList, to allow more flexibility with how many words you allocate.
For example, you could allocate a starting size:
size_t start_size = 1000;
char **wordList = malloc(start_size * sizeof(*wordlist));
Then if more words are found, you can realloc() more space as needed.
realloc() resizes block of memory it points to, and returns a pointer.
An example would be:
if (start_size == word_count) {
start_size *= 2;
wordList = realloc(wordList, start_size * sizeof(*wordList));
if (wordList == NULL) {
/* handle exit */
Which would return a pointer which holds start_size * 2 spaces.
You should also check the return of malloc() and realloc(), as they can return NULL if unsuccessful. At the end of your program, you should also free() the pointers allocated from malloc().
In practice, a good rule of thumb when programming in C is that "big" data should be allocated in heap memory.
(I am guessing that you are coding for an ordinary laptop or desktop machine running some common operating system; I'm actually thinking of a desktop Linux computer, like the machine I am answering here; but you could adapt my answer to a desktop running some Windows, to a tablet running some Android, to an Apple computer running some MacOSX)
Notice that your current wordList is 30 megabytes (that is sizeof(wordList)==30000000). That is big! But in some cases not big enough (probably not enough for the whole Saint James bible, and certainly not for the entire laws, decrees, and juridictions of US or of France, or for the archive of a century-old newspaper). You can easily find textual data of more than several dozens of megabytes today, such as all the messages on StackOverflow.
You may need to understand more about current operating systems to understand all of my answer; I recommend reading Operating Systems: Three Easy Pieces and, if your computer runs Linux and you want to code for Linux, Advanced Linux Programming.
You don't want to allocate a big array (or any kind of big data) as local variables (or with alloca(3) ...) on your call stack, because the call stack is limited (typically to one megabyte, or a few of them). On some special computers (think of expensive servers running some specially configured Linux) you could raise that (machine call stack) limit, but perhaps not easily, to several dozens of megabytes. Expecting a gigabyte call stack is not reasonable.
You probably don't want to have huge global or static data (those allocated at compile time in your data segment) of fixed size. If you did that, your program might still lack of memory (because you under-estimated that fixed size) or could not even start on smaller computers (e.g. if your data segment had 20Gbytes, your executable might start on my desktop with 32Gbytes, but would fail to start -at execve(2) time- on your laptop with only 16Gbytes).
The remaining option is usual practice: allocate all "big" data in heap memory by indirectly using primitives growing your virtual address space. In standard C, you'll extensively use malloc and friends (e.g. calloc) - with free to release memory. FWIW, the underlying primitives (to grow the virtual address space) on Linux include mmap(2) and related system calls (and may be called by the malloc implementation on your system). But the standard C dynamic memory allocation techniques (that is malloc & free) are hiding these gory (implementation specific) details in the C standard library (so your code using malloc & free could become portable with efforts from your part).
Read some coding rules, e.g. the GNU ones, related to memory usage, and robust programs, notably:
Avoid arbitrary limits on the length or number of any data structure, including file names, lines, files, and symbols, by allocating all data structures dynamically
(emphasis is mine)
Practically speaking, your wordList (that is a poor name, it is not a list but a vector or a table) should probably be a dynamically allocated array of pointers to dynamically allocated strings. You could declare it as char**wordList; and you want to keep its allocated size and used length (perhaps in two other global variables, size_t allocatedSize, usedLength; ...). You might prefer to use a struct ending with a flexible array member
Don't forget to check against failure of malloc. Perhaps you want to initialize your data with something like:
allocatedSize=1000;
usedLength=0;
wordList= calloc(allocatedSize, sizeof(char*));
if (!wordList) { perror("initial calloc wordlist"); exit(EXIT_FAILURE); };
Here is a routine to add a new word to your wordList; that routine does not check if the word is indeed new (maybe you want to use some other data structure, like some hash-table, or some self balancing binary search tree); if you want to keep only unique words, read some Introduction to Algorithms. Otherwise, you could use:
void add_new_word(const char*w) {
if (usedLength >= allocatedSize) {
size_t newsize = 4*usedLength/3+10;
(heuristically, we don't want to re-allocate wordList too often; hence the "geometrical" growth above)
char**newlist = calloc(newsize*sizeof(char*));
if (!newlist) { perror("calloc newlist"); exit(FAILURE); };
memcpy (newlist, wordList, usedLength*sizeof(char*));
free (wordList);
wordList = newlist;
allocatedSize = newsize;
};
// here we are sure that wordList is not full,
// so usedLength < allocatedSize
char *dw = strdup(w);
if (!dw) { perror("strdup failure"); exit(EXIT_FAILURE); };
we are using the very common strdup(3) function to copy some string into a heap allocated one. If your system don't have that, it is really easy to write, using strlen, malloc, strcpy ...
wordList[usedLength++] = dw;
} // end of add_new_word
After a call to add_new_word you know that a new word has been added at index usedLength-1 of wordList.
Notice that my add_new_word is checking against failure of malloc (including the one called from strdup) and satisfy the "robustness" criteria: all data is heap allocated!
BTW, some computers are (IMHO wrongly) enabling memory overcommitment. This is a system administration issue. I dislike that feature (because when it is enabled, malloc would never fail, but programs would crash badly when memory resources are exhausted).
FWIW, here is the routine to release memory, to be called near end of program (or registered thru atexit(3))
void destroy_word_list(void) {
if (!wordList) return;
for (size_t ix=0; ix<usedLength; ix++) free(wordList[ix]);
free (wordList);
usedLength = 0;
allocatedSize = 0;
wordList = NULL;
} // end of destroy_word_list
You may want to use valgrind e.g. to debug memory leaks.

malloc checkpoints

I am sure someone must have implemented something like this already!
What I am looking for is the ability to "checkpoint" the heap state and then clear all allocations that have happened since the last checkpoint.
Basically what I am looking for is a natural corollary of the _CrtMemCheck Apis.
Something like(preferably cross-platform)
//we save the heap state here in s1
_CrtMemCheckpoint( &s1 );
//allocs and frees
//Get rid of all allocs since checkpoint s1 that have not been freed!
_CrtMemClearAllObjectsSince(&s1);
There is no standard way to use mark/release memory allocation in C. If you know for a fact that all malloc/free calls will be used in a LIFO fashion, you may be able to link in your ownmalloc/free` functions using something like the following:
#define MY_HEAP_SIZE 12345678
unsigned char my_mem[MY_HEAP_SIZE];
unsigned char *my_alloc_ptr = my_mem;
void *malloc(size_t size)
{
void *ret = my_alloc_ptr;
if (size <= MY_HEAP_SIZE && ((my_alloc_ptr - my_mem)+size) <= MY_HEAP_SIZE)
{
my_alloc_ptr += size;
return (void*)ret;
}
else
return (void*)0;
}
void free(void *ptr)
{
if (ptr)
my_alloc_ptr = ptr;
}
This approach requires zero bytes of overhead per allocation block, but calling free() on any block will also free all blocks that were allocated later. An alternative approach which could be used if the external code doesn't use malloc/free in LIFO order, but it would be okay if blocks don't freed until your code does so, would be to make free() do nothing, but have some other function which behaves like free above. More sophisticated variations are possible as well, but in cases where the first approach will suffice, there's no beating its efficiency. Very nice for embedded systems (though I'd usually call it something other than malloc).
You can modify malloc()/free() using hooks to remember allocated memory (for example, suppose that your record the new pointer in an array of pointers). Then your can have two functions:
int get_checkpoint(), that returns the next free array index,
void free_until(int checkpoint), that frees memory from the current stored pointer in the array backwards, until checkpoint is reached.
This way, you can do:
int cpoint = get_checkpoint();
LibraryDoSomething();
free_until(cpoint);
Of course, this technique is still dangerous; calling a C library function can have side effects that you can easily affect. The best advice is still that of Amardeep.
Another possible and interesting solution could be the use of LD_PRELOAD. As the man page for LD_PRELOAD states "This can be used to selectively override functions in other shared libraries."
Thus, you can have your own implementations of malloc and free wherein you can implement the required checks and then call the default malloc or free.
You can check the details here: http://somethingswhichidintknow.blogspot.com/2009/10/dll-injection.html

What are the main things that should be considered while deallocating the memory in C?

I tried to find some tutorial which explicitly explains,what are the things that need to keep in mind while deallocating the memory. But I could not find the such things.Can anybody let me know what are the principal things that a programmer should keep in mind while deallocting the memory in C. I am currently dealing with linked lists.There are some cases where a new linked list is created using 2 or more existing linked list.For example:
list l1;
list l2
list l3 = list_append(l1,l2)
list l4 = list_append(l3,l1)
list l5 = list_append(l3,l4)
What is the sequence of deallocation that i have to follow to deallocate the memory?
here list_append is the function that returns the copy of the list.
When using the malloc/free family of function there are two rules to be obeyed.
You can only free valid memory returned by a malloc family allocator, and freeing it renders it invalid (so double freeing is an error as is freeing memory not obtained from malloc).
It is an error to access memory after it has been freed.
And here is the important part: the allocator provides no facilities to help you obey these rules.
You have to manage it yourself. This means that you have to arrange the logic of your program to insure that these rules are always followed. This is c and huge amounts of tedious and complex responsibility are dumped on your shoulders.
Let me suggest two patterns that are fairly safe:
Allocate and free in the same context
//...
{
SomeData *p = malloc(sizeof SomeData);
if (!p) { /* handle failure to allocate */ }
// Initialize p
// use p various ways
// free any blocks allocated and assigned to members of p
free p;
}
//...
Here you know that the data p points to is allocated once and freed once and only used in between. If the initialization and freeing of SomeData's contents are non-trivial you should wrap them up in a couple of function so that this reduces to
//...
{
SomeData *p = NewSomeData(i,f,"name"/*,...*/); // this handles initialization
if (!p) { /* handle failure to allocate */ }
// use p various ways
ReleaseSomeData(p) // this handles freeing any blocks
// allocated and assigned to members of p
}
//...
Call this one "Scope Ownership". You'll note that it is not much different from local automatic variables and provides you with only a few options not available with automatic
variables.
Call the second option "Structure Ownership": Here responsibility for deleting the allocated block is handed to a larger structure:
List L = NewList();
//...
while (something) {
// ...
Node n= NewNode(nodename);
if (!n) { /* handle failure to allocate */ }
ListAdd(L,n); // <=== Here the list takes ownership of the
// node and you should only access the node
// through the list.
n = NULL; // This is not a memory leak because L knows where the new block is, and
// deleting the knowledge outside of L prevents you from breaking the
// "access only through the structure" constraint.
//...
}
// Later
RemoveListNode(L,key); // <== This routine manages the deletion of one node
// found using key. This is why keeping a separate copy
// of n to access the node would have been bad (because
// your separate copy won't get notified that the block
// no longer valid).
// much later
ReleaseList(L); // <== Takes care of deleting all remaining nodes
Given as you have a list with nodes to be added and removed, you might consider the Structure Ownership pattern, so remember: once you give the node to the structure you only access it through the structure.
The question in general makes little sense, the only reasonable answer seems rather obvious:
That the memory was dynamically allocated in the first instance
That following deallocation you do not attempt to use the memory again
That you maintain at least one pointer to the allocation until you need to deallocate it (i.e. don't let your only reference to the block go out of scope or be destroyed).
This second requirement can be assisted by setting the pointer to NULL or zero after deallocation, but the pointer may be held elsewhere to it is not fool-proof.
The third requirement is particularly an issue in complex data structures where the allocated memory may contain structures that themselves contain pointers to allocated memory. You will of course need to deallocate these prior to deallocating the higher level structure.
Can anybody let me know what are the
principal things that a programmer
should keep in mind while deallocting
the memory in C.
The basic principles are pretty straightforward: any memory allocated using the *alloc family of functions, including malloc, calloc or realloc, must be deallocated by a corresponding call to free().
When passing a pointer (memory address) to free(), keep in mind that the only valid memory addresses you can pass to free() are memory addresses which were previously returned by one of the *alloc functions. Once a memory address has been passed to free(), that memory address is no longer valid and cannot be used for any other purpose.
The first principle is:
Whatever you allocate (with calloc /
malloc) you need to free eventually.
In you case if lists are deep copied on every append, I don't see what is the problem. You need to free every list separately.
Use valgrind(1) to show you, in your particular case, what objects still exist at termination and then modify your code to ensure that unnecessary ones are freed.
Well the best answer is a question.. why the heck are you writing code in C instead of using a higher level language with better memory management?

Where should I deallocate memory within functions?

I'm writing a shell in C. While I don't expect many other people to use it, I'd like to practice writing maintainable and well-organized code. I've noticed the following pattern in a number of my functions, so before it solidifies, I'd like it to be fully vetted.
As an example, consider the following function:
int foo(int param...) {
// declare variables
struct bar *a, *b, *c;
// do some work
a = bar_creator();
b = bar_modifier(a);
c = bar_modifier(b);
// cleanup
free(a);
free(b);
free(c);
return 1;
}
Things to note:
three phases: declaration, initiation/modification, cleanup
newly allocated structures are often returned from functions as modified copies of other objects
a huge number of objects are not needed, so memory usage is not an issue
As it stands, the three sections have been relatively distinct. This allows me to match up the first and last sections and ensure everything is accounted for. Now I wonder if a better style might be to deallocate something as soon as it is not needed. A motivation for this might be to minimize the context within which a code section makes sense.
What is your approach to deallocation of resources? What are the advantages of the given strategy?
edit
To clear up any confusion as to the behavior of functions:
/**
* returns a newly created bar
*/
struct bar *bar_creator();
/**
* takes a bar, and returns a _new_ copy of it that may have been modified.
* the original is not modified.
*/
struct bar *bar_modifier(struct bar *param);
Personally, my preference is to free objects directly after I'm done using them, and only allocate directly before I need them. This forces me to understand what memory my program is actually using. Another benefit of this technique is that it reduces total memory consumption if you allocate additional memory after you free memory in the method.
there are two different situations to consider:
(1) an object is created in the local scope and it is not needed outside this local scope.
in this case you could allocate storage with calloc alloca() or with a RAII approach. Using calloc alloca() has the big advantage that you don't have to care about calling free() because the allocated memory is automatically freed when the local scope is left.
(2) an object is created in the local scope and it is needed outside this local scope.
In this case there is no general advice. I would free the memory when the object is no longer needed.
EDITED: use alloca() instead of calloc()
I tend to group frees at the end unless I am reusing a variable and need to free it first. This way it's clearer what exactly needs to be destroyed, which is helpful if you are considering an early return or if the function is a bit more complex. Often your function will have a few different control flows and you want to be sure they all hit the clean up at the end, which is easier to see when the cleanup code is at the end.
I usually favor the smallest scope as possible, thus I create object as late as possible, and I release (free) them as early as possible.
I will tend to have:
char * foo;
/* some work */
{
foo = create();
/* use foo */
destroy(foo);
}
/* some other work */
{
foo = create();
/* use foo */
destroy(foo);
}
Even if I could have reused the memory, I prefer to alloc it twice and release it twice. Most of the time the performance hit of this technique is very little, as most of the time the two objects are different anyway, and if it's a problem, I tend to optimize this very lately in the dev process.
Now if you have 2 objects with the same scope (or three as your example), it's the same thing:
{
foo1 = create();
foo2 = create();
foo3 = create();
/* do something */
destroy(foo1);
destroy(foo4);
destroy(foo3);
}
But this particular layout is only relevant when the three objects have the same scope.
I tend to avoid this kind of layout:
{
foo1 = create();
{
foo2 = create();
/* use foo2 */
}
destroy(foo1);
/* use foo2 again */
destroy(foo2);
}
As I consider this broken.
Of course the {} are only here for the example, but you can also use them in the actual code, or vim folds or anything that denote scope.
When I need a larger scope (eg global or shared), I use reference count and a retain release mechanism (replace create with retain and destroy with release), and this has always ensured me a nice and simple memory management.
Usually dynamically allocated memory has a long lifetime (longer than a function call) so it is meaningless to talk about where within a function it is deallocated.
If memory is only needed for within the scope of a function, depending on the language it should be statically allocated if appropriate on the stack (declared as a local variable in the function, it will be allocated when the function is called and freed when the function exits, as shown in an example by another poster).
As far as naming is concerned, only functions that allocate memory and return it need to be specially named. Anything else don't bother saying "modfiier" - use that letterspace for describing what the function does. I.e. by default, assume that it is not allocating memory unless specifically named so (i.e. createX, allocX, etc.).
In languages or situations (i.e. to provide consistency with code elsewhere in the program) where static alllocation is not appropriate, then mimic the stack allocation pattern by allocating at the beginning of the function call, and freeing at the end.
For clarity, if your function simply modifies the object, don't use a function at all. Use a procedure. This makes it absolutely clear that no new memory is being allocated. In other words, eliminate your pointers b and c - they are unnecessary. They can modify what a is pointing to without returning a value.
From the looks of your code, either you are freeing already freed memory, or bar_modifier is misleading named in that it is not simply modifying the memory pointed to by a, but creating brand new dynamically allocated memory. In this case, they shouldn't be named bar_modifier but create_SomethingElse.
Why are you freeing it 3 times?
If bar_creator() is the only function that allocates memory dynamically you only need to free one of the pointers that point to that area of memory.
When you have finished with it!
Do not let cheap memory prices promote lazy programming.
You need to be careful about what happens when the memory allocation fails. Because C doesn't have support for exceptions, I use goto to manage unwinding dynamic state on error. Here's a trivial manipulation of your original function, demonstrating the technique:
int foo(int param...) {
// declare variables
struct bar *a, *b, *c;
// do some work
a = bar_creator();
if(a == (struct bar *) 0)
goto err0;
b = bar_modifier(a);
if(b == (struct bar *) 0)
goto err1;
c = bar_modifier(b);
if(c == (struct bar *) 0)
goto err2;
// cleanup
free(a);
free(b);
free(c);
return 1;
err2:
free(b);
err1:
free(a);
err0:
return -1;
}
When using this technique, I always want to have a return statement preceding the error labels, to visually distinguish the normal return case from the error case. Now, this presumes that you use a wind / unwind paradigm for your dynamically allocated memory... What you're doing looks more sequential, so I'd probably have something closer to the following:
a = bar_creator();
if(a == (struct bar *) 0)
goto err0;
/* work with a */
b = bar_modifier(a);
free(a);
if(b == (struct bar *) 0)
goto err0;
/* work with b */
c = bar_modifier(b);
free(b);
if(c == (struct bar *) 0)
goto err0;
/* work with c */
free(c);
return 1;
err0:
return -1;
Let the compiler clean the stack for you?
int foo(int param...) {
// declare variables
struct bar a, b, c;
// do some work
bar_creator(/*retvalue*/&a);
bar_modifier(a,/*retvalue*/&b);
bar_modifier(b,/*retvalue*/&c);
return 1;
}
For complicated code I would use structure charts to show the way the subroutines work together and then for allocation/deallocation I try to make these occur at roughly the same level in the charts for a given object.
In your case, I might be tempted to define a new function called bar_destroyer, call this 3 times at the end of function foo, and do the free() in there.
Consider using a different pattern. Allocate variables on the stack if it is reasonable to do so (using declarations, not alloca). Consider making your bar_creator a bar_initialiser which takes a struct bar *.
Then you can make your bar_modifier look like
void bar_modifier(const struct bar * source, struct bar *dest);
Then you don't need to worry so much about memory allocation.
In general it is nicer in C to have the caller allocate memory, not the callee - hence why strcpy is a "nicer" function, in my opinion, than strdup.

C Memory Management

I've always heard that in C you have to really watch how you manage memory. And I'm still beginning to learn C, but thus far, I have not had to do any memory managing related activities at all.. I always imagined having to release variables and do all sorts of ugly things. But this doesn't seem to be the case.
Can someone show me (with code examples) an example of when you would have to do some "memory management" ?
There are two places where variables can be put in memory. When you create a variable like this:
int a;
char c;
char d[16];
The variables are created in the "stack". Stack variables are automatically freed when they go out of scope (that is, when the code can't reach them anymore). You might hear them called "automatic" variables, but that has fallen out of fashion.
Many beginner examples will use only stack variables.
The stack is nice because it's automatic, but it also has two drawbacks: (1) The compiler needs to know in advance how big the variables are, and (2) the stack space is somewhat limited. For example: in Windows, under default settings for the Microsoft linker, the stack is set to 1 MB, and not all of it is available for your variables.
If you don't know at compile time how big your array is, or if you need a big array or struct, you need "plan B".
Plan B is called the "heap". You can usually create variables as big as the Operating System will let you, but you have to do it yourself. Earlier postings showed you one way you can do it, although there are other ways:
int size;
// ...
// Set size to some value, based on information available at run-time. Then:
// ...
char *p = (char *)malloc(size);
(Note that variables in the heap are not manipulated directly, but via pointers)
Once you create a heap variable, the problem is that the compiler can't tell when you're done with it, so you lose the automatic releasing. That's where the "manual releasing" you were referring to comes in. Your code is now responsible to decide when the variable is not needed anymore, and release it so the memory can be taken for other purposes. For the case above, with:
free(p);
What makes this second option "nasty business" is that it's not always easy to know when the variable is not needed anymore. Forgetting to release a variable when you don't need it will cause your program to consume more memory that it needs to. This situation is called a "leak". The "leaked" memory cannot be used for anything until your program ends and the OS recovers all of its resources. Even nastier problems are possible if you release a heap variable by mistake before you are actually done with it.
In C and C++, you are responsible to clean up your heap variables like shown above. However, there are languages and environments such as Java and .NET languages like C# that use a different approach, where the heap gets cleaned up on its own. This second method, called "garbage collection", is much easier on the developer but you pay a penalty in overhead and performance. It's a balance.
(I have glossed over many details to give a simpler, but hopefully more leveled answer)
Here's an example. Suppose you have a strdup() function that duplicates a string:
char *strdup(char *src)
{
char * dest;
dest = malloc(strlen(src) + 1);
if (dest == NULL)
abort();
strcpy(dest, src);
return dest;
}
And you call it like this:
main()
{
char *s;
s = strdup("hello");
printf("%s\n", s);
s = strdup("world");
printf("%s\n", s);
}
You can see that the program works, but you have allocated memory (via malloc) without freeing it up. You have lost your pointer to the first memory block when you called strdup the second time.
This is no big deal for this small amount of memory, but consider the case:
for (i = 0; i < 1000000000; ++i) /* billion times */
s = strdup("hello world"); /* 11 bytes */
You have now used up 11 gig of memory (possibly more, depending on your memory manager) and if you have not crashed your process is probably running pretty slowly.
To fix, you need to call free() for everything that is obtained with malloc() after you finish using it:
s = strdup("hello");
free(s); /* now not leaking memory! */
s = strdup("world");
...
Hope this example helps!
You have to do "memory management" when you want to use memory on the heap rather than the stack. If you don't know how large to make an array until runtime, then you have to use the heap. For example, you might want to store something in a string, but don't know how large its contents will be until the program is run. In that case you'd write something like this:
char *string = malloc(stringlength); // stringlength is the number of bytes to allocate
// Do something with the string...
free(string); // Free the allocated memory
I think the most concise way to answer the question in to consider the role of the pointer in C. The pointer is a lightweight yet powerful mechanism that gives you immense freedom at the cost of immense capacity to shoot yourself in the foot.
In C the responsibility of ensuring your pointers point to memory you own is yours and yours alone. This requires an organized and disciplined approach, unless you forsake pointers, which makes it hard to write effective C.
The posted answers to date concentrate on automatic (stack) and heap variable allocations. Using stack allocation does make for automatically managed and convenient memory, but in some circumstances (large buffers, recursive algorithms) it can lead to the horrendous problem of stack overflow. Knowing exactly how much memory you can allocate on the stack is very dependent on the system. In some embedded scenarios a few dozen bytes might be your limit, in some desktop scenarios you can safely use megabytes.
Heap allocation is less inherent to the language. It is basically a set of library calls that grants you ownership of a block of memory of given size until you are ready to return ('free') it. It sounds simple, but is associated with untold programmer grief. The problems are simple (freeing the same memory twice, or not at all [memory leaks], not allocating enough memory [buffer overflow], etc) but difficult to avoid and debug. A hightly disciplined approach is absolutely mandatory in practive but of course the language doesn't actually mandate it.
I'd like to mention another type of memory allocation that's been ignored by other posts. It's possible to statically allocate variables by declaring them outside any function. I think in general this type of allocation gets a bad rap because it's used by global variables. However there's nothing that says the only way to use memory allocated this way is as an undisciplined global variable in a mess of spaghetti code. The static allocation method can be used simply to avoid some of the pitfalls of the heap and automatic allocation methods. Some C programmers are surprised to learn that large and sophisticated C embedded and games programs have been constructed with no use of heap allocation at all.
There are some great answers here about how to allocate and free memory, and in my opinion the more challenging side of using C is ensuring that the only memory you use is memory you've allocated - if this isn't done correctly what you end up with is the cousin of this site - a buffer overflow - and you may be overwriting memory that's being used by another application, with very unpredictable results.
An example:
int main() {
char* myString = (char*)malloc(5*sizeof(char));
myString = "abcd";
}
At this point you've allocated 5 bytes for myString and filled it with "abcd\0" (strings end in a null - \0).
If your string allocation was
myString = "abcde";
You would be assigning "abcde" in the 5 bytes you've had allocated to your program, and the trailing null character would be put at the end of this - a part of memory that hasn't been allocated for your use and could be free, but could equally be being used by another application - This is the critical part of memory management, where a mistake will have unpredictable (and sometimes unrepeatable) consequences.
A thing to remember is to always initialize your pointers to NULL, since an uninitialized pointer may contain a pseudorandom valid memory address which can make pointer errors go ahead silently. By enforcing a pointer to be initialized with NULL, you can always catch if you are using this pointer without initializing it. The reason is that operating systems "wire" the virtual address 0x00000000 to general protection exceptions to trap null pointer usage.
Also you might want to use dynamic memory allocation when you need to define a huge array, say int[10000]. You can't just put it in stack because then, hm... you'll get a stack overflow.
Another good example would be an implementation of a data structure, say linked list or binary tree. I don't have a sample code to paste here but you can google it easily.
(I'm writing because I feel the answers so far aren't quite on the mark.)
The reason you have to memory management worth mentioning is when you have a problem / solution that requires you to create complex structures. (If your programs crash if you allocate to much space on the stack at once, that's a bug.) Typically, the first data structure you'll need to learn is some kind of list. Here's a single linked one, off the top of my head:
typedef struct listelem { struct listelem *next; void *data;} listelem;
listelem * create(void * data)
{
listelem *p = calloc(1, sizeof(listelem));
if(p) p->data = data;
return p;
}
listelem * delete(listelem * p)
{
listelem next = p->next;
free(p);
return next;
}
void deleteall(listelem * p)
{
while(p) p = delete(p);
}
void foreach(listelem * p, void (*fun)(void *data) )
{
for( ; p != NULL; p = p->next) fun(p->data);
}
listelem * merge(listelem *p, listelem *q)
{
while(p != NULL && p->next != NULL) p = p->next;
if(p) {
p->next = q;
return p;
} else
return q;
}
Naturally, you'd like a few other functions, but basically, this is what you need memory management for. I should point out that there are a number tricks that are possible with "manual" memory management, e.g.,
Using the fact that malloc is guaranteed (by the language standard) to return a pointer divisible by 4,
allocating extra space for some sinister purpose of your own,
creating memory pools..
Get a good debugger... Good luck!
#Euro Micelli
One negative to add is that pointers to the stack are no longer valid when the function returns, so you cannot return a pointer to a stack variable from a function. This is a common error and a major reason why you can't get by with just stack variables. If your function needs to return a pointer, then you have to malloc and deal with memory management.
#Ted Percival:
...you don't need to cast malloc()'s return value.
You are correct, of course. I believe that has always been true, although I don't have a copy of K&R to check.
I don't like a lot of the implicit conversions in C, so I tend to use casts to make "magic" more visible. Sometimes it helps readability, sometimes it doesn't, and sometimes it causes a silent bug to be caught by the compiler. Still, I don't have a strong opinion about this, one way or another.
This is especially likely if your compiler understands C++-style comments.
Yeah... you caught me there. I spend a lot more time in C++ than C. Thanks for noticing that.
In C, you actually have two different choices. One, you can let the system manage the memory for you. Alternatively, you can do that by yourself. Generally, you would want to stick to the former as long as possible. However, auto-managed memory in C is extremely limited and you will need to manually manage the memory in many cases, such as:
a. You want the variable to outlive the functions, and you don't want to have global variable. ex:
struct pair{
int val;
struct pair *next;
}
struct pair* new_pair(int val){
struct pair* np = malloc(sizeof(struct pair));
np->val = val;
np->next = NULL;
return np;
}
b. you want to have dynamically allocated memory. Most common example is array without fixed length:
int *my_special_array;
my_special_array = malloc(sizeof(int) * number_of_element);
for(i=0; i
c. You want to do something REALLY dirty. For example, I would want a struct to represent many kind of data and I don't like union (union looks soooo messy):
struct data{
int data_type;
long data_in_mem;
};
struct animal{/*something*/};
struct person{/*some other thing*/};
struct animal* read_animal();
struct person* read_person();
/*In main*/
struct data sample;
sampe.data_type = input_type;
switch(input_type){
case DATA_PERSON:
sample.data_in_mem = read_person();
break;
case DATA_ANIMAL:
sample.data_in_mem = read_animal();
default:
printf("Oh hoh! I warn you, that again and I will seg fault your OS");
}
See, a long value is enough to hold ANYTHING. Just remember to free it, or you WILL regret. This is among my favorite tricks to have fun in C :D.
However, generally, you would want to stay away from your favorite tricks (T___T). You WILL break your OS, sooner or later, if you use them too often. As long as you don't use *alloc and free, it is safe to say that you are still virgin, and that the code still looks nice.
Sure. If you create an object that exists outside of the scope you use it in. Here is a contrived example (bear in mind my syntax will be off; my C is rusty, but this example will still illustrate the concept):
class MyClass
{
SomeOtherClass *myObject;
public MyClass()
{
//The object is created when the class is constructed
myObject = (SomeOtherClass*)malloc(sizeof(myObject));
}
public ~MyClass()
{
//The class is destructed
//If you don't free the object here, you leak memory
free(myObject);
}
public void SomeMemberFunction()
{
//Some use of the object
myObject->SomeOperation();
}
};
In this example, I'm using an object of type SomeOtherClass during the lifetime of MyClass. The SomeOtherClass object is used in several functions, so I've dynamically allocated the memory: the SomeOtherClass object is created when MyClass is created, used several times over the life of the object, and then freed once MyClass is freed.
Obviously if this were real code, there would be no reason (aside from possibly stack memory consumption) to create myObject in this way, but this type of object creation/destruction becomes useful when you have a lot of objects, and want to finely control when they are created and destroyed (so that your application doesn't suck up 1GB of RAM for its entire lifetime, for example), and in a Windowed environment, this is pretty much mandatory, as objects that you create (buttons, say), need to exist well outside of any particular function's (or even class') scope.

Resources