Writing a memory management function in C? - c

In my beginning programs in C, I noticed I call free a lot, so I thought of making a call-once function that frees up everything. Is this code a valid way of doing it, or are there any other suggestions to improve it?
#include <stdio.h>
#include <stdlib.h>
void *items_to_free[1024];
int intItemsToFree = 0;
void mm_init(void)
{
int i;
for (i = 0; i < 1024; i++)
{
items_to_free[i] = NULL;
}
}
void mm_release(void)
{
int i;
for (i = 0; i < 1024; i++)
{
if (items_to_free[i])
{
printf("Freeing %p\n", items_to_free[i]);
free(items_to_free[i]);
items_to_free[i] = NULL;
}
}
}
void mm_add(void *p)
{
items_to_free[intItemsToFree++] = p;
}
int main(void)
{
int *i = NULL;
/* initialize memory management */
mm_init();
/* allocate something */
i = (int *) malloc(sizeof(int) * 10);
/* add it to the memory collection */
mm_add(i);
/* run program as usual */
printf("App doing something...");
/* at the end, free all memory */
mm_release();
return 0;
}
Output:
App doing something...Freeing 0x100103b30

While for a simple application this may seem like a good idea, in reality it's not. Let's consider two cases:
1) mm_release is called at program termination
This means that mm_release is completely useless and is a waste of time. Any OS since decades ago would clean that memory up for you in one big gulp. Doing it yourself piece by piece is just a waste of time.
2) mm_release is called somewhere in between
This means that mm_release has to be specialized. You release memory during execution because you are done with some memory and you want to give it back so it could be used somewhere else in your program. mm_release would have to be given exact information on what should be released and what not. This is exactly what free does.
So as you can see, mm_release is really not helping you at all. In the first case, it's useless and you can simply get rid of it. In the second case, you are better off directly using free since you are selectively freeing memory anyway.
Note also that your method is very thread-unfriendly.
You may think that mm_release could group the allocated memory in related sets, where you can free all memories in a set with one call. While this may look attractive, it's again quite useless in reality. First of all, in reality either you don't have many memory allocations that are semantically similar so they can be grouped, or if you do, then they are already put together in an array or equivalent.
So either the memory sets have single elements (which means you don't gain anything by using this method), or you are simply avoiding a for loop at the cost of an unnecessarily complicated library.
Last but not least, memory is a resource in the same system just as many others. You open files one by one and close them one by one. You get semaphores one by one and you release them one by one. You open pipes one by one and you close them one by one. Heck, you even open { one by one and close it with } one by one. It doesn't make sense to make an exception for memory.
In fact, some people who were very afraid of memory tried your method in the past. They called it garbage collector and it's an insult to regularity in resource management. (Those same people were also very afraid of pointers and basically programming in general)

Related

How to deal with assert() in a function, when you have dynamically allocated memory in main?

I have the following C function:
void mySwap(void * p1, void * p2, int elementSize)
{
void * temp = (void*) malloc(elementSize);
assert(temp != NULL);
memcpy(temp, p1, elementSize);
memcpy(p1, p2, elementSize);
memcpy(p2, temp, elementSize);
free(temp);
}
that I want to use in a generic sorting function. Let's suppose that I use it to sort a dynamically allocated array owned by main(). Now let's suppose that at some point temp in mySwap() is actually NULL and the whole program is aborted without freeing the dynamically allocated array in main(). I thought that both mySwap() and the sorting function could return a bool value indicating whether the allocation was successful or not and by using if statements I could free the array in main() and exit(EXIT_FAILURE), but it doesn't seem like a very elegant sollution. What would be a good way to prevent a memory leak in such an instance?
assert is typically used during debugging to identify problems/errors that should never occur.
Out of memory is something that can occur, and so either should not be handled by assert, or, if you do use assert, beware that it will abort the program. Once the program aborts, all memory used by the program is deallocated, so don't worry about that.
Note: If you don't want to have unwieldy if statements everywhere just to handle errors that hardly ever occur, you can use setjmp/longjmp to return to a recoverable state.
You have to realize that the reason malloc fails is because your computer has ran out of memory. From that point and onwards, there's nothing meaningful that your program can do, except terminating as gracefully as you can.
The OS will free the memory upon program termination, so that's not something you need to worry about.
Still, in the normal case, it is of course good practice to free() yourself, manually. Not so much for the sake of "making the memory available again" - the OS will ensure that - but to verify that your program has not gone terribly wrong along the way and created heap corruption, leaks or other bugs. If you have such bugs in your program, it will crash during the free() call, which is a good thing, as the bugs will surface.
assert should preferably not be used in production code. Build your own error handling if needed, that's something better than just violently terminating your own program in the middle of execution.
Avoid the problem by not using malloc.
Instead of allocating a block of memory for every swap, do the swap one byte at a time;
for (int i = 0; i < elementSize; ++i) {
char tmp = ((char*)p1)[i];
((char*)p1)[i] = ((char*)p2)[i];
((char*)p2)[i] = tmp;
}
Only use assert() to catch programmer-error during development, in release-builds it doesn't do anything. If you need to test other things, use proper error-handling, whether that means abort(), return-codes or emulating exceptions using setjmp()/longjmp().
As an aside, do not cast the result of malloc().

Is it really important to free allocated memory if the program's just about to exit? [duplicate]

This question already has answers here:
What REALLY happens when you don't free after malloc before program termination?
(20 answers)
Closed 7 years ago.
I understand that if you're allocating memory to store something temporarily, say in response to a user action, and by the time the code gets to that point again you don't need the memory anymore, you should free the memory so it doesn't cause a leak. In case that wasn't clear, here's an example of when I know it's important to free memory:
#include <stdio.h>
#include <stdlib.h>
void countToNumber(int n)
{
int *numbers = malloc(sizeof(int) * n);
int i;
for (i=0; i<n; i++) {
numbers[i] = i+1;
}
for (i=0; i<n; i++) {
// Yes, simply using "i+1" instead of "numbers[i]" in the printf would make the array unnecessary.
// But the point of the example is using malloc/free, so pretend it makes sense to use one here.
printf("%d ", numbers[i]);
}
putchar('\n');
free(numbers); // Freeing is absolutely necessary here; this function could be called any number of times.
}
int main(int argc, char *argv[])
{
puts("Enter a number to count to that number.");
puts("Entering zero or a negative number will quit the program.");
int n;
while (scanf("%d", &n), n > 0) {
countToNumber(n);
}
return 0;
}
Sometimes, however, I'll need that memory for the whole time the program is running, and even if I end up allocating more for the same purpose, the data stored in the previously-allocated memory is still being used. So the only time I'd end up needing to free the memory is just before the program exits.
But if I don't end up freeing the memory, would that really cause a memory leak? I'd think the operating system would reclaim the memory as soon as the process exits. And even if it doesn't cause a memory leak, is there another reason it's important to free the memory, provided this isn't C++ and there isn't a destructor that needs to be called?
For example, is there any need for the free call in the below example?
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
void *ptr = malloc(1024);
// do something with ptr
free(ptr);
return 0;
}
In that case the free isn't really inconvenient, but in cases where I'm dynamically allocating memory for structures that contain pointers to other dynamically-allocated data, it would be nice to know I don't need to set up a loop to do it. Especially if the pointer in the struct is to an object with the same struct, and I'd need to recursively delete them.
Generally, the OS will reclaim the memory, so no, you don't have to free() it. But it is really good practice to do it, and in some cases it may actually make a difference. Couple of examples:
You execute your program as a subprocess of another process. Depending on how that is done (see comments below), the memory won't be freed until the parent finishes. If the parent never finishes, that's a permanent leak.
You change your program to do something else. Now you need to hunt down every exit path and free everything, and you'll likely forget some.
Reclaiming the memory is of OS' volition. All major ones do it, but if you port your program to another system it may not.
Static analysis and debug tools work better with correct code.
If the memory is shared between processes, it may only be freed after all processes terminate, or possibly not even then.
By the way, this is just about memory. Freeing other resources, such as closing a file (fclose()) is much more important, as some OSes (Windows) don't properly flush the stream.

Why doesn't the memory footprint of this program increase?

I am in a scratchbox cross compile environment and have this
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main()
{
int * ptr;
int i=0;
while(1)
{
ptr = (int*)malloc( 10485760 * sizeof(int) );
if(ptr == NULL)
{
printf("Could not malloc\n");
exit(1);
}
else
{
printf("Malloc done\n");
for (i = 0 ; i <= 10485759 ; i++)
{
ptr[i] = i ;
}
sleep (5);
continue;
}
}
}
When I run the binary and do
ps -p pid -o cmd,rss,%mem
I do not see any increase in the memory footprint of the process. Why is that?
You probably built very optimized.
On most modern systems gcc knows that malloc returns a non-aliased pointer. That is to say, it will never return the same pointer twice and it will never return a pointer you have saved 'live' somewhere else.
I find this very very hard to imagine, but it is possible that malloc is being called once and its return value being used over and over. The reasons being:
It knows your memory is a dead store. ie: you write to it, but it never gets read from. The pointer is known to not be aliased so it hasn't escaped to be read from somewhere else, and it isn't marked volatile. Your for loop itself /could/ get thrown away.
At that point it could just use the same memory over and over.
Now here is why I find it hard to believe: Just how much does gcc know about malloc? Malloc could have any kind of side effects like incrementing a global 'number of times called' to 'paints my room a random shade of blue'. It seems really weird that it would drop the call and assume it to be side-effect-free. Hell, 'malloc' could be implemented to return NULL every 100th call (maybe not quite to spec, but who is to say).
What it is NOT doing is freeing it on your behalf. That goes beyond what it 'could' know and into the territory of 'doing things it's just not allowed to'. You're allowed to leak memory, lame though it may be.
2 things would be useful here:
1) Compile environmenet: which os, compiler, and command line flags.
and 2) disassembly of the final binary. (objdump or from the compiler)
rss and %mem are both in terms of "physical memory being used by the process at the moment". It has plenty of opportunity to page stuff out. Try adding vsz. I bet that grows as you expect.
Your compiler is helping you out by freeing the allocated memory (assuming that the optimized version of your code even gets around to doing the malloc) when it realizes that you aren't using it. You might try printing out the value of the pointer (printf("0x%x", ptr);) - I suspect you'll be getting repeating values. A more reliable check would write a known bitstring into memory, having already looked to see if the allocated memory already contains that string. In other words, instead of writing i, write 0xdeadbeef0cabba6e over and over again, after checking to see if that bit pattern is already in the space you have allocated.

Integer to string in C without preallocated char array

Please, look at the following code that just convert an unsigned int to a string (there may be some unhandled cases but it's not my question), allocating a char array in the heap and returning it, leaving the user the responsibility to free it after the use. Can you explain me why such function (and others similar) do not exist in C standard library? Yes, printf("%s\n", itos(5)) is a memory leak, but this programming pattern is already used and is consider a good practice[1]. IMO, if such functions had existed since the dawn of C we would had little memory leaks more but tons of buffer overflows less!
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
char* itos(unsigned int value)
{
int string_l = (value == 0) ? 1 : (int)log10(value) + 1;
char *string = malloc((string_l + 1) * sizeof(char));
int residual = value;
int it;
for (it = string_l - 1; it >= 0; it--) {
int digit;
digit = residual % 10;
residual = residual / 10;
string[it] = '0' + digit;
}
string[string_l] = '\0';
return string;
}
int main(void)
{
char* string = itos(534534345);
printf("%s\n", string);
free(string);
return 0;
}
[1] http://www.opengroup.org/onlinepubs/009695399/functions/getaddrinfo.html
EDIT:
Habbie's answer:
char *string;
asprintf(&string, "%d", 155);
printf("%s\n", string);
free(string);
turns out asprintf is what you need :)
In my eyes, memory management is up to the caller, not the callee. For instance, when I'm not using the standard malloc() implementation throughout my program I would be very upset about having to find and call the corresponding free(), the upshot is I wouldn't use such a function.
Edit: Your getaddrinfo() example is perfect, they provide both getaddrinfo() and freeaddrinfo(), that's the only way to make sure I'm calling the right free().
Programming has evolved since it was created - this is simply something that wasn't present since the dawn of C, but has evolved in other languages since. I particularly like the way objective-c handles this by returning a string object which has been autoreleased (meaning it will be automatically freed later on, after the object has gone out of scope). You could implement something similar in C if you wanted to:
create a pool for temporary allocations outside your main loop
allocate from the pool as needed using your own allocation function
periodically free the pool at a shallow level in your call stack (for example once per cycle in your very outer main loop)
Another way to achieve the same thing, but allowing you to use system functions that use malloc to allocate memory:
outside your main loop create a list (initially empty) of 'to-be-freed' objects
write a function called 'autofree' that adds pointers to the list and returns the pointer
whenever you need to use it like this: printf("%s\n", autofree(itos(5)));
each time round your main loop, free all the pointers in the list and empty the list
If you do this in a nice way, you can create multiple such autofree pools and nest them around inner loops that potentially create lots of allocations that you want to be freed sooner rather than back in your main loop.

C Memory Management

I've always heard that in C you have to really watch how you manage memory. And I'm still beginning to learn C, but thus far, I have not had to do any memory managing related activities at all.. I always imagined having to release variables and do all sorts of ugly things. But this doesn't seem to be the case.
Can someone show me (with code examples) an example of when you would have to do some "memory management" ?
There are two places where variables can be put in memory. When you create a variable like this:
int a;
char c;
char d[16];
The variables are created in the "stack". Stack variables are automatically freed when they go out of scope (that is, when the code can't reach them anymore). You might hear them called "automatic" variables, but that has fallen out of fashion.
Many beginner examples will use only stack variables.
The stack is nice because it's automatic, but it also has two drawbacks: (1) The compiler needs to know in advance how big the variables are, and (2) the stack space is somewhat limited. For example: in Windows, under default settings for the Microsoft linker, the stack is set to 1 MB, and not all of it is available for your variables.
If you don't know at compile time how big your array is, or if you need a big array or struct, you need "plan B".
Plan B is called the "heap". You can usually create variables as big as the Operating System will let you, but you have to do it yourself. Earlier postings showed you one way you can do it, although there are other ways:
int size;
// ...
// Set size to some value, based on information available at run-time. Then:
// ...
char *p = (char *)malloc(size);
(Note that variables in the heap are not manipulated directly, but via pointers)
Once you create a heap variable, the problem is that the compiler can't tell when you're done with it, so you lose the automatic releasing. That's where the "manual releasing" you were referring to comes in. Your code is now responsible to decide when the variable is not needed anymore, and release it so the memory can be taken for other purposes. For the case above, with:
free(p);
What makes this second option "nasty business" is that it's not always easy to know when the variable is not needed anymore. Forgetting to release a variable when you don't need it will cause your program to consume more memory that it needs to. This situation is called a "leak". The "leaked" memory cannot be used for anything until your program ends and the OS recovers all of its resources. Even nastier problems are possible if you release a heap variable by mistake before you are actually done with it.
In C and C++, you are responsible to clean up your heap variables like shown above. However, there are languages and environments such as Java and .NET languages like C# that use a different approach, where the heap gets cleaned up on its own. This second method, called "garbage collection", is much easier on the developer but you pay a penalty in overhead and performance. It's a balance.
(I have glossed over many details to give a simpler, but hopefully more leveled answer)
Here's an example. Suppose you have a strdup() function that duplicates a string:
char *strdup(char *src)
{
char * dest;
dest = malloc(strlen(src) + 1);
if (dest == NULL)
abort();
strcpy(dest, src);
return dest;
}
And you call it like this:
main()
{
char *s;
s = strdup("hello");
printf("%s\n", s);
s = strdup("world");
printf("%s\n", s);
}
You can see that the program works, but you have allocated memory (via malloc) without freeing it up. You have lost your pointer to the first memory block when you called strdup the second time.
This is no big deal for this small amount of memory, but consider the case:
for (i = 0; i < 1000000000; ++i) /* billion times */
s = strdup("hello world"); /* 11 bytes */
You have now used up 11 gig of memory (possibly more, depending on your memory manager) and if you have not crashed your process is probably running pretty slowly.
To fix, you need to call free() for everything that is obtained with malloc() after you finish using it:
s = strdup("hello");
free(s); /* now not leaking memory! */
s = strdup("world");
...
Hope this example helps!
You have to do "memory management" when you want to use memory on the heap rather than the stack. If you don't know how large to make an array until runtime, then you have to use the heap. For example, you might want to store something in a string, but don't know how large its contents will be until the program is run. In that case you'd write something like this:
char *string = malloc(stringlength); // stringlength is the number of bytes to allocate
// Do something with the string...
free(string); // Free the allocated memory
I think the most concise way to answer the question in to consider the role of the pointer in C. The pointer is a lightweight yet powerful mechanism that gives you immense freedom at the cost of immense capacity to shoot yourself in the foot.
In C the responsibility of ensuring your pointers point to memory you own is yours and yours alone. This requires an organized and disciplined approach, unless you forsake pointers, which makes it hard to write effective C.
The posted answers to date concentrate on automatic (stack) and heap variable allocations. Using stack allocation does make for automatically managed and convenient memory, but in some circumstances (large buffers, recursive algorithms) it can lead to the horrendous problem of stack overflow. Knowing exactly how much memory you can allocate on the stack is very dependent on the system. In some embedded scenarios a few dozen bytes might be your limit, in some desktop scenarios you can safely use megabytes.
Heap allocation is less inherent to the language. It is basically a set of library calls that grants you ownership of a block of memory of given size until you are ready to return ('free') it. It sounds simple, but is associated with untold programmer grief. The problems are simple (freeing the same memory twice, or not at all [memory leaks], not allocating enough memory [buffer overflow], etc) but difficult to avoid and debug. A hightly disciplined approach is absolutely mandatory in practive but of course the language doesn't actually mandate it.
I'd like to mention another type of memory allocation that's been ignored by other posts. It's possible to statically allocate variables by declaring them outside any function. I think in general this type of allocation gets a bad rap because it's used by global variables. However there's nothing that says the only way to use memory allocated this way is as an undisciplined global variable in a mess of spaghetti code. The static allocation method can be used simply to avoid some of the pitfalls of the heap and automatic allocation methods. Some C programmers are surprised to learn that large and sophisticated C embedded and games programs have been constructed with no use of heap allocation at all.
There are some great answers here about how to allocate and free memory, and in my opinion the more challenging side of using C is ensuring that the only memory you use is memory you've allocated - if this isn't done correctly what you end up with is the cousin of this site - a buffer overflow - and you may be overwriting memory that's being used by another application, with very unpredictable results.
An example:
int main() {
char* myString = (char*)malloc(5*sizeof(char));
myString = "abcd";
}
At this point you've allocated 5 bytes for myString and filled it with "abcd\0" (strings end in a null - \0).
If your string allocation was
myString = "abcde";
You would be assigning "abcde" in the 5 bytes you've had allocated to your program, and the trailing null character would be put at the end of this - a part of memory that hasn't been allocated for your use and could be free, but could equally be being used by another application - This is the critical part of memory management, where a mistake will have unpredictable (and sometimes unrepeatable) consequences.
A thing to remember is to always initialize your pointers to NULL, since an uninitialized pointer may contain a pseudorandom valid memory address which can make pointer errors go ahead silently. By enforcing a pointer to be initialized with NULL, you can always catch if you are using this pointer without initializing it. The reason is that operating systems "wire" the virtual address 0x00000000 to general protection exceptions to trap null pointer usage.
Also you might want to use dynamic memory allocation when you need to define a huge array, say int[10000]. You can't just put it in stack because then, hm... you'll get a stack overflow.
Another good example would be an implementation of a data structure, say linked list or binary tree. I don't have a sample code to paste here but you can google it easily.
(I'm writing because I feel the answers so far aren't quite on the mark.)
The reason you have to memory management worth mentioning is when you have a problem / solution that requires you to create complex structures. (If your programs crash if you allocate to much space on the stack at once, that's a bug.) Typically, the first data structure you'll need to learn is some kind of list. Here's a single linked one, off the top of my head:
typedef struct listelem { struct listelem *next; void *data;} listelem;
listelem * create(void * data)
{
listelem *p = calloc(1, sizeof(listelem));
if(p) p->data = data;
return p;
}
listelem * delete(listelem * p)
{
listelem next = p->next;
free(p);
return next;
}
void deleteall(listelem * p)
{
while(p) p = delete(p);
}
void foreach(listelem * p, void (*fun)(void *data) )
{
for( ; p != NULL; p = p->next) fun(p->data);
}
listelem * merge(listelem *p, listelem *q)
{
while(p != NULL && p->next != NULL) p = p->next;
if(p) {
p->next = q;
return p;
} else
return q;
}
Naturally, you'd like a few other functions, but basically, this is what you need memory management for. I should point out that there are a number tricks that are possible with "manual" memory management, e.g.,
Using the fact that malloc is guaranteed (by the language standard) to return a pointer divisible by 4,
allocating extra space for some sinister purpose of your own,
creating memory pools..
Get a good debugger... Good luck!
#Euro Micelli
One negative to add is that pointers to the stack are no longer valid when the function returns, so you cannot return a pointer to a stack variable from a function. This is a common error and a major reason why you can't get by with just stack variables. If your function needs to return a pointer, then you have to malloc and deal with memory management.
#Ted Percival:
...you don't need to cast malloc()'s return value.
You are correct, of course. I believe that has always been true, although I don't have a copy of K&R to check.
I don't like a lot of the implicit conversions in C, so I tend to use casts to make "magic" more visible. Sometimes it helps readability, sometimes it doesn't, and sometimes it causes a silent bug to be caught by the compiler. Still, I don't have a strong opinion about this, one way or another.
This is especially likely if your compiler understands C++-style comments.
Yeah... you caught me there. I spend a lot more time in C++ than C. Thanks for noticing that.
In C, you actually have two different choices. One, you can let the system manage the memory for you. Alternatively, you can do that by yourself. Generally, you would want to stick to the former as long as possible. However, auto-managed memory in C is extremely limited and you will need to manually manage the memory in many cases, such as:
a. You want the variable to outlive the functions, and you don't want to have global variable. ex:
struct pair{
int val;
struct pair *next;
}
struct pair* new_pair(int val){
struct pair* np = malloc(sizeof(struct pair));
np->val = val;
np->next = NULL;
return np;
}
b. you want to have dynamically allocated memory. Most common example is array without fixed length:
int *my_special_array;
my_special_array = malloc(sizeof(int) * number_of_element);
for(i=0; i
c. You want to do something REALLY dirty. For example, I would want a struct to represent many kind of data and I don't like union (union looks soooo messy):
struct data{
int data_type;
long data_in_mem;
};
struct animal{/*something*/};
struct person{/*some other thing*/};
struct animal* read_animal();
struct person* read_person();
/*In main*/
struct data sample;
sampe.data_type = input_type;
switch(input_type){
case DATA_PERSON:
sample.data_in_mem = read_person();
break;
case DATA_ANIMAL:
sample.data_in_mem = read_animal();
default:
printf("Oh hoh! I warn you, that again and I will seg fault your OS");
}
See, a long value is enough to hold ANYTHING. Just remember to free it, or you WILL regret. This is among my favorite tricks to have fun in C :D.
However, generally, you would want to stay away from your favorite tricks (T___T). You WILL break your OS, sooner or later, if you use them too often. As long as you don't use *alloc and free, it is safe to say that you are still virgin, and that the code still looks nice.
Sure. If you create an object that exists outside of the scope you use it in. Here is a contrived example (bear in mind my syntax will be off; my C is rusty, but this example will still illustrate the concept):
class MyClass
{
SomeOtherClass *myObject;
public MyClass()
{
//The object is created when the class is constructed
myObject = (SomeOtherClass*)malloc(sizeof(myObject));
}
public ~MyClass()
{
//The class is destructed
//If you don't free the object here, you leak memory
free(myObject);
}
public void SomeMemberFunction()
{
//Some use of the object
myObject->SomeOperation();
}
};
In this example, I'm using an object of type SomeOtherClass during the lifetime of MyClass. The SomeOtherClass object is used in several functions, so I've dynamically allocated the memory: the SomeOtherClass object is created when MyClass is created, used several times over the life of the object, and then freed once MyClass is freed.
Obviously if this were real code, there would be no reason (aside from possibly stack memory consumption) to create myObject in this way, but this type of object creation/destruction becomes useful when you have a lot of objects, and want to finely control when they are created and destroyed (so that your application doesn't suck up 1GB of RAM for its entire lifetime, for example), and in a Windowed environment, this is pretty much mandatory, as objects that you create (buttons, say), need to exist well outside of any particular function's (or even class') scope.

Resources