When I have a pointer which should repeatedly be used as argument to realloc and to save it's return value, I understand that realloc won't touch the old object if no allocation could take place, returning NULL. Should I still be worried about the old object in a construction like:
int *p = (int *)malloc(sizeof(int *));
if (p = (int *)realloc(p, 2 * sizeof(int *)))
etc...
Now if realloc succeeds, I need to free(p) when I'm done with it.
When realloc fails, I have assigned it's return NULL to p and free(p) doesn't do anything (as it is a free(NULL)). At the same time (according to the standard) the old object is not deallocated.
So should I have a copy of p (e.g. int *last_p = p;) for keeping track of the old object, so that I can free(last_p) when realloc fails?
So should I have a copy of p (e.g. int *last_p = p;) for keeping track of the old object, so that I can free(last_p) when realloc fails?
Basically: yes, of course.
It is generally not suggested to use the pattern you showed:
p = (int *)realloc(p, ...)
This is considered bad practice for the reason you found out.
Use this instead:
void *new = realloc(p, ...);
if (new != NULL)
p = new;
else
...
It is also considered bad practice to cast the return value of malloc, realloc etc. in C. That is not required.
Should you save a temporary pointer?
It can be a good thing in certain situations, as have been pointed out in previous answers. Provided that you have a plan for how to continue execution after the failure.
The most important thing is not how you handle the error. The important thing is that you do something and not just assume there's no error. Exiting is a perfectly valid way of handling an error.
Don't do it, unless you plan a sensible recovery
However, do note that in most situations, a failure from realloc is pretty hard to recover from. Often is exiting the only sensible option. If you cannot acquire enough memory for your task, what are you going to do? I have encountered a situation where recovering was sensible only once. I had an algorithm for a problem, and I realized that I could make significant improvement to performance if I allocated a few gigabytes of ram. It worked fine with just a few kilobytes, but it got noticeably faster with the extra ram usage. So that code was basically like this:
int *huge_buffer = malloc(1000*1000*1000*sizeof *hugebuffer);
if(!huge_buffer)
slow_version();
else
fast_version();
In those cases, just do this:
p = realloc(p, 2 * sizeof *p)
if(!p) {
fprintf(stderr, "Error allocating memory");
exit(EXIT_FAILURE);
}
Do note both changes to the call. I removed casting and changed the sizeof. Read more about that here: Do I cast the result of malloc?
Or even better, if you don't care in general. Write a wrapper.
void *my_realloc(void *p, size_t size) {
void *tmp = realloc(p, size);
if(tmp) return tmp;
fprintf(stderr, "Error reallocating\n");
free(p);
exit(EXIT_FAILURE);
return NULL; // Will never be executed, but to avoid warnings
}
Note that this might contradict what I'm writing below, where I'm writing that it's not always necessary to free before exiting. The reason is that since proper error handling is so easy to do when I have abstracted it all out to a single function, I might as well do it right. It only took one extra line in this case. For the whole program.
Related: What REALLY happens when you don't free after malloc?
About backwards compatibility in general
Some would say that it's good practice to free before exiting, just because it d actually does matter in certain circumstances. My opinion is that these circumstances are quite specialized, like when coding embedded systems with an OS that does not free memory automatically when they terminate. If you're coding in such an environment, you should know that. And if you're coding for an OS that does this for you, then why not utilize it to keep down code complexity?
In general, I think some C coders focuses too much on backwards compatibility with ancient computers from the 1970th that you only encounter on museums today. In most cases, it's pretty fair to assume ascii, two complement, char size of 8 bits and such things.
A comparison would be to still code web pages so that they are possible to view in Netscape Navigator, Mosaic and Lynx. Only spend time on that if there really is a need.
Even if you skip backwards compatibility, use some guards
However, whenever you make assumptions it can be a good thing to include some meta code that makes the compilation fail with wrong target. For instance with this, if your program relies on 8 bit chars:
_Static_assert(CHAR_BITS == 8, "Char bits");
That will make your program crash upon compilation. If you're doing cross compiling, this might possibly be more complicated. I don't know how to do it properly then.
So should I have a copy of p (e.g. int *last_p = p;) for keeping track of the old object, so that I can free(last_p) when realloc fails?
Yes, exactly. Usually I see the new pointer is assigned:
void *pnew = realloc(p, 2 * sizeof(int *)))
if (pnew == NULL) {
free(p);
/* handle error */
return ... ;
}
p = pnew;
Related
I have a simple exercise in C (see code below). The program takes a vector with three components and double each component. The IDE showed me this warning (green squiggle): C6011 dereferencing null pointer v. in this line: v[0] = 12;. I think it's a bug because in the debugger I read the program exited with code 0. What do you think about it?
#include <stdlib.h>
#include <stdint.h>
void twice_three(uint32_t *x) {
for (size_t i = 0; i < 3; ++i) {
x[i] = 2 * *x;
}
}
int main(void) {
uint32_t *v = malloc(3 * sizeof(uint32_t));
v[0] = 12;
v[1] = 59;
v[2] = 83;
twice_three(v);
free(v);
return 0;
}
NB: I'm using Visual Studio.
First of all, note that the warning is generated by your compiler (or static analyzer, or linter), not by your debugger, as you initially wrote.
The warning is telling you that your program possibly might dereference a null pointer. The reason for this warning is that you perform a malloc() and then use the result (the pointer) without checking for NULL values. In this specific code example, malloc() will most likely just return the requested block of memory. On any desktop computer or laptop, there's generally no reason why it would fail to allocate 12 bytes. That's why your application just runs fine and exits successfully. However, if this would be part of a larger application and/or run on a memory-limited system such as an embedded system, malloc() could fail and return NULL. Note that malloc() does not only fail if there is not enough memory available, it could also fail if there is no large enough consecutive block of memory available, due to fragmentation.
According to the C standard, dereferencing a NULL pointer is undefined behavior, meaning that anything could happen. On modern computers it would likely get your application killed (which could lead to data loss or corruption, depending on what the application does). On older computers or embedded systems the problem might be undetected and your application would read from or (worse) write to the address NULL (which is most likely 0, but even that isn't guaranteed by the C standard). This could lead to data corruption, crashes or other unexpected behavior at an arbitrary time after this happened.
Note that the compiler/analyzer/linter doesn't know anything about your application or the platform you will be running it on, and it doesn't make any assumptions about it. It just warns you about this possible problem. It's up to you to determine if this specific warning is relevant for your situation and how to deal with it.
Generally speaking, there are three things you can do about it:
If you know for sure that malloc() would never fail (for example, in such a toy example that you would only run on a modern computer with gigabytes of memory) or if you don't care about the results (because the application will be killed by your OS and you don't mind), then there's no need for this warning. Just disable it in your compiler, or ignore the warning message.
If you don't expect malloc() to fail, but do want to be informed when it happens, the quick-and-dirty solution is to add assert(v != NULL); after the malloc. Note that this will also exit your application when it happens, but in a slightly more controlled way, and you'll get an error message stating where the problem occurred. I would recommend this for simple hobby projects, where you do not want to spend much time on error handling and corner cases but just want to have some fun programming :-)
When there is a realistic change that malloc() would fail and you want a well-defined behavior of your application, you should definitely add code to handle that situation (check for NULL values). If this is the case, you would generally have to do more than just add an if-statement. You would have to think about how the application can continue to work or gracefully shutdown without requiring more memory allocations. And on an embedded system, you would also have to think about things such as memory fragmentation.
The easiest fix for the example code in question is add the NULL-check. This would make the warning go away, and (assuming malloc() would not fail) your program would run still the same.
int main(void) {
uint32_t *v = malloc(3 * sizeof(uint32_t));
if (v != NULL) {
v[0] = 12;
v[1] = 59;
v[2] = 83;
twice_three(v);
free(v);
}
return 0;
}
I believe your IDE is warning you that you didn't make sure that malloc returned something other than NULL. malloc can return NULL when you run out of memory to allocate.
It's debatable whether such a check is needed. In the unlikely event malloc returned NULL, your program would end up getting killed (on modern computers with virtualized memory).[1] So the question is whether you want a clean message or not on exit in the very very rare situation that you run out of memory.
If you do add a check, don't use assert. That's useless. For starters, it only works in dev builds (not production builts) where malloc returning NULL is unlikely, and where it's already super easy to find memory leaks (e.g. by using valgrind). Use a proper check (if (!v) { perror(NULL); exit(1) }).
Since people are trying to debate the issue in the comments despite the rules, it looks like I'll have to go into my claim in more detail.
A couple of people suggested in the comments that "anything could happen" if you ones doesn't check for NULL, but that's simply not true on modern computers with virtualized memory.
When the C spec doesn't define the behaviour of something (what is called "undefined behaviour"), it doesn't mean anything can happen; it just means the C language doesn't care what the compiler/machine does in such situations. And a NULL dereference is very well defined on such systems. Catching such situations is a raison d'ĂȘtre of memory virtualization!
Just like you can rely on other compiler-specific features such as gcc's field packing attributes, one can argue it's fine to rely on memory virtualization to detect a failure by malloc.
Always check the result of malloc.
Use objects not types in sizeof
int main(void) {
uint32_t *v = malloc(3 * sizeof(*v));
if(v)
{
v[0] = 12;
v[1] = 59;
v[2] = 83;
twice_three(v);
}
free(v);
return 0;
}
I have the following C function:
void mySwap(void * p1, void * p2, int elementSize)
{
void * temp = (void*) malloc(elementSize);
assert(temp != NULL);
memcpy(temp, p1, elementSize);
memcpy(p1, p2, elementSize);
memcpy(p2, temp, elementSize);
free(temp);
}
that I want to use in a generic sorting function. Let's suppose that I use it to sort a dynamically allocated array owned by main(). Now let's suppose that at some point temp in mySwap() is actually NULL and the whole program is aborted without freeing the dynamically allocated array in main(). I thought that both mySwap() and the sorting function could return a bool value indicating whether the allocation was successful or not and by using if statements I could free the array in main() and exit(EXIT_FAILURE), but it doesn't seem like a very elegant sollution. What would be a good way to prevent a memory leak in such an instance?
assert is typically used during debugging to identify problems/errors that should never occur.
Out of memory is something that can occur, and so either should not be handled by assert, or, if you do use assert, beware that it will abort the program. Once the program aborts, all memory used by the program is deallocated, so don't worry about that.
Note: If you don't want to have unwieldy if statements everywhere just to handle errors that hardly ever occur, you can use setjmp/longjmp to return to a recoverable state.
You have to realize that the reason malloc fails is because your computer has ran out of memory. From that point and onwards, there's nothing meaningful that your program can do, except terminating as gracefully as you can.
The OS will free the memory upon program termination, so that's not something you need to worry about.
Still, in the normal case, it is of course good practice to free() yourself, manually. Not so much for the sake of "making the memory available again" - the OS will ensure that - but to verify that your program has not gone terribly wrong along the way and created heap corruption, leaks or other bugs. If you have such bugs in your program, it will crash during the free() call, which is a good thing, as the bugs will surface.
assert should preferably not be used in production code. Build your own error handling if needed, that's something better than just violently terminating your own program in the middle of execution.
Avoid the problem by not using malloc.
Instead of allocating a block of memory for every swap, do the swap one byte at a time;
for (int i = 0; i < elementSize; ++i) {
char tmp = ((char*)p1)[i];
((char*)p1)[i] = ((char*)p2)[i];
((char*)p2)[i] = tmp;
}
Only use assert() to catch programmer-error during development, in release-builds it doesn't do anything. If you need to test other things, use proper error-handling, whether that means abort(), return-codes or emulating exceptions using setjmp()/longjmp().
As an aside, do not cast the result of malloc().
I'm new to C and haven't really grasped when C decides to free an object and when it decides to keep an object.
heap_t is pointer to a struct heap.
heap_t create_heap(){
heap_t h_t = (heap_t)malloc(sizeof(heap));
h_t->it = 0;
h_t->len = 10;
h_t->arr = (token_t)calloc(10, sizeof(token));
//call below a couple of times to fill up arr
app_heap(h_t, ENUM, "enum", 1);
return h_t;
}
putting h_t through
int app_heap(heap_t h, enum symbol s, char* word, int line){
int it = h->it;
int len = h->len;
if (it + 1 < len ){
token temp;
h->arr[it] = temp;
h->arr[it].sym = s;
h->arr[it].word = word;
h->arr[it].line = line;
h->it = it + 1;
printf(h->arr[it].word);
return 1;
} else {
h->len = len*2;
h->arr = realloc(h->arr, len*2);
return app_heap(h, s, word, line);
}
}
Why does my h_t->arr fill up with junk and eventually I get a segmentation fault? How do I fix this? Any C coding tips/styles to avoid stuff like this?
First, to answer your question about the crash, I think the reason you are getting segmentation fault is that you fail to multiply len by sizeof(token) in the call to realloc. You end up writing past the end of the block that has been allocated, eventually triggering a segfault.
As far as "deciding to free an object and when [...] to keep an object" goes, C does not decide any of it for you: it simply does it when you tell it to by calling free, without asking you any further questions. This "obedience" ends up costing you sometimes, because you can accidentally free something you still need. It is a good idea to NULL out the pointer, to improve your chance of catching the issue faster (unfortunately, this is not enough to eliminate the problem altogether, because of shared pointers).
free(h->arr);
h -> arr = NULL; // Doing this is a good practice
To summarize, managing memory in C is a tedious task that requires a lot of thinking and discipline. You need to check the result of every allocation call to see if it has failed, and perform many auxiliary tasks when it does.
C does not "decide" anything, if you have allocated something yourself with an explicit call to e.g. malloc(), it will stay allocated until you free() it (or until the program terminates, typically).
I think this:
token temp;
h->arr[it] = temp;
h->arr[it].sym = s;
/* more accesses */
is very weird, the first two lines don't do anything sensible.
As pointed out by dasblinkenlight, you're failing to scale the re-allocation into bytes, which will cause dramatic shrinkage of the array when it tries to grow, and corrupt it totally.
You shouldn't cast the return values of malloc() and realloc(), in C.
Remember that realloc() might fail, in which case you will lose your pointer if you overwrite it like you do.
Lots of repetition in your code, i.e. realloc(h->arr, len*2) instead of realloc(h->arr, h->len * sizeof *h->arr) and so on.
Note how the last bullet point also fixes the realloc() scaling bug mentioned above.
You're not reallocating to the proper size, the realloc statement needs to be:
realloc(h->arr, sizeof(token) * len*2);
^^^^^^^^^^^^
(Or perhaps better realloc(h->arr, sizeof *h->arr * h->h_len);)
In C, you are responsible to free the memory you allocate. You have to free() the memory you've malloc/calloc/realloc'ed when it's suitable to do so. The C runtime never frees anything, except when the program has terminated(some more esoteric systems might not release the memory even then).
Also, try to be consistent, the general form for allocating is always T *foo = malloc(sizeof *foo), and dont duplicate stuff.
e.g.
h_t->arr = (token_t)calloc(10, sizeof(token));
^^^^^^^^ ^^ ^^^^^^^^^^^^^
Don't cast the return value of malloc in C. It's unncessesary and might hide a serious compiler warning and bug if you forget to include stdlib.h
the cast is token_t but the sizeof applies to token, why are they different, and are they the same type as *h_t->arr ?
You already have the magic 10 value, use h_t->len
If you ever change the type of h_t->arr, you have to remember to change the sizeof(..)
So make this
h_t->arr = calloc(h_t->len, sizeof *h_t->arr);
Two main problems in creating dangling pointers in C are the not assigning
NULL to a pointer after freeing its allocated memory, and shared pointers.
There is a solution to the first problem, of automatically nulling out the pointer.
void SaferFree(void *AFree[])
{
free(AFree[0]);
AFree[0] = NULL;
}
The caller, instead calling
free(p);
will call
SaferFree(&p);
In respect to the second and harder to be siolved issue:
The rule of three says:
If you need to explicitly declare either the destructor, copy constructor or copy assignment operator yourself, you probably need to explicitly declare all three of them.
Sharing a pointer in C is simply copying it (copy assignment). It means that using the rule of three (or the general rule of 0)
when programming in C obliges the programmer to supply a way to construct and especially destruct such an assignment, which is possible, but not an
easy task especially when C does not supply a descructor that is implicitly activated as in C++.
I've seen a lot of code that checks for NULL pointers whenever an allocation is made. This makes the code verbose, and if it's not done consistently, only when the programmer felt like it, doesn't even ensure that the program won't crash when the address space runs out. Besides, if the program can't make more allocations, it wouldn't be able to do its function anyway, right?
So my question is, isn't it better for most programs not to check at all and just let the program crash if memory runs out? At least the code is more readable that way.
Note
I'm talking about desktop apps that run on modern computers (at least 2 GB address space), and that most definitely don't operate space shuttles, life support systems, or BP's oil platforms. Most importantly I'm talking about programs that use malloc but never really go above 5 MB of memory usage.
Always check the return value, but for clarity, it's common to wrap malloc() in a function which never returns NULL:
void *
emalloc(size_t amt){
void *v = malloc(amt);
if(!v){
fprintf(stderr, "out of mem\n");
exit(EXIT_FAILURE);
}
return v;
}
Then, later you can use
char *foo = emalloc(56);
foo[12] = 'A';
With no guilty conscience.
Yes, you should check for a null return value from malloc. Even if you can't recover from the failure of memory allocation you should explicitly exit. Carrying on as though memory allocation had succeeded leaves your application in an inconsistent state and is likely to cause "undefined behavior" which should be avoided.
For example, you may end up writing inconsistent data to external storage which may hinder the ability of the next run of the application to recover. It's much safer to exit swiftly in a more controlled fashion.
Many applications that want to exit on allocation failure wrap malloc in a function that checks the return value and explicitly aborts on failure.
Arguably, this is one advantage of the C++ default new approach to throw an exception on allocation failure. It requires no effort to exit on memory allocation failure.
Similar to Dave's approach above, but adds a macro that automatically passes
the file name and line number to our allocation routine so that we can report
that information in the event of a failure.
#include <stdio.h>
#include <stdlib.h>
#define ZMALLOC(theSize) zmalloc(__FILE__, __LINE__, theSize)
static void *zmalloc(const char *file, int line, int size)
{
void *ptr = malloc(size);
if(!ptr)
{
printf("Could not allocate: %d bytes (%s:%d)\n", size, file, line);
exit(1);
}
return(ptr);
}
int main()
{
/* -- Set 'forceFailure' to a non-zero value in order to observe
how 'zmalloc' behaves when it cannot allocate the
requested memory -- */
int bytes = 10 * sizeof(int);
int forceFailure = 0;
int *anArray = NULL;
if(forceFailure)
bytes = -1;
anArray = ZMALLOC(bytes);
free(anArray);
return(0);
}
but it is much more difficult to troubleshoot if you don't log where the malloc failed.
failed to allocate memory in line XX is to prefer than just to crash.
You should definitely check the return value for malloc. Helpful in debugging and the code becomes robust.
Always check malloc'ed memory?
In a hosted environment error checking the return of malloc makes not much sense nowadays. Most machines have a virtual address space of 64 bit. You'd need a lot of time to exhaust that. Your program will most likely fail at a completely different place, namely when your physical+swap memory is exhausted. It will have shown completely ridiculous performance before that, because it only was swapping and the user will have triggered Cntrl-C long before you ever come there.
Segfaulting "nicely" on a null pointer reference would be a clear point to see where things fail in a debugger. But in my practice I have never seen a failed malloc as a cause.
When programming for embedded systems the picture changes completely. There you definitively should check for failed malloc.
Edit: To clarify that after the edit of the question. The kind of programs/systems described there are clearly not "embedded". I have never seen malloc fail under the circumstances described there.
I'd like to add that edge cases should always be checked even if you think they are safe or cannot lead to other issues than a crash. Null pointer dereference can potentially be exploited (http://uninformed.org/?v=4&a=5&t=sumry).
I've always heard that in C you have to really watch how you manage memory. And I'm still beginning to learn C, but thus far, I have not had to do any memory managing related activities at all.. I always imagined having to release variables and do all sorts of ugly things. But this doesn't seem to be the case.
Can someone show me (with code examples) an example of when you would have to do some "memory management" ?
There are two places where variables can be put in memory. When you create a variable like this:
int a;
char c;
char d[16];
The variables are created in the "stack". Stack variables are automatically freed when they go out of scope (that is, when the code can't reach them anymore). You might hear them called "automatic" variables, but that has fallen out of fashion.
Many beginner examples will use only stack variables.
The stack is nice because it's automatic, but it also has two drawbacks: (1) The compiler needs to know in advance how big the variables are, and (2) the stack space is somewhat limited. For example: in Windows, under default settings for the Microsoft linker, the stack is set to 1 MB, and not all of it is available for your variables.
If you don't know at compile time how big your array is, or if you need a big array or struct, you need "plan B".
Plan B is called the "heap". You can usually create variables as big as the Operating System will let you, but you have to do it yourself. Earlier postings showed you one way you can do it, although there are other ways:
int size;
// ...
// Set size to some value, based on information available at run-time. Then:
// ...
char *p = (char *)malloc(size);
(Note that variables in the heap are not manipulated directly, but via pointers)
Once you create a heap variable, the problem is that the compiler can't tell when you're done with it, so you lose the automatic releasing. That's where the "manual releasing" you were referring to comes in. Your code is now responsible to decide when the variable is not needed anymore, and release it so the memory can be taken for other purposes. For the case above, with:
free(p);
What makes this second option "nasty business" is that it's not always easy to know when the variable is not needed anymore. Forgetting to release a variable when you don't need it will cause your program to consume more memory that it needs to. This situation is called a "leak". The "leaked" memory cannot be used for anything until your program ends and the OS recovers all of its resources. Even nastier problems are possible if you release a heap variable by mistake before you are actually done with it.
In C and C++, you are responsible to clean up your heap variables like shown above. However, there are languages and environments such as Java and .NET languages like C# that use a different approach, where the heap gets cleaned up on its own. This second method, called "garbage collection", is much easier on the developer but you pay a penalty in overhead and performance. It's a balance.
(I have glossed over many details to give a simpler, but hopefully more leveled answer)
Here's an example. Suppose you have a strdup() function that duplicates a string:
char *strdup(char *src)
{
char * dest;
dest = malloc(strlen(src) + 1);
if (dest == NULL)
abort();
strcpy(dest, src);
return dest;
}
And you call it like this:
main()
{
char *s;
s = strdup("hello");
printf("%s\n", s);
s = strdup("world");
printf("%s\n", s);
}
You can see that the program works, but you have allocated memory (via malloc) without freeing it up. You have lost your pointer to the first memory block when you called strdup the second time.
This is no big deal for this small amount of memory, but consider the case:
for (i = 0; i < 1000000000; ++i) /* billion times */
s = strdup("hello world"); /* 11 bytes */
You have now used up 11 gig of memory (possibly more, depending on your memory manager) and if you have not crashed your process is probably running pretty slowly.
To fix, you need to call free() for everything that is obtained with malloc() after you finish using it:
s = strdup("hello");
free(s); /* now not leaking memory! */
s = strdup("world");
...
Hope this example helps!
You have to do "memory management" when you want to use memory on the heap rather than the stack. If you don't know how large to make an array until runtime, then you have to use the heap. For example, you might want to store something in a string, but don't know how large its contents will be until the program is run. In that case you'd write something like this:
char *string = malloc(stringlength); // stringlength is the number of bytes to allocate
// Do something with the string...
free(string); // Free the allocated memory
I think the most concise way to answer the question in to consider the role of the pointer in C. The pointer is a lightweight yet powerful mechanism that gives you immense freedom at the cost of immense capacity to shoot yourself in the foot.
In C the responsibility of ensuring your pointers point to memory you own is yours and yours alone. This requires an organized and disciplined approach, unless you forsake pointers, which makes it hard to write effective C.
The posted answers to date concentrate on automatic (stack) and heap variable allocations. Using stack allocation does make for automatically managed and convenient memory, but in some circumstances (large buffers, recursive algorithms) it can lead to the horrendous problem of stack overflow. Knowing exactly how much memory you can allocate on the stack is very dependent on the system. In some embedded scenarios a few dozen bytes might be your limit, in some desktop scenarios you can safely use megabytes.
Heap allocation is less inherent to the language. It is basically a set of library calls that grants you ownership of a block of memory of given size until you are ready to return ('free') it. It sounds simple, but is associated with untold programmer grief. The problems are simple (freeing the same memory twice, or not at all [memory leaks], not allocating enough memory [buffer overflow], etc) but difficult to avoid and debug. A hightly disciplined approach is absolutely mandatory in practive but of course the language doesn't actually mandate it.
I'd like to mention another type of memory allocation that's been ignored by other posts. It's possible to statically allocate variables by declaring them outside any function. I think in general this type of allocation gets a bad rap because it's used by global variables. However there's nothing that says the only way to use memory allocated this way is as an undisciplined global variable in a mess of spaghetti code. The static allocation method can be used simply to avoid some of the pitfalls of the heap and automatic allocation methods. Some C programmers are surprised to learn that large and sophisticated C embedded and games programs have been constructed with no use of heap allocation at all.
There are some great answers here about how to allocate and free memory, and in my opinion the more challenging side of using C is ensuring that the only memory you use is memory you've allocated - if this isn't done correctly what you end up with is the cousin of this site - a buffer overflow - and you may be overwriting memory that's being used by another application, with very unpredictable results.
An example:
int main() {
char* myString = (char*)malloc(5*sizeof(char));
myString = "abcd";
}
At this point you've allocated 5 bytes for myString and filled it with "abcd\0" (strings end in a null - \0).
If your string allocation was
myString = "abcde";
You would be assigning "abcde" in the 5 bytes you've had allocated to your program, and the trailing null character would be put at the end of this - a part of memory that hasn't been allocated for your use and could be free, but could equally be being used by another application - This is the critical part of memory management, where a mistake will have unpredictable (and sometimes unrepeatable) consequences.
A thing to remember is to always initialize your pointers to NULL, since an uninitialized pointer may contain a pseudorandom valid memory address which can make pointer errors go ahead silently. By enforcing a pointer to be initialized with NULL, you can always catch if you are using this pointer without initializing it. The reason is that operating systems "wire" the virtual address 0x00000000 to general protection exceptions to trap null pointer usage.
Also you might want to use dynamic memory allocation when you need to define a huge array, say int[10000]. You can't just put it in stack because then, hm... you'll get a stack overflow.
Another good example would be an implementation of a data structure, say linked list or binary tree. I don't have a sample code to paste here but you can google it easily.
(I'm writing because I feel the answers so far aren't quite on the mark.)
The reason you have to memory management worth mentioning is when you have a problem / solution that requires you to create complex structures. (If your programs crash if you allocate to much space on the stack at once, that's a bug.) Typically, the first data structure you'll need to learn is some kind of list. Here's a single linked one, off the top of my head:
typedef struct listelem { struct listelem *next; void *data;} listelem;
listelem * create(void * data)
{
listelem *p = calloc(1, sizeof(listelem));
if(p) p->data = data;
return p;
}
listelem * delete(listelem * p)
{
listelem next = p->next;
free(p);
return next;
}
void deleteall(listelem * p)
{
while(p) p = delete(p);
}
void foreach(listelem * p, void (*fun)(void *data) )
{
for( ; p != NULL; p = p->next) fun(p->data);
}
listelem * merge(listelem *p, listelem *q)
{
while(p != NULL && p->next != NULL) p = p->next;
if(p) {
p->next = q;
return p;
} else
return q;
}
Naturally, you'd like a few other functions, but basically, this is what you need memory management for. I should point out that there are a number tricks that are possible with "manual" memory management, e.g.,
Using the fact that malloc is guaranteed (by the language standard) to return a pointer divisible by 4,
allocating extra space for some sinister purpose of your own,
creating memory pools..
Get a good debugger... Good luck!
#Euro Micelli
One negative to add is that pointers to the stack are no longer valid when the function returns, so you cannot return a pointer to a stack variable from a function. This is a common error and a major reason why you can't get by with just stack variables. If your function needs to return a pointer, then you have to malloc and deal with memory management.
#Ted Percival:
...you don't need to cast malloc()'s return value.
You are correct, of course. I believe that has always been true, although I don't have a copy of K&R to check.
I don't like a lot of the implicit conversions in C, so I tend to use casts to make "magic" more visible. Sometimes it helps readability, sometimes it doesn't, and sometimes it causes a silent bug to be caught by the compiler. Still, I don't have a strong opinion about this, one way or another.
This is especially likely if your compiler understands C++-style comments.
Yeah... you caught me there. I spend a lot more time in C++ than C. Thanks for noticing that.
In C, you actually have two different choices. One, you can let the system manage the memory for you. Alternatively, you can do that by yourself. Generally, you would want to stick to the former as long as possible. However, auto-managed memory in C is extremely limited and you will need to manually manage the memory in many cases, such as:
a. You want the variable to outlive the functions, and you don't want to have global variable. ex:
struct pair{
int val;
struct pair *next;
}
struct pair* new_pair(int val){
struct pair* np = malloc(sizeof(struct pair));
np->val = val;
np->next = NULL;
return np;
}
b. you want to have dynamically allocated memory. Most common example is array without fixed length:
int *my_special_array;
my_special_array = malloc(sizeof(int) * number_of_element);
for(i=0; i
c. You want to do something REALLY dirty. For example, I would want a struct to represent many kind of data and I don't like union (union looks soooo messy):
struct data{
int data_type;
long data_in_mem;
};
struct animal{/*something*/};
struct person{/*some other thing*/};
struct animal* read_animal();
struct person* read_person();
/*In main*/
struct data sample;
sampe.data_type = input_type;
switch(input_type){
case DATA_PERSON:
sample.data_in_mem = read_person();
break;
case DATA_ANIMAL:
sample.data_in_mem = read_animal();
default:
printf("Oh hoh! I warn you, that again and I will seg fault your OS");
}
See, a long value is enough to hold ANYTHING. Just remember to free it, or you WILL regret. This is among my favorite tricks to have fun in C :D.
However, generally, you would want to stay away from your favorite tricks (T___T). You WILL break your OS, sooner or later, if you use them too often. As long as you don't use *alloc and free, it is safe to say that you are still virgin, and that the code still looks nice.
Sure. If you create an object that exists outside of the scope you use it in. Here is a contrived example (bear in mind my syntax will be off; my C is rusty, but this example will still illustrate the concept):
class MyClass
{
SomeOtherClass *myObject;
public MyClass()
{
//The object is created when the class is constructed
myObject = (SomeOtherClass*)malloc(sizeof(myObject));
}
public ~MyClass()
{
//The class is destructed
//If you don't free the object here, you leak memory
free(myObject);
}
public void SomeMemberFunction()
{
//Some use of the object
myObject->SomeOperation();
}
};
In this example, I'm using an object of type SomeOtherClass during the lifetime of MyClass. The SomeOtherClass object is used in several functions, so I've dynamically allocated the memory: the SomeOtherClass object is created when MyClass is created, used several times over the life of the object, and then freed once MyClass is freed.
Obviously if this were real code, there would be no reason (aside from possibly stack memory consumption) to create myObject in this way, but this type of object creation/destruction becomes useful when you have a lot of objects, and want to finely control when they are created and destroyed (so that your application doesn't suck up 1GB of RAM for its entire lifetime, for example), and in a Windowed environment, this is pretty much mandatory, as objects that you create (buttons, say), need to exist well outside of any particular function's (or even class') scope.