How important is it to check for malloc failures? - c

Is always protecting mallocs important? By protecting I mean:
char *test_malloc = malloc(sizeof(char) * 10000);
if (!test_malloc)
exit(EXIT_FAILURE);
I mean, in electronic devices, I don't doubt it's essential.
But in programs I'm running on my own machine that I'm sure my allocation size will be positive and that the size will not be astronomical.
Some people say, "Ah imagine there’s not enough memory in your computer at this moment."

There's always the possibility that the system can run out of memory and not have any more to allocate, so it's always a good idea to do so.
Given that there's almost no way to recover from a failed malloc, I prefer to use a wrapper function to do the allocation and checking. That makes the code more readable.
void *safe_malloc(size_t size)
{
void *p = malloc(size);
if (p == NULL) {
perror("malloc failed!");
exit(EXIT_FAILURE);
}
return p;
}

The other reason it's a good idea to check for malloc failures (and why I always do) is: to catch programming mistakes.
It's true, memory is effectively infinite on many machines today, so malloc almost never fails because it ran out of memory. But it often fails (for me, at least) because I screwed up and overwrote memory in some way, screwing up the heap, and the next call to malloc often returns NULL to let you know it noticed the corruption. And that's something I really want to know about, and fix.
Over the years, probably 10% of the time malloc has returned NULL on me was because it was out of memory, and the other 90% was because of those other problems, my problems, that I really needed to know about.
So, to this day, I still maintain a pretty religious habit of always checking malloc.

It is important to check for error conditions? Absolutely!
Now when it comes to malloc you might find, that some implementations of the C standard library – the glibc for example – will never return NULL and instead abort, due to the assumption that if allocating memory fails, a program won't be able to recover from that. IMHO that assumption is ill founded. But it places malloc in that weird groups of function where, even if you properly implement out-of-memory condition handling that works without requesting any more memory, your program is still being crashed.
That being said: You should definitely check for error conditions, for the sole reason you might run your program in an environment that gives you a chance to properly react to them.

The Standard makes no distinction between a number of kinds of implementations:
Those where a program a program could allocate as much memory as it malloc() will supply, without affecting system stability, and where a program that could behave usefully without needing the storage from the failed allocation may run normally.
Those where system memory exhaustion may cause programs to terminate unexpectedly even without malloc() ever indicating a failure, but where attempting to dereference a pointer returned via malloc() will either access a valid memory region or force program termination.
Those where malloc() might return null, and dereferencing a null pointer might have unwanted effects beyond abnormal program termination, but where programs might be unexpectedly terminated due to a lack of storage even when malloc() reports success.
If there's no way a program would be able to usefully continue operation if malloc() fails, it's probably best to allocate memory via wrapper function that calls malloc() and forcibly terminates the program if it returns null.

Related

Is it mandatory to check if realloc worked?

In C is it mandatory to check if the realloc function made it?
void *tmp = realloc(data, new_size);
if (tmp == NULL) return 1;
data = tmp;
In C is it mandatory to check if the realloc function made it?
The quick answer is: NO! checking for failure is not mandatory. If realloc fails, it returns a null pointer, storing it into the original pointer overwrites the previous value, potentially making the block unreachable for later freeing. Dereferencing this null pointer has undefined behavior, a crash on architectures with protected virtual memory. In practice, on these systems with virtual memory, unless you pass an insanely large number for the new size, the call will not fail so you wont get hit by this sloppy code:
data = realloc(data, new_size); // assume realloc succeeds
If you care to be friendly to the next guy trying to debug the program in more stressed environments, you could add:
data = realloc(data, new_size); // assume realloc succeeds
assert(data);
The long answer is: YES you should check for realloc failure in a reliable production program and handle the failure gracefully.
Obviously realloc can fail if you the requested amount of memory is too large for the heap to honor, but it can also fail for internal reasons for requests for smaller amounts, even amounts smaller than the size of the allocated block passed as an argument, even in the absence of heap corruption caused by undefined behavior. There is just no reason to assume that realloc() will always succeed.
If you know the current size allocated for the object you mean to realloc, you can ignore a realloc failure to shrink the object.
For other requests, you should handle handle the error gracefully. If the error causes the function to abort its operation, any memory allocated for the current task should be free'd to avoid memory leaks if the calling program continues. This is a recommendation to avoid memory or resource leaks and allow the program to run reliably for a long time, but depending on your local constraints, you may get away with slack.
To summarize: depending on local constraints (from quick and dirty throw away code to reliable sturdy production code running indefinitely in a life support system), you might not care about the potential improbable failures or it could be mandatory for you to detect, handle and document any unexpected situations.
Detecting and reporting less improbable errors such as fopen() failure to open files, fgets() failure at end of file or scanf() conversion errors from invalid user input is recommended to avoid wasting hours trying to make sense of unexpected behavior, or worse relying on corrupted data that did not produce obviously incorrect results.
Is it mandatory to check if realloc worked?
Yes, or at least, as mandatory as is to check any other success/failure return.
realloc can certainly fail, just as malloc can. In fact, realloc can fail even if you're trying to make a block smaller.
Now, if you're working on the assumption that malloc on a virtual memory system can never fail, and if you're not checking the return value from malloc, either, then no, with the same justification, you could skip checking realloc's return value also. (It'd be silly to check realloc's return value but not malloc's.)
In practice, the question of whether to check for memory allocation failure (malloc and/or realloc) on a modern, virtual-memory system is the subject of some debate. There are plenty of programmers who believe that checking the return values is a waste of time, since "malloc never fails". Me, I always check (except in the very most rushed, throwaway code), because even after all these years, the number of times I've said "I'm getting so tired of typing if( … != NULL) all the time!" is still zero, while the number of times that a checked and caught NULL return has alerted me promptly to a bug in my code, and saved me hours of pointless debugging, is countless.
So in summary, and at the risk of repeating myself, I'd say that, yes, it's mandatory. See also Always check malloc'ed memory?
It is a return value from a library function. The language has no knowledge of the semantics of any function and it is not syntactically mandatory to check or use any return value. That is to say it will compile.
However it may be a semantic error not to check, and an unchecked failure may cause unintended runtime behaviour - i.e. a bug. The language does not prevent you from writing bugs.
It may be "mandatory" in a "local" sense, in cases where development is subjected a coding standard that makes it so. Any reasonable coding standard would likely impose such a requirement, and that might be enforced by static analysis tools or peer review used by the development team. But not universally mandatory. You are free to write bad code if such standards are not applied to your project.
If you use realloc() on a pointer that has not been initialized with either malloc() or calloc(), as far as I know there's no retrospective way to check realloc()'s return value since it will throw an error:
realloc(): invalid pointer
Aborted (core dumped)
Example:
int myint = 10;
int * ptr = &myint;
ptr = (int*) realloc(ptr,(2 * sizeof(int)));

The need of checking a pointer returned by malloc in C

Suppose I have the following line in my code:
struct info *pinfo = malloc(sizeof(struct info));
Usually there is another line of code like this one:
if (!pinfo)
<handle this error>
But does it really worth it? especially if the object is so small that the code generated to check it might need more memory than the object itself.
It's true that running out of memory is rare, especially for little test programs that are only allocating tens of bytes of memory, especially on modern systems that have many gigabytes of memory available.
Yet malloc failures are very common, especially for little test programs.
malloc can fail for two reasons:
There's not enough memory to allocate.
malloc detects that the memory-allocation heap is messed up, perhaps because you did something wrong with one of your previous memory allocations.
Now, it turns out that #2 happens all the time.
And, it turns out that #1 is pretty common, too, although not because there's not enough memory to satisfy the allocation the programmer meant to do, but because the programmer accidentally passed a preposterously huge number to malloc, accidentally asking for more memory than there is in the known universe.
So, yes, it turns out that checking for malloc failure is a really good idea, even though it seems like malloc "can't fail".
The other thing to think about is, what if you take the shortcut and don't check for malloc failure? If you sail along and use the null pointer that malloc gave you instead, that'll cause your program to immediately crash, and that'll alert you to your problem just as well as an "out of memory" message would have, without your having to wear your fingers to the bone typing if(!pinfo) and fprintf(stderr, "out of memory\n"), right?
Well, no.
Depending on what your program accidentally does with the null pointer, it's possible it won't crash right away. Anyway, the crash you get, with a message like "Segmentation violation - core dumped" doesn't tell you much, doesn't tell you where your problem is. You can get segmentation violations for all sorts of reasons (especially in little test programs, especially if you're a beginner not quite sure what you're doing). You can spend hours in a futile effort to figure out why your program is crashing, without realizing it's because malloc is returning a null pointer. So, definitely, you should always check for malloc failure, even in the tiniest test programs.
Deciding which errors to test for, versus those that "can't happen" or for whatever reason aren't worth catching, is a hard problem in general. It can take a fair amount of experience to know what is and isn't worth checking for. But, truly, anybody who's programmed in C for very long can tell you emphatically: malloc failure is definitely worth checking for.
If your program is calling malloc all over the place, checking each and every call can be a real nuisance. So a popular strategy is to use a malloc wrapper:
void *my_malloc(size_t n)
{
void *ret = malloc(n);
if(ret == NULL) {
fprintf(stderr, "malloc failed (%s)\n", strerror(errno));
exit(1);
}
return ret;
}
There are three ways of thinking about this function:
Whenever you have some processing that you're doing repetitively, all over the place (in this case, checking for malloc failure), see if you can move it off to (centralize it in) a single function, like this.
Unlike malloc, my_malloc can't fail. It never returns a null pointer. It's almost magic. You can call it whenever and wherever you want, and you never have to check its return value. It lets you pretend that you never have to worry about running out of memory (which was sort of the goal all along).
Like any magical result, my_malloc's benefit — that it never seems to fail — comes at a price. If the underlying malloc fails, my_malloc summarily exits (since it can't return in that case), meaning that the rest of your program doesn't get a chance to clean up. If the program were, say, a text editor, and whenever it had a little error it printed "out of memory" and then basically threw away the file the user had been editing for the last hour, the user might not be too pleased. So you can't use the simple my_malloc trick in production programs that might lose data. But it's a huge convenience for programs that don't have to worry about that sort of thing.
If malloc fails then chances are the system is out of memory or it's something else your program can't handle. It should abort immediately and at most log some diagnostics. Not handling NULL from malloc will make you end up in undefined behavior land. One might argue that having to abort because of a failure of malloc is already catastrophic but just letting it exhibit UB falls under a worse category.
But what if the malloc fails? You will dereference the NULL pointer, which is UB (undefined behaviour) and your program will (probably) fail!
Sometimes code which checks the correctness of the data is longer than the code which does something with it :).
This is very simply, if you won't check for NULL you might end up with runtime error. Checking for NULL will help you to avoid program from unexpected crash and gracefully handle the error case.
If you just want to quickly test some algorithm, then fine, but know it can fail. For example run it in the debugger.
When you include it in your Real World Program, then add all the error checking and handling needed.

Why some people don't check for NULL after calling malloc?

Some time ago I downloaded a sourcecode from the Internet. There were several malloc calls, and after that there was no check for NULL. As far as I know you need to check for NULL after calling malloc.
Is there a good reason for somebody not check for NULL after calling malloc? Am I missing something?
As Jens Gustedt mentioned in a comment, by the time malloc() returns an error your program is likely to be in a heap of trouble already. Does it make sense to put in a bunch of error handling code to handle the situation, when the program is likely not going to be able to do much of anything anyway? For many programs the answer might be 'no', for others it might be very important to do something appropriate.
You can try allocating your memory through a simple 'malloc-or-die' wrapper function that guarantees that the allocation succeeds or the program will terminate:
void* m_malloc(size_t size)
{
void* p;
// make sure a size request of `0` doesn't trigger
// an error situation needlessly
if (size == 0) size = 1;
p = malloc(size);
if (!p) {
// attempt to log the error or whatever
abort();
}
return p;
}
One problem that you then run into is that there's not much you can reliably do except maybe terminate the program. Even logging the problem is likely to require some memory allocation, so the logging facility will probably have its own problems (unless your allocation failure is due to trying to allocate an unreasonably large block of memory).
You might try to solve that issue by allocating a 'fail-safe' block early in your program that can be freed when you need to log the problem (I think there are quite a few programs that use this strategy). But how much work you are willing to put into this kind of error handling depends on your specific needs. If your program needs to ensure that something of significant complexity is done when malloc() returns an error, you'll need to have corresponding safeguards to make sure you can do those things in a very low-memory situation. Generally this means additional complexity, and it may not always be worth the effort.
People don't check because they're lazy, it makes their code uglier, and they don't want to figure out how to recover from errors everywhere.
I've heard a few programmers say, "If I can't malloc a block the system is going to crash soon anyway because VM is full, so why should I bother checking?"
I disagree. You should check for errors, even if it means just logging the error and calling exit() or throwing an exception. While we were trending towards systems with huge disks and always-on paged memory, the industry has flipped and now we have smartphones and tablets with limited RAM and no on-demand paging. Plus even on the desktop our datasets have grown so much that sometimes malloc will fail.
If you don't want to add extra lines of code everywhere, just write your own malloc replacement that calls malloc and checks for errors and use it instead of malloc.
They just don't care about unexpected crashes!
When you do malloc, it's very likely you are going to store something immediately. So if you don't check for NULL, then program may crash subsequently when trying to store something there.
This is unlikely in small programs where malloc hardly fails when requested for small amount of memory. So the malloc doesn't return NULL.
But it's usually good to practice the NULL check for malloc even in small programs, in my opinion.
If you need more memory and malloc can not give you more can you do anything about it?
I guess exit gracefully.
But if you exit, I guess they think it doesn't really matter how you exit (might as well crash and avoid, the what they think as "overhead" for checks for null).
Perhaps the functionality was such that they didn't have any need for cleanup code?
I don't agree though. You should check for NULL on malloc's return

Can I rely on malloc returning NULL?

I read that on Unix systems, malloc can return a non-NULL pointer even if the memory is not actually available, and trying to use the memory later on will trigger an error. Since I cannot catch such an error by checking for NULL, I wonder how useful it is to check for NULL at all?
On a related note, Herb Sutter says that handling C++ memory errors is futile, because the system will go into spasms of paging long before an exception will actually occur. Does this apply to malloc as well?
Quoting Linux manuals:
By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no
guarantee that
the memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more
processes will be
killed by the infamous OOM killer. In case Linux is employed under circumstances where it would be less desirable to suddenly lose
some randomly
picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior
using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
You ought to check for NULL return, especially on 32-bit systems, as the process address space could be exhausted far before the RAM: on 32-bit Linux for example, user processes might have usable address space of 2G - 3G as opposed to over 4G of total RAM. On 64-bit systems it might be useless to check the malloc return code, but might be considered good practice anyway, and it does make your program more portable. And, remember, dereferencing the null pointer kills your process certainly; some swapping might not hurt much compared to that.
If malloc happens to return NULL when one tries to allocate only a small amount of memory, then one must be cautious when trying to recover from the error condition as any subsequent malloc can fail too, until enough memory is available.
The default C++ operator new is often a wrapper over the same allocation mechanisms employed by malloc().
On Linux, you can indeed not rely on malloc returning NULL if sufficient memory is not available due to the kernel's overallocation strategy, but you should still check for it because in some circumstances malloc will return NULL, e.g. when you ask for more memory than is available in the machine in total. The Linux malloc(3) manpage calls the overallocation "a really bad bug" and contains advice on how to turn it off.
I've never heard about this behavior also occurring in other Unix variants.
As for the "spasms of paging", that depends on the machine setup. E.g., I tend not to setup a swap partition on laptop Linux installations, since the exact behavior you fear might kill the hard disk. I would still like the C/C++ programs that I run to check malloc return values, give appropriate error messages and when possible clean up after themselves.
Checking for the return of malloc doesn't help you much by its own to make your allocations safer or less error prone. It can even be a trap if this is the only test that you implement.
When called with an argument of 0 the standard allows malloc to return a sort of unique address, which is not a null pointer and which you don't have the right to access, nevertheless. So if you just test if the return is 0 but don't test the arguments to malloc, calloc or realloc you might encounter a segfault much later.
This error condition (memory exhausted) is quite rare in "hosted" environments. Usually you are in trouble long before you hassle with this kind of error. (But if you are writing runtime libraries, are a kernel hacker or rocket builder this is different, and there the test makes perfect sense.)
People then tend to decorate their code with complicated captures of that error condition that span several lines, doing perror and stuff like that, that can have an impact on the readability of the code.
I think that this "check the return of malloc" is much overestimated, sometimes even defended quite dogmatically. Other things are much more important:
always initialize variables, always. for pointer variables this is crucial,
let the program crash nicely before things get too bad. uninitialized pointer members in structs are an important cause of errors that are difficult to find.
always check the argument to malloc and Co. if this is a compile
time constant like sizof toto there can't be a problem, but
always ensure that your vector allocation handles the zero case properly.
An easy thing to check for return of malloc is to wrap it up with something like memset(malloc(n), 0, 1). This just writes a 0 in the first byte and crashes nicely if malloc had an error or n was 0 to start with.
To view this from an alternative point of view:
"malloc can return a non-NULL pointer even if the memory is not actually available" does not mean that it always returns non-NULL. There might (and will) be cases where NULL is returned (as others already said), so this check is necessary nevertheless.

Wrapping calls to malloc()/realloc()... is this a good idea?

For an assignment, I need to allocate a dynamic buffer, using malloc() for the inital buffer and realloc() to expand that buffer if needed. Everywhere I use (re|m)alloc(), the code looks like the following:
char *buffer = malloc(size);
if (buffer == NULL) {
perror();
exit(EXIT_FAILURE);
}
The programm only reads data from a file and outputs it, so I thought just exiting the program when (re|m)alloc fails would be a good idea. Now, the real question is:
Would it be beneficial to wrap the calls, e.g. like this?
void *Malloc(int size) {
void *buffer = malloc(size);
if (buffer == NULL) {
perror();
exit(EXIT_FAILURE);
}
return buffer;
}
Or is this a bad idea?
It's a bad idea in the form presented, because in anything other than a trivial program written for an assignment, you want to do something more useful/graceful than bailing out. So best not to get into bad habits. This isn't to say that wrappers around allocation are bad per se (centralizing error handling can be a good thing), just that a wrapper that you don't check the return value of (e.g., that doesn't return at all on failure) is a bad idea, barring your having provided some mechanism for allowing code to hook into the bail-out logic.
If you do want to do it in the form you've presented, I'd strongly recommend using a more clearly different name than Malloc instead of malloc. Like malloc_or_die. :-)
In most instances, the attempt to use a null pointer will crash the program soon enough anyway, and it will make it easier to debug, since you get a nice core dump to wallow around in, which you don't get if you call exit().
The only advice I'd give is to dereference the returned pointer as soon as possible after allocating it, even if only gratuitously, so that the core dump can lead you straight to the faulty malloc call.
There is rarely much you can do to recover from memory exhaustion, so exiting is usually the right thing to do. But do it in such a way as to make the post-mortem easier.
Just to be clear, though, memory exhaustion usually occurs long after the OS is crippled by page-swapping activity. This strategy is really only useful to catch ridiculous allocations, like trying to malloc(a_small_negative_number) due to a bug.
Should I bother detecting OOM (out of memory) errors in my C code?
That's my answer to a similar question. In summary, I'm in favor of designing apps so they recover from any kind of crashes, and then treat out-of-memory as a reason to crash.
In your case it's OK. Just remember to give a message with reasons for premature exit, and it would be nice to specify the line number. Somethink like this:
void* malloc2(int size, int line_num){
void *buffer = malloc(size);
if (buffer == NULL) {
printf("ERROR: cannot alloc for line %d\n", line_num);
perror();
exit(EXIT_FAILURE);
}
return buffer;
};
#define Malloc(n) malloc2((n), __LINE__)
EDIT: as others mentioned, it's not a good habbit for an experienced programmer, but for a beginner that have difficulties even with keeping track of the program flow in "happy" case it's OK.
The ideas that "checking for malloc for failure is useless because of overcommit" or that "the OS will already be crippled by the time malloc fails" are seriously outdated. Robust operating systems have never overcommitted memory, and historically-not-so-robust ones (like Linux) nowadays have easy ways to disable overcommit and protect against the OS becoming crippled due to memory exhaustion - as long as apps do their part not to crash and burn when malloc fails!
There are many reasons malloc can fail on a modern system:
Insufficient physical resources to instantiate the memory.
Virtual address space exhausted, even when there is plenty of physical memory free. This can happen easily on a 32-bit machine (or 32-bit userspace) with >4gb of ram+swap.
Memory fragmentation. If your allocation patterns are very bad, you could end up with 4 million 16-byte chunks spaced evenly 1000 bytes apart, and unable to satisfy a malloc(1024) call.
How you deal with memory exhaustion depends on the nature of your program.
Certainly from a standpoint of the system's health as a whole, it's nice for your program to die. That reduces resource starvation and may allow other apps to keep running. On the other hand, the user will be very upset if that means losing hours of work editing a video, typing a paper, drafting a blog post, coding, etc. Or they could be happy if their mp3 player suddenly dying with out-of-memory means their disk stops thrashing and they're able to get back to their word processor and click "save".
As for OP's original question, I would strongly advise against writing malloc wrappers that die on failure, or writing code that just assumes it will segfault immediately upon using the null pointer if malloc failed. It's an easy bad habit to get into, and once you've written code that's full of unchecked allocations, it will be impossible to later reuse that code in any program where robustness matters.
A much better solution is just to keep returning failure to the calling function, and let the calling function return failure to its calling function, etc., until you get all the way back to main or similar, where you can write if (failure) exit(1);. This way, the code is immediately reusable in other situations where you might actually want to check for errors and take some kind of recovery steps to free up memory, save/dump valuable data to disk, etc.
I think that it is a bad idea, since first of all checking the return of malloc doesn't buy you much on modern systems and second since this gives you false security that when you use such a call, all your allocations are fine.
(I am supposing you are writing for a hosted environment and not embedded, standalone.)
Modern systems with a large virtual address space will just never return (void*)0 from malloc or realloc apart, maybe, if the arguments where bogus. You will encounter problems much, much later when your system starts to swap like crazy or even runs out of swap.
So no, don't check the return of these functions, it makes not much sense. Instead, check the arguments to malloc against 0 (and for realloc if both are 0 simultaneously) with an assertion, since then the problem is not inside malloc or realloc but the way you are calling them.

Resources