How to handle error conditions in a void function - c

I'm making a data structures and algorithms library in C for learning purposes (so this doesn't necessarily have to be bullet-proof), and I'm wondering how void functions should handle errors on preconditions. If I have a function for destroying a list as follows:
void List_destroy(List* list) {
/*
...
free()'ing pointers in the list. Nothing to return.
...
*/
}
Which has a precondition that list != NULL, otherwise the function will blow up in the caller's face with a segfault.
So as far as I can tell I have a few options: one, I throw in an assert() statement to check the precondition, but that means the function would still blow up in the caller's face (which, as far as I have been told, is a big no-no when it comes to libraries), but at least I could provide an error message; or two, I check the precondition, and if it fails I jump to an error block and just return;, silently chugging along, but then the caller doesn't know the List* was NULL.
Neither of these options seem particularly appealing. Moreover, implementing a return value for a simple destroy() function seems like it should be unnecessary.
EDIT: Thank you everyone. I settled on implementing (in all my basic list functions, actually) consistent behavior for NULL List* pointers being passed to the functions. All the functions jump to an error block and exit(1) as well as report an error message to stderr along the lines of "Cannot destroy NULL list." (or push, or pop, or whatever). I reasoned that there's really no sensible reason why a caller should be passing NULL List* pointers anyway, and if they didn't know they were then by all means I should probably let them know.

Destructors (in the abstract sense, not the C++ sense) should indeed never fail, no matter what. Consistent with this, free is specified to return without doing anything if passed a null pointer. Therefore, I would consider it reasonable for your List_destroy to do the same.
However, a prompt crash would also be reasonable, because in general the expectation is that C library functions crash when handed invalid pointers. If you take this option, you should crash by going ahead and dereferencing the pointer and letting the kernel fire a SIGSEGV, not by assert, because assert has a different crash signature.
Absolutely do not change the function signature so that it can potentially return a failure code. That is the mistake made by the authors of close() for which we are still paying 40 years later.

Generally, you have several options if a constraint of one of your functions is violated:
Do nothing, successfully
Return some value indicating failure (or set something pointed-to by an argument to some error code)
Crash randomly (i.e. introduce undefined behaviour)
Crash reliably (i.e. use assert or call abort or exit or the like)
Where (but this is my personal opinion) this is a good rule of thumb:
the first option is the right choice if you think it's OK to not obey the constraints (i.e. they aren't real constraints), a good example for this is free.
the second option is the right choice, if the caller can't know in advance if the call will succeed; a good example is fopen.
the third and fourth option are a good choice if the former two don't apply. A good example is memcpy. I prefer the use of assert (one of the fourth options) because it enables both: Crashing reliably if someone is unwilling to read your documentation and introduce undefined behaviour for people who do read it (they will prevent that by obeying your constraints), depending on whether they compile with NDEBUG defined or not. Dereferencing a pointer argument can serve as an assert, because it will make your program crash (which is the right thing, people not reading your documentation should crash as early as possible) if these people pass an invalid pointer.
So, in your case, I would make it similar to free and would succeed without doing anything.
HTH

If you wish not to return any value from function, then it is good idea to have one more argument for errCode.
void List_destroy(List* list, int* ErrCode) {
*ErrCode = ...
}
Edit:
Changed & to * as question is tagged for C.

I would say that simply returning in case the list is NULL would make sense at this would indicate that list is empty(not an error condition). If list is an invalid pointer, you cant detect that and let kernel handle it for you by giving a seg fault and let programmer fix it.

Related

The need of checking a pointer returned by malloc in C

Suppose I have the following line in my code:
struct info *pinfo = malloc(sizeof(struct info));
Usually there is another line of code like this one:
if (!pinfo)
<handle this error>
But does it really worth it? especially if the object is so small that the code generated to check it might need more memory than the object itself.
It's true that running out of memory is rare, especially for little test programs that are only allocating tens of bytes of memory, especially on modern systems that have many gigabytes of memory available.
Yet malloc failures are very common, especially for little test programs.
malloc can fail for two reasons:
There's not enough memory to allocate.
malloc detects that the memory-allocation heap is messed up, perhaps because you did something wrong with one of your previous memory allocations.
Now, it turns out that #2 happens all the time.
And, it turns out that #1 is pretty common, too, although not because there's not enough memory to satisfy the allocation the programmer meant to do, but because the programmer accidentally passed a preposterously huge number to malloc, accidentally asking for more memory than there is in the known universe.
So, yes, it turns out that checking for malloc failure is a really good idea, even though it seems like malloc "can't fail".
The other thing to think about is, what if you take the shortcut and don't check for malloc failure? If you sail along and use the null pointer that malloc gave you instead, that'll cause your program to immediately crash, and that'll alert you to your problem just as well as an "out of memory" message would have, without your having to wear your fingers to the bone typing if(!pinfo) and fprintf(stderr, "out of memory\n"), right?
Well, no.
Depending on what your program accidentally does with the null pointer, it's possible it won't crash right away. Anyway, the crash you get, with a message like "Segmentation violation - core dumped" doesn't tell you much, doesn't tell you where your problem is. You can get segmentation violations for all sorts of reasons (especially in little test programs, especially if you're a beginner not quite sure what you're doing). You can spend hours in a futile effort to figure out why your program is crashing, without realizing it's because malloc is returning a null pointer. So, definitely, you should always check for malloc failure, even in the tiniest test programs.
Deciding which errors to test for, versus those that "can't happen" or for whatever reason aren't worth catching, is a hard problem in general. It can take a fair amount of experience to know what is and isn't worth checking for. But, truly, anybody who's programmed in C for very long can tell you emphatically: malloc failure is definitely worth checking for.
If your program is calling malloc all over the place, checking each and every call can be a real nuisance. So a popular strategy is to use a malloc wrapper:
void *my_malloc(size_t n)
{
void *ret = malloc(n);
if(ret == NULL) {
fprintf(stderr, "malloc failed (%s)\n", strerror(errno));
exit(1);
}
return ret;
}
There are three ways of thinking about this function:
Whenever you have some processing that you're doing repetitively, all over the place (in this case, checking for malloc failure), see if you can move it off to (centralize it in) a single function, like this.
Unlike malloc, my_malloc can't fail. It never returns a null pointer. It's almost magic. You can call it whenever and wherever you want, and you never have to check its return value. It lets you pretend that you never have to worry about running out of memory (which was sort of the goal all along).
Like any magical result, my_malloc's benefit — that it never seems to fail — comes at a price. If the underlying malloc fails, my_malloc summarily exits (since it can't return in that case), meaning that the rest of your program doesn't get a chance to clean up. If the program were, say, a text editor, and whenever it had a little error it printed "out of memory" and then basically threw away the file the user had been editing for the last hour, the user might not be too pleased. So you can't use the simple my_malloc trick in production programs that might lose data. But it's a huge convenience for programs that don't have to worry about that sort of thing.
If malloc fails then chances are the system is out of memory or it's something else your program can't handle. It should abort immediately and at most log some diagnostics. Not handling NULL from malloc will make you end up in undefined behavior land. One might argue that having to abort because of a failure of malloc is already catastrophic but just letting it exhibit UB falls under a worse category.
But what if the malloc fails? You will dereference the NULL pointer, which is UB (undefined behaviour) and your program will (probably) fail!
Sometimes code which checks the correctness of the data is longer than the code which does something with it :).
This is very simply, if you won't check for NULL you might end up with runtime error. Checking for NULL will help you to avoid program from unexpected crash and gracefully handle the error case.
If you just want to quickly test some algorithm, then fine, but know it can fail. For example run it in the debugger.
When you include it in your Real World Program, then add all the error checking and handling needed.

Best practice for handling input check for void function

This may be a huge noob question, but I am relatively new to C and to using 'assert'.
Say I'm building a large program and have a void function test() which takes in an array and performs some manipulation to the array.
Now, as I build this program, I'll want to make sure that all my inputs for my functions are valid, so I want to make sure the array passed into test() is valid (i.e. not null let's say).
I can write something like:
if (array == NULL) return;
However, when I'm testing and it just returns, it becomes hard to know if my method succeeded at manipulating my array unless I check the array itself. Is it normal practice to add an assert in this case to ensure my condition for my own debugging purposes? I've heard that assert is not compiled for production code, so the assert would only be to help me, the programmer, test and debug. It seems kind of weird to have both if statement and an assert, but I don't see how the if statement could quickly allow me to know if my test method succeeded, and I don't see how assert could be a valid check for production code. So it seems like they're both needed?
If the contract of your function is that it requires a valid pointer, the best possible behavior is to crash loudly when a null or otherwise invalid pointer is passed. You can't test the validity of a pointer in general, but in the case of null pointers, dereferencing them will crash on most systems anyway. An assert would be an appropriate way of documenting this and ensuring a crash (unless NDEBUG is defined) to aid in diagnosing usage errors.
Changing your function to return an error status is not a good idea. It complicates the interface and lets the contract violation go unnoticed until later (or not at all if the caller does not check the return value).
You have the basic ideas.
Asserts are used to ensure some condition never occurs. The assert you've indicated ( assert( a != NULL ) ) would be used if it is not valid to call the function where a could be NULL.
As such, testing if ( a == NULL ) would make no sense. The assert indicates you consider this invalid. If you've significantly tested your code with the assert, you've "proven" (at least that's the idea) that you never call the function with a as NULL.
However, if by design, you intend that the function should gracefully ignore a when it is null, and do nothing, then a test is more appropriate. The assert would not be, because if you intend that the function is to be called with a null for some PURPOSE in mind, you don't need to be informed when that happens by having the debug mode trigger a breakpoint.
Which is to say, it makes little sense to combine the two.
A more "advanced" usage may check other values for validity. Say I have some class which processes bitmaps, and the bitmap is "owned" by the class, dynamically allocated and deleted appropriately. If I'm going to call any function in the class that performs operations on the bitmap, I must know it could never be NULL. Asserts would be appropriate to test the member pointer storing the bitmap data to be sure those functions aren't being called when the pointer IS null.
This is key to using asserts. You're attempting to prove the condition never occurs during debugging. That's the basic concept. As you're grappling with the possibility that it may be valid for such a value TO BE null, and still otherwise operate gracefully, you may find the combination of asserts AND tests to be reasonable. That is, you want to avoid crashes if other users of your code happen to make a call when the value IS null, but still not crash in production code if it happens to BE null.
Sometimes that's a performance hit you don't want to accept, and so you fire asserts so that consumers of your code know they're doing something you declare should never be done.

Should I really worry about fixing null defres if there's no crashes?

Clang's scan-build reports quite a few dereference of null pointers in my project, however, I don't really see any unusual behavior (in 6 years of using it), ie:
Dereference of null pointer (loaded from variable chan)
char *tmp;
CList *chan = NULL;
/* This is weird because chan is set via do_lookup so why could it be NULL? */
chan = do_lookup(who, me, UNLINK);
if (chan)
tmp = do_lookup2(you,me,0);
prot(get_sec_var(chan->zsets));
^^^^
I know null derefs can cause crashes but is this really a big security concern as some people make it out to be? What should I do in this case?
It is Undefined Behavior to dereference a NULL pointer. It can show any behavior, it might crash or not but you MUST fix those!
The truth about Undefined Behavior is that it obeys Murphy's Law
"Anything that can go wrong will go wrong"
It makes no sense checking chan for NULL at one point:
if (chan)
tmp = do_lookup2(you,me,0); /* not evaluated if `chan` is NULL */
prot(get_sec_var(chan->zsets)); /* will be evaluated in any case */
... yet NOT checking it right at the next line.
Don't you have to execute both these statements within if branch?
Clang is warning you because you check for chan being NULL, and then you unconditionally dereference it in the next line anyway. This cannot possibly be correct. Either do_lookup cannot return NULL, then the check is useless and should be removed. Or it can, then the last line can cause undefined behaviour and MUST be fixed. Als is 100% correct: NULL pointer dereferences are undefined behaviour and are always a potential risk.
Probably you want to enclose your code in a block, so that all of it is governed by the check for NULL, and not just the next line.
You have to fix these as soon as possible. Or probably sooner. The Standard says that the NULL pointer is a pointer that points to "no valid memory location", so dereferencing it is undefined behaviour. It means that it may work, it may crash, and it may do strange things at other parts of your program, or maybe cause daemons to fly out of your nose.
Fix them. Now.
Here's how: put the dereference statement inside the if - doing otherwise (as you do: checking for NULL then dereferencing anyways) makes no sense.
if (pointer != NULL) {
something = pointer->field;
}
^^ this is good practice.
If you have never experienced problems with this code, it's probably because:
do_lookup(who, me, UNLINK);
always returns a valid pointer.
But what will it happen if this function changes? Or its parameters vary?
You definitely have to check for NULL pointers before dereferencing them.
if (chan)
prot(get_sec_var(chan->zsets));
If you are absolutely sure that neither do_lookup or its parameters will change in the future (and you can bet the safe execution of your program on it), and the cost of changing all the occurrences of similar functions is excessively high compared to the benefits that you would have in doing so, then:
you may decide to leave your code broken.
Many programmers did that in the past, and many more will do that in the future. Otherwise what would explain the existence of Windows ME?
If your program crashes because of a NULL pointer dereference, this can be classified as a Denial of Service (DoS).
If this program is used together with other programs (e.g. they invoke it), the security aspects now start to depend on what those other programs do when this one crashes. The overall effect can be the same DoS or something worse (exploitation, sensitive info leakage, and so on).
If your program does not crash because of a NULL pointer dereference and instead continues running while corrupting itself and possibly the OS and/or other programs within the same address space, you can have a whole spectrum of security issues.
Don't put on the line (or online) broken code, unless you can afford dealing with consequences of potential hacking.

Should my library handle SIGSEGV on bad pointer input?

I'm writing a small library that takes a FILE * pointer as input.
If I immediately check this FILE * pointer and find it leads to a segfault, is it more correct to handle the signal, set errno, and exit gracefully; or to do nothing and use the caller's installed signal handler, if he has one?
The prevailing wisdom seems to be "libraries should never cause a crash." But my thinking is that, since this particular signal is certainly the caller's fault, then I shouldn't attempt to hide that information from him. He may have his own handler installed to react to the problem in his own way. The same information CAN be retrieved with errno, but the default disposition for SIGSEGV was set for a good reason, and passing the signal up respects this philosophy by either forcing the caller to be handle his errors, or by crashing and protecting him from further damage.
Would you agree with this analysis, or do you see some compelling reason to handle SIGSEGV in this situation?
Taking over handlers is not library business, I'd say it's somewhat offensive of them unless explicitly asked for. To minimize crashes library may validate their input to some certain extent. Beyond that: garbage in — garbage out.
The prevailing wisdom seems to be "libraries should never cause a crash."
I don't know where you got that from - if they pass an invalid pointer, you should crash. Any library will.
I would consider it reasonable to check for the special case of a NULL pointer. But beyond that, if they pass junk, they violated the function's contract and they get a crash.
This is a subjective question, and possibly not fit for SO, but I will present my opinion:
Think about it this way: If you have a function that takes a nul-terminated char * string and is documented as such, and the caller passes a string without the nul terminator, should you catch the signal and slap the caller on the wrist? Or should you let it crash and make the bad programmer using your API fix his/her code?
If your code takes a FILE * pointer, and your documentation says "pass any open FILE *", and they pass a closed or invalidated FILE * object, they've broken the contract. Checking for this case would slow down the code of people who properly use your library to accommodate people who don't, whereas letting it crash will keep the code as fast as possible for the people who read the documentation and write good code.
Do you expect someone who passes an invalid FILE * pointer to check for and correctly handle an error? Or are they more likely to blindly carry on, causing another crash later, in which case handling this crash may just disguise the error?
Kernels shouldn't crash if you feed them a bad pointer, but libraries probably should. That doesn't mean you should do no error checking; a good program dies immediately in the face of unreasonably bad data. I'd much rather a library call bail with assert(f != NULL) than to just trundle on and eventually dereference the NULL pointer.
Sorry, but people who say a library should crash are just being lazy (perhaps in consideration time, as well as development efforts). Libraries are collections of functions. Library code should not "just crash" any more than other functions in your software should "just crash".
Granted, libraries may have some issues around how to pass errors across the API boundary, if multiple languages or (relatively) exotic language features like exceptions would normally be involved, but there's nothing TOO special about that. Really, it's just part of the burden of writing libraries, as opposed to in-application code.
Except where you really can't justify the overhead, every interface between systems should implement sanity checking, or better, design by contract, to prevent security issues, as well as bugs.
There are a number of ways to handle this, What you should probably do, in order of preference, is one of:
Use a language that supports exceptions (or better, design by contract) within libraries, and throw an exception on or allow the contract to fail.
Provide an error handling signal/slot or hook/callback mechanism, and call any registered handlers. Require that, when your library is initialised, at least one error handler is registered.
Support returning some error code in every function that could possibly fail, for any reason. But this is the old, relatively insane way of doing things from C (as opposed to C++) days.
Set some global "an error has occurred flag", and allow clearing that flag before calls. This is also old, and completely insane, mostly because it moves error status maintence burden to the caller, AND is unsafe when it comes to threading.

checking for NULL before calling free

Many C code freeing pointers calls:
if (p)
free(p);
But why? I thought C standard say the free function doesn't do anything given a NULL pointer. So why another explicit check?
The construct:
free(NULL);
has always been OK in C, back to the original UNIX compiler written by Dennis Ritchie. Pre-standardisation, some poor compilers might not have fielded it correctly, but these days any compiler that does not cannot legitimately call itself a compiler for the C language. Using it typically leads to clearer, more maintainable code.
As I understand, the no-op on NULL was not always there.
In the bad old days of C (back around
1986, on a pre-ANSI standard cc
compiler) free(NULL) would dump core.
So most devs tested for NULL/0 before
calling free.
The world has come a long way, and it
appears that we don't need to do the
test anymore. But old habits die
hard;)
http://discuss.joelonsoftware.com/default.asp?design.4.194233.15
I tend to write "if (p) free(p)" a lot, even if I know it's not needed.
I partially blame myself because I learned C the old days when free(NULL) would segfault and I still feel uncomfortable not doing it.
But I also blame the C standard for not being consistent. Would, for example, fclose(NULL) be well defined, I would not have problems in writing:
free(p);
fclose(f);
Which is something that happens very often when cleaning up things.
Unfortunately, it seems strange to me to write
free(p);
if (f) fclose(f);
and I end up with
if (p) free(p);
if (f) fclose(f);
I know, it's not a rational reason but that's my case :)
Compilers, even when inlining are not smart enough to know the function will return immediately. Pushing parameters etc on stack and setting the call up up is obviously more expensive than testing a pointer. I think it is always good practice to avoid execution of anything, even when that anything is a no-op.
Testing for null is a good practice. An even better practice is to ensure your code does not reach this state and therefore eliminate the need for the test altogether.
There are two distinct reasons why a pointer variable could be NULL:
because the variable is used for what in type theory is called an option type, and holds either a pointer to an object, or NULL to represent nothing,
because it points to an array, and may therefore be NULL if the array has zero length (as malloc(0) is allowed to return NULL, implementation-defined).
Although this is only a logical distinction (in C there are neither option types nor special pointers to arrays and we just use pointers for everything), it should always be made clear how a variable is used.
That the C standard requires free(NULL) to do nothing is the necessary counterpart to the fact that a successful call to malloc(0) may return NULL. It is not meant as a general convenience, which is why for example fclose() does require a non-NULL argument. Abusing the permission to call free(NULL) by passing a NULL that does not represent a zero-length array feels hackish and wrong.
If you rely on that free(0) is OKAY, and it's normal for your pointer to be null at this point, please say so in comment // may be NULL
This may be merely self-explanatory code, saying yes I know, I also use p as a flag.
there can be a custom implementation of free() in mobile environment.
In that case free(0) can cause a problem.
(yeah, bad implementation)
if (p)
free(p);
why another explicit check?
If I write something like that, it's to convey the specific knowledge that the pointer may be NULL...to assist in readability and code comprehension. Because it looks a bit weird to make that an assert:
assert(p || !p);
free(p);
(Beyond looking strange, compilers are known to complain about "condition always true" if you turn your warnings up in many such cases.)
So I see it as good practice, if it's not clear from the context.
The converse case, of a pointer being expected to be non null, is usually evident from the previous lines of code:
...
Unhinge_Widgets(p->widgets);
free(p); // why `assert(p)`...you just dereferenced it!
...
But if it's non-obvious, having the assert may be worth the characters typed.

Resources