I'm trying to write a function in C to solve a math problem. In that function, there are several steps, and each step needs to allocate some memory with the size depending on the calculation results in previous steps (so I can't allocate them all at the beginning of the function). The pseudo code looks like:
int func(){
int *p1, *p2, *p3, *p4;
...
p1 = malloc(...);
if(!p1){
return -1; //fail in step 1
}
...
p2 = malloc(...);
if(!p2){
free(p1);
return -2; //fail in step 2
}
...
p3 = malloc(...);
if(!p3){
free(p1);
free(p2);
return -3; //fail in step 3
}
...
p4 = malloc(...);
if(!p4){
free(p1);
free(p2);
free(p3); /* I have to write too many "free"s here! */
return -4; //fail in step 4
}
...
free(p1);
free(p2);
free(p3);
free(p4);
return 0; //normal exit
}
The above way to handle malloc failures is so ugly. Thus, I do it in the following way:
int func(){
int *p1=NULL, *p2=NULL, *p3=NULL, *p4=NULL;
int retCode=0;
...
/* other "malloc"s and "if" blocks here */
...
p3 = malloc(...);
if(!p3){
retCode = -3; //fail in step 3
goto FREE_ALL_EXIT;
}
...
p4 = malloc(...);
if(!p4){
retCode = -4; //fail in step 4
goto FREE_ALL_EXIT;
}
...
FREE_ALL_EXIT:
free(p1);
free(p2);
free(p3);
free(p4);
return retCode; //normal exit
}
Although I believe it's more brief, clear, and beautiful now, my team mate is still strongly against the use of 'goto'. And he suggested the following method:
int func(){
int *p1=NULL, *p2=NULL, *p3=NULL, *p4=NULL;
int retCode=0;
...
do{
/* other "malloc"s and "if" blocks here */
p4 = malloc(...);
if(!p4){
retCode = -4; //fail in step 4
break;
}
...
}while(0);
free(p1);
free(p2);
free(p3);
free(p4);
return retCode; //normal exit
}
Hmmm, it seems a way to avoid the use of 'goto', but this way increases indents, which makes the code ugly.
So my question is, is there any other method to handle many 'malloc' failures in a good code style? Thank you all.
goto in this case is legitimate. I see no particular advantage to the do{}while(0) block as its less obvious what pattern it is following.
First of all, there's nothing wrong with goto—this is a perfectly legitimate use of goto. The do { ... } while(0) with break statements are just gotos in disguise, and it only serves to obfuscate the code. Gotos are really the best solution in this case.
Another option is to put a wrapper around malloc (e.g. call it xmalloc) which kills the program if malloc fails. For example:
void *xmalloc(size_t size)
{
void *mem = malloc(size);
if(mem == NULL)
{
fprintf(stderr, "Out of memory trying to malloc %zu bytes!\n", size);
abort();
}
return mem;
}
Then use xmalloc everywhere in place of malloc, and you no longer need to check the return value, since it will return a valid pointer if it returns at all. But of course, this is only usable if you want allocation failures to be an unrecoverable failure. If you want to be able to recover, then you really do need to check the result of every allocation (though honestly, you'll probably have another failure very soon after).
Ask your teammate how he would re-write this sort of code:
if (!grabResource1()) goto res1failed;
if (!grabResource2()) goto res2failed;
if (!grabResource3()) goto res3failed;
(do stuff)
res3failed:
releaseResource2();
res2failed:
releaseResource1();
res1failed:
return;
And ask how he would generalize it to n resources. (Here, "grabbing a resource" could mean locking a mutex, opening a file, allocating memory, etc. The "free on NULL is OK" hack does not solve everything...)
Here, the alternative to goto is to create a chain of nested functions: Grab a resource, call a function that grabs another resource and calls another function that grabs a resource and calls another function... When a function fails, its caller can free its resource and return failure, so the releasing happens as the stack unwinds. But do you really think this is easier to read than the gotos?
(Aside: C++ has constructors, destructors, and the RAII idiom to handle this sort of thing. But in C, this is the one case where goto is clearly the right answer, IMO.)
There's nothing wrong with goto in error handling and there's actually no code difference between using a do { ... } while(0); with breaks; instead of goto (since they're both jmp instructions). I would say that seems normal. One thing you could do that is shorter is create an array of int * types and iterate through while calling malloc. If one fails free the ones that are non-null and return an error code. This is the cleanest way I can think of so something like
int *arr[4];
unsigned int i;
for (i = 0; i < 4; ++i)
if (!(arr[i] = malloc(sizeof(int))) {
retCode = -(i + 1); //or w/e error
break;
}
if (errorCode)
for (i = 0; i < 4; i++)
if (arr[i])
free(arr[i]);
else
break;
or something along those lines (used brain compiler for this so I might be wrong)
Not only does this shorten your code but also avoids goto's (which I don't see anything wrong with) so you and your teammate can both be happy :D
David Hanson wrote the book C Interfaces and Implementations: Techniques for Creating Reusable Software. His Mem interface provides functions that are "similar to those in the standard C library, but they don't accept zero sizes and never return null pointers." The source code includes a production implementation and a checking implementation.
He also implements an Arena interface. The Arena interface releases you from the obligation to call free() for every malloc(). Instead, there's just a single call to free the entire arena.
CII source code
If an allocation fails, simply assign the error code as normal. conditionalize each malloc like so:
if (retCode < 0) malloc...
and then at the end of your code, add this:
int * p_array[] = { p1, p2, p3, p4};
for (int x = -retCode + 1; x >= 0; x-- )
{
free(p_array[x]);
}
Related
There are two main ways,which is better?
Deal with error right now.
int func(){
rv = process_1();
if(!rv){
// deal with error_1
return -1;
}
rv = process_2();
if(!rv){
// deal with error_1
// deal with error_2
return -1;
}
return 0;
}
Deal with errors at go-to. I found a lot of this style of code in the Linux kernel code.
int func(){
rv = process_1();
if(!rv){
goto err_1
}
rv = process_2();
if(!rv){
goto err_2;
}
return 0;
err_2:
// deal with error_2
err_1:
// deal with error_1
return -1;
}
This is really prone to become a flame war, but here my opinion :
A lot of people will say that goto is inherently evil, that you should never use it.
While I can agree to a certain degree, I also can say that when it come to clean multiple variable (like by using fclose / free / etc etc), I find goto to be the cleanest (or more readable, at least) way of doing it.
To be clear, I advise to always use the simplest way for error handling, not using always goto.
For exemple,
bool MyFunction(void)
{
char *logPathfile = NULL;
FILE *logFile = NULL;
char *msg = NULL;
bool returnValue = false;
logPathfile = malloc(...);
if (!logPathfile) {
// Error message (use possibly perror (3) / strerror (3))
goto END_FUNCTION;
}
sprintf(logPathfile, "%s", "/home/user/exemple.txt");
logFile = fopen(logPathfile, "w");
if (!logFile) {
// Error message (use possibly perror (3) / strerror (3))
goto END_FUNCTION;
}
msg = malloc(...);
if (!msg) {
// Error message (use possibly perror (3) / strerror (3))
goto END_FUNCTION;
}
/* ... other code, with possibly other failure test that end with goto */
// Function's end
returnValue = true;
/* GOTO */END_FUNCTION:
free(logPathfile);
if (logFile) {
fclose(logFile);
}
free(msg);
return returnValue;
}
By using goto to handle the error, you now really reduce the risk to do memory leak.
And if in the futur you have to add another variable that need cleaning, you can add the memory management really simply.
Or if you have to add another test (let's say for example that the filename should not begin by "/root/"), then you reduce the risk to forgetting to free the memory because the goto whill handle it.
Like you said it, you can also use this flow structure to add rollback action.
Depending the situation, you maybe don't need to have multiple goto label thougth.
Let's say that in the previous code, if there is an error, we have to delete the created file.
Simply add
/* rollback action */
if (!returnValue) {
if (logPathfile) {
remove(logPathfile);
}
}
rigth after the goto label, and you're done :)
=============
edit :
The complexity added by using goto are, as far as I know, the following :
every variable that will be cleaned or use to use clean have to be intialized.
That should not be problematic since setting pointer to a valid value (NULL or other) should always be done when declaring the variable.
for example
void MyFunction(int nbFile)
{
FILE *array = NULL;
size_t size = 0;
array = malloc(nbFile * sizeof(*array));
if (!array) {
// Error message (use possibly perror (3) / strerror (3))
goto END_FUNCTION;
}
for (int i = 0; i < nbFile; ++i) {
array[i] = fopen("/some/path", "w");
if (!array[i]) {
// Error message (use possibly perror (3) / strerror (3))
goto END_FUNCTION;
}
++size;
}
/* ... other code, with possibly other failure test that end with goto */
/* GOTO */END_FUNCTION:
/* We need size to fclose array[i], so size should be initialized */
for (int i = 0; i < size; ++i) {
flcose(array[i]);
}
free(array);
}
(yeah, I know that If I had use calloc instead of malloc, I could have tested if array[i] != NULL to know if I need to fclose, but it's for the sake of the explanation ...)
You probably have to add another variable for the function return value.
I usually set this variable to indicate failure at the beginning (like setting false) and give it's success value just before the goto.
Sometime, in some situation, this can seem weird, but it's, in my opinion, still understandable (just add a comment :) )
I'd recommend you to read thoroughly the examples you have found (more if they are in the kernel code of an operating system.) The situation you describe corresponds to an algorithm that should make decisions at each stage of the execution, and those stages require to undo the previous steps.
You first allocate some resource #1, and continue.
then you allocate another resource (say resource #2) if that fails, then you have to free resource #1, as it is not longer valid.
...
finally you allocate resource #N, if that fails you must free resources #1 to #N-1.
The figure you show allows you to write in one line, a set of resource allocations, between which you have to decide if you continue.
In this scenario a policy like this is recommended (for novice C programmers, as it avoids the use of goto but becomes less readable (as it nests as things happen)
if ((res_1 = some_allocation(blablah)) != ERROR_CODE) {
if ((res_2 = some_other_allocation(blablatwo)) != ANOTHER_ERROR_CODE) {
...
if ((res_N = some_N_allocation(blablaN)) != NTH_ERROR_CODE) {
do_what_is_needed();
return_resource_N(res_N); /* free resN */
} else {
do_action_corresponding_to_failed_N(); /* error for failing N */
}
return_resource_N_minus_one(resN_1); /* free resN_1 */
...
} else {
do_action_corresponding_to_failed_2(); /* error for failing #2 */
}
return_resource_1(res1); /* free #1. (A): (see below) */
} else {
do_acttion_corresponding_to_failed_1(); /* error for failing #1 */
}
/* there's nothing to undo here, as we have returned the first resource in (A) above. */
nothing to say about this code, but that it has no gotos, but is incredible far less readable (it's a mess of nested things in which, when you fail for resource N, then you have to return up to N-1 resources.) you can messup the resources deallocated by putting them in the wrong position and it's error prone. But on the other side, it allocates and deallocates the things in just one place and is as compact as the code with gotos.
writing this code with gotos gives this:
if ((res_1 = some_allocation(blablah)) == ERROR_CODE) {
do_acttion_corresponding_to_failed_1(); /* error for failing #1 */
goto end;
}
if ((res_2 = some_other_allocation(blablatwo)) == ANOTHER_ERROR_CODE) {
do_action_corresponding_to_failed_2(); /* error for failing #2 */
goto res1;
}
...
if ((res_N = some_N_allocation(blablaN)) == NTH_ERROR_CODE) {
do_action_corresponding_to_failed_N(); /* error for failing #N */
goto resN1;
}
do_what_is_needed();
return_resource_N(res_N); /* free resN */
resN1: return_resource_N_minus_one(resN_1); /* free resN_1 */
...
res1: return_resource_1(res1); /* free #1. (A): (see below) */
end: /* there's nothing to undo here, as we have returned the first resource in (A) above. */
There's only thing that can be said about the first code that will make it perform better in some architectures. Dealing with goto is a pain for the compiler, as normally it has to make assumptions about all the possible resulting blocks that will end jumping to the same label, and this makes things far more difficult to optimice, resulting in not so optimiced code. (this is clear when you use structured blocks, and only implies one or two places you can come from), and you will get worse performance code (not much worse, but somewhat slower code)
You will agree with me that the equivalent code you post in your code is more readable, probably exactly the same level of correctness.
Other required use of goto constructs is when you have several nested loops and you have to exit more than the closest loop to exit.
for(...) {
for(...) {
...
for (...) {
goto out;
}
...
}
}
out:
this is also C specific, as other languages allow you to label the construct you want to exit from and specify it in the break statement.
E.g. in Java:
external_loop: for(...) {
for(...) {
...
for (...) {
break external_loop;
}
...
}
}
In this case you don't need to jump, as the break knows how many loops we need to exit.
One last thing to say. With just the while() construct, all other language constructs can be simulated, by introducing state variables to allow you to do things (e.g. stepping out of each loop by checking some variable used precisely for that). And even less.... if we allow for recursive function call, even the while() loop can be simulated, and optimicers are capable of guessing a faster implementation without recursion for the simulated block. Why in the schools nobody says never use if sentences, they are evil? This is because there's a frequent fact that newbies tend to learn one struct better than others and then, they get the vice of using it everywhere. This happens frequently with goto and not with others, more difficult to understand but easier to use, once they have been understood.
The use of goto for everything (this is the legacy of languages like assembler and early fortran) and maintaining that code normally ends in what is called spaghetti programming. A programmer just selects at random a place to write his/her code in the main code of a program, opens an editor and inserts it's code there:
Let's say that we have to do several steps, named A to F:
{
code_for_A();
code_for_B();
code_for_C();
code_for_D();
code_for_E();
code_for_F();
}
and later, some steps, named G and H have to be added to be executed at the end. Spaghetti programming can make the code end being something like this:
{
code_for_A();
code_for_B();
code_for_C(); /* programmer opened the editor in this place */
goto A;-------.
|
B:<---------------+-.
code_for_G(); | | /* the code is added in the middle of the file */
code_for_H(); | |
goto C;-------+-+--.
| | |
A:<---------------' | |
code_for_D(); | |
code_for_E(); | |
code_for_F(); | |
goto B; --------' |
|
C:<--------------------'
}
While this code is correct (it executes steps A to H in sequence), it will take a programmer some time to guess how the code flows from A to H, by following back and forward the gotos.
For an alternate open that can sometimes be used to "hide" the gotos, one of our programmers got us using what he calls "do once" loops. They look like this:
failed = true; // default to failure
do // once
{
if( fail == func1(parm1) )
{ // emit error
break;
}
failed = false; // we only succeed if we get all the way through
}while(0);
// do common cleanup
// additional failure handling and/or return success/fail result
Obviously, the if block inside the 'do once' would be repeated. For example, we like this structure for setting up a network connection because there are many steps that have the possibility of failure. This structure can get tricky to use if you need a switch or another loop embedded within, but it has proven to be a surprisingly handy way to deal with error detection and common cleanup for us.
If you hate it, don't use it. (smile) We like it.
I am student and I am writing HTTP proxy application in C. I have trouble with memory management. In all my previous applications I simply wrote a wrapper around malloc which aborted when malloc failed.
void *xmalloc(size_t size)
{
void *ptr;
assert(size);
ptr = malloc(size);
if (!ptr)
abort();
return ptr;
}
This I now find insufficient as I just want to refuse client and continue serving other clients when memory allocation fails due to temporary shortage of memory. If I don't want to clutter my code with checks after each malloc call (I have quite lot of them per function in parsing code), what are other options to handle memory management and which one is the best for my purposes and how what is a common way for server applications to handle memory management and shortage of memory?
Consider this function from my current code which parses one line from header portion of HTTP message (xstrndup calls xmalloc):
int http_header_parse(http_hdr_table *t, const char *s)
{
const char *p;
const char *b;
char *tmp_name;
char *tmp_value;
int ret = -1;
assert(t);
assert(s);
p = b = s;
/* field name */
for (; ; p++) {
if (*p == ':') {
if (p-b <= 0) goto out;
tmp_name = xstrndup(b, p-b);
b = ++p;
break;
}
if (is_ctl_char(*p) || is_sep_char(*p)) goto out;
}
while (*p == ' ' || *p == '\t') {
p++; b++;
}
/* field value */
for (; ; p++) {
if (is_crlf(p)) {
if (p-b <= 0) goto err_value;
tmp_value = xstrndup(b, p-b);
p += 2;
break;
}
if (!*p) goto err_value;
}
http_hdr_table_set(t, tmp_name, tmp_value);
ret = 0;
xfree(tmp_value);
err_value:
xfree(tmp_name);
out:
return ret;
}
I would like to keep things simple and handle memory allocation errors at one place and to not clutter code with malloc error handling code. What should I do? Thank you.
P.S: I am writing the application to run on POSIX/Unix-like systems. Also feel free to criticize my current coding style and practices.
If you want to use a relatively low level language like C, then you shouldn't be too worried about adding something like if(tmp_value == NULL) goto out; in 2 places.
If you can't stand the idea of 2 trivial lines of extra code, then maybe try a language that supports exceptions properly (e.g. C++) and add throw/try/catch instead. Note: I really don't like C++, but using C++ would have to make more sense than implementing your own "exception like" features and an entire layer of automated resource de-allocation in C.
Modern languages give you garbage collection and exceptions. C doesn't, so you have to work hard. There's no magical solution here.
Some tips:
Create a session structure, and keep all your allocated memory pointed from it. When the session is aborted, always call a cleanup function. This way, even if you have to check for failures in many places, at least all failures are handled the same way.
You can even create a session_allocate() function, which allocates memory and keeps it on a linked list pointed from the session structure. Everything you allocate using this function would be freed when the session is destroy.
Try to concentrate all allocations in the beginning of the session. After you've allocated all you need, the rest of your code won't need to worry about failures.
If you're on a system that supports fork(), which linux does, you can run each client connection in it's own process. When a client connection is first established, you fork your main process into a child process to handle the rest of the request. Then you can abort() like you always have and only the specific client connection is affected. This is a classic unix server model.
If you don't want to or can't use fork(), you need to abort the request by throwing an exception. In C, that would be done by using setjump() when the connection is first established and then calling longjump() when out of memory is detected. This will reset execution and the stack back to where setjump() was called.
The problem is, this will leak all the resources allocated up to that point (for example, other memory allocations that had succeeded up to the point of getting out of memory). So additionally, your memory allocator will have to track all the memory allocations for each request. When longjump() is called, the setjump() return location will then have to free all the memory that was associated with the aborted request.
This is what apache does using pools. Apache uses pools to track resource allocations so it can auto free them in the case of an abort or because the code just didn't free it: http://www.apachetutor.org/dev/pools.
You should also consider the pool model and not just simply wrap malloc() so one client can't use up all the memory in the system.
Another possibility would be to use Boehm's GC by using its GC_malloc instead of malloc (you won't need to call free or GC_free); its
GC_oom_fn function pointer (called internally from GC_malloc when no memory is available any more) can be set to your particular out of memory handler (which would deny the incoming HTTP request, perhaps with a longjmp)
The major advantage of using Boehm GC is that you don't care any more about free-ing your dynamically allocated data (provided it was allocated using GC_malloc or friends, e.g. GC_malloc_atomic for data without any pointers inside).
Notice that memory management is not a modular property. The liveness of some given data is a whole program property, see garbage collection wikipage, and RAII programming idiom.
You could of course use alloca, but that has issues that mean it must be used with care. Alternatively, you can write your code so that you minimise and localise the use of malloc. For example your function above could be rewritten to localise the allocations:
static size_t field_name_length(const char *s)
{
const char *p = s;
for ( ; *p != ':'; ++p) {
if (is_ctl_char(*p) || is_sep_char(*p))
return 0;
}
return (size_t) (p - s);
}
static size_t value_length(const char *s)
{
const char *p = s;
for (; *p && !is_crlf(p); p+=2) {
/* nothing */
}
return *p ? (size_t) (p - s) : 0;
}
int http_header_parse(http_hdr_table *t, const char *s)
{
const char *v;
int ret = -1;
size_t v_len = 0;
size_t f_len = field_name_length(s);
if (f_len) {
v = s + f_len + 1;
v = s + strspn(s, " \t");
v_len = value_length(s);
}
if (v_len > 0 && f_len > 0) {
/* Allocation is localised to this block */
const char *name = xstrndup(s, f_len);
const char *value = xstrndup(v, v_len);
if (name && value) {
http_hdr_table_set(t, name, value);
ret = 0;
}
xfree(value);
xfree(name);
}
return ret;
}
Or, even better, you could modify http_hdr_table_set to accept the pointers and lengths and avoid allocation completely.
For example,
I need to malloc two pieces of memory, so:
void *a = malloc (1);
if (!a)
return -1;
void *b = malloc (1);
if (!b)
{
free (a);
return -1;
}
Notice if the second malloc fails, I have to free "a" first. The problem is, this can be very messy if there are many such malloc's and error checking's, unless I use the notorious "goto" clause and carefully arrange the order of free's along with the labels:
void *a = malloc (1);
if (!a)
goto X;
void *b = malloc (1);
if (!b)
goto Y;
return 0; //normal exit
Y:
free (a);
X:
return -1;
Do you have any better solution to this situation? Thanks in advance.
We do like this:
void *a = NULL;
void *b = NULL;
void *c = NULL;
a = malloc(1);
if (!a) goto errorExit;
b = malloc(1);
if (!b) goto errorExit;
c = malloc(1);
if (!b) goto errorExit;
return 0;
errorExit:
//free a null pointer is safe.
free(a);
free(b);
free(c);
return -1;
Using goto is not a bad thing, in my opinion. Using it for resource cleanup is just right for it.
Source code as famous as the Linux kernel uses the technique.
Just don't use goto to go backwards. That leads to disaster and confusion. Only jump forward is my recommendation.
As previously mentioned by Zan Lynx use goto statement.
You can also alloc larger chunk of memory for further use.
Or you can invest your time to develop something like memory pool.
Or do this.
void *a,*b;
char * p = malloc(2);
if (!p) return -1;
a = p;
b = p+1;
I think OOP techniques could give you a nice and clean solution to this problem:
typedef struct {
void *a;
void *b;
} MyObj;
void delete_MyObj(MyObj* obj)
{
if (obj) {
if (obj->a)
free(obj->a);
if (obj->b)
free(obj->b);
free(obj);
}
}
MyObj* new_MyObj()
{
MyObj* obj = (MyObj*)malloc(sizeof(MyObj));
if (!obj) return NULL;
memset(obj, 0, sizeof(MyObj));
obj->a = malloc(1);
obj->b = malloc(1);
if (!obj->a || !obj->b) {
delete_MyObj(obj);
return 0;
}
return obj;
}
int main()
{
MyObj* obj = new_MyObj();
if (obj) {
/* use obj */
delete_MyObj(obj);
}
}
There's nothing really wrong with your goto code IMO (I'd use more verbose labels).
In this case though, the goto statements you've written create exactly the same structure as reversing the ifs.
That is, a conditional forward goto that doesn't leave any scope does exactly the same as an if statement with no else. The difference is that the goto happens not to leave scope, whereas the if is constrained not to leave scope. That's why the if is usually easier to read: the reader has more clues up front.
void *a = malloc (1);
if (a) {
void *b = malloc (1);
if (b) {
return 0; //normal exit
}
free(a);
}
return -1;
For a couple of levels this is OK, although taken too far you get "arrow code" with too many levels of indentation. That becomes unreadable for entirely different reasons.
Use a garbage collector like boehmgc.
It works, it's easy to use, there is no slowdown contrary to common opinion.
https://en.wikipedia.org/wiki/Boehm_garbage_collector
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
Depending on the platform / context your application is running, continuing after malloc() is returning NULL may not make much sense anymore.
So in many cases, a simple
if (!a)
catastrophic_failure_bail_out();
might be the most sensible solution and keep the code clean and readable.
I'm trying to learn C by writing a simple parser / compiler. So far its been a very enlightening experience, however coming from a strong background in C# I'm having some problems adjusting - in particular to the lack of exceptions.
Now I've read Cleaner, more elegant, and harder to recognize and I agree with every word in that article; In my C# code I avoid throwing exceptions whenever possible, however now that I'm faced with a world where I can't throw exceptions my error handling is completely swamping the otherwise clean and easy-to-read logic of my code.
At the moment I'm writing code which needs to fail fast if there is a problem, and it also potentially deeply nested - I've settled on a error handling pattern whereby "Get" functions return NULL on an error, and other functions return -1 on failure. In both cases the function that fails calls NS_SetError() and so all the calling function needs to do is to clean up and immediately return on a failure.
My issue is that the number of if (Action() < 0) return -1; statements that I have is doing my head in - it's very repetitive and completely obscures the underlying logic. I've ended up creating myself a simple macro to try and improve the situation, for example:
#define NOT_ERROR(X) if ((X) < 0) return -1
int NS_Expression(void)
{
NOT_ERROR(NS_Term());
NOT_ERROR(Emit("MOVE D0, D1\n"));
if (strcmp(current->str, "+") == 0)
{
NOT_ERROR(NS_Add());
}
else if (strcmp(current->str, "-") == 0)
{
NOT_ERROR(NS_Subtract());
}
else
{
NS_SetError("Expected: operator");
return -1;
}
return 0;
}
Each of the functions NS_Term, NS_Add and NS_Subtract do a NS_SetError() and return -1 in the case of an error - its better, but it still feels like I'm abusing macros and doesn't allow for any cleanup (some functions, in particular Get functions that return a pointer, are more complex and require clean-up code to be run).
Overall it just feels like I'm missing something - despite the fact that error handling in this way is supposedly easier to recognize, In many of my functions I'm really struggling to identify whether or not errors are being handled correctly:
Some functions return NULL on an error
Some functions return < 0 on an error
Some functions never produce an error
My functions do a NS_SetError(), but many other functions don't.
Is there a better way that I can structure my functions, or does everyone else also have this problem?
Also is having Get functions (that return a pointer to an object) return NULL on an error a good idea, or is it just confusing my error handling?
It's a bigger problem when you have to repeat the same finalizing code before each return from an error. In such cases it is widely accepted to use goto:
int func ()
{
if (a() < 0) {
goto failure_a;
}
if (b() < 0) {
goto failure_b;
}
if (c() < 0) {
goto failure_c;
}
return SUCCESS;
failure_c:
undo_b();
failure_b:
undo_a();
failure_a:
return FAILURE;
}
You can even create your own macros around this to save you some typing, something like this (I haven't tested this though):
#define CALL(funcname, ...) \
if (funcname(__VA_ARGS__) < 0) { \
goto failure_ ## funcname; \
}
Overall, it is a much cleaner and less redundant approach than the trivial handling:
int func ()
{
if (a() < 0) {
return FAILURE;
}
if (b() < 0) {
undo_a();
return FAILURE;
}
if (c() < 0) {
undo_b();
undo_a();
return FAILURE;
}
return SUCCESS;
}
As an additional hint, I often use chaining to reduce the number of if's in my code:
if (a() < 0 || b() < 0 || c() < 0) {
return FAILURE;
}
Since || is a short-circuit operator, the above would substitute three separate if's. Consider using chaining in a return statement as well:
return (a() < 0 || b() < 0 || c() < 0) ? FAILURE : SUCCESS;
One technique for cleanup is to use an while loop that will never actually iterate. It gives you goto without using goto.
#define NOT_ERROR(x) if ((x) < 0) break;
#define NOT_NULL(x) if ((x) == NULL) break;
// Initialise things that may need to be cleaned up here.
char* somePtr = NULL;
do
{
NOT_NULL(somePtr = malloc(1024));
NOT_ERROR(something(somePtr));
NOT_ERROR(somethingElse(somePtr));
// etc
// if you get here everything's ok.
return somePtr;
}
while (0);
// Something went wrong so clean-up.
free(somePtr);
return NULL;
You lose a level of indentation though.
Edit: I'd like to add that I've nothing against goto, it's just that for the use-case of the questioner he doesn't really need it. There are cases where using goto beats the pants off any other method, but this isn't one of them.
You're probably not going to like to hear this, but the C way to do exceptions is via the goto statement. This is one of the reasons it is in the language.
The other reason is that goto is the natural expression of the implementation of a state machine. What common programming task is best represented by a state machine? A lexical analyzer. Look at the output from lex sometime. Gotos.
So it sounds to me like now is the time for you to get chummy with that parriah of language syntax elements, the goto.
Besides goto, standard C has another construct to handle exceptional flow control setjmp/longjmp. It has the advantage that you can break out of multiply nested control statements more easily than with break as was proposed by someone, and in addition to what goto provides has a status indication that can encode the reason for what went wrong.
Another issue is just the syntax of your construct. It is not a good idea to use a control statement that can inadvertibly be added to. In your case
if (bla) NOT_ERROR(X);
else printf("wow!\n");
would go fundamentally wrong. I'd use something like
#define NOT_ERROR(X) \
if ((X) >= 0) { (void)0; } \
else return -1
instead.
THis must be thought on at least two levels: how your functions interact, and what you do when it breaks.
Most large C frameworks I see always return a status and "return" values by reference (this is the case of the WinAPI and of many C Mac OS APIs). You want to return a bool?
StatusCode FooBar(int a, int b, int c, bool* output);
You want to return a pointer?
StatusCode FooBar(int a, int b, int c, char** output);
Well, you get the idea.
On the calling function's side, the pattern I see the most often is to use a goto statement that points to a cleanup label:
if (statusCode < 0) goto error;
/* snip */
return everythingWentWell;
error:
cleanupResources();
return somethingWentWrong;
What about this?
int NS_Expression(void)
{
int ok = 1;
ok = ok && NS_Term();
ok = ok && Emit("MOVE D0, D1\n");
ok = ok && NS_AddSub();
return ok
}
The short answer is: let your functions return an error code that cannot possibly be a valid value - and always check the return value. For functions returning pointers, this is NULL. For functions returning a non-negative int, it's a negative value, commonly -1, and so on...
If every possible return value is also a valid value, use call-by-reference:
int my_atoi(const char *str, int *val)
{
// convert str to int
// store the result in *val
// return 0 on success, -1 (or any other value except 0) otherwise
}
Checking the return value of every function might seem tedious, but that's the way errors are handled in C. Consider the function nc_dial(). All it does is checking its arguments for validity and making a network connection by calling getaddrinfo(), socket(), setsockopt(), bind()/listen() or connect(), finally freeing unused resources and updating metadata. This could be done in approximately 15 lines. However, the function has nearly 100 lines due to error checking. But that's the way it is in C. Once you get used to it, you can easily mask the error checking in your head.
Furthermore, there's nothing wrong with multiple if (Action() == 0) return -1;. To the contrary: it is usually a sign of a cautious programmer. It's good to be cautious.
And as a final comment: don't use macros for anything but defining values if you can't justify their use while someone is pointing with a gun at your head. More specifically, never use control flow statements in macros: it confuses the shit out of the poor guy who has to maintain your code 5 years after you left the company. There's nothing wrong with if (foo) return -1;. It's simple, clean and obvious to the point that you can't do any better.
Once you drop your tendency to hide control flow in macros, there's really no reason to feel like you're missing something.
A goto statement is the easiest and potentially cleanest way to implement exception style processing. Using a macro makes it easier to read if you include the comparison logic inside the macro args. If you organize the routines to perform normal (i.e. non-error) work and only use the goto on exceptions, it is fairly clean for reading. For example:
/* Exception macro */
#define TRY_EXIT(Cmd) { if (!(Cmd)) {goto EXIT;} }
/* My memory allocator */
char * MyAlloc(int bytes)
{
char * pMem = NULL;
/* Must have a size */
TRY_EXIT( bytes > 0 );
/* Allocation must succeed */
pMem = (char *)malloc(bytes);
TRY_EXIT( pMem != NULL );
/* Initialize memory */
TRY_EXIT( initializeMem(pMem, bytes) != -1 );
/* Success */
return (pMem);
EXIT:
/* Exception: Cleanup and fail */
if (pMem != NULL)
free(pMem);
return (NULL);
}
It never occurred to me to use goto or do { } while(0) for error handling in this way - its pretty neat, however after thinking about it I realised that in many cases I can do the same thing by splitting the function out into two:
int Foo(void)
{
// Initialise things that may need to be cleaned up here.
char* somePtr = malloc(1024);
if (somePtr = NULL)
{
return NULL;
}
if (FooInner(somePtr) < 0)
{
// Something went wrong so clean-up.
free(somePtr);
return NULL;
}
return somePtr;
}
int FooInner(char* somePtr)
{
if (something(somePtr) < 0) return -1;
if (somethingElse(somePtr) < 0) return -1;
// etc
// if you get here everything's ok.
return 0;
}
This does now mean that you get an extra function, but my preference is for many short functions anyway.
After Philips advice I've also decided to avoid using control flow macros as well - its clear enough what is going on as long as you put them on one line.
At the very least Its reassuring to know that I'm not just missing something - everyone else has this problem too! :-)
Use setjmp.
http://en.wikipedia.org/wiki/Setjmp.h
http://aszt.inf.elte.hu/~gsd/halado_cpp/ch02s03.html
http://www.di.unipi.it/~nids/docs/longjump_try_trow_catch.html
#include <setjmp.h>
#include <stdio.h>
jmp_buf x;
void f()
{
longjmp(x,5); // throw 5;
}
int main()
{
// output of this program is 5.
int i = 0;
if ( (i = setjmp(x)) == 0 )// try{
{
f();
} // } --> end of try{
else // catch(i){
{
switch( i )
{
case 1:
case 2:
default: fprintf( stdout, "error code = %d\n", i); break;
}
} // } --> end of catch(i){
return 0;
}
#include <stdio.h>
#include <setjmp.h>
#define TRY do{ jmp_buf ex_buf__; if( !setjmp(ex_buf__) ){
#define CATCH } else {
#define ETRY } }while(0)
#define THROW longjmp(ex_buf__, 1)
int
main(int argc, char** argv)
{
TRY
{
printf("In Try Statement\n");
THROW;
printf("I do not appear\n");
}
CATCH
{
printf("Got Exception!\n");
}
ETRY;
return 0;
}
Yes, two hated constructs combined. Is it as bad as it sounds or can it be seen as a good way to control usage of goto and also provide a reasonable cleanup strategy?
At work we had a discussion about whether or not to allow goto in our coding standard. In general nobody wanted to allow free usage of goto but some were positive about using it for cleanup jumps. As in this code:
void func()
{
char* p1 = malloc(16);
if( !p1 )
goto cleanup;
char* p2 = malloc(16);
if( !p2 )
goto cleanup;
goto norm_cleanup;
err_cleanup:
if( p1 )
free(p1);
if( p2 )
free(p2);
norm_cleanup:
}
The abovious benefit of such use is that you don't have to end up with this code:
void func()
{
char* p1 = malloc(16);
if( !p1 ){
return;
}
char* p2 = malloc(16);
if( !p2 ){
free(p1);
return;
}
char* p3 = malloc(16);
if( !p3 ){
free(p1);
free(p2);
return;
}
}
Especially in constructor-like functions with many allocations this can sometimes grow very bad, not the least when someone has to insert something in the middle.
So, in order to be able to use goto, but still clearly isolate it from being used freely, a set of flow controlling macros was created for handling the task. Looks something like this (simplified):
#define FAIL_SECTION_BEGIN int exit_code[GUID] = 0;
#define FAIL_SECTION_DO_EXIT_IF( cond, exitcode ) if(cond){exit_code[GUID] = exitcode; goto exit_label[GUID];}
#define FAIL_SECTION_ERROR_EXIT(code) exit_label[GUID]: if(exit_code[GUID]) int code = exit_code[GUID];else goto end_label[GUID]
#define FAIL_SECTION_END end_label[GUID]:
We can use this as follows:
int func()
{
char* p1 = NULL;
char* p2 = NULL;
char* p3 = NULL;
FAIL_SECTION_BEGIN
{
p1 = malloc(16);
FAIL_SECTION_DO_EXIT_IF( !p1, -1 );
p2 = malloc(16);
FAIL_SECTION_DO_EXIT_IF( !p2, -1 );
p3 = malloc(16);
FAIL_SECTION_DO_EXIT_IF( !p3, -1 );
}
FAIL_SECTION_ERROR_EXIT( code )
{
if( p3 )
free(p3);
if( p2 )
free(p2);
if( p1 )
free(p1);
return code;
}
FAIL_SECTION_END
return 0;
It looks nice, and comes with many benefits, BUT, are there any drawbacks we should be thinking about before rolling this out into development? It is after all very flow controlling and goto:ish. Both are discouraged. What are the arguments for discouraging them in this case?
Thanks.
Error handling is one of the rare situations when goto is not so bad.
But if I had to maintain that code I would be very upset that goto are hidden by macros.
So in this case goto is OK for me but not macros.
Using goto to go to a common error handler/cleanup/exit sequence is absolutely fine.
This code:
void func()
{
char* p1 = malloc(16);
if( !p1 )
goto cleanup;
char* p2 = malloc(16);
if( !p2 )
goto cleanup;
cleanup:
if( p1 )
free(p1);
if( p2 )
free(p2);
}
can be legally written as:
void func()
{
char* p1 = malloc(16);
char* p2 = malloc(16);
free(p1);
free(p2);
}
whether or not the memory allocations succeed.
This works because free() does nothing if passed a NULL pointer. You can use the same idiom when designing your own APIs to allocate and free other resources:
// return handle to new Foo resource, or 0 if allocation failed
FOO_HANDLE AllocFoo();
// release Foo indicated by handle, - do nothing if handle is 0
void ReleaseFoo( FOO_HANDLE h );
Designing APIs like this can considerably simplify resource management.
Cleanup with goto is a common C idiom and is used in Linux kernel*.
**Perhaps Linus' opinion is not the best example of a good argument, but it does show goto being used in a relatively large scale project.*
If the first malloc fails you then cleanup both p1 and p2. Due to the goto, p2 is not initialised and may point to anything. I ran this quickly with gcc to check and attempting to free(p2) would indeed cause a seg fault.
In your last example the variables are scoped inside the braces (i.e. they only exist in the FAIL_SECTION_BEGIN block).
Assuming the code works without the braces you'd still have to initialise all the pointers to NULL before FAIL_SECTION_BEGIN to avoid seg faulting.
I have nothing against goto and macros but I prefer Neil Butterworth's idea..
void func(void)
{
void *p1 = malloc(16);
void *p2 = malloc(16);
void *p3 = malloc(16);
if (!p1 || !p2 || !p3) goto cleanup;
/* ... */
cleanup:
if (p1) free(p1);
if (p2) free(p2);
if (p3) free(p3);
}
Or if it's more appropriate..
void func(void)
{
void *p1 = NULL;
void *p2 = NULL;
void *p3 = NULL;
p1 = malloc(16);
if (!p1) goto cleanup;
p2 = malloc(16);
if (!p2) goto cleanup;
p3 = malloc(16);
if (!p3) goto cleanup;
/* ... */
cleanup:
if (p1) free(p1);
if (p2) free(p2);
if (p3) free(p3);
}
The term "Structured Programming" which we all know as the anti-goto thing originally started and developed as a bunch of coding patterns with goto's (or JMP's). Those patterns were called the while and if patterns, amongst others.
So, if you are using goto's, use them in a structured way. That limits the damage. And those macro's seem a reasonable approach.
The original code would benefit from using multiple return statements - there is no need to hop around the error return clean up code. Plus, you normally need the allocated space released on an ordinary return too - otherwise you are leaking memory. And you can rewrite the example without goto if you are careful. This is a case where you can usefully declare variables before otherwise necessary:
void func()
{
char *p1 = 0;
char *p2 = 0;
char *p3 = 0;
if ((p1 = malloc(16)) != 0 &&
(p2 = malloc(16)) != 0 &&
(p3 = malloc(16)) != 0)
{
// Use p1, p2, p3 ...
}
free(p1);
free(p2);
free(p3);
}
When there are non-trivial amounts of work after each allocation operation, then you can use a label before the first of the free() operations, and a goto is OK - error handling is the main reason for using goto these days, and anything much else is somewhat dubious.
I look after some code which does have macros with embedded goto statements. It is confusing on first encounter to see a label that is 'unreferenced' by the visible code, yet that cannot be removed. I prefer to avoid such practices. Macros are OK when I don't need to know what they do - they just do it. Macros are not so OK when you have to know what they expand to to use them accurately. If they don't hide information from me, they are more of a nuisance than a help.
Illustration - names disguised to protect the guilty:
#define rerrcheck if (currval != &localval && globvar->currtub && \
globvar->currtub->te_flags & TE_ABORT) \
{ if (globvar->currtub->te_state) \
globvar->currtub->te_state->ts_flags |= TS_FAILED;\
else \
delete_tub_name(globvar->currtub->te_name); \
goto failure; \
}
#define rgetunsigned(b) {if (_iincnt>=2) \
{_iinptr+=2;_iincnt-=2;b = ldunsigned(_iinptr-2);} \
else {b = _igetunsigned(); rerrcheck}}
There are several dozen variants on rgetunsigned() that are somewhat similar - different sizes and different loader functions.
One place where these are used contains this loop - in a larger block of code in a single case of a large switch with some small and some big blocks of code (not particularly well structured):
for (i = 0 ; i < no_of_rows; i++)
{
row_t *tmprow = &val->v_coll.cl_typeinfo->clt_rows[i];
rgetint(tmprow->seqno);
rgetint(tmprow->level_no);
rgetint(tmprow->parent_no);
rgetint(tmprow->fieldnmlen);
rgetpbuf(tmprow->fieldname, IDENTSIZE);
rgetint(tmprow->field_no);
rgetint(tmprow->type);
rgetint(tmprow->length);
rgetlong(tmprow->xid);
rgetint(tmprow->flags);
rgetint(tmprow->xtype_nm_len);
rgetpbuf(tmprow->xtype_name, IDENTSIZE);
rgetint(tmprow->xtype_owner_len);
rgetpbuf(tmprow->xtype_owner_name, IDENTSIZE);
rgetpbuf(tmprow->xtype_owner_name,
tmprow->xtype_owner_len);
rgetint(tmprow->alignment);
rgetlong(tmprow->sourcetype);
}
It isn't obvious that the code there is laced with goto statements! And clearly, a full exegesis of the sins of the code it comes from would take all day - they are many and varied.
The first example looks much more readable to me than the macroised version.
And mouviciel said it much better than I did
#define malloc_or_die(size) if(malloc(size) == NULL) exit(1)
It's not like you can really recover from failed malloc's unless you have software worth writing a transaction system for, if you do, add roll back code to malloc_or_die.
For a real example of good use of goto, check out parsing dispatch code that use computed goto.