How to implement a fsm - c

I want to parse output from a commandline tool using the fsm programming model. What is the simplest implementation of a fsm that is possible for this task?

Basically, the core idea of a finite state machine is that the machine is in a "state" and, for every state, the behaviour of the machine is different from other states.
A simple way to do this is to have an integer variable (or an enum) which stores the status, and a switch() statement which implements, for every case, the required logic.
Suppose you have a file of the followin kind:
something
begin
something
something2
end
something
and you duty is to print the part between begin/end. You read the file line by line, and switch state basing on the content of the line:
// pseudo-C code
enum state {nothing, inblock};
enum state status;
string line;
status = nothing;
while (!eof(file)) {
readline(line);
switch(status) {
case nothing:
if (line == "begin") status=inblock;
break;
case inblock:
if (line == "end")
status=nothing;
else print(line);
break;
}
}
In this example, only the core idea is shown: a "status" of the machine and a mean to change status (the "line" read from file). In real life examples probably there are more variables to keep more informations for every state and, perhaps, the "status" of the machine can be stored in a function pointer, to avoid the burden and rigidity of the switch() statement but, even so, the programming paradigm is clean and powerful.

The fsm model works in C by assigning function pointers to certain functions that have to process certain data. One good use for fsms is for parsing commandline arguments, for parsing captured output.... The function pointer is assigned to a preset starting function. The start function assigns the function pointer, which must be passed along, to the appropriate next function. And that decides the next function and so on.
Here is a very simple implementation of a fsm:
struct _fsm
{
void (*ptr_to_fsm)(struct _fsm fsm);
char *data;
}
struct _fsm fsm;
fsm->ptr_to_fsm = start; // There is a function called start.
while (fsm->ptr_to_fsm != NULL)
{
fsm->ptr_to_fsm(&fsm);
}
void start (struct _fsm fsm)
{
if (fsm->data == NULL)
{
fsm->ptr_to_fsm = stop; // There is a function called stop.
}
/* Check more more conditions, and branch out on other functions based on the results. */
return;
}
void stop (struct _fsm fsm)
{
fsm->ptr_to_fsm = NULL; /* The while loop will terminate. */
/* And you're done (unless you have to do free`ing. */
}

Related

In C, what is the best practice for handling errors in your own functions? [duplicate]

What do you consider "best practice" when it comes to error handling errors in a consistent way in a C library.
There are two ways I've been thinking of:
Always return error code. A typical function would look like this:
MYAPI_ERROR getObjectSize(MYAPIHandle h, int* returnedSize);
The always provide an error pointer approach:
int getObjectSize(MYAPIHandle h, MYAPI_ERROR* returnedError);
When using the first approach it's possible to write code like this where the error handling check is directly placed on the function call:
int size;
if(getObjectSize(h, &size) != MYAPI_SUCCESS) {
// Error handling
}
Which looks better than the error handling code here.
MYAPIError error;
int size;
size = getObjectSize(h, &error);
if(error != MYAPI_SUCCESS) {
// Error handling
}
However, I think using the return value for returning data makes the code more readable, It's obvious that something was written to the size variable in the second example.
Do you have any ideas on why I should prefer any of those approaches or perhaps mix them or use something else? I'm not a fan of global error states since it tends to make multi threaded use of the library way more painful.
EDIT:
C++ specific ideas on this would also be interesting to hear about as long as they are not involving exceptions since it's not an option for me at the moment...
I've used both approaches, and they both worked fine for me. Whichever one I use, I always try to apply this principle:
If the only possible errors are programmer errors, don't return an error code, use asserts inside the function.
An assertion that validates the inputs clearly communicates what the function expects, while too much error checking can obscure the program logic. Deciding what to do for all the various error cases can really complicate the design. Why figure out how functionX should handle a null pointer if you can instead insist that the programmer never pass one?
I like the error as return-value way. If you're designing the api and you want to make use of your library as painless as possible think about these additions:
store all possible error-states in one typedef'ed enum and use it in your lib. Don't just return ints or even worse, mix ints or different enumerations with return-codes.
provide a function that converts errors into something human readable. Can be simple. Just error-enum in, const char* out.
I know this idea makes multithreaded use a bit difficult, but it would be nice if application programmer can set an global error-callback. That way they will be able to put a breakpoint into the callback during bug-hunt sessions.
There's a nice set of slides from CMU's CERT with recommendations for when to use each of the common C (and C++) error handling techniques. One of the best slides is this decision tree:
I would personally change two things about this flowcart.
First, I would clarify that sometimes objects should use return values to indicate errors. If a function only extracts data from an object but doesn't mutate the object, then the integrity of the object itself is not at risk and indicating errors using a return value is more appropriate.
Second, it's not always appropriate to use exceptions in C++. Exceptions are good because they can reduce the amount of source code devoted to error handling, they mostly don't affect function signatures, and they're very flexible in what data they can pass up the callstack. On the other hand, exceptions might not be the right choice for a few reasons:
C++ exceptions have very particular semantics. If you don't want those semantics, then C++ exceptions are a bad choice. An exception must be dealt with immediately after being thrown and the design favors the case where an error will need to unwind the callstack a few levels.
C++ functions that throw exceptions can't later be wrapped to not throw exceptions, at least not without paying the full cost of exceptions anyway. Functions that return error codes can be wrapped to throw C++ exceptions, making them more flexible. C++'s new gets this right by providing a non-throwing variant.
C++ exceptions are relatively expensive but this downside is mostly overblown for programs making sensible use of exceptions. A program simply shouldn't throw exceptions on a codepath where performance is a concern. It doesn't really matter how fast your program can report an error and exit.
Sometimes C++ exceptions are not available. Either they're literally not available in one's C++ implementation, or one's code guidelines ban them.
Since the original question was about a multithreaded context, I think the local error indicator technique (what's described in SirDarius's answer) was underappreciated in the original answers. It's threadsafe, doesn't force the error to be immediately dealt with by the caller, and can bundle arbitrary data describing the error. The downside is that it must be held by an object (or I suppose somehow associated externally) and is arguably easier to ignore than a return code.
I use the first approach whenever I create a library. There are several advantages of using a typedef'ed enum as a return code.
If the function returns a more complicated output such as an array and its length you do not need to create arbitrary structures to return.
rc = func(..., int **return_array, size_t *array_length);
It allows for simple, standardized error handling.
if ((rc = func(...)) != API_SUCCESS) {
/* Error Handling */
}
It allows for simple error handling in the library function.
/* Check for valid arguments */
if (NULL == return_array || NULL == array_length)
return API_INVALID_ARGS;
Using a typedef'ed enum also allows for the enum name to be visible in the debugger. This allows for easier debugging without the need to constantly consult a header file. Having a function to translate this enum into a string is helpful as well.
The most important issue regardless of approach used is to be consistent. This applies to function and argument naming, argument ordering and error handling.
Returning error code is the usual approach for error handling in C.
But recently we experimented with the outgoing error pointer approach as well.
It has some advantages over the return value approach:
You can use the return value for more meaningful purposes.
Having to write out that error parameter reminds you to handle the error or propagate it. (You never forget checking the return value of fclose, don't you?)
If you use an error pointer, you can pass it down as you call functions. If any of the functions set it, the value won't get lost.
By setting a data breakpoint on the error variable, you can catch where does the error occurred first. By setting a conditional breakpoint you can catch specific errors too.
It makes it easier to automatize the check whether you handle all errors. The code convention may force you to call your error pointer as err and it must be the last argument. So the script can match the string err); then check if it's followed by if (*err. Actually in practice we made a macro called CER (check err return) and CEG (check err goto). So you don't need to type it out always when we just want to return on error, and can reduce the visual clutter.
Not all functions in our code has this outgoing parameter though.
This outgoing parameter thing are used for cases where you would normally throw an exception.
Here's a simple program to demonstrate the first 2 bullets of Nils Pipenbrinck's answer here.
His first 2 bullets are:
store all possible error-states in one typedef'ed enum and use it in your lib. Don't just return ints or even worse, mix ints or different enumerations with return-codes.
provide a function that converts errors into something human readable. Can be simple. Just error-enum in, const char* out.
Assume you have written a module named mymodule. First, in mymodule.h, you define your enum-based error codes, and you write some error strings which correspond to these codes. Here I am using an array of C strings (char *), which only works well if your first enum-based error code has value 0, and you don't manipulate the numbers thereafter. If you do use error code numbers with gaps or other starting values, you'll simply have to change from using a mapped C-string array (as I do below) to using a function which uses a switch statement or if / else if statements to map from enum error codes to printable C strings (which I don't demonstrate). The choice is yours.
mymodule.h
/// #brief Error codes for library "mymodule"
typedef enum mymodule_error_e
{
/// No error
MYMODULE_ERROR_OK = 0,
/// Invalid arguments (ex: NULL pointer where a valid pointer is required)
MYMODULE_ERROR_INVARG,
/// Out of memory (RAM)
MYMODULE_ERROR_NOMEM,
/// Make up your error codes as you see fit
MYMODULE_ERROR_MYERROR,
// etc etc
/// Total # of errors in this list (NOT AN ACTUAL ERROR CODE);
/// NOTE: that for this to work, it assumes your first error code is value 0 and you let it naturally
/// increment from there, as is done above, without explicitly altering any error values above
MYMODULE_ERROR_COUNT,
} mymodule_error_t;
// Array of strings to map enum error types to printable strings
// - see important NOTE above!
const char* const MYMODULE_ERROR_STRS[] =
{
"MYMODULE_ERROR_OK",
"MYMODULE_ERROR_INVARG",
"MYMODULE_ERROR_NOMEM",
"MYMODULE_ERROR_MYERROR",
};
// To get a printable error string
const char* mymodule_error_str(mymodule_error_t err);
// Other functions in mymodule
mymodule_error_t mymodule_func1(void);
mymodule_error_t mymodule_func2(void);
mymodule_error_t mymodule_func3(void);
mymodule.c contains my mapping function to map from enum error codes to printable C strings:
mymodule.c
#include <stdio.h>
/// #brief Function to get a printable string from an enum error type
/// #param[in] err a valid error code for this module
/// #return A printable C string corresponding to the error code input above, or NULL if an invalid error code
/// was passed in
const char* mymodule_error_str(mymodule_error_t err)
{
const char* err_str = NULL;
// Ensure error codes are within the valid array index range
if (err >= MYMODULE_ERROR_COUNT)
{
goto done;
}
err_str = MYMODULE_ERROR_STRS[err];
done:
return err_str;
}
// Let's just make some empty dummy functions to return some errors; fill these in as appropriate for your
// library module
mymodule_error_t mymodule_func1(void)
{
return MYMODULE_ERROR_OK;
}
mymodule_error_t mymodule_func2(void)
{
return MYMODULE_ERROR_INVARG;
}
mymodule_error_t mymodule_func3(void)
{
return MYMODULE_ERROR_MYERROR;
}
main.c contains a test program to demonstrate calling some functions and printing some error codes from them:
main.c
#include <stdio.h>
int main()
{
printf("Demonstration of enum-based error codes in C (or C++)\n");
printf("err code from mymodule_func1() = %s\n", mymodule_error_str(mymodule_func1()));
printf("err code from mymodule_func2() = %s\n", mymodule_error_str(mymodule_func2()));
printf("err code from mymodule_func3() = %s\n", mymodule_error_str(mymodule_func3()));
return 0;
}
Output:
Demonstration of enum-based error codes in C (or C++)
err code from mymodule_func1() = MYMODULE_ERROR_OK
err code from mymodule_func2() = MYMODULE_ERROR_INVARG
err code from mymodule_func3() = MYMODULE_ERROR_MYERROR
References:
You can run this code yourself here: https://onlinegdb.com/ByEbKLupS.
My answer I frequently reference to see this type of error handling: STM32 how to get last reset status
I personally prefer the former approach (returning an error indicator).
Where necessary the return result should just indicate that an error occurred, with another function being used to find out the exact error.
In your getSize() example I'd consider that sizes must always be zero or positive, so returning a negative result can indicate an error, much like UNIX system calls do.
I can't think of any library that I've used that goes for the latter approach with an error object passed in as a pointer. stdio, etc all go with a return value.
The UNIX approach is most similar to your second suggestion. Return either the result or a single "it went wrong" value. For instance, open will return the file descriptor on success or -1 on failure. On failure it also sets errno, an external global integer to indicate which failure occurred.
For what it's worth, Cocoa has also been adopting a similar approach. A number of methods return BOOL, and take an NSError ** parameter, so that on failure they set the error and return NO. Then the error handling looks like:
NSError *error = nil;
if ([myThing doThingError: &error] == NO)
{
// error handling
}
which is somewhere between your two options :-).
Use setjmp.
http://en.wikipedia.org/wiki/Setjmp.h
http://aszt.inf.elte.hu/~gsd/halado_cpp/ch02s03.html
http://www.di.unipi.it/~nids/docs/longjump_try_trow_catch.html
#include <setjmp.h>
#include <stdio.h>
jmp_buf x;
void f()
{
longjmp(x,5); // throw 5;
}
int main()
{
// output of this program is 5.
int i = 0;
if ( (i = setjmp(x)) == 0 )// try{
{
f();
} // } --> end of try{
else // catch(i){
{
switch( i )
{
case 1:
case 2:
default: fprintf( stdout, "error code = %d\n", i); break;
}
} // } --> end of catch(i){
return 0;
}
#include <stdio.h>
#include <setjmp.h>
#define TRY do{ jmp_buf ex_buf__; if( !setjmp(ex_buf__) ){
#define CATCH } else {
#define ETRY } }while(0)
#define THROW longjmp(ex_buf__, 1)
int
main(int argc, char** argv)
{
TRY
{
printf("In Try Statement\n");
THROW;
printf("I do not appear\n");
}
CATCH
{
printf("Got Exception!\n");
}
ETRY;
return 0;
}
When I write programs, during initialization, I usually spin off a thread for error handling, and initialize a special structure for errors, including a lock. Then, when I detect an error, through return values, I enter in the info from the exception into the structure and send a SIGIO to the exception handling thread, then see if I can't continue execution. If I can't, I send a SIGURG to the exception thread, which stops the program gracefully.
I have done a lot of C programming in the past. And I really apreciated the error code return value. But is has several possible pitfalls:
Duplicate error numbers, this can be solved with a global errors.h file.
Forgetting to check the error code, this should be solved with a cluebat and long debugging hours. But in the end you will learn (or you will know that someone else will do the debugging).
I ran into this Q&A a number of times, and wanted to contribute a more comprehensive answer. I think the best way to think about this is how to return errors to the caller, and what you return.
How
There are 3 ways to return information from a function:
Return Value
Out Argument(s)
Out of Band, that includes non-local goto (setjmp/longjmp),
file or global scoped variables, file system etc.
Return Value
You can only return a single value (object); however, it can be an arbitrarily complex value. Here is an example of an error returning function:
enum error hold_my_beer(void);
One benefit of return values is that it allows chaining of calls for less intrusive error handling:
!hold_my_beer() &&
!hold_my_cigarette() &&
!hold_my_pants() ||
abort();
This not just about readability, but may also allow processing an array of such function pointers in a uniform way.
Out Argument(s)
You can return more via more than one object via arguments, but best practice does suggest to keep the total number of arguments low (say, <=4):
void look_ma(enum error *e, char *what_broke);
enum error e;
look_ma(e);
if(e == FURNITURE) {
reorder(what_broke);
} else if(e == SELF) {
tell_doctor(what_broke);
}
This forces caller to pass in object which may make it more likely that it's being checked. If you have a set of calls all returning errors, and you decide to allocate a new variable to each, then it add some clutter in the caller.
Out of Band
The best known example is probably the (thread-local) errno variable, which the called function sets. It's very easy for the callee to not check this variable, and you only get one which may be an issue if your function is complicated (for instance, two parts of the function returning the same error code).
With setjmp() you define a place and how you want to handle an int value, and you transfer control to that location via a longjmp(). See Practical usage of setjmp and longjmp in C.
What
Indicator
Code
Object
Callback
Indicator
An error indicator only tells you that there is a problem but nothing about the nature of said problem:
struct foo *f = foo_init();
if(!f) {
/// handle the absence of foo
}
This is the least powerful way for a function to communicate error state; however, it's perfect if the caller cannot respond to the error in a graduated manner anyways.
Code
An error code tells the caller about the nature of the problem, and may allow for a suitable response (from the above). It can be a return value, or like the look_ma() example above an error argument.
Object
With an error object, the caller can be informed about arbitrarily complicated issues. For example, an error code and a suitable human-readable message. It can also inform the caller that multiple things went wrong, or an error per item when processing a collection:
struct collection friends;
enum error *e = malloc(c.size * sizeof(enum error));
...
ask_for_favor(friends, reason);
for(int i = 0; i < c.size; i++) {
if(reason[i] == NOT_FOUND) find(friends[i]);
}
Instead of pre-allocating the error array, you can also (re)allocate it dynamically as needed of course.
Callback
Callback is the most powerful way to handle errors, as you can tell the function what behavior you would like to see happen when something goes wrong. A callback argument can be added to each function, or if customization uis only required per instance of a struct like this:
struct foo {
...
void (error_handler)(char *);
};
void default_error_handler(char *message) {
assert(f);
printf("%s", message);
}
void foo_set_error_handler(struct foo *f, void (*eh)(char *)) {
assert(f);
f->error_handler = eh;
}
struct foo *foo_init() {
struct foo *f = malloc(sizeof(struct foo));
foo_set_error_handler(f, default_error_handler);
return f;
}
struct foo *f = foo_init();
foo_something();
One interesting benefit of a callback is that it can be invoked multiple times, or none at all in the absence of errors in which there is no overhead on the happy path.
There is, however, an inversion of control. The calling code does not know if the callback was invoked. As such, it may make sense to use an indicator as well.
I was pondering this issue recently as well, and wrote up some macros for C that simulate try-catch-finally semantics using purely local return values. Hope you find it useful.
Here is an approach which I think is interesting, while requiring some discipline.
This assumes a handle-type variable is the instance on which operate all API functions.
The idea is that the struct behind the handle stores the previous error as a struct with necessary data (code, message...), and the user is provided with a function that returns a pointer to this error object. Each operation will update the pointed object so the user can check its status without even calling functions. As opposed to the errno pattern, the error code is not global, which make the approach thread-safe, as long as each handle is properly used.
Example:
MyHandle * h = MyApiCreateHandle();
/* first call checks for pointer nullity, since we cannot retrieve error code
on a NULL pointer */
if (h == NULL)
return 0;
/* from here h is a valid handle */
/* get a pointer to the error struct that will be updated with each call */
MyApiError * err = MyApiGetError(h);
MyApiFileDescriptor * fd = MyApiOpenFile("/path/to/file.ext");
/* we want to know what can go wrong */
if (err->code != MyApi_ERROR_OK) {
fprintf(stderr, "(%d) %s\n", err->code, err->message);
MyApiDestroy(h);
return 0;
}
MyApiRecord record;
/* here the API could refuse to execute the operation if the previous one
yielded an error, and eventually close the file descriptor itself if
the error is not recoverable */
MyApiReadFileRecord(h, &record, sizeof(record));
/* we want to know what can go wrong, here using a macro checking for failure */
if (MyApi_FAILED(err)) {
fprintf(stderr, "(%d) %s\n", err->code, err->message);
MyApiDestroy(h);
return 0;
}
First approach is better IMHO:
It's easier to write function that way. When you notice an error in the middle of the function you just return an error value. In second approach you need to assign error value to one of the parameters and then return something.... but what would you return - you don't have correct value and you don't return error value.
it's more popular so it will be easier to understand, maintain
I definitely prefer the first solution :
int size;
if(getObjectSize(h, &size) != MYAPI_SUCCESS) {
// Error handling
}
i would slightly modify it, to:
int size;
MYAPIError rc;
rc = getObjectSize(h, &size)
if ( rc != MYAPI_SUCCESS) {
// Error handling
}
In additional i will never mix legitimate return value with error even if currently the scope of function allowing you to do so, you never know which way function implementation will go in the future.
And if we already talking about error handling i would suggest goto Error; as error handling code, unless some undo function can be called to handle error handling correctly.
What you could do instead of returning your error, and thus forbidding you from returning data with your function, is using a wrapper for your return type:
typedef struct {
enum {SUCCESS, ERROR} status;
union {
int errCode;
MyType value;
} ret;
} MyTypeWrapper;
Then, in the called function:
MyTypeWrapper MYAPIFunction(MYAPIHandle h) {
MyTypeWrapper wrapper;
// [...]
// If there is an error somewhere:
wrapper.status = ERROR;
wrapper.ret.errCode = MY_ERROR_CODE;
// Everything went well:
wrapper.status = SUCCESS;
wrapper.ret.value = myProcessedData;
return wrapper;
}
Please note that with the following method, the wrapper will have the size of MyType plus one byte (on most compilers), which is quite profitable; and you won't have to push another argument on the stack when you call your function (returnedSize or returnedError in both of the methods you presented).
In addition to what has been said, prior to returning your error code, fire off an assert or similar diagnostic when an error is returned, as it will make tracing a lot easier. The way I do this is to have a customised assert that still gets compiled in at release but only gets fired when the software is in diagnostics mode, with an option to silently report to a log file or pause on screen.
I personally return error codes as negative integers with no_error as zero , but it does leave you with the possible following bug
if (MyFunc())
DoSomething();
An alternative is have a failure always returned as zero, and use a LastError() function to provide details of the actual error.
EDIT:If you need access only to the last error, and you don't work in multithreaded environment.
You can return only true/false (or some kind of #define if you work in C and don't support bool variables), and have a global Error buffer that will hold the last error:
int getObjectSize(MYAPIHandle h, int* returnedSize);
MYAPI_ERROR LastError;
MYAPI_ERROR* getLastError() {return LastError;};
#define FUNC_SUCCESS 1
#define FUNC_FAIL 0
if(getObjectSize(h, &size) != FUNC_SUCCESS ) {
MYAPI_ERROR* error = getLastError();
// error handling
}
Second approach lets the compiler produce more optimized code, because when address of a variable is passed to a function, the compiler cannot keep its value in register(s) during subsequent calls to other functions. The completion code usually is used only once, just after the call, whereas "real" data returned from the call may be used more often
I prefer error handling in C using the following technique:
struct lnode *insert(char *data, int len, struct lnode *list) {
struct lnode *p, *q;
uint8_t good;
struct {
uint8_t alloc_node : 1;
uint8_t alloc_str : 1;
} cleanup = { 0, 0 };
// allocate node.
p = (struct lnode *)malloc(sizeof(struct lnode));
good = cleanup.alloc_node = (p != NULL);
// good? then allocate str
if (good) {
p->str = (char *)malloc(sizeof(char)*len);
good = cleanup.alloc_str = (p->str != NULL);
}
// good? copy data
if(good) {
memcpy ( p->str, data, len );
}
// still good? insert in list
if(good) {
if(NULL == list) {
p->next = NULL;
list = p;
} else {
q = list;
while(q->next != NULL && good) {
// duplicate found--not good
good = (strcmp(q->str,p->str) != 0);
q = q->next;
}
if (good) {
p->next = q->next;
q->next = p;
}
}
}
// not-good? cleanup.
if(!good) {
if(cleanup.alloc_str) free(p->str);
if(cleanup.alloc_node) free(p);
}
// good? return list or else return NULL
return (good ? list : NULL);
}
Source: http://blog.staila.com/?p=114
In addition the other great answers, I suggest that you try to separate the error flag and the error code in order to save one line on each call, i.e.:
if( !doit(a, b, c, &errcode) )
{ (* handle *)
(* thine *)
(* error *)
}
When you have lots of error-checking, this little simplification really helps.
I have seen five main approaches used in error reporting by functions in C:
return value with no error code reporting or no return value
return value that is an error code only
return value that is a valid value or an error code value
return value indicating an error with some way of fetching an error code possibly with error context information
function argument that returns a value with an error code possibly with error context information
In addition to the choice of function error return mechanism there is also the consideration of error code mnemonics and ensuring that the error code mnemonics do not clash with any other error code mnemonics being used. Typically this requires the use of a Three Letter Prefix approach to the naming of mnemonics defining them with #define, enum, or const static int. See this discussion "static const" vs "#define" vs "enum"
There are a couple of different outcomes once an error is detected and that may be a consideration how functions provide error codes and error information. These outcomes are really divided into two camps, recoverable errors and unrecoverable errors:
document the system state and then abort
wait and retry the failed action
notify a human being and request assistance
continue execution in a degraded state
An error type may use more than one of these outcomes depending on the context of the error. For instance a file open that fails because the file doesn't exist may be retried with a different file name or notify a user and ask for assistance or continue execution in a degraded state.
Details on Five Main Approaches
Some functions do not provide an error code. The functions either can't fail or if they fail, they fail silently. An example of this type of function are the various is character test functions such as isdigit() which indicates if a character value is a digit or is not. A character value either is or is not a digit or an alphabetic character. Similarly with the strcmp() function, comparing two strings results in a value indicating which one is higher in the collating sequence than the other should they not be the same.
In some cases an error code is not necessary because a value indicating failure is a valid result. For example the strchr() function from the Standard Library returns a pointer to the searched for character if found in the string to be scanned or NULL if it is not found. In this case a failure to find the character is a valid and useful indicator. A function using strchr() may require the character searched for not be in the string to be successful and finding the character is an error condition.
Other functions do not return an error code but instead report an error through an external mechanism. This is used by most of the math library functions in the Standard Library which require the user to set errno to a value of zero, call the function, and then check that the value of errno is still zero. The range of output values from many of the math functions do not allow a special return value to be used to indicate an error and they do not have an error reporting argument in their interfaces.
Some functions perform an action and return an error code value with one of the possible error code values indicating success and the rest of the range of values indicating an error code. For example a function may return a value of 0 if successful or a positive or negative non-zero value indicating an error with the value returned being the error code.
Some functions may perform an action and return either a value from a range of valid values if successful or a value from a range of invalid values indicating an error code. A simple approach is to use a positive value (0, 1, 2, ...) for valid values and a negative value for error codes allowing a check such as if(status < 0) return error;.
Some functions return a valid value or an invalid value indicating an error requiring the additional step of fetching the error code by some means. For example the fopen() function returns either a pointer to a FILE object or it returns an invalid pointer value of NULL and sets errno to an error code indicating the reason for the failure. A number of Windows API functions that return a HANDLE value to reference a resource may also return a value of INVALID_HANDLE_VALUE and the function GetLastError() is used to obtain the error code. The OPOS Control Objects standard requires an OPOS Control Object to provide two functions, GetResultCode() and GetResultCodeExtended(), to allow for the retrieval of error status information in the event a COM object method call fails.
This same approach is used in other APIs that use a handle or reference to a resource in which there is a range of valid values with one or more values outside of that range used to indicate an error. A mechanism is then provided to fetch additional error information such as an error code.
A similar approach is used with functions that return a boolean value of true to indicate the function was successful or false to indicate an error. The programmer must then examine other data to determine an error code such as GetLastError() with the Windows API.
Some functions have a pointer argument containing the address of a memory area for the function called to provide an error code or error information. Where this approach really shines is when in addition to a simple error code there is additional, error context information that helps to pin point the error. For example a JSON string parsing function may not only return an error code but also a pointer to where in the JSON string the parsing failed.
I have also seen functions where the function returned an error indicator such as a boolean value with the argument used for error information. I recall that the error information argument could in some cases be NULL indicating the caller didn't want to know the specifics of a failure.
This approach to returning error code or error information seems to be uncommon in my experience though for some reason I think I've seen it used in the Windows API from time to time or perhaps with an XML parser.
Considerations for multi-threading
When using the approach of an additional error code access through a mechanism as in checking a global such as errno or using a function such as GetLastError() there is the problem of sharing the global across multiple threads.
Modern compilers and libraries deal with this by using thread local storage to ensure that each thread has its own storage that is not shared by other threads. However there is still the issue of multiple functions sharing the same thread local storage location for status information which may require some accomodation. For instance, a function that uses several files may need to work around the issue that all of the fopen() calls that may fail share a single errno in the same thread.
If the API uses some type of handle or reference then error code storage can be made handle specific. The fopen() function could be wrapped in another function which performs the fopen() and then sets an API control block with both the FILE * returned by the fopen() as well as the value of errno.
The approach I prefer
My preference is for an error code to be returned as a function return value so that I can either check it at the point of call or save it for later. In most cases, an error is something to be dealt with immediately which is why I prefer this approach.
An approach I have used with functions is to have the function return a simple struct which contains two members, a status code and the return value. For example:
struct FuncRet {
short sStatus; // status or error code
double dValue; // calculated value
};
struct FuncRet Func(double dInput)
{
struct FuncRet = {0, 0}; // sStatus == 0 indicates success
// calculate return value FuncRet.dValue and set
// status code FuncRet.sStatus in the event of an error.
return FuncRet;
}
// ... source code before using our function.
{
struct FuncRet s;
if ((s = Func(aDble)).sStatus == 0) {
// do things with the valid value s.dValue
} else {
// error so deal with the error reported in s.sStatus
}
}
This allows me to do an immediate check for an error. Many functions end up returning a status without returning an actual value as well because the data returned is complex. One or more arguments may be modified by the function but the function doesn't return a value other than a status code.

How do you avoid using global variables in inherently stateful programs?

I am currently writing a small game in C and feel like I can't get away from global variables.
For example I am storing the player position as a global variable because it's needed in other files. I have set myself some rules to keep the code clean.
Only use a global variable in the file it's defined in, if possible
Never directly change the value of a global from another file (reading from another file using extern is okay)
So for example graphics settings would be stored as file scope variables in graphics.c. If code in other files wants to change the graphics settings they would have to do so through a function in graphics.c like graphics_setFOV(float fov).
Do you think those rules are sufficient for avoiding global variable hell in the long term?
How bad are file scope variables?
Is it okay to read variables from other files using extern?
Typically, this kind of problem is handled by passing around a shared context:
graphics_api.h
#ifndef GRAPHICS_API
#define GRAPHICS_API
typedef void *HANDLE;
HANDLE init_graphics(void);
void destroy_graphics(HANDLE handle);
void use_graphics(HANDLE handle);
#endif
graphics.c
#include <stdio.h>
#include <stdlib.h>
#include "graphics_api.h"
typedef struct {
int width;
int height;
} CONTEXT;
HANDLE init_graphics(void) {
CONTEXT *result = malloc(sizeof(CONTEXT));
if (result) {
result->width = 640;
result->height = 480;
}
return (HANDLE) result;
}
void destroy_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
free(context);
}
}
void use_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
printf("width = %5d\n", context->width);
printf("height = %5d\n", context->height);
}
}
main.c
#include <stdio.h>
#include "graphics_api.h"
int main(void) {
HANDLE handle = init_graphics();
if (handle) {
use_graphics(handle);
destroy_graphics(handle);
}
return 0;
}
Output
width = 640
height = 480
Hiding the details of the context by using a void pointer prevents the user from changing the data contained within the memory to which it points.
How do you avoid using global variables in inherently stateful programs?
By passing arguments...
// state.h
/// state object:
struct state {
int some_value;
};
/// Initializes state
/// #return zero on success
int state_init(struct state *s);
/// Destroys state
/// #return zero on success
int state_fini(struct state *s);
/// Does some operation with state
/// #return zero on success
int state_set_value(struct state *s, int new_value);
/// Retrieves some operation from state
/// #return zero on success
int state_get_value(struct state *s, int *value);
// state.c
#include "state.h"
int state_init(struct state *s) {
s->some_value = -1;
return 0;
}
int state_fini(struct state *s) {
// add free() etc. if needed here
// call fini of other objects here
return 0;
}
int state_set_value(struct state *s, int value) {
if (value < 0) {
return -1; // ERROR - invalid argument
// you may return EINVAL here
}
s->some_value = value;
return 0; // success
}
int state_get_value(struct state *s, int *value) {
if (s->some_value < 0) { // value not set yet
return -1;
}
*value = s->some_value;
return 0;
}
// main.c
#include "state.h"
#include <stdlib.h>
#include <stdio.h>
int main() {
struct state state; // local variable
int err = state_init(&state);
if (err) abort();
int value;
err = state_get_value(&state, &value);
if (err != 0) {
printf("Getting value errored: %d\n", err);
}
err = state_set_value(&state, 50);
if (err) abort();
err = state_get_value(&state, &value);
if (err) abort();
printf("Current value is: %d\n", value);
err = state_fini(&state);
if (err) abort();
}
The only single case where global variables (preferably only a single pointer to some stack variable anyway) have to be used are signal handlers. The standard way would be to only increment a single global variable of type sig_atomic_t inside a signal handler and do nothing else - then execute all signal handling related logic from the normal flow in the rest of the code by checking the value of that variable. (On POSIX system) all other asynchronous communication from the kernel, like timer_create, that take sigevent structure, they can pass arguments to notified function by using members in union sigval.
Do you think those rules are sufficient for avoiding global variable hell in the long term?
Subjectively: no. I believe that a potentially uneducated programmer has too much freedom in creating global variables given the first rule. In complex programs I would use a hard rule: Do not use global variables. If finally after researching all other ways and all other possibilities have been exhausted and you have to use a global variables, make sure global variables leave the smallest possible memory footprint.
In simple short programs I wouldn't care much.
How bad are file scope variables?
This is opinion based - there are good cases where projects use many global variables. I believe that topic is exhausted in are global variables bad and numerous other internet resources.
Is it okay to read variables from other files using extern?
Yes, it's ok.
There are no "hard rules" and each project has it's own rules. I also recommend to read c2 wiki global variables are bad.
The first thing you have to ask yourself is: Just why did the programming world come to loath global variables? Obviously, as you noted, the way to model a global state is essentially a global (set of) variable(s). So what's the problem with that?
The Problem
All parts of the program have access to that state. The whole program becomes tightly coupled. Global variables violate the prime directive in programming, divide and conquer. Once all functions operate on the same data you can as well do away with the functions: They are no longer logical separations of concern but degrade to a notational convenience to avoid large files.
Write access is worse than read access: You'll have a hard time finding out just why on earth the state is unexpected at a certain point; the change can have happened anywhere. It is tempting to take shortcuts: "Ah, we can make the state change right here instead of passing a computation result back up three layers to the caller; that makes the code much smaller."
Even read access can be used to cheat and e.g. change behavior of some deep-down code depending on some global information: "Ah, we can skip rendering, there is no display yet!" A decision which should not be made in the rendering code but at top level. What if top level renders to a file!?
This creates both a debugging and a development/maintenance nightmare. If every piece of the code potentially relies on the presence and semantics of certain variables — and can change them! — it becomes exponentially harder to debug or change the program. The code agglomerating around the global data is like a cast, or perhaps a Boa Constrictor, which starts to immobilize and strangle your program.
Such programming can be avoided with (self-)discipline, but imagine a large project with many teams! It's much better to "physically" prevent access. Not coincidentally all programming languages after C, even if they are otherwise fundamentally different, come with improved modularization.
So what can we do?
The solution is indeed to pass parameters to functions, as KamilCuk said; but each function should only get the information they legitimately need. Of course it is best if the access is read-only and the result is a return value: Pure functions cannot change state at all and thus perfectly separate concerns.
But simply passing a pointer to the global state around does not cut the mustard: That's only a thinly veiled global variable.
Instead, the state should be separated into sub-states. Only top-level functions (which typically do not do much themselves but mostly delegate) have access to the overall state and hand sub-states to the functions they call. Third-tier functions get sub-sub states, etc. The corresponding implementation in C is a nested struct; pointers to the members — const whenever possible — are passed to functions which therefore cannot see, let alone alter, the rest of the global state. Separation of concerns is thus guaranteed.

Using GOTO for a FSM in C

I am creating a finite state machine in C.
I learned FSM from the hardware point of view (HDL language). So I'm used a switch with one case per state.
I also like to apply the Separation of Concerns concept when programing.
I mean I'd like to get this flow:
Calculate the next state depending on the current state and input flags
Validate this next state (if the user request a transition that is not allowed)
Process the next state when it is allowed
As a start I implemented 3 functions:
static e_InternalFsmStates fsm_GetNextState();
static bool_t fsm_NextStateIsAllowed(e_InternalFsmStates nextState);
static void fsm_ExecuteNewState(e_InternalFsmStates);
At the moment they all contain a big switch-case which is the same:
switch (FSM_currentState) {
case FSM_State1:
[...]
break;
case FSM_State2:
[...]
break;
default:
[...]
break;
}
Now that it works, I'd like to improve the code.
I know that in the 3 functions I'll execute the same branch of the switch.
So I am thinking to use gotos in this way:
//
// Compute next state
//
switch (FSM_currentState) {
case FSM_State1:
next_state = THE_NEXT_STATE
goto VALIDATE_FSM_State1_NEXT_STATE;
case FSM_State2:
next_state = THE_NEXT_STATE
goto VALIDATE_FSM_State2_NEXT_STATE;
[...]
default:
[...]
goto ERROR;
}
//
// Validate next state
//
VALIDATE_FSM_State1_NEXT_STATE:
// Some code to Set stateIsValid to TRUE/FALSE;
if (stateIsValid == TRUE)
goto EXECUTE_STATE1;
else
goto ERROR;
VALIDATE_FSM_State2_NEXT_STATE:
// Some code to Set stateIsValid to TRUE/FALSE;
if (stateIsValid == TRUE)
goto EXECUTE_STATE2;
else
goto ERROR;
//
// Execute next state
//
EXECUTE_STATE1:
// Do what I need for state1
goto END;
EXECUTE_STATE2:
// Do what I need for state2
goto END;
//
// Error
//
ERROR:
// Error handling
goto END;
END:
return; // End of function
Of course, I could do the 3 parts (calculate, validate and process the next state) in a single switch case. But for code readability and code reviews, I feel like it will be easier to separate them.
Finally my question, is it dangerous to use GOTOs in this way?
Would you have any advice when using FSM like that?
Thank you for your comments!
After reading the answers and comments below, here is what I am going to try:
e_FSM_InternalStates nextState = FSM_currentState;
bool_t isValidNextState;
//
// Compute and validate next state
//
switch (FSM_currentState) {
case FSM_State1:
if (FSM_inputFlags.flag1 == TRUE)
{
nextState = FSM_State2;
}
[...]
isValidNextState = fsm_validateState1Transition(nextState);
case FSM_State2:
if (FSM_inputFlags.flag2 == TRUE)
{
nextState = FSM_State3;
}
[...]
isValidNextState = fsm_validateState2Transition(nextState);
}
//
// If nextState is invalid go to Error
//
if (isValidNextState == FALSE) {
nextState = FSM_StateError;
}
//
// Execute next state
//
switch (nextState) {
case FSM_State1:
// Execute State1
[...]
case FSM_State2:
// Execute State1
[...]
case FSM_StateError:
// Execute Error
[...]
}
FSM_currentState = nextState;
While goto has its benefits in C, it should be used sparesly and with extreme caution. What you intend is no recommendable use-case.
Your code will be less maintainable and more confusing. switch/case is actually some kind of "calculated" goto (thats's why there are case labels).
You are basicaly thinking the wrong way. For a state-machine, you should first verify input, then calculate the next state, then the output. There are various ways to do so, but it is often a good idea to use two switches and - possibly - a single error-handling label or a error-flag:
bool error_flag = false;
while ( run_fsm ) {
switch ( current_state ) {
case STATE1:
if ( input1 == 1 )
next_state = STATE2;
...
else
goto error_handling; // use goto
error_flag = true; // or the error-flag (often better)
break;
...
}
if ( error_flag )
break;
switch ( next_state ) {
case STATE1:
output3 = 2;
// if outputs depend on inputs, similar to the upper `switch`
break;
...
}
current_state = next_state;
}
error_handling:
...
This way you are transitioning and verifying input at once. This makes senase, as you have to evaluate the inputs anyway to set the next state, so invalid input just falls of the tests naturally.
An alternative is to have an output_state and state variable instead of next_state and current_state. In the first switch you set output_state and state, the second is switch ( output_state ) ....
If the single cases become too long, you should use functions to determine the next_state and/or output_state/outputs. It depends very much on the FSM (number of inputs, outputs, states, complexity (e.g. one-hot vs. "encoded" - if you are family with HDL, you will know).
If you need more complex error-handling (e.g. recover) inside the loop, leave the loop as-is and add an outer loop, possibly change the error-flag to an error-code and add another switch for it in the outer loop. Depending on complexity, pack the inner loop into its own function, etc.
Sidenote: The compiler might very well optimize the structured approach (without goto) to the same/similar code as with the goto
Whether it is "dangerous" is probably somewhat a matter of opinion. The usual reason people say to avoid GOTO is that it tends to lead to spaghetti code that's hard to follow. Is that an absolute rule? Probably not, but I think it's definitely fair to say that it is the trend. Secondarily, most programmers at this point are trained to believe that GOTO is bad, so, even if it's not in some case, you may run into some level of maintainability issue with other people coming into the project later.
How much risk you have in your case, probably depends on how big of a chunk of code you're going to have under those state labels and how sure you are that it won't change much. More code (or potential for large revisions), means more risk. In addition to just straight questions of readability, you'll have increased chances for assignments to variables interfering between cases or being dependent on the path you took to get to a certain state. Using functions helps with this (in many cases) by creating local scope for variables.
All in all, I would recommend avoiding the GOTO.
You don't really need to use switch-case, it will actually get optimized away by the compiler into machine code with a function pointer jump table. Switch-cases for state machines tend to be somewhat hard to read, especially the more complex ones.
The spaghetti-gotos are unacceptable and bad programming practice: there are a few valid uses of goto, this is not one of them.
Instead, consider to have a one-line state machine which looks like:
state = STATE_MACHINE[state]();
Here is an answer of mine (taken from the electrical engineering site, it pretty much applies universally) which is based on a function-pointer lookup table.
typedef enum
{
STATE_S1,
STATE_S2,
...
STATE_N // the number of states in this state machine
} state_t;
typedef state_t (*state_func_t)(void);
state_t do_state_s1 (void);
state_t do_state_s2 (void);
static const state_func_t STATE_MACHINE [STATE_N] =
{
&do_state_s1,
&do_state_s2,
...
};
void main()
{
state_t state = STATE_S1;
while (1)
{
state = STATE_MACHINE[state]();
}
}
state_t do_state_s1 (void)
{
state_t result = STATE_S1;
// stuff
if (...)
result = STATE_S2;
return result;
}
state_t do_state_s2 (void)
{
state_t result = STATE_S2;
// other stuff
if (...)
result = STATE_S1;
return result;
}
You can easily modify the function signatures to contain an error code as well, such as for example:
typedef err_t (*state_func_t)(state_t*);
with functions as
err_t do_state_s1 (state_t* state);
in which case the caller would end up as:
error = STATE_MACHINE[state](&state);
if(error != NO_ERROR)
{
// handle errors here
}
Leave all error handling to the caller as show in the above example.
My rule of thumb is to use GOTOs only to jump forward in the code, but never backwards. In the end this boils down to using GOTO only for exception handling, which otherwise doesn't exist in C.
In your particular case I would absolutely not recommend the use of GOTO.

TCL_LINK_STRING causing segmentation fault (core dumped)

I'm trying to share a variable with c and tcl, the problem is when i try to read the variable in the c thread from tcl, it causes segmentation error, i'm not sure this is the right way to do it, but it seems to work for ints. The part that is causing the segmentation fault is this line is when i try to print "Var" but i want to read the variable to do the corresponding action when the variable changes.
Here is the C code that i'm using
void mode_service(ClientData clientData) {
while(1) {
char* Var = (char *) clientData;
printf("%s\n", Var);
usleep(100000); //100ms
}
}
static int mode_thread(ClientData cdata, Tcl_Interp *interp, int objc, Tcl_Obj *const objv[]) {
Tcl_ThreadId id;
ClientData limitData;
limitData = cdata;
id = 0;
Tcl_CreateThread(&id, mode_service, limitData, TCL_THREAD_STACK_DEFAULT, TCL_THREAD_NOFLAGS);
printf("Tcl_CreateThread id = %d\n", (int) id);
// Wait thread process, before returning to TCL prog
int i, aa;
for (i=0 ; i<100000; i++) {aa = i;}
// Return thread ID to tcl prog to allow mutex use
Tcl_SetObjResult(interp, Tcl_NewIntObj((int)id));
printf("returning\n");
return TCL_OK;
}
int DLLEXPORT Modemanager_Init(Tcl_Interp *interp){
if (Tcl_InitStubs(interp, TCL_VERSION, 0) == NULL) {
return TCL_ERROR;
}
if (Tcl_PkgProvide(interp, "PCIe", "1.0") == TCL_ERROR) {
return TCL_ERROR;
}
// Create global Var
int *sharedPtr=NULL;
//sharedPtr = sharedPtr = (char *) Tcl_Alloc(sizeof(char));
Tcl_LinkVar(interp, "mode", (char *) &sharedPtr, TCL_LINK_STRING);
Tcl_CreateObjCommand(interp, "mode_thread", mode_thread, sharedPtr, NULL);
return TCL_OK;
}
In the tcl code, i'm changing the variable mode whenever the user presses a button for example:
set mode "Idle"
button .startSamp -text "Sample Start" -width 9 -height 3 -background $btnColor -relief flat -state normal -command {set mode "Sampling"}
set threadId [mode_thread]
puts "Created thread $threadId, waiting"
Your code is a complete mess! You need to decide what you are doing and then do just that. In particular, you are using Tcl_LinkVar so you need to decide what sort of variable you are linking to. If you get a mismatch between the storage, the C access pattern and the declared semantic type, you'll get crashes.
Because your code is in too complicated a mess for me to figure out exactly what you want to do, I'll illustrate with less closely related examples. You'll need to figure out from them how to change things in your code to get the result you need.
Linking Integer Variables
Let's do the simple case: a global int variable (declared outside any function).
int sharedVal;
You want your C code to read that variable and get the value. Easy! Just read it as it is in scope. You also want Tcl code to be able to write to that variable. Easy! In the package initialization function, put this:
Tcl_LinkVar(interp /* == the Tcl interpreter context */,
"sharedVal" /* == the Tcl name */,
(char *) &sharedVal /* == pointer to C variable */,
TCL_LINK_INT /* == what is it! An integer */);
Note that after that (until you Tcl_UnlinkVar) whenever Tcl code reads from the Tcl variable, the current value will be fetched from the C variable and converted.
If you want that variable to be on the heap, you then do:
int *sharedValPtr = malloc(sizeof(int));
C code accesses using *sharedValPtr, and you bind to Tcl with:
Tcl_LinkVar(interp /* == the Tcl interpreter context */,
"sharedVal" /* == the Tcl name */,
(char *) sharedValPtr /* == pointer to C variable */,
TCL_LINK_INT /* == what is it! An integer */);
Linking String Variables
There's a bunch of other semantic types as well as TCL_LINK_INT (see the documentation for a list) but they all follow that pattern except for TCL_LINK_STRING. With that, you do:
char *sharedStr = NULL;
Tcl_LinkVar(interp, "sharedStr", (char *) &sharedStr, TCL_LINK_STRING);
You also need to be aware that the string will always be allocated with Tcl_Alloc (which is substantially faster than most system memory allocators for typical Tcl memory usage patterns) and not with any other memory allocator, and so will also always be deallocated with Tcl_Free. Practically, that means if you set the string from the C side, you must use Tcl_Alloc to allocate the memory.
Posting Update Notifications
The final piece to note is when you set the variable from the C side but want Tcl to notice that the change has set (e.g., because a trace has been set or because you've surfaced the value in a Tk GUI), you should do Tcl_UpdateLinkedVar to let Tcl know that a change has happened that it should pay attention to. If you never use traces (or Tk GUIs, or the vwait command) to watch the variable for updates, you can ignore this API call.
Donal's answer is correct, but I try to show you what you did with your ClientData.
To clarify: All (or almost all, Idk) Tcl functions that take a function pointer also take a parameter of type ClientData that is passed to your function when Tcl calls it.
Let's take a look at this line:
Tcl_CreateObjCommand(interp, "mode_thread", mode_thread, NULL, NULL);
// ------------------------------------------------------^^^^
You always pass NULL as ClientData to the mode_thread function.
In the mode_thread function you use the passed ClientData (NULL) to pass it as ClientData to the new Thread:
limitData = cdata;
// ...
Tcl_CreateThread(&id, mode_service, limitData, TCL_THREAD_STACK_DEFAULT, TCL_THREAD_NOFLAGS);
In the mode_service function you use the ClientData (which is still NULL) as pointer to a char array:
char* Var = (char *) clientData;
Which is a pointer to the address 0x00.
And then you tell printf to dereference this NULL pointer:
printf("%s\n", Var);
Which obviously crashes your program.

How to Pass Simple, Anonymous Functions as Parameters in C

I'm sure some variation of this question has been asked before but all other, similar questions on SO seem to be much more complex, involving passing arrays and other forms of data. My scenario is much simpler so I hope there is a simple/elegant solution.
Is there a way that I can create an anonymous function, or pass a line of code as a function pointer to another function?
In my case, I have a series of diverse operations. Before and after each line of code, there are tasks I want to accomplish, that never change. Instead of duplicating the beginning code and ending code, I'd like to write a function that takes a function pointer as a parameter and executes all of the code in the necessary order.
My problem is that it's not worth defining 30 functions for each operation since they are each one line of code. If I can't create an anonymous function, is there a way that I can simplify my C code?
If my request isn't entirely clear. Here's a bit of pseudo-code for clarification. My code is much more meaningful than this but the code below gets the point accross.
void Tests()
{
//Step #1
printf("This is the beginning, always constant.");
something_unique = a_var * 42; //This is the line I'd like to pass as an anon-function.
printf("End code, never changes");
a_var++;
//Step #2
printf("This is the beginning, always constant.");
a_diff_var = "arbitrary"; //This is the line I'd like to pass as an anon-function.
printf("End code, never changes");
a_var++;
...
...
//Step #30
printf("This is the beginning, always constant.");
var_30 = "Yup, still executing the same code around a different operation. Would be nice to refactor..."; //This is the line I'd like to pass as an anon-function.
printf("End code, never changes");
a_var++;
}
Not in the traditional sense of anonymous functions, but you can macro it:
#define do_something(blah) {\
printf("This is the beginning, always constant.");\
blah;\
printf("End code, never changes");\
a_var++;\
}
Then it becomes
do_something(something_unique = a_var * 42)
No, you cannot. Anonymous functions are only available in functional languages (and languages with functional subsets), and as we all know, c is dysfunctional ;^)
In C and pre-0x C++, no.
In C++0x, yes, using lambda functions.
The best way to simplify your code would probably to put a for loop around a switch statement.
int a_var;
for ( a_var = 0; a_var <= 30; a_var++ )
{
starteroperations();
switch (a_var)
{
case 0:
operation0(); break;
case ...:
operationx(); break;
case 30:
...
}
closingoperations();
}
If you can use Clang, you can take advantage of blocks. To learn blocks, you can use Apple's documentation, Clang's block language specification and implementation notes, and Apple's proposal to the ISO C working group to add blocks to the standard C language, as well as a ton of blog posts.
Using blocks, you could write:
/* Block variables are declared like function pointers
* but use ^ ("block pointer") instead of * ("normal pointer"). */
void (^before)(void) = void ^(void) { puts("before"); };
/* Blocks infer the return type, so you don't need to declare it
* in the block definition. */
void (^after)(void) = ^(void) { puts("after"); };
/* The default arguments are assumed to be void, so you could even
* just define after as
*
* ^{ puts("after"); };
*/
before();
foo = bar + baz*kablooie;
after();
This example gives the anonymous blocks names by assigning to a block variable. You can also define and call a block directly:
^{ puts("!"); } ();
/*| definition | invocation of anonymous function |*/
This also makes defining "struct-objects" (OOP in C using structs) very simple.
Both Clang and GCC support inner/nested functions as an extension to standard C. This would let you define the function immediately before taking its address, which might be an alternative if your control flow structure allows it: inner function pointers cannot be allowed to escape from their immediate scope. As the docs say:
If you try to call the nested function through its address after the containing function has exited, all hell will break loose. If you try to call it after a containing scope level has exited, and if it refers to some of the variables that are no longer in scope, you may be lucky, but it's not wise to take the risk. If, however, the nested function does not refer to anything that has gone out of scope, you should be safe.
Using nested functions, you could write:
/* Nested functions are defined just like normal functions.
* The difference is that they are not defined at "file scope"
* but instead are defined inside another function. */
void before(void) { puts("before"); };
void after(void) { puts("after"); };
before();
foo = bar + baz*kablooie;
after();
Either you go the case way suggested by #dcpomero, or you do the following:
typedef void job(int);
job test1; void test1(int a_var) { something_unique = a_var * 42; }
job test2; void test2(int a_var) { a_diff_var = "arbitrary"; }
job test3; void test3(int a_var) { var_30 = "Yup, still executing the same code around a different operation. Would be nice to refactor..."; }
job * tests[] = { test1, test2, test3, testn };
void Tests()
{
int i;
for (i=0; i < sizeof tests/sizeof tests[0]; i++) {
printf("This is the beginning, always constant.");
tests[i](a_var);
printf("End code, never changes");
a_var++;
}
}

Resources