Is there a typical state machine implementation pattern? - c

We need to implement a simple state machine in C.
Is a standard switch statement the best way to go?
We have a current state (state) and a trigger for the transition.
switch(state)
{
case STATE_1:
state = DoState1(transition);
break;
case STATE_2:
state = DoState2(transition);
break;
}
...
DoState2(int transition)
{
// Do State Work
...
if(transition == FROM_STATE_2) {
// New state when doing STATE 2 -> STATE 2
}
if(transition == FROM_STATE_1) {
// New State when moving STATE 1 -> STATE 2
}
return new_state;
}
Is there a better way for simple state machines
EDIT:
For C++, I think the Boost Statechart library might be the way to go. However, it does not help with C. Lets concentrate on the C use case.

I prefer to use a table driven approach for most state machines:
typedef enum { STATE_INITIAL, STATE_FOO, STATE_BAR, NUM_STATES } state_t;
typedef struct instance_data instance_data_t;
typedef state_t state_func_t( instance_data_t *data );
state_t do_state_initial( instance_data_t *data );
state_t do_state_foo( instance_data_t *data );
state_t do_state_bar( instance_data_t *data );
state_func_t* const state_table[ NUM_STATES ] = {
do_state_initial, do_state_foo, do_state_bar
};
state_t run_state( state_t cur_state, instance_data_t *data ) {
return state_table[ cur_state ]( data );
};
int main( void ) {
state_t cur_state = STATE_INITIAL;
instance_data_t data;
while ( 1 ) {
cur_state = run_state( cur_state, &data );
// do other program logic, run other state machines, etc
}
}
This can of course be extended to support multiple state machines, etc. Transition actions can be accommodated as well:
typedef void transition_func_t( instance_data_t *data );
void do_initial_to_foo( instance_data_t *data );
void do_foo_to_bar( instance_data_t *data );
void do_bar_to_initial( instance_data_t *data );
void do_bar_to_foo( instance_data_t *data );
void do_bar_to_bar( instance_data_t *data );
transition_func_t * const transition_table[ NUM_STATES ][ NUM_STATES ] = {
{ NULL, do_initial_to_foo, NULL },
{ NULL, NULL, do_foo_to_bar },
{ do_bar_to_initial, do_bar_to_foo, do_bar_to_bar }
};
state_t run_state( state_t cur_state, instance_data_t *data ) {
state_t new_state = state_table[ cur_state ]( data );
transition_func_t *transition =
transition_table[ cur_state ][ new_state ];
if ( transition ) {
transition( data );
}
return new_state;
};
The table driven approach is easier to maintain and extend and simpler to map to state diagrams.

You might have seen my answer to another C question where I mentioned FSM! Here is how I do it:
FSM {
STATE(x) {
...
NEXTSTATE(y);
}
STATE(y) {
...
if (x == 0)
NEXTSTATE(y);
else
NEXTSTATE(x);
}
}
With the following macros defined
#define FSM
#define STATE(x) s_##x :
#define NEXTSTATE(x) goto s_##x
This can be modified to suit the specific case. For example, you may have a file FSMFILE that you want to drive your FSM, so you could incorporate the action of reading next char into the the macro itself:
#define FSM
#define STATE(x) s_##x : FSMCHR = fgetc(FSMFILE); sn_##x :
#define NEXTSTATE(x) goto s_##x
#define NEXTSTATE_NR(x) goto sn_##x
now you have two types of transitions: one goes to a state and read a new character, the other goes to a state without consuming any input.
You can also automate the handling of EOF with something like:
#define STATE(x) s_##x : if ((FSMCHR = fgetc(FSMFILE) == EOF)\
goto sx_endfsm;\
sn_##x :
#define ENDFSM sx_endfsm:
The good thing of this approach is that you can directly translate a state diagram you draw into working code and, conversely, you can easily draw a state diagram from the code.
In other techniques for implementing FSM the structure of the transitions is buried in control structures (while, if, switch ...) and controlled by variables value (tipically a state variable) and it may be a complex task to relate the nice diagram to a convoluted code.
I learned this technique from an article appeared on the great "Computer Language" magazine that, unfortunately, is no longer published.

In Martin Fowler's UML Distilled, he states (no pun intended) in Chapter 10 State Machine Diagrams (emphasis mine):
A state diagram can be implemented in three main ways: nested switch, the State pattern, and
state tables.
Let's use a simplified example of the states of a mobile phone's display:
Nested switch
Fowler gave an example of C# code, but I've adapted it to my example.
public void HandleEvent(PhoneEvent anEvent) {
switch (CurrentState) {
case PhoneState.ScreenOff:
switch (anEvent) {
case PhoneEvent.PressButton:
if (powerLow) { // guard condition
DisplayLowPowerMessage(); // action
// CurrentState = PhoneState.ScreenOff;
} else {
CurrentState = PhoneState.ScreenOn;
}
break;
case PhoneEvent.PlugPower:
CurrentState = PhoneState.ScreenCharging;
break;
}
break;
case PhoneState.ScreenOn:
switch (anEvent) {
case PhoneEvent.PressButton:
CurrentState = PhoneState.ScreenOff;
break;
case PhoneEvent.PlugPower:
CurrentState = PhoneState.ScreenCharging;
break;
}
break;
case PhoneState.ScreenCharging:
switch (anEvent) {
case PhoneEvent.UnplugPower:
CurrentState = PhoneState.ScreenOff;
break;
}
break;
}
}
State pattern
Here's an implementation of my example with the GoF State pattern:
State Tables
Taking inspiration from Fowler, here's a table for my example:
Source State Target State Event Guard Action
--------------------------------------------------------------------------------------
ScreenOff ScreenOff pressButton powerLow displayLowPowerMessage
ScreenOff ScreenOn pressButton !powerLow
ScreenOn ScreenOff pressButton
ScreenOff ScreenCharging plugPower
ScreenOn ScreenCharging plugPower
ScreenCharging ScreenOff unplugPower
Comparison
Nested switch keeps all the logic in one spot, but the code can be hard to read when there are a lot of states and transitions. It's possibly more secure and easier to validate than the other approaches (no polymorphism or interpreting).
The State pattern implementation potentially spreads the logic over several separate classes, which may make understanding it as a whole a problem. On the other hand, the small classes are easy to understand separately. The design is particularly fragile if you change the behavior by adding or removing transitions, as they're methods in the hierarchy and there could be lots of changes to the code. If you live by the design principle of small interfaces, you'll see this pattern doesn't really do so well. However, if the state machine is stable, then such changes won't be needed.
The state tables approach requires writing some kind of interpreter for the content (this might be easier if you have reflection in the language you're using), which could be a lot of work to do up front. As Fowler points out, if your table is separate from your code, you could modify the behavior of your software without recompiling. This has some security implications, however; the software is behaving based on the contents of an external file.
Edit (not really for C language)
There is a fluent interface (aka internal Domain Specific Language) approach, too, which is probably facilitated by languages that have first-class functions. The Stateless library exists and that blog shows a simple example with code. A Java implementation (pre Java8) is discussed. I was shown a Python example on GitHub as well.

I also have used the table approach. However, there is overhead. Why store a second list of pointers? A function in C without the () is a const pointer. So you can do:
struct state;
typedef void (*state_func_t)( struct state* );
typedef struct state
{
state_func_t function;
// other stateful data
} state_t;
void do_state_initial( state_t* );
void do_state_foo( state_t* );
void do_state_bar( state_t* );
void run_state( state_t* i ) {
i->function(i);
};
int main( void ) {
state_t state = { do_state_initial };
while ( 1 ) {
run_state( state );
// do other program logic, run other state machines, etc
}
}
Of course depending on your fear factor (i.e. safety vs speed) you may want to check for valid pointers. For state machines larger than three or so states, the approach above should be less instructions than an equivalent switch or table approach. You could even macro-ize as:
#define RUN_STATE(state_ptr_) ((state_ptr_)->function(state_ptr_))
Also, I feel from the OP's example, that there is a simplification that should be done when thinking about / designing a state machine. I don't thing the transitioning state should be used for logic. Each state function should be able to perform its given role without explicit knowledge of past state(s). Basically you design for how to transition from the state you are in to another state.
Finally, don't start the design of a state machine based on "functional" boundaries, use sub-functions for that. Instead divide the states based on when you will have to wait for something to happen before you can continue. This will help minimize the number of times you have to run the state machine before you get a result. This can be important when writing I/O functions, or interrupt handlers.
Also, a few pros and cons of the classic switch statement:
Pros:
it is in the language, so it is documented and clear
states are defined where they are called
can execute multiple states in one function call
code common to all states can be executed before and after the switch statement
Cons:
can execute multiple states in one function call
code common to all states can be executed before and after the switch statement
switch implementation can be slow
Note the two attributes that are both pro and con. I think the switch allows the opportunity for too much sharing between states, and the interdependency between states can become unmanageable. However for a small number of states, it may be the most readable and maintainable.

there is also the logic grid which is more maintainable as the state machine gets bigger

For a simple state machine just use a switch statement and an enum type for your state. Do your transitions inside the switch statement based on your input. In a real program you would obviously change the "if(input)" to check for your transition points. Hope this helps.
typedef enum
{
STATE_1 = 0,
STATE_2,
STATE_3
} my_state_t;
my_state_t state = STATE_1;
void foo(char input)
{
...
switch(state)
{
case STATE_1:
if(input)
state = STATE_2;
break;
case STATE_2:
if(input)
state = STATE_3;
else
state = STATE_1;
break;
case STATE_3:
...
break;
}
...
}

switch() is a powerful and standard way of implementing state machines in C, but it can decrease maintainability down if you have a large number of states. Another common method is to use function pointers to store the next state. This simple example implements a set/reset flip-flop:
/* Implement each state as a function with the same prototype */
void state_one(int set, int reset);
void state_two(int set, int reset);
/* Store a pointer to the next state */
void (*next_state)(int set, int reset) = state_one;
/* Users should call next_state(set, reset). This could
also be wrapped by a real function that validated input
and dealt with output rather than calling the function
pointer directly. */
/* State one transitions to state one if set is true */
void state_one(int set, int reset) {
if(set)
next_state = state_two;
}
/* State two transitions to state one if reset is true */
void state_two(int set, int reset) {
if(reset)
next_state = state_one;
}

I found a really slick C implementation of Moore FSM on the edx.org course Embedded Systems - Shape the World UTAustinX - UT.6.02x, chapter 10, by Jonathan Valvano and Ramesh Yerraballi....
struct State {
unsigned long Out; // 6-bit pattern to output
unsigned long Time; // delay in 10ms units
unsigned long Next[4]; // next state for inputs 0,1,2,3
};
typedef const struct State STyp;
//this example has 4 states, defining constants/symbols using #define
#define goN 0
#define waitN 1
#define goE 2
#define waitE 3
//this is the full FSM logic coded into one large array of output values, delays,
//and next states (indexed by values of the inputs)
STyp FSM[4]={
{0x21,3000,{goN,waitN,goN,waitN}},
{0x22, 500,{goE,goE,goE,goE}},
{0x0C,3000,{goE,goE,waitE,waitE}},
{0x14, 500,{goN,goN,goN,goN}}};
unsigned long currentState; // index to the current state
//super simple controller follows
int main(void){ volatile unsigned long delay;
//embedded micro-controller configuration omitteed [...]
currentState = goN;
while(1){
LIGHTS = FSM[currentState].Out; // set outputs lines (from FSM table)
SysTick_Wait10ms(FSM[currentState].Time);
currentState = FSM[currentState].Next[INPUT_SENSORS];
}
}

For simple cases, you can use your switch style method. What I have found that works well in the past is to deal with transitions too:
static int current_state; // should always hold current state -- and probably be an enum or something
void state_leave(int new_state) {
// do processing on what it means to enter the new state
// which might be dependent on the current state
}
void state_enter(int new_state) {
// do processing on what is means to leave the current state
// might be dependent on the new state
current_state = new_state;
}
void state_process() {
// switch statement to handle current state
}
I don't know anything about the boost library, but this type of approach is dead simple, doesn't require any external dependencies, and is easy to implement.

You might want to look into the libero FSM generator software. From a state description language and/or a (windows) state diagram editor you may generate code for C, C++, java and many others ... plus nice documentation and diagrams.
Source and binaries from iMatix

This article is a good one for the state pattern (though it is C++, not specifically C).
If you can put your hands on the book "Head First Design Patterns", the explanation and example are very clear.

One of my favourite patterns is the state design pattern. Respond or behave differently to the same given set of inputs.
One of the problems with using switch/case statements for state machines is that as you create more states, the switch/cases becomes harder/unwieldy to read/maintain, promotes unorganized spaghetti code, and increasingly difficult to change without breaking something. I find using design patterns helps me to organize my data better, which is the whole point of abstraction.
Instead of designing your state code around what state you came from, instead structure your code so that it records the state when you enter a new state. That way, you effectively get a record of your previous state. I like #JoshPetit's answer, and have taken his solution one step further, taken straight from the GoF book:
stateCtxt.h:
#define STATE (void *)
typedef enum fsmSignal
{
eEnter =0,
eNormal,
eExit
}FsmSignalT;
typedef struct fsm
{
FsmSignalT signal;
// StateT is an enum that you can define any which way you want
StateT currentState;
}FsmT;
extern int STATECTXT_Init(void);
/* optionally allow client context to set the target state */
extern STATECTXT_Set(StateT stateID);
extern void STATECTXT_Handle(void *pvEvent);
stateCtxt.c:
#include "stateCtxt.h"
#include "statehandlers.h"
typedef STATE (*pfnStateT)(FsmSignalT signal, void *pvEvent);
static FsmT fsm;
static pfnStateT UsbState ;
int STATECTXT_Init(void)
{
UsbState = State1;
fsm.signal = eEnter;
// use an enum for better maintainability
fsm.currentState = '1';
(*UsbState)( &fsm, pvEvent);
return 0;
}
static void ChangeState( FsmT *pFsm, pfnStateT targetState )
{
// Check to see if the state has changed
if (targetState != NULL)
{
// Call current state's exit event
pFsm->signal = eExit;
STATE dummyState = (*UsbState)( pFsm, pvEvent);
// Update the State Machine structure
UsbState = targetState ;
// Call the new state's enter event
pFsm->signal = eEnter;
dummyState = (*UsbState)( pFsm, pvEvent);
}
}
void STATECTXT_Handle(void *pvEvent)
{
pfnStateT newState;
if (UsbState != NULL)
{
fsm.signal = eNormal;
newState = (*UsbState)( &fsm, pvEvent );
ChangeState( &fsm, newState );
}
}
void STATECTXT_Set(StateT stateID)
{
prevState = UsbState;
switch (stateID)
{
case '1':
ChangeState( State1 );
break;
case '2':
ChangeState( State2);
break;
case '3':
ChangeState( State3);
break;
}
}
statehandlers.h:
/* define state handlers */
extern STATE State1(void);
extern STATE State2(void);
extern STATE State3(void);
statehandlers.c:
#include "stateCtxt.h:"
/* Define behaviour to given set of inputs */
STATE State1(FsmT *fsm, void *pvEvent)
{
STATE nextState;
/* do some state specific behaviours
* here
*/
/* fsm->currentState currently contains the previous state
* just before it gets updated, so you can implement behaviours
* which depend on previous state here
*/
fsm->currentState = '1';
/* Now, specify the next state
* to transition to, or return null if you're still waiting for
* more stuff to process.
*/
switch (fsm->signal)
{
case eEnter:
nextState = State2;
break;
case eNormal:
nextState = null;
break;
case eExit:
nextState = State2;
break;
}
return nextState;
}
STATE State3(FsmT *fsm, void *pvEvent)
{
/* do some state specific behaviours
* here
*/
fsm->currentState = '2';
/* Now, specify the next state
* to transition to
*/
return State1;
}
STATE State2(FsmT *fsm, void *pvEvent)
{
/* do some state specific behaviours
* here
*/
fsm->currentState = '3';
/* Now, specify the next state
* to transition to
*/
return State3;
}
For most State Machines, esp. Finite state machines, each state will know what its next state should be, and the criteria for transitioning to its next state. For loose state designs, this may not be the case, hence the option to expose the API for transitioning states. If you desire more abstraction, each state handler can be separated out into its own file, which are equivalent to the concrete state handlers in the GoF book. If your design is simple with only a few states, then both stateCtxt.c and statehandlers.c can be combined into a single file for simplicity.

In my experience using the 'switch' statement is the standard way to handle multiple possible states. Although I am surpirsed that you are passing in a transition value to the per-state processing. I thought the whole point of a state machine was that each state performed a single action. Then the next action/input determines which new state to transition into. So I would have expected each state processing function to immediately perform whatever is fixed for entering state and then afterwards decide if transition is needed to another state.

There is a book titled Practical Statecharts in C/C++.
However, it is way too heavyweight for what we need.

For compiler which support __COUNTER__ , you can use them for simple (but large) state mashines.
#define START 0
#define END 1000
int run = 1;
state = START;
while(run)
{
switch (state)
{
case __COUNTER__:
//do something
state++;
break;
case __COUNTER__:
//do something
if (input)
state = END;
else
state++;
break;
.
.
.
case __COUNTER__:
//do something
if (input)
state = START;
else
state++;
break;
case __COUNTER__:
//do something
state++;
break;
case END:
//do something
run = 0;
state = START;
break;
default:
state++;
break;
}
}
The advantage of using __COUNTER__ instead of hard coded numbers is that you
can add states in the middle of other states, without renumbering everytime everything.
If the compiler doesnt support __COUNTER__, in a limited way its posible to use with precaution __LINE__

You can use minimalist UML state machine framework in c. https://github.com/kiishor/UML-State-Machine-in-C
It supports both finite and hierarchical state machine. It has only 3 API's, 2 structures and 1 enumeration.
The State machine is represented by state_machine_t structure. It is an abstract structure that can be inherited to create a state machine.
//! Abstract state machine structure
struct state_machine_t
{
uint32_t Event; //!< Pending Event for state machine
const state_t* State; //!< State of state machine.
};
State is represented by pointer to state_t structure in the framework.
If framework is configured for finite state machine then state_t contains,
typedef struct finite_state_t state_t;
// finite state structure
typedef struct finite_state_t{
state_handler Handler; //!< State handler function (function pointer)
state_handler Entry; //!< Entry action for state (function pointer)
state_handler Exit; //!< Exit action for state (function pointer)
}finite_state_t;
The framework provides an API dispatch_event to dispatch the event to the state machine and two API's for the state traversal.
state_machine_result_t dispatch_event(state_machine_t* const pState_Machine[], uint32_t quantity);
state_machine_result_t switch_state(state_machine_t* const, const state_t*);
state_machine_result_t traverse_state(state_machine_t* const, const state_t*);
For further details on how to implement hierarchical state machine refer the GitHub repository.
code examples
https://github.com/kiishor/UML-State-Machine-in-C/blob/master/demo/simple_state_machine/readme.md
https://github.com/kiishor/UML-State-Machine-in-C/blob/master/demo/simple_state_machine_enhanced/readme.md

In C++, consider the State pattern.

Your question is similar to "is there a typical Data Base implementation pattern"?
The answer depends upon what do you want to achieve? If you want to implement a larger deterministic state machine you may use a model and a state machine generator.
Examples can be viewed at www.StateSoft.org - SM Gallery. Janusz Dobrowolski

I would also prefer a table driven approach. I have used switch statements in the past. The main problem I have encountered is debugging transitions and ensuring that the designed state machine has been implemented properly. This occurred in cases where there was a large number of states and events.
With the table driven approach are the states and transitions are summarized in one place.
Below is a demo of this approach.
/*Demo implementations of State Machines
*
* This demo leverages a table driven approach and function pointers
*
* Example state machine to be implemented
*
* +-----+ Event1 +-----+ Event2 +-----+
* O---->| A +------------------->| B +------------------->| C |
* +-----+ +-----+ +-----+
* ^ |
* | Event3 |
* +-----------------------------------------------------+
*
* States: A, B, C
* Events: NoEvent (not shown, holding current state), Event1, Event2, Event3
*
* Partly leveraged the example here: http://web.archive.org/web/20160808120758/http://www.gedan.net/2009/03/18/finite-state-machine-matrix-style-c-implementation-function-pointers-addon/
*
* This sample code can be compiled and run using GCC.
* >> gcc -o demo_state_machine demo_state_machine.c
* >> ./demo_state_machine
*/
#include <stdio.h>
#include <assert.h>
// Definitions of state id's, event id's, and function pointer
#define N_STATES 3
#define N_EVENTS 4
typedef enum {
STATE_A,
STATE_B,
STATE_C,
} StateId;
typedef enum {
NOEVENT,
EVENT1,
EVENT2,
EVENT3,
} Event;
typedef void (*StateRoutine)();
// Assert on number of states and events defined
static_assert(STATE_C==N_STATES-1,
"Number of states does not match defined number of states");
static_assert(EVENT3==N_EVENTS-1,
"Number of events does not match defined number of events");
// Defining State, holds both state id and state routine
typedef struct {
StateId id;
StateRoutine routine;
} State;
// General functions
void evaluate_state(Event e);
// State routines to be executed at each state
void state_routine_a(void);
void state_routine_b(void);
void state_routine_c(void);
// Defining each state with associated state routine
const State state_a = {STATE_A, state_routine_a};
const State state_b = {STATE_B, state_routine_b};
const State state_c = {STATE_C, state_routine_c};
// Defning state transition matrix as visualized in the header (events not
// defined, result in mainting the same state)
State state_transition_mat[N_STATES][N_EVENTS] = {
{ state_a, state_b, state_a, state_a},
{ state_b, state_b, state_c, state_b},
{ state_c, state_c, state_c, state_a}};
// Define current state and initialize
State current_state = state_a;
int main()
{
while(1) {
// Event to receive from user
int ev;
printf("----------------\n");
printf("Current state: %c\n", current_state.id + 65);
printf("Event to occur: ");
// Receive event from user
scanf("%u", &ev);
evaluate_state((Event) ev); // typecast to event enumeration type
printf("-----------------\n");
};
return (0);
}
/*
* Determine state based on event and perform state routine
*/
void evaluate_state(Event ev)
{
//Determine state based on event
current_state = state_transition_mat[current_state.id][ev];
printf("Transitioned to state: %c\n", current_state.id + 65);
// Run state routine
(*current_state.routine)();
}
/*
* State routines
*/
void state_routine_a() {
printf("State A routine ran. \n");
}
void state_routine_b() {
printf("State B routine ran. \n");
}
void state_routine_c() {
printf("State C routine ran. \n");
}

Boost has the statechart library. http://www.boost.org/doc/libs/1_36_0/libs/statechart/doc/index.html
I can't speak to the use of it, though. Not used it myself (yet)

Related

How to implement a fsm

I want to parse output from a commandline tool using the fsm programming model. What is the simplest implementation of a fsm that is possible for this task?
Basically, the core idea of a finite state machine is that the machine is in a "state" and, for every state, the behaviour of the machine is different from other states.
A simple way to do this is to have an integer variable (or an enum) which stores the status, and a switch() statement which implements, for every case, the required logic.
Suppose you have a file of the followin kind:
something
begin
something
something2
end
something
and you duty is to print the part between begin/end. You read the file line by line, and switch state basing on the content of the line:
// pseudo-C code
enum state {nothing, inblock};
enum state status;
string line;
status = nothing;
while (!eof(file)) {
readline(line);
switch(status) {
case nothing:
if (line == "begin") status=inblock;
break;
case inblock:
if (line == "end")
status=nothing;
else print(line);
break;
}
}
In this example, only the core idea is shown: a "status" of the machine and a mean to change status (the "line" read from file). In real life examples probably there are more variables to keep more informations for every state and, perhaps, the "status" of the machine can be stored in a function pointer, to avoid the burden and rigidity of the switch() statement but, even so, the programming paradigm is clean and powerful.
The fsm model works in C by assigning function pointers to certain functions that have to process certain data. One good use for fsms is for parsing commandline arguments, for parsing captured output.... The function pointer is assigned to a preset starting function. The start function assigns the function pointer, which must be passed along, to the appropriate next function. And that decides the next function and so on.
Here is a very simple implementation of a fsm:
struct _fsm
{
void (*ptr_to_fsm)(struct _fsm fsm);
char *data;
}
struct _fsm fsm;
fsm->ptr_to_fsm = start; // There is a function called start.
while (fsm->ptr_to_fsm != NULL)
{
fsm->ptr_to_fsm(&fsm);
}
void start (struct _fsm fsm)
{
if (fsm->data == NULL)
{
fsm->ptr_to_fsm = stop; // There is a function called stop.
}
/* Check more more conditions, and branch out on other functions based on the results. */
return;
}
void stop (struct _fsm fsm)
{
fsm->ptr_to_fsm = NULL; /* The while loop will terminate. */
/* And you're done (unless you have to do free`ing. */
}

How do you avoid using global variables in inherently stateful programs?

I am currently writing a small game in C and feel like I can't get away from global variables.
For example I am storing the player position as a global variable because it's needed in other files. I have set myself some rules to keep the code clean.
Only use a global variable in the file it's defined in, if possible
Never directly change the value of a global from another file (reading from another file using extern is okay)
So for example graphics settings would be stored as file scope variables in graphics.c. If code in other files wants to change the graphics settings they would have to do so through a function in graphics.c like graphics_setFOV(float fov).
Do you think those rules are sufficient for avoiding global variable hell in the long term?
How bad are file scope variables?
Is it okay to read variables from other files using extern?
Typically, this kind of problem is handled by passing around a shared context:
graphics_api.h
#ifndef GRAPHICS_API
#define GRAPHICS_API
typedef void *HANDLE;
HANDLE init_graphics(void);
void destroy_graphics(HANDLE handle);
void use_graphics(HANDLE handle);
#endif
graphics.c
#include <stdio.h>
#include <stdlib.h>
#include "graphics_api.h"
typedef struct {
int width;
int height;
} CONTEXT;
HANDLE init_graphics(void) {
CONTEXT *result = malloc(sizeof(CONTEXT));
if (result) {
result->width = 640;
result->height = 480;
}
return (HANDLE) result;
}
void destroy_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
free(context);
}
}
void use_graphics(HANDLE handle) {
CONTEXT *context = (CONTEXT *) handle;
if (context) {
printf("width = %5d\n", context->width);
printf("height = %5d\n", context->height);
}
}
main.c
#include <stdio.h>
#include "graphics_api.h"
int main(void) {
HANDLE handle = init_graphics();
if (handle) {
use_graphics(handle);
destroy_graphics(handle);
}
return 0;
}
Output
width = 640
height = 480
Hiding the details of the context by using a void pointer prevents the user from changing the data contained within the memory to which it points.
How do you avoid using global variables in inherently stateful programs?
By passing arguments...
// state.h
/// state object:
struct state {
int some_value;
};
/// Initializes state
/// #return zero on success
int state_init(struct state *s);
/// Destroys state
/// #return zero on success
int state_fini(struct state *s);
/// Does some operation with state
/// #return zero on success
int state_set_value(struct state *s, int new_value);
/// Retrieves some operation from state
/// #return zero on success
int state_get_value(struct state *s, int *value);
// state.c
#include "state.h"
int state_init(struct state *s) {
s->some_value = -1;
return 0;
}
int state_fini(struct state *s) {
// add free() etc. if needed here
// call fini of other objects here
return 0;
}
int state_set_value(struct state *s, int value) {
if (value < 0) {
return -1; // ERROR - invalid argument
// you may return EINVAL here
}
s->some_value = value;
return 0; // success
}
int state_get_value(struct state *s, int *value) {
if (s->some_value < 0) { // value not set yet
return -1;
}
*value = s->some_value;
return 0;
}
// main.c
#include "state.h"
#include <stdlib.h>
#include <stdio.h>
int main() {
struct state state; // local variable
int err = state_init(&state);
if (err) abort();
int value;
err = state_get_value(&state, &value);
if (err != 0) {
printf("Getting value errored: %d\n", err);
}
err = state_set_value(&state, 50);
if (err) abort();
err = state_get_value(&state, &value);
if (err) abort();
printf("Current value is: %d\n", value);
err = state_fini(&state);
if (err) abort();
}
The only single case where global variables (preferably only a single pointer to some stack variable anyway) have to be used are signal handlers. The standard way would be to only increment a single global variable of type sig_atomic_t inside a signal handler and do nothing else - then execute all signal handling related logic from the normal flow in the rest of the code by checking the value of that variable. (On POSIX system) all other asynchronous communication from the kernel, like timer_create, that take sigevent structure, they can pass arguments to notified function by using members in union sigval.
Do you think those rules are sufficient for avoiding global variable hell in the long term?
Subjectively: no. I believe that a potentially uneducated programmer has too much freedom in creating global variables given the first rule. In complex programs I would use a hard rule: Do not use global variables. If finally after researching all other ways and all other possibilities have been exhausted and you have to use a global variables, make sure global variables leave the smallest possible memory footprint.
In simple short programs I wouldn't care much.
How bad are file scope variables?
This is opinion based - there are good cases where projects use many global variables. I believe that topic is exhausted in are global variables bad and numerous other internet resources.
Is it okay to read variables from other files using extern?
Yes, it's ok.
There are no "hard rules" and each project has it's own rules. I also recommend to read c2 wiki global variables are bad.
The first thing you have to ask yourself is: Just why did the programming world come to loath global variables? Obviously, as you noted, the way to model a global state is essentially a global (set of) variable(s). So what's the problem with that?
The Problem
All parts of the program have access to that state. The whole program becomes tightly coupled. Global variables violate the prime directive in programming, divide and conquer. Once all functions operate on the same data you can as well do away with the functions: They are no longer logical separations of concern but degrade to a notational convenience to avoid large files.
Write access is worse than read access: You'll have a hard time finding out just why on earth the state is unexpected at a certain point; the change can have happened anywhere. It is tempting to take shortcuts: "Ah, we can make the state change right here instead of passing a computation result back up three layers to the caller; that makes the code much smaller."
Even read access can be used to cheat and e.g. change behavior of some deep-down code depending on some global information: "Ah, we can skip rendering, there is no display yet!" A decision which should not be made in the rendering code but at top level. What if top level renders to a file!?
This creates both a debugging and a development/maintenance nightmare. If every piece of the code potentially relies on the presence and semantics of certain variables — and can change them! — it becomes exponentially harder to debug or change the program. The code agglomerating around the global data is like a cast, or perhaps a Boa Constrictor, which starts to immobilize and strangle your program.
Such programming can be avoided with (self-)discipline, but imagine a large project with many teams! It's much better to "physically" prevent access. Not coincidentally all programming languages after C, even if they are otherwise fundamentally different, come with improved modularization.
So what can we do?
The solution is indeed to pass parameters to functions, as KamilCuk said; but each function should only get the information they legitimately need. Of course it is best if the access is read-only and the result is a return value: Pure functions cannot change state at all and thus perfectly separate concerns.
But simply passing a pointer to the global state around does not cut the mustard: That's only a thinly veiled global variable.
Instead, the state should be separated into sub-states. Only top-level functions (which typically do not do much themselves but mostly delegate) have access to the overall state and hand sub-states to the functions they call. Third-tier functions get sub-sub states, etc. The corresponding implementation in C is a nested struct; pointers to the members — const whenever possible — are passed to functions which therefore cannot see, let alone alter, the rest of the global state. Separation of concerns is thus guaranteed.

State machine implementation in C

I have been trying to implement a state machine in C language and I am not
sure whether my implementation is appropriate. I have tried to implement the state diagram in pseudo object manner. So I have declared the public interface of StateMachine "class" in StateMachine.h header file:
typedef enum{
STATE_01,
STATE_02,
STATE_03,
STATE_04
}state_e;
typedef enum{
NO_EVENT,
EVENT_01,
EVENT_02,
EVENT_03,
EVENT_04,
EVENT_05,
EVENT_06,
EVENT_07
}event_e;
// state machine instance
typedef struct{
state_e currState;
}StateMachine;
extern void InitState(StateMachine*);
extern void ProcSTATE_01(StateMachine*, event_e);
extern void ProcSTATE_02(StateMachine*, event_e);
extern void ProcSTATE_03(StateMachine*, event_e);
extern void ProcSTATE_04(StateMachine*, event_e);
extern void ProcEvent(StateMachine*, event_e);
The implementation of the StateMachine object "methods" is in the StateMachine.c module:
void InitState(StateMachine *sm)
{
sm->currState = STATE_01;
}
void ProcSTATE_01(StateMachine *sm, event_e event)
{
if(event == EVENT_01){
sm->currState = STATE_02;
}
}
void ProcSTATE_02(StateMachine *sm, event_e event)
{
// ...
}
void ProcSTATE_03(StateMachine *sm, event_e event)
{
// ...
}
void ProcSTATE_04(StateMachine *sm, event_e event)
{
// ...
}
void ProcEvent(StateMachine *sm, event_e event)
{
switch(sm->currState){
case STATE_01:
ProcSTATE_01(sm, event);
break;
case STATE_02:
ProcSTATE_02(sm, event);
break;
case STATE_03:
ProcSTATE_03(sm, event);
break;
case STATE_04:
ProcSTATE_04(sm, event);
break;
}
}
The usage of the StateMachine object is following:
int main(int argc, char** argv) {
StateMachine app;
ProcEvent(&app, EVENT_01);
return 0;
}
I would like to ask anybody more experienced whether he can see any potential problems which are connected with this implementation. Thanks for any comments.
You'll need to think about where your events come from. Do they only get generated from the state transition handlers themselves, or are they exogenous. If they are exogenous, then you'll likely want an event loop external to your handler which pulls off events from a queue and injects them in to the appropriate handler (which might enqueue another event to the queue for later handling). Also, you might want to store your state-event handler functions in an array (indexed on the state number) so you can programmatically call them (instead of having to have a giant switch statement).
There is no "correct" way to implement a state machine (like the state table aproach mentioned by Doug), but there is code that can give you direct insights about what is easy to use and what is not.
There is a book by Miro Samek about using hirarchical state machines in c/c++. There he compares the different styles with the state machine style he has developed. And i must say, his style is the one that is most reasonable and readable to me.
Part of the book is also available online at state-machine.com.
Your state machine skeleton in fact is very similar to the state machine part of Miros aproach, his aproach also adds a hirarchical transition possibility, queues per state machine, a system tick mechanism, event recycling, a zero event copying policy and so on. If you look at his web-site you will see that your aproach is a good one that can be extended in many, many ways.

Using GOTO for a FSM in C

I am creating a finite state machine in C.
I learned FSM from the hardware point of view (HDL language). So I'm used a switch with one case per state.
I also like to apply the Separation of Concerns concept when programing.
I mean I'd like to get this flow:
Calculate the next state depending on the current state and input flags
Validate this next state (if the user request a transition that is not allowed)
Process the next state when it is allowed
As a start I implemented 3 functions:
static e_InternalFsmStates fsm_GetNextState();
static bool_t fsm_NextStateIsAllowed(e_InternalFsmStates nextState);
static void fsm_ExecuteNewState(e_InternalFsmStates);
At the moment they all contain a big switch-case which is the same:
switch (FSM_currentState) {
case FSM_State1:
[...]
break;
case FSM_State2:
[...]
break;
default:
[...]
break;
}
Now that it works, I'd like to improve the code.
I know that in the 3 functions I'll execute the same branch of the switch.
So I am thinking to use gotos in this way:
//
// Compute next state
//
switch (FSM_currentState) {
case FSM_State1:
next_state = THE_NEXT_STATE
goto VALIDATE_FSM_State1_NEXT_STATE;
case FSM_State2:
next_state = THE_NEXT_STATE
goto VALIDATE_FSM_State2_NEXT_STATE;
[...]
default:
[...]
goto ERROR;
}
//
// Validate next state
//
VALIDATE_FSM_State1_NEXT_STATE:
// Some code to Set stateIsValid to TRUE/FALSE;
if (stateIsValid == TRUE)
goto EXECUTE_STATE1;
else
goto ERROR;
VALIDATE_FSM_State2_NEXT_STATE:
// Some code to Set stateIsValid to TRUE/FALSE;
if (stateIsValid == TRUE)
goto EXECUTE_STATE2;
else
goto ERROR;
//
// Execute next state
//
EXECUTE_STATE1:
// Do what I need for state1
goto END;
EXECUTE_STATE2:
// Do what I need for state2
goto END;
//
// Error
//
ERROR:
// Error handling
goto END;
END:
return; // End of function
Of course, I could do the 3 parts (calculate, validate and process the next state) in a single switch case. But for code readability and code reviews, I feel like it will be easier to separate them.
Finally my question, is it dangerous to use GOTOs in this way?
Would you have any advice when using FSM like that?
Thank you for your comments!
After reading the answers and comments below, here is what I am going to try:
e_FSM_InternalStates nextState = FSM_currentState;
bool_t isValidNextState;
//
// Compute and validate next state
//
switch (FSM_currentState) {
case FSM_State1:
if (FSM_inputFlags.flag1 == TRUE)
{
nextState = FSM_State2;
}
[...]
isValidNextState = fsm_validateState1Transition(nextState);
case FSM_State2:
if (FSM_inputFlags.flag2 == TRUE)
{
nextState = FSM_State3;
}
[...]
isValidNextState = fsm_validateState2Transition(nextState);
}
//
// If nextState is invalid go to Error
//
if (isValidNextState == FALSE) {
nextState = FSM_StateError;
}
//
// Execute next state
//
switch (nextState) {
case FSM_State1:
// Execute State1
[...]
case FSM_State2:
// Execute State1
[...]
case FSM_StateError:
// Execute Error
[...]
}
FSM_currentState = nextState;
While goto has its benefits in C, it should be used sparesly and with extreme caution. What you intend is no recommendable use-case.
Your code will be less maintainable and more confusing. switch/case is actually some kind of "calculated" goto (thats's why there are case labels).
You are basicaly thinking the wrong way. For a state-machine, you should first verify input, then calculate the next state, then the output. There are various ways to do so, but it is often a good idea to use two switches and - possibly - a single error-handling label or a error-flag:
bool error_flag = false;
while ( run_fsm ) {
switch ( current_state ) {
case STATE1:
if ( input1 == 1 )
next_state = STATE2;
...
else
goto error_handling; // use goto
error_flag = true; // or the error-flag (often better)
break;
...
}
if ( error_flag )
break;
switch ( next_state ) {
case STATE1:
output3 = 2;
// if outputs depend on inputs, similar to the upper `switch`
break;
...
}
current_state = next_state;
}
error_handling:
...
This way you are transitioning and verifying input at once. This makes senase, as you have to evaluate the inputs anyway to set the next state, so invalid input just falls of the tests naturally.
An alternative is to have an output_state and state variable instead of next_state and current_state. In the first switch you set output_state and state, the second is switch ( output_state ) ....
If the single cases become too long, you should use functions to determine the next_state and/or output_state/outputs. It depends very much on the FSM (number of inputs, outputs, states, complexity (e.g. one-hot vs. "encoded" - if you are family with HDL, you will know).
If you need more complex error-handling (e.g. recover) inside the loop, leave the loop as-is and add an outer loop, possibly change the error-flag to an error-code and add another switch for it in the outer loop. Depending on complexity, pack the inner loop into its own function, etc.
Sidenote: The compiler might very well optimize the structured approach (without goto) to the same/similar code as with the goto
Whether it is "dangerous" is probably somewhat a matter of opinion. The usual reason people say to avoid GOTO is that it tends to lead to spaghetti code that's hard to follow. Is that an absolute rule? Probably not, but I think it's definitely fair to say that it is the trend. Secondarily, most programmers at this point are trained to believe that GOTO is bad, so, even if it's not in some case, you may run into some level of maintainability issue with other people coming into the project later.
How much risk you have in your case, probably depends on how big of a chunk of code you're going to have under those state labels and how sure you are that it won't change much. More code (or potential for large revisions), means more risk. In addition to just straight questions of readability, you'll have increased chances for assignments to variables interfering between cases or being dependent on the path you took to get to a certain state. Using functions helps with this (in many cases) by creating local scope for variables.
All in all, I would recommend avoiding the GOTO.
You don't really need to use switch-case, it will actually get optimized away by the compiler into machine code with a function pointer jump table. Switch-cases for state machines tend to be somewhat hard to read, especially the more complex ones.
The spaghetti-gotos are unacceptable and bad programming practice: there are a few valid uses of goto, this is not one of them.
Instead, consider to have a one-line state machine which looks like:
state = STATE_MACHINE[state]();
Here is an answer of mine (taken from the electrical engineering site, it pretty much applies universally) which is based on a function-pointer lookup table.
typedef enum
{
STATE_S1,
STATE_S2,
...
STATE_N // the number of states in this state machine
} state_t;
typedef state_t (*state_func_t)(void);
state_t do_state_s1 (void);
state_t do_state_s2 (void);
static const state_func_t STATE_MACHINE [STATE_N] =
{
&do_state_s1,
&do_state_s2,
...
};
void main()
{
state_t state = STATE_S1;
while (1)
{
state = STATE_MACHINE[state]();
}
}
state_t do_state_s1 (void)
{
state_t result = STATE_S1;
// stuff
if (...)
result = STATE_S2;
return result;
}
state_t do_state_s2 (void)
{
state_t result = STATE_S2;
// other stuff
if (...)
result = STATE_S1;
return result;
}
You can easily modify the function signatures to contain an error code as well, such as for example:
typedef err_t (*state_func_t)(state_t*);
with functions as
err_t do_state_s1 (state_t* state);
in which case the caller would end up as:
error = STATE_MACHINE[state](&state);
if(error != NO_ERROR)
{
// handle errors here
}
Leave all error handling to the caller as show in the above example.
My rule of thumb is to use GOTOs only to jump forward in the code, but never backwards. In the end this boils down to using GOTO only for exception handling, which otherwise doesn't exist in C.
In your particular case I would absolutely not recommend the use of GOTO.

C programming lookup table

After some help as have some code which works but trying to fully understand a small section which uses a lookup table as part of a state machine. The state machine I understand and have been using for a tutorial I am writing found here http://coder-tronics.com/state-machine-tutorial-pt1
Instead of using more switch case statements inside the UponExit, UponEnter and ActionWhileInState, someone suggested using this method which I like but don't fully understand.
Now the part I am unsure about is these lines to do with the typedef and lookup table
typedef void (* const voidFunc)(void);
voidFunc UponEnter[S_MAX] = {hEnter_OFF, hEnter_ON, hEnter_PROCESS};
voidFunc ActionWhileInState[S_MAX] = {hInState_OFF, hInState_ON, hInState_PROCESS};
voidFunc UponExit[S_MAX] = {hExit_OFF, hExit_ON, hExit_PROCESS};
I looked up typedef and lookup tables and have a basic understanding, but was hoping someone could give a brief walk through of how this works in this case?
The full part of code this relates to is below for completeness.
enum states { S_OFF, S_ON, S_PROCESS, S_MAX };
enum events { E_OFF, E_ON, E_START, E_MAX};
typedef void (* const voidFunc)(void);
void hEnter_OFF(void); void hEnter_ON(void); void hEnter_PROCESS(void);
void hInState_OFF(void); void hInState_ON(void); void hInState_PROCESS(void);
void hExit_OFF(void); void hExit_ON(void); void hExit_PROCESS(void);
voidFunc UponEnter[S_MAX] = {hEnter_OFF, hEnter_ON, hEnter_PROCESS};
voidFunc ActionWhileInState[S_MAX] = {hInState_OFF, hInState_ON, hInState_PROCESS};
voidFunc UponExit[S_MAX] = {hExit_OFF, hExit_ON, hExit_PROCESS};
enum states StateMachine(enum events event, enum states Current_State)
{
int Next_State = Current_State;
switch ( Current_State )
{
case S_OFF:
switch (event )
{
// A transition to the next state will occur here
case E_ON:
Next_State = S_ON;
break;
default: // Default case placed here to avoid Eclipse warnings as Eclipse expects
break; //to handle all enumerated values
}
break;
case S_ON:
switch (event )
{
// A transition to the next state will occur here
case E_OFF:
Next_State = S_OFF;
break;
case E_START:
Next_State = S_PROCESS;
break;
default: // Default case placed here to avoid Eclipse warnings as Eclipse expects
break; //to handle all enumerated values
}
break;
case S_PROCESS:
switch (event )
{
// A transition to the next state will occur here
case E_OFF:
Next_State = S_OFF;
break;
default: // Default case placed here to avoid Eclipse warnings as Eclipse expects
break; //to handle all enumerated values
}
break;
// The program should never arrive here
default:
break;
}
if (Next_State != Current_State)
{
// Function call for Upon Exit function, it can be omitted but allows extra functionality
UponExit[Current_State]();
// Function call for Upon Enter function, it can be omitted but allows extra functionality
if (Next_State != S_MAX) UponEnter[Next_State]();
}
else // I think ActionWhileInState should only be called when NOT doing a transition!
{
if ( event != E_MAX) ActionWhileInState[Current_State]();
}
return Next_State;
}
Thanks,
Ant
typedef void (* const voidFunc)(void);
is a typedef for a function pointer (for a pointer to a function expecting no parameters and not returning anything to be exact). So your arrays
voidFunc UponEnter[S_MAX] = {hEnter_OFF, hEnter_ON, hEnter_PROCESS};
voidFunc ActionWhileInState[S_MAX] = {hInState_OFF, hInState_ON, hInState_PROCESS};
voidFunc UponExit[S_MAX] = {hExit_OFF, hExit_ON, hExit_PROCESS};
Each hold three different function pointers, one for each state. So a line like
UponExit[Current_State]();
calls the function the pointer UponExit[Current_State] points to which is hExit_OFF if Current_State is 0, hExit_ON if Current_State is 1 or hExit_PROCESS if Current_State is 2.
You could also write the line like that:
(*UponExit[Current_State])();
What looks more complicated but makes clear that UponExit[Current_State] is a function pointer that is "dereferenced" to a call to the function it points to.
Edit:
You have an enum for your states:
enum states { S_OFF, S_ON, S_PROCESS, S_MAX };
Thus, S_OFF == 0, S_ON == 1, S_PROCESS == 2 and S_MAX == 3 (look here)

Resources