Related
So I have seen that many suggestions on implementing a state machine in C involve a state struct or the like, but I was wondering why we can't just use a while(1) for very simple state machines. For example,
int currentstate = state1;
void state1function(){
dosomething();
if(user chooses to go to state 2){
currentstate = state2;
}
}
int main{
while(1){
if(currentstate == state1){
state1function();
}
else if (currenstate == state2){
state2function();
}
Basically keeping track of the state in a global variable, and in the while loop calling a function depending on the state. This seems simple to me and i don't really see why it wouldn't work.
Can anyone please tell me why something like this would not work/would not be recommended?
Thanks
Sooner or later, using this approach, you will find that it would be convenient to have:
An explicit transition table.
OnEntry(), OnExit(), Do(), OnEvent() functions for each state.
Actions performed on a transition.
Guards. (explicit conditions for transitions to be triggered)
Nested state machines.
Concurrent state machines. Meaning: Multiple FSM running next to each other.
Communicating concurrent, nested state machines.
Somewhere along this ladder of sophistication, you will most likely abandon the brute force style, you started with, which might have looked like the code you gave in your question.
The while(1) construct is quite unrelated to state machines. It is used whenever a single thread of execution (main() or OS threads) are long-running, as they typically are on embedded systems or server-applications. If the application is written in form of a state machine or in other forms, does not really matter.
Depending on the problem you're trying to solve, a global or a static is a straightforward solution. Using a struct comes in handy when you need to manage more than one state machine at a time, and/or when you need to change state from more than one thread or process.
The while loop without wait will consume lot of CPU. I think an event mechanism using mutex or seamphores will be useful.
C is not asynchronous (by default). An infinite loop will just cause your program to "freeze" until the results are done.
I am writing a simple ncurses program with a menu and different sections (create/view etc), all using the keyboard. Currently I have one getkey routine and then switches to determine which section the keyboard input is for, like this:
ch = getch();
if(menu){
switch(ch){
...
if(create){
switch(ch){
...
if(view){
switch(ch){
...
is this the best way to do this or should I have different getkey routines for each section (menu_getkey(), view_getkey() and so on) - what is the best way to do this?
This gets into a bit of design (and so might be a bit subjective), but I think your approach is fine. It allows you to handle common input logic close to the getch() before diving into specifics, and in general it's usually a good idea to handle events (e.g., keyboard input) in a single location if possible -- having multiple event loops tends to get messy as programs grow.
An equivalent way of writing it would be something like the following:
enum { MENU, CREATE, VIEW } input_focus = MENU;
...
void input_loop(void) {
for (;;) {
int ch = getch();
/* Put common input logic here. */
switch (input_focus) {
case MENU:
handle_menu_input(ch);
break;
case CREATE:
handle_create_input(ch);
break;
case VIEW:
handle_view_input(ch);
break;
}
}
}
Your individual cases might have to have some more logic in them, but you get the idea.
Having short functions helps with readability. Modern compilers are smart enough to inline functions that are only called once (remove the function call and insert the code directly instead), so don't worry about performance. (And in case someone nit-picky reads this -- yes, it would require link time optimization for functions in different compilation units.)
Having a function pointer that you update to always point to the current input focus instead of having the switch would be another option, but it's overkill here. It's ~kinda what C++ does for virtual functions.
In one of my first code reviews (a while back), I was told that it's good practice to include a default clause in all switch statements. I recently remembered this advice but can't remember what the justification was. It sounds fairly odd to me now.
Is there a sensible reason for always including a default statement?
Is this language dependent? I don't remember what language I was using at the time - maybe this applies to some languages and not to others?
Switch cases should almost always have a default case.
Reasons to use a default
1.To 'catch' an unexpected value
switch(type)
{
case 1:
//something
case 2:
//something else
default:
// unknown type! based on the language,
// there should probably be some error-handling
// here, maybe an exception
}
2. To handle 'default' actions, where the cases are for special behavior.
You see this a LOT in menu-driven programs and bash shell scripts. You might also see this when a variable is declared outside the switch-case but not initialized, and each case initializes it to something different. Here the default needs to initialize it too so that down the line code that accesses the variable doesn't raise an error.
3. To show someone reading your code that you've covered that case.
variable = (variable == "value") ? 1 : 2;
switch(variable)
{
case 1:
// something
case 2:
// something else
default:
// will NOT execute because of the line preceding the switch.
}
This was an over-simplified example, but the point is that someone reading the code shouldn't wonder why variable cannot be something other than 1 or 2.
The only case I can think of to NOT use default is when the switch is checking something where its rather obvious every other alternative can be happily ignored
switch(keystroke)
{
case 'w':
// move up
case 'a':
// move left
case 's':
// move down
case 'd':
// move right
// no default really required here
}
No.
What if there is no default action, context matters. What if you only care to act on a few values?
Take the example of reading keypresses for a game
switch(a)
{
case 'w':
// Move Up
break;
case 's':
// Move Down
break;
case 'a':
// Move Left
break;
case 'd':
// Move Right
break;
}
Adding:
default: // Do nothing
Is just a waste of time and increases the complexity of the code for no reason.
NOT having the default case can actually be beneficial in some situations.
If your switch cases are enums values, by not having a default case, you can get a compiler warning if you are missing any cases. That way, if new enum values are added in the future and you forget to add cases for these values in the switch, you can find out about the problem at compile time. You should still make sure the code takes appropriate action for unhandled values, in case an invalid value was cast to the enum type. So this may work best for simple cases where you can return within the enum case rather than break.
enum SomeEnum
{
ENUM_1,
ENUM_2,
// More ENUM values may be added in future
};
int foo(SomeEnum value)
{
switch (value)
{
case ENUM_1:
return 1;
case ENUM_2:
return 2;
}
// handle invalid values here
return 0;
}
I would always use a default clause, no matter what language you are working in.
Things can and do go wrong. Values will not be what you expect, and so on.
Not wanting to include a default clause implies you are confident that you know the set of possible values. If you believe you know the set of possible values then, if the value is outside this set of possible values, you'd want to be informed of it - it's certainly an error.
That's the reason why you should always use a default clause and throw an error, for example in Java:
switch (myVar) {
case 1: ......; break;
case 2: ......; break;
default: throw new RuntimeException("unreachable");
}
There's no reason to include more information than just the "unreachable" string; if it actually happens, you're going to need to look at the source and the values of the variables etc anyway, and the exception stacktrace will include that line number, so no need to waste your time writing more text into the exception message.
Should a "switch" statement always include a default clause? No. It should usually include a default.
Including a default clause only makes sense if there's something for it to do, such as assert an error condition or provide a default behavior. Including one "just because" is cargo-cult programming and provides no value. It's the "switch" equivalent of saying that all "if" statements should include an "else".
Here's a trivial example of where it makes no sense:
void PrintSign(int i)
{
switch (Math.Sign(i))
{
case 1:
Console.Write("positive ");
break;
case -1:
Console.Write("negative ");
break;
default: // useless
}
Console.Write("integer");
}
This is the equivalent of:
void PrintSign(int i)
{
int sgn = Math.Sign(i);
if (sgn == 1)
Console.Write("positive ");
else if (sgn == -1)
Console.Write("negative ");
else // also useless
{
}
Console.Write("integer");
}
In my company, we write software for the Avionics and Defense market, and we always include a default statement, because ALL cases in a switch statement must be explicitly handled (even if it is just a comment saying 'Do nothing'). We cannot afford the software just to misbehave or simply crash on unexpected (or even what we think impossible) values.
It can be discussed that a default case is not always necessary, but by always requiring it, it is easily checked by our code analyzers.
As far as i see it the answer is 'default' is optional, saying a switch must always contain a default is like saying every 'if-elseif' must contain a 'else'.
If there is a logic to be done by default, then the 'default' statement should be there, but otherwise the code could continue executing without doing anything.
I disagree with the most voted answer of Vanwaril above.
Any code adds complexity. Also tests and documentation must be done for it. So it is always good if you can program using less code. My opinion is that I use a default clause for non-exhaustive switch statements while I use no default clause for exhaustive switch statements. To be sure that I did that right I use a static code analysis tool. So let's go into the details:
Nonexhaustive switch statements: Those should always have a default value. As the name suggests those are statements which do not cover all possible values. This also might not be possible, e.g. a switch statement on an integer value or on a String. Here I would like to use the example of Vanwaril (It should be mentioned that I think he used this example to make a wrong suggestion. I use it here to state the opposite --> Use a default statement):
switch(keystroke)
{
case 'w':
// move up
case 'a':
// move left
case 's':
// move down
case 'd':
// move right
default:
// cover all other values of the non-exhaustive switch statement
}
The player could press any other key. Then we could not do anything (this can be shown in the code just by adding a comment to the default case) or it should for example print something on the screen. This case is relevant as it may happen.
Exhaustive switch statements: Those switch statements cover all possible values, e.g. a switch statement on an enumeration of grade system types. When developing code the first time it is easy to cover all values. However, as we are humans there is a small chance to forget some. Additionally if you add an enum value later such that all switch statements have to be adapted to make them exhaustive again opens the path to error hell. The simple solution is a static code analysis tool. The tool should check all switch statements and check if they are exhaustive or if they have a default value. Here an example for an exhaustive switch statement. First we need an enum:
public enum GradeSystemType {System1To6, SystemAToD, System0To100}
Then we need a variable of this enum like GradeSystemType type = .... An exhaustive switch statement would then look like this:
switch(type)
{
case GradeSystemType.System1To6:
// do something
case GradeSystemType.SystemAToD:
// do something
case GradeSystemType.System0To100:
// do something
}
So if we extend the GradeSystemType by for example System1To3 the static code analysis tool should detect that there is no default clause and the switch statement is not exhaustive so we are save.
Just one additional thing. If we always use a default clause it might happen that the static code analysis tool is not capable of detecting exhaustive or non-exhaustive switch statements as it always detects the default clause. This is super bad as we will not be informed if we extend the enum by another value and forget to add it to one switch statement.
Having a default clause when it's not really needed is Defensive programming
This usually leads to code that is overly complex because of too much error handling code.
This error handling and detection code harms the readability of the code, makes maintenance harder, and eventually leads to more bugs than it solves.
So I believe that if the default shouldn't be reached - you don't have to add it.
Note that "shouldn't be reached" means that if it reached it's a bug in the software - you do need to test values that may contain unwanted values because of user input, etc.
I would say it depends on the language, but in C if you're switching on a enum type and you handle every possible value, you're probably better off NOT including a default case. That way, if you add an additional enum tag later and forget to add it to the switch, a competent compiler will give you a warning about the missing case.
If you know that the switch statement will only ever have a strict defined set of labels or values, just do this to cover the bases, that way you will always get valid outcome.. Just put the default over the label that would programmatically/logically be the best handler for other values.
switch(ResponseValue)
{
default:
case No:
return false;
case Yes;
return true;
}
Atleast it is not mandatory in Java. According to JLS, it says atmost one default case can be present. Which means no default case is acceptable . It at times also depends on the context that you are using the switch statement. For example in Java, the following switch block does not require default case
private static void switch1(String name) {
switch (name) {
case "Monday":
System.out.println("Monday");
break;
case "Tuesday":
System.out.println("Tuesday");
break;
}
}
But in the following method which expects to return a String, default case comes handy to avoid compilation errors
private static String switch2(String name) {
switch (name) {
case "Monday":
System.out.println("Monday");
return name;
case "Tuesday":
System.out.println("Tuesday");
return name;
default:
return name;
}
}
though you can avoid compilation error for the above method without having default case by just having a return statement at the end, but providing default case makes it more readable.
If the switch value (switch(variable)) can't reach the default case, then default case is not at all needed. Even if we keep the default case, it is not at all executed. It is dead code.
It is an optional coding 'convention'. Depending on the use is whether or not it is needed. I personally believe that if you do not need it it shouldn't be there. Why include something that won't be used or reached by the user?
If the case possibilities are limited (i.e. a Boolean) then the default clause is redundant!
Some (outdated) guidelines say so, such as MISRA C:
The requirement for a final default clause is defensive programming. This clause shall either take appropriate action or contain a suitable comment as to why no action is taken.
That advice is outdated because it is not based on currently relevant criteria. The glaring omission being what Harlan Kassler said:
Leaving out the default case enables the compiler to optionally warn or fail when it sees an unhandled case. Static verifiability is after all better than any dynamic check, and therefore not a worthy sacrifice for when you need the dynamic check as well.
As Harlan also demonstrated, the functional equivalent of a default case can be recreated after the switch. Which is trivial when each case is an early return.
The typical need for a dynamic check is input handling, in a wide sense. If a value comes from outside the program's control, it can't be trusted.
This is also where Misra takes the standpoint of extreme defensive programming, whereby as long as an invalid value is physically representable, it must be checked for, no matter if the program is provably correct. Which makes sense if the software needs to be as reliable as possible in the presence of hardware errors. But as Ophir Yoktan said, most software are better off not "handling" bugs. The latter practice is sometimes called offensive programming.
You should have a default to catch un-expected values coming in.
However, I disagree with the Adrian Smith that your error message for default should be something totally meaningless. There may be an un-handled case you didn't forsee (which is kind of the point) that your user will end up seeing and a message like "unreachable" is entirely pointless and doesn't help anyone in that situation.
Case in point, how many times have you had an utterly meaningless BSOD? Or a fatal exception # 0x352FBB3C32342?
If there is no default case in a switch statement, the behavior can be unpredictable if that case
arises at some point of time, which was not predictable at development stage. It is a good practice
to include a default case.
switch ( x ){
case 0 : { - - - -}
case 1 : { - - - -}
}
/* What happens if case 2 arises and there is a pointer
* initialization to be made in the cases . In such a case ,
* we can end up with a NULL dereference */
Such a practice can result in a bug like NULL dereference, memory leak as well as other types of
serious bugs.
For example we assume that each condition initializes a pointer. But if default case is
supposed to arise and if we don’t initialize in this case, then there is every possibility of landing up
with a null pointer exception. Hence it is suggested to use a default case statement, even though it
may be trivial.
The default case may not necessary in the switch used by enum. when switch contained all value, the default case will never execute. So in this case, it is not necessary.
Depends on how the switch in particular language works, however in most languages when no case is matched, the execution falls through the switch statement without warning. Imagine you expected some set of values and handled them in switch, however you get another value in the input. Nothing happens and you don't know nothing happened. If you caught the case in default, you would know there was something wrong.
Should switch statements always contain a default clause ?
No switch cases can exist with out default case, in switch case default case will trigger switch value switch(x) in this case x when not match with any other case values.
I believe this is quite language specific and for the C++ case a minor point for enum class type. Which appears more safe than traditional C enum. BUT
If you look at the implementation of std::byte its something like:
enum class byte : unsigned char {} ;
Source: https://en.cppreference.com/w/cpp/language/enum
And also consider this:
Otherwise, if T is a enumeration type that is either scoped or
unscoped with fixed underlying type, and if the braced-init-list has
only one initializer, and if the conversion from the initializer to
the underlying type is non-narrowing, and if the initialization is
direct-list-initialization, then the enumeration is initialized with
the result of converting the initializer to its underlying type.
(since C++17)
Source: https://en.cppreference.com/w/cpp/language/list_initialization
This is an example of enum class representing values that are not defined enumerator. For this reason you cannot place complete trust in enums. Depending on application this might be important.
However, I really like what #Harlan Kassler said in his post and will start using that strategy in some situations myself.
Just an example of unsafe enum class:
enum class Numbers : unsigned
{
One = 1u,
Two = 2u
};
int main()
{
Numbers zero{ 0u };
return 0;
}
This question may sound cliched, but I am in a situation here.
I am trying to implement a finite state automaton to parse a certain string in C. As I started writing the code, I realised the code may be more readable if I used labels to mark the different states and use goto to jump from one state to another as the case comes.
Using the standard breaks and flag variables is quite cumbersome in this case and hard to keep track of the state.
What approach is better? More than anything else I am worried it may leave a bad impression on my boss, as I am on an internship.
There is nothing inherently wrong with goto. The reason they are often considered "taboo" is because of the way that some programmers (often coming from the assembly world) use them to create "spaghetti" code that is nearly impossible to understand. If you can use goto statements while keeping your code clean, readable, and bug-free, then more power to you.
Using goto statements and a section of code for each state is definitely one way of writing a state machine. The other method is to create a variable that will hold the current state and to use a switch statement (or similar) to select which code block to execute based on the value of the state variable. See Aidan Cully's answer for a good template using this second method.
In reality, the two methods are very similar. If you write a state machine using the state variable method and compile it, the generated assembly may very well resemble code written using the goto method (depending on your compiler's level of optimization). The goto method can be seen as optimizing out the extra variable and loop from the state variable method. Which method you use is a matter of personal choice, and as long as you are producing working, readable code I would hope that your boss wouldn't think any different of you for using one method over the other.
If you are adding this code to an existing code base which already contains state machines, I would recommend that you follow whichever convention is already in use.
Using a goto for implementing a state machine often makes good sense. If you're really concerned about using a goto, a reasonable alternative is often to have a state variable that you modify, and a switch statement based on that:
typedef enum {s0,s1,s2,s3,s4,...,sn,sexit} state;
state nextstate;
int done = 0;
nextstate = s0; /* set up to start with the first state */
while(!done)
switch(nextstate)
{
case s0:
nextstate = do_state_0();
break;
case s1:
nextstate = do_state_1();
break;
case s2:
nextstate = do_state_2();
break;
case s3:
.
.
.
.
case sn:
nextstate = do_state_n();
break;
case sexit:
done = TRUE;
break;
default:
/* some sort of unknown state */
break;
}
I'd use a FSM generator, like Ragel, if I wanted to leave a good impression on my boss.
The main benefit of this approach is that you are able to describe your state machine at a higher level of abstraction and don't need to concern yourself of whether to use goto or a switch. Not to mention in the particular case of Ragel that you can automatically get pretty diagrams of your FSM, insert actions at any point, automatically minimize the amount of states and various other benefits. Did I mention that the generated FSMs are also very fast?
The drawbacks are that they're harder to debug (automatic visualization helps a lot here) and that you need to learn a new tool (which is probably not worth it if you have a simple machine and you are not likely to write machines frequently.)
I would use a variable that tracks what state you are in and a switch to handle them:
fsm_ctx_t ctx = ...;
state_t state = INITIAL_STATE;
while (state != DONE)
{
switch (state)
{
case INITIAL_STATE:
case SOME_STATE:
state = handle_some_state(ctx)
break;
case OTHER_STATE:
state = handle_other_state(ctx);
break;
}
}
Goto isn't neccessary evil, and I have to strongly disagree with Denis, yes goto might be a bad idea in most cases, but there are uses. The biggest fear with goto is so called "spagetti-code", untraceable code paths. If you can avoid that and if it will always be clear how the code behaves and you don't jump out of the function with a goto, there is nothing against goto. Just use it with caution and if you are tempted to use it, really evaluate the situation and find a better solution. If you unable to do this, goto can be used.
Avoid goto unless the complexity added (to avoid) is more confusing.
In practical engineering problems, there's room for goto used very sparingly. Academics and non-engineers wring their fingers needlessly over using goto. That said, if you paint yourself into an implementation corner where a lot of goto is the only way out, rethink the solution.
A correctly working solution is usually the primary objective. Making it correct and maintainable (by minimizing complexity) has many life cycle benefits. Make it work first, and then clean it up gradually, preferably by simplifying and removing ugliness.
I don't know your specific code, but is there a reason something like this:
typedef enum {
STATE1, STATE2, STATE3
} myState_e;
void myFsm(void)
{
myState_e State = STATE1;
while(1)
{
switch(State)
{
case STATE1:
State = STATE2;
break;
case STATE2:
State = STATE3;
break;
case STATE3:
State = STATE1;
break;
}
}
}
wouldn't work for you? It doesn't use goto, and is relatively easy to follow.
Edit: All those State = fragments violate DRY, so I might instead do something like:
typedef int (*myStateFn_t)(int OldState);
int myStateFn_Reset(int OldState, void *ObjP);
int myStateFn_Start(int OldState, void *ObjP);
int myStateFn_Process(int OldState, void *ObjP);
myStateFn_t myStateFns[] = {
#define MY_STATE_RESET 0
myStateFn_Reset,
#define MY_STATE_START 1
myStateFn_Start,
#define MY_STATE_PROCESS 2
myStateFn_Process
}
int myStateFn_Reset(int OldState, void *ObjP)
{
return shouldStart(ObjP) ? MY_STATE_START : MY_STATE_RESET;
}
int myStateFn_Start(int OldState, void *ObjP)
{
resetState(ObjP);
return MY_STATE_PROCESS;
}
int myStateFn_Process(int OldState, void *ObjP)
{
return (process(ObjP) == DONE) ? MY_STATE_RESET : MY_STATE_PROCESS;
}
int stateValid(int StateFnSize, int State)
{
return (State >= 0 && State < StateFnSize);
}
int stateFnRunOne(myStateFn_t StateFns, int StateFnSize, int State, void *ObjP)
{
return StateFns[OldState])(State, ObjP);
}
void stateFnRun(myStateFn_t StateFns, int StateFnSize, int CurState, void *ObjP)
{
int NextState;
while(stateValid(CurState))
{
NextState = stateFnRunOne(StateFns, StateFnSize, CurState, ObjP);
if(! stateValid(NextState))
LOG_THIS(CurState, NextState);
CurState = NextState;
}
}
which is, of course, much longer than the first attempt (funny thing about DRY). But it's also more robust - failure to return the state from one of the state functions will result in a compiler warning, rather than silently ignore a missing State = in the earlier code.
I would recommend you the "Dragon book": Compilers, Principles-Techniques-Tools from Aho, Sethi and Ullman. (It is rather expensive to buy but you for sure will find it in a library). There you will find anything you will need to parse strings and build finite automatons. There is no place I could find with a goto. Usually the states are a data table and transitions are functions like accept_space()
I can't see much of a difference between goto and switch. I might prefer switch/while because it gives you a place guaranteed to execute after the switch (where you could throw in logging and reason about your program). With GOTO you just keep jumping from label to label, so to throw in logging you'd have to put it at every label.
But aside from that there shouldn't be much difference. Either way, if you didn't break it up into functions and not every state uses/initializes all local variables you may end up with a mess of almost spaghetti code not knowing which states changed which variables and making it very difficult to debug/reason about.
As an aside, can you maybe parse the string using a regular expression? Most programming languages have libraries that allow using them. The regular expressions often create an FSM as part of their implementation. Generally regular expressions work for non arbitrarily nested items and for everything else there is a parser generator(ANTLR/YACC/LEX). It is generally much easier to maintain a grammar/regex than the underlying state machine. Also you said you were on an internship, and generally they might give you easier work than say a senior developer, so there is a strong chance that a regex may work on the string. Also regular expressions generally aren't emphasized in college so try using Google to read up on them.
I have a rather long switch-case statement. Some of the cases are really short and trivial. A few are longer and need some variables that are never used anywhere else, like this:
switch (action) {
case kSimpleAction:
// Do something simple
break;
case kComplexAction: {
int specialVariable = 5;
// Do something complex with specialVariable
} break;
}
The alternative would be to declare that variable before going into the switch like this:
int specialVariable = 5;
switch (action) {
case kSimpleAction:
// Do something simple
break;
case kComplexAction:
// Do something complex with specialVariable
break;
}
This can get rather confusing since it is not clear to which case the variable belongs and it uses some unnecessary memory.
However, I have never seen this usage anywhere else.
Do you think it is a good idea to declare variables locally in a block for a single case?
If specialVariable is not used after the switch block, declare it in the "case" block.
In general, variables should be declared in the smallest possible scope it will be used.
If the switch statement becomes unmanageably huge, you may want to convert to a table of function pointers. By having the code for each case in separate functions, you don't have to worry about variable declaration and definitions.
Another advantage is that you can put each case function into a separate translation unit. This will speed up the build process by only compiling the cases that have changed. Also improves quality by isolating changes to their smallest scope.
Yes define variables in the narrowest scope needed.
So example 1 is preferred.
I'm all for
case X:
{
type var;
...;
}
break; // I like to keep breaks outside of the blocks if I can
If the stuff in there gets too complicated and starts getting in the way of your ability to see the entire switch/case as a switch/case then consider moving as much as you can into one or two inline functions that get called by the cases code. This can improve readability without throwing function call overhead in there.
Agree with Max -- smallest possible scope as possible. That way, when the next person needs to update it, he/she doesn't need to worry about if the variable is used in other sections of the switch statement.
My own rule for switch statements is that there should be a maximum of a single statement inside each case, excluding a break. This means the statement is either a an initialisation, an assignment or a function call. Putting any more complex code in a case is a recipe for disaster - I "fondly" remember all the Windows code I've seen (inspired by Petzold) which processed message parameters in-line in the same case of a windows procedure.
So call a function, and put the variable in there!