Should switch statements always contain a default clause? - default

In one of my first code reviews (a while back), I was told that it's good practice to include a default clause in all switch statements. I recently remembered this advice but can't remember what the justification was. It sounds fairly odd to me now.
Is there a sensible reason for always including a default statement?
Is this language dependent? I don't remember what language I was using at the time - maybe this applies to some languages and not to others?

Switch cases should almost always have a default case.
Reasons to use a default
1.To 'catch' an unexpected value
switch(type)
{
case 1:
//something
case 2:
//something else
default:
// unknown type! based on the language,
// there should probably be some error-handling
// here, maybe an exception
}
2. To handle 'default' actions, where the cases are for special behavior.
You see this a LOT in menu-driven programs and bash shell scripts. You might also see this when a variable is declared outside the switch-case but not initialized, and each case initializes it to something different. Here the default needs to initialize it too so that down the line code that accesses the variable doesn't raise an error.
3. To show someone reading your code that you've covered that case.
variable = (variable == "value") ? 1 : 2;
switch(variable)
{
case 1:
// something
case 2:
// something else
default:
// will NOT execute because of the line preceding the switch.
}
This was an over-simplified example, but the point is that someone reading the code shouldn't wonder why variable cannot be something other than 1 or 2.
The only case I can think of to NOT use default is when the switch is checking something where its rather obvious every other alternative can be happily ignored
switch(keystroke)
{
case 'w':
// move up
case 'a':
// move left
case 's':
// move down
case 'd':
// move right
// no default really required here
}

No.
What if there is no default action, context matters. What if you only care to act on a few values?
Take the example of reading keypresses for a game
switch(a)
{
case 'w':
// Move Up
break;
case 's':
// Move Down
break;
case 'a':
// Move Left
break;
case 'd':
// Move Right
break;
}
Adding:
default: // Do nothing
Is just a waste of time and increases the complexity of the code for no reason.

NOT having the default case can actually be beneficial in some situations.
If your switch cases are enums values, by not having a default case, you can get a compiler warning if you are missing any cases. That way, if new enum values are added in the future and you forget to add cases for these values in the switch, you can find out about the problem at compile time. You should still make sure the code takes appropriate action for unhandled values, in case an invalid value was cast to the enum type. So this may work best for simple cases where you can return within the enum case rather than break.
enum SomeEnum
{
ENUM_1,
ENUM_2,
// More ENUM values may be added in future
};
int foo(SomeEnum value)
{
switch (value)
{
case ENUM_1:
return 1;
case ENUM_2:
return 2;
}
// handle invalid values here
return 0;
}

I would always use a default clause, no matter what language you are working in.
Things can and do go wrong. Values will not be what you expect, and so on.
Not wanting to include a default clause implies you are confident that you know the set of possible values. If you believe you know the set of possible values then, if the value is outside this set of possible values, you'd want to be informed of it - it's certainly an error.
That's the reason why you should always use a default clause and throw an error, for example in Java:
switch (myVar) {
case 1: ......; break;
case 2: ......; break;
default: throw new RuntimeException("unreachable");
}
There's no reason to include more information than just the "unreachable" string; if it actually happens, you're going to need to look at the source and the values of the variables etc anyway, and the exception stacktrace will include that line number, so no need to waste your time writing more text into the exception message.

Should a "switch" statement always include a default clause? No. It should usually include a default.
Including a default clause only makes sense if there's something for it to do, such as assert an error condition or provide a default behavior. Including one "just because" is cargo-cult programming and provides no value. It's the "switch" equivalent of saying that all "if" statements should include an "else".
Here's a trivial example of where it makes no sense:
void PrintSign(int i)
{
switch (Math.Sign(i))
{
case 1:
Console.Write("positive ");
break;
case -1:
Console.Write("negative ");
break;
default: // useless
}
Console.Write("integer");
}
This is the equivalent of:
void PrintSign(int i)
{
int sgn = Math.Sign(i);
if (sgn == 1)
Console.Write("positive ");
else if (sgn == -1)
Console.Write("negative ");
else // also useless
{
}
Console.Write("integer");
}

In my company, we write software for the Avionics and Defense market, and we always include a default statement, because ALL cases in a switch statement must be explicitly handled (even if it is just a comment saying 'Do nothing'). We cannot afford the software just to misbehave or simply crash on unexpected (or even what we think impossible) values.
It can be discussed that a default case is not always necessary, but by always requiring it, it is easily checked by our code analyzers.

As far as i see it the answer is 'default' is optional, saying a switch must always contain a default is like saying every 'if-elseif' must contain a 'else'.
If there is a logic to be done by default, then the 'default' statement should be there, but otherwise the code could continue executing without doing anything.

I disagree with the most voted answer of Vanwaril above.
Any code adds complexity. Also tests and documentation must be done for it. So it is always good if you can program using less code. My opinion is that I use a default clause for non-exhaustive switch statements while I use no default clause for exhaustive switch statements. To be sure that I did that right I use a static code analysis tool. So let's go into the details:
Nonexhaustive switch statements: Those should always have a default value. As the name suggests those are statements which do not cover all possible values. This also might not be possible, e.g. a switch statement on an integer value or on a String. Here I would like to use the example of Vanwaril (It should be mentioned that I think he used this example to make a wrong suggestion. I use it here to state the opposite --> Use a default statement):
switch(keystroke)
{
case 'w':
// move up
case 'a':
// move left
case 's':
// move down
case 'd':
// move right
default:
// cover all other values of the non-exhaustive switch statement
}
The player could press any other key. Then we could not do anything (this can be shown in the code just by adding a comment to the default case) or it should for example print something on the screen. This case is relevant as it may happen.
Exhaustive switch statements: Those switch statements cover all possible values, e.g. a switch statement on an enumeration of grade system types. When developing code the first time it is easy to cover all values. However, as we are humans there is a small chance to forget some. Additionally if you add an enum value later such that all switch statements have to be adapted to make them exhaustive again opens the path to error hell. The simple solution is a static code analysis tool. The tool should check all switch statements and check if they are exhaustive or if they have a default value. Here an example for an exhaustive switch statement. First we need an enum:
public enum GradeSystemType {System1To6, SystemAToD, System0To100}
Then we need a variable of this enum like GradeSystemType type = .... An exhaustive switch statement would then look like this:
switch(type)
{
case GradeSystemType.System1To6:
// do something
case GradeSystemType.SystemAToD:
// do something
case GradeSystemType.System0To100:
// do something
}
So if we extend the GradeSystemType by for example System1To3 the static code analysis tool should detect that there is no default clause and the switch statement is not exhaustive so we are save.
Just one additional thing. If we always use a default clause it might happen that the static code analysis tool is not capable of detecting exhaustive or non-exhaustive switch statements as it always detects the default clause. This is super bad as we will not be informed if we extend the enum by another value and forget to add it to one switch statement.

Having a default clause when it's not really needed is Defensive programming
This usually leads to code that is overly complex because of too much error handling code.
This error handling and detection code harms the readability of the code, makes maintenance harder, and eventually leads to more bugs than it solves.
So I believe that if the default shouldn't be reached - you don't have to add it.
Note that "shouldn't be reached" means that if it reached it's a bug in the software - you do need to test values that may contain unwanted values because of user input, etc.

I would say it depends on the language, but in C if you're switching on a enum type and you handle every possible value, you're probably better off NOT including a default case. That way, if you add an additional enum tag later and forget to add it to the switch, a competent compiler will give you a warning about the missing case.

If you know that the switch statement will only ever have a strict defined set of labels or values, just do this to cover the bases, that way you will always get valid outcome.. Just put the default over the label that would programmatically/logically be the best handler for other values.
switch(ResponseValue)
{
default:
case No:
return false;
case Yes;
return true;
}

Atleast it is not mandatory in Java. According to JLS, it says atmost one default case can be present. Which means no default case is acceptable . It at times also depends on the context that you are using the switch statement. For example in Java, the following switch block does not require default case
private static void switch1(String name) {
switch (name) {
case "Monday":
System.out.println("Monday");
break;
case "Tuesday":
System.out.println("Tuesday");
break;
}
}
But in the following method which expects to return a String, default case comes handy to avoid compilation errors
private static String switch2(String name) {
switch (name) {
case "Monday":
System.out.println("Monday");
return name;
case "Tuesday":
System.out.println("Tuesday");
return name;
default:
return name;
}
}
though you can avoid compilation error for the above method without having default case by just having a return statement at the end, but providing default case makes it more readable.

If the switch value (switch(variable)) can't reach the default case, then default case is not at all needed. Even if we keep the default case, it is not at all executed. It is dead code.

It is an optional coding 'convention'. Depending on the use is whether or not it is needed. I personally believe that if you do not need it it shouldn't be there. Why include something that won't be used or reached by the user?
If the case possibilities are limited (i.e. a Boolean) then the default clause is redundant!

Some (outdated) guidelines say so, such as MISRA C:
The requirement for a final default clause is defensive programming. This clause shall either take appropriate action or contain a suitable comment as to why no action is taken.
That advice is outdated because it is not based on currently relevant criteria. The glaring omission being what Harlan Kassler said:
Leaving out the default case enables the compiler to optionally warn or fail when it sees an unhandled case. Static verifiability is after all better than any dynamic check, and therefore not a worthy sacrifice for when you need the dynamic check as well.
As Harlan also demonstrated, the functional equivalent of a default case can be recreated after the switch. Which is trivial when each case is an early return.
The typical need for a dynamic check is input handling, in a wide sense. If a value comes from outside the program's control, it can't be trusted.
This is also where Misra takes the standpoint of extreme defensive programming, whereby as long as an invalid value is physically representable, it must be checked for, no matter if the program is provably correct. Which makes sense if the software needs to be as reliable as possible in the presence of hardware errors. But as Ophir Yoktan said, most software are better off not "handling" bugs. The latter practice is sometimes called offensive programming.

You should have a default to catch un-expected values coming in.
However, I disagree with the Adrian Smith that your error message for default should be something totally meaningless. There may be an un-handled case you didn't forsee (which is kind of the point) that your user will end up seeing and a message like "unreachable" is entirely pointless and doesn't help anyone in that situation.
Case in point, how many times have you had an utterly meaningless BSOD? Or a fatal exception # 0x352FBB3C32342?

If there is no default case in a switch statement, the behavior can be unpredictable if that case
arises at some point of time, which was not predictable at development stage. It is a good practice
to include a default case.
switch ( x ){
case 0 : { - - - -}
case 1 : { - - - -}
}
/* What happens if case 2 arises and there is a pointer
* initialization to be made in the cases . In such a case ,
* we can end up with a NULL dereference */
Such a practice can result in a bug like NULL dereference, memory leak as well as other types of
serious bugs.
For example we assume that each condition initializes a pointer. But if default case is
supposed to arise and if we don’t initialize in this case, then there is every possibility of landing up
with a null pointer exception. Hence it is suggested to use a default case statement, even though it
may be trivial.

The default case may not necessary in the switch used by enum. when switch contained all value, the default case will never execute. So in this case, it is not necessary.

Depends on how the switch in particular language works, however in most languages when no case is matched, the execution falls through the switch statement without warning. Imagine you expected some set of values and handled them in switch, however you get another value in the input. Nothing happens and you don't know nothing happened. If you caught the case in default, you would know there was something wrong.

Should switch statements always contain a default clause ?
No switch cases can exist with out default case, in switch case default case will trigger switch value switch(x) in this case x when not match with any other case values.

I believe this is quite language specific and for the C++ case a minor point for enum class type. Which appears more safe than traditional C enum. BUT
If you look at the implementation of std::byte its something like:
enum class byte : unsigned char {} ;
Source: https://en.cppreference.com/w/cpp/language/enum
And also consider this:
Otherwise, if T is a enumeration type that is either scoped or
unscoped with fixed underlying type, and if the braced-init-list has
only one initializer, and if the conversion from the initializer to
the underlying type is non-narrowing, and if the initialization is
direct-list-initialization, then the enumeration is initialized with
the result of converting the initializer to its underlying type.
(since C++17)
Source: https://en.cppreference.com/w/cpp/language/list_initialization
This is an example of enum class representing values that are not defined enumerator. For this reason you cannot place complete trust in enums. Depending on application this might be important.
However, I really like what #Harlan Kassler said in his post and will start using that strategy in some situations myself.
Just an example of unsafe enum class:
enum class Numbers : unsigned
{
One = 1u,
Two = 2u
};
int main()
{
Numbers zero{ 0u };
return 0;
}

Related

Absence of default case in switch command - could this be a runtime error?

I have an argument with my compilation course lecturer:
In the test that was part of this course, some of the questions referred the identification and classification code segments written in C.
Each of these questions must indicate at what stage the error will expose:
a.Lexical analysis
b. Synthetic analysis
c. Semantic analysis
d. Running time (under certain conditions)
e. This is not an error.
One of the questions in this style was as follows:
Switch command that does not have the default component. For example:
switch (key){
case 1: .........
case 2: .........
case 3:..........
}
Now, in the official test solution, in the above case only option e was correct.
However, I argue that Option d cannot necessarily be rejected outright, and that it is also true.
As an argument, I showed (after the test) the following two examples to my lecturer:
1)
(from : https://cwe.mitre.org/data/definitions/478.html)
2)
(from : Should switch statements always contain a default clause?)
However, he is not yet convinced that the runtime error option is considered in this case. He said that because of the questions mentioned above, it is only for commands or snippets that are shown directly in the questions and because the code does not have the intent of the code, so in this case they are actually asking if this structure is by itself invalid, so you are denied a runtime error here (I personally do not notice any contradictory here ...).
I would be happy if you could share your views on this issue.
Your lecturer is correct.
Omitting the default case in a switch is perfectly valid code and will not directly lead to any kind of problem. It may well be exactly what the programmer intended, and do the correct thing.
Of course, it is always possible to add some code that would cause problems, but that is a problem with this added code, not with the switch per se. Code style rules like "always add a default case" may guard against certain types of programming mistakes, but not following them does not automatically cause these mistakes - it "just" requires more caution.
From a code style perspective, it is usually better to be explicit about intentionally ignoring certain cases, or to add some default handler for guarding against unexpected values, but that does not mean that omitting such a handler is always incorrect by itself.
(Note that the currently second most upvoted answer in the question you linked to yourself argues for omitting default cases that are not doing anything useful, in order to reduce clutter - I don't fully agree with that, but it is a matter of style)

Why defining enums in the function bloc could be a bad practice?

I've coded a stateflow handler and to reduce the risk of using the state flow enums outside the stateflow handler, I've defined the stateflow enums inside the function bloc.
My code looks like:
static void RequestHandler(bool isThisRequestANewRequest)
{
typedef enum
{
STATE_NEW_REQUEST,
STATE_1,
STATE_2,
STATE_ERROR,
} States;
static States state = STATE_ERROR;
if(isThisRequestANewRequest == true)
{
state = STATE_NEW_REQUEST;
}
switch(state)
{
case STATE_NEW_REQUEST:
//init request flags
state = STATE_1;
//lint -fallthrough
case STATE_1:
//do something
break;
case STATE_2:
//do something else
break;
case STATE_ERROR:
default:
//do something in case of error
break;
}
}
Is this can be considered as a good practice? Is there any risk? Is there any cons? (maintenance, reading, ...)
One collegue of mine told me that this was not, but I'm waiting for fact based answers, not just raw opinions.
Note: My question applies to both monothreaded and multithreaded tasks.
Generally, reducing scope as much as possible is good practice. If you want some sort of canonical reference to that, the closest I can come up with is MISRA-C:2012, rule 8.9 which recommends that objects that are only used by a single function should be declared at block scope. I don't see why the same wouldn't apply to types.
It is however bad practice to rely on fall-through in switch statements, since that blocks static analysers (like Lint in this case) from finding real bugs caused by missing break. It also makes the code harder to read and maintain - I would personally consider fall-through switches much worse practice than code repetition.
If you are to execute multiple states per function call, consider using simple if statements:
if(state == STATE_NEW_REQUEST)
{
...
state = STATE_1;
}
if(state == STATE_1)
{
...
}
Otherwise, if you are only executing one state per function call, you can use a switch. In general, the need to "execute several states per state" is a hint that the broader program design could be improved.

How do you switch() on an enum that has multiple similar values in C/C++?

Let's say that I have an enum as such:
typedef enum
{
gray = 4, //Gr[ae]y should be the same
grey = 4,
blue = 5,
red = 6
} FOO;
I then want to switch on this:
switch(f){
case gray:
case grey:
printf("The color of an elephant\n"); break;
case blue:
printf("The color of the sky\n"); break;
case red:
printf("The color of an apple\n"); break;
default:
printf("I don't know this color\n");
}
Basically I have enum that has values that are essentially synonyms that I want to handle exactly the same way. I tried the above switch, but it doesn't compile for me. Is there a way to do this, or am I stuck using if/else logic? (I'd rather not as there are 20+ enums and the switch is much cleaner looking
EDIT: Yes, I know that I can just pick one or the other (and no locales are not the solution), but doesn't it seem kind of odd that enums explicitly allow you to declare duplicate values yet you then can't use them in a switch statement? I want to use enums so that I can statically enforce in a library API that they are sending proper values (yes I know you can get around with typecasting, I'm just trying to prevent stupid mistakes and such). If I do so it now seems like I lose the ability to use it in a switch statement.
The compiler is just reducing the logic down to if/else logic. If
case 4:
case 5:
bar(); break;
is legal, why can't
case 4:
case 4:
bar(); break;
be legal? The compiler should be able to optimize that to one statement and move on.
You can't.
The C standard requires all the constant expression in the case labels for a given switch statement to have distinct values. This is checked at compile time. Having two case labels with the same value is a constraint violation, requiring a compile-time diagnostic. (This could be a non-fatal warning, but I don't know of any compiler that doesn't treat it as a fatal error.)
The rule is stated in N1570 6.8.4.2p3:
The expression of each case label shall be an integer constant
expression and no two of the case constant expressions in the same
switch statement shall have the same value after conversion.
C++ has similar rules.
This means, for example that this:
switch (blah) {
case 2+2:
case 4:
/* ... */
}
is also illegal. The compiler checks the values of the expressions, regardless of whether they have some distinct meaning to a human reader.
You'll just have to pick either gray or grey.
(In principle, the standard could have permitted two case labels to have the same value as long as they're grouped together, as in your example. But it wasn't defined that way, probably because it wasn't considered useful enough.)

To use goto or not?

This question may sound cliched, but I am in a situation here.
I am trying to implement a finite state automaton to parse a certain string in C. As I started writing the code, I realised the code may be more readable if I used labels to mark the different states and use goto to jump from one state to another as the case comes.
Using the standard breaks and flag variables is quite cumbersome in this case and hard to keep track of the state.
What approach is better? More than anything else I am worried it may leave a bad impression on my boss, as I am on an internship.
There is nothing inherently wrong with goto. The reason they are often considered "taboo" is because of the way that some programmers (often coming from the assembly world) use them to create "spaghetti" code that is nearly impossible to understand. If you can use goto statements while keeping your code clean, readable, and bug-free, then more power to you.
Using goto statements and a section of code for each state is definitely one way of writing a state machine. The other method is to create a variable that will hold the current state and to use a switch statement (or similar) to select which code block to execute based on the value of the state variable. See Aidan Cully's answer for a good template using this second method.
In reality, the two methods are very similar. If you write a state machine using the state variable method and compile it, the generated assembly may very well resemble code written using the goto method (depending on your compiler's level of optimization). The goto method can be seen as optimizing out the extra variable and loop from the state variable method. Which method you use is a matter of personal choice, and as long as you are producing working, readable code I would hope that your boss wouldn't think any different of you for using one method over the other.
If you are adding this code to an existing code base which already contains state machines, I would recommend that you follow whichever convention is already in use.
Using a goto for implementing a state machine often makes good sense. If you're really concerned about using a goto, a reasonable alternative is often to have a state variable that you modify, and a switch statement based on that:
typedef enum {s0,s1,s2,s3,s4,...,sn,sexit} state;
state nextstate;
int done = 0;
nextstate = s0; /* set up to start with the first state */
while(!done)
switch(nextstate)
{
case s0:
nextstate = do_state_0();
break;
case s1:
nextstate = do_state_1();
break;
case s2:
nextstate = do_state_2();
break;
case s3:
.
.
.
.
case sn:
nextstate = do_state_n();
break;
case sexit:
done = TRUE;
break;
default:
/* some sort of unknown state */
break;
}
I'd use a FSM generator, like Ragel, if I wanted to leave a good impression on my boss.
The main benefit of this approach is that you are able to describe your state machine at a higher level of abstraction and don't need to concern yourself of whether to use goto or a switch. Not to mention in the particular case of Ragel that you can automatically get pretty diagrams of your FSM, insert actions at any point, automatically minimize the amount of states and various other benefits. Did I mention that the generated FSMs are also very fast?
The drawbacks are that they're harder to debug (automatic visualization helps a lot here) and that you need to learn a new tool (which is probably not worth it if you have a simple machine and you are not likely to write machines frequently.)
I would use a variable that tracks what state you are in and a switch to handle them:
fsm_ctx_t ctx = ...;
state_t state = INITIAL_STATE;
while (state != DONE)
{
switch (state)
{
case INITIAL_STATE:
case SOME_STATE:
state = handle_some_state(ctx)
break;
case OTHER_STATE:
state = handle_other_state(ctx);
break;
}
}
Goto isn't neccessary evil, and I have to strongly disagree with Denis, yes goto might be a bad idea in most cases, but there are uses. The biggest fear with goto is so called "spagetti-code", untraceable code paths. If you can avoid that and if it will always be clear how the code behaves and you don't jump out of the function with a goto, there is nothing against goto. Just use it with caution and if you are tempted to use it, really evaluate the situation and find a better solution. If you unable to do this, goto can be used.
Avoid goto unless the complexity added (to avoid) is more confusing.
In practical engineering problems, there's room for goto used very sparingly. Academics and non-engineers wring their fingers needlessly over using goto. That said, if you paint yourself into an implementation corner where a lot of goto is the only way out, rethink the solution.
A correctly working solution is usually the primary objective. Making it correct and maintainable (by minimizing complexity) has many life cycle benefits. Make it work first, and then clean it up gradually, preferably by simplifying and removing ugliness.
I don't know your specific code, but is there a reason something like this:
typedef enum {
STATE1, STATE2, STATE3
} myState_e;
void myFsm(void)
{
myState_e State = STATE1;
while(1)
{
switch(State)
{
case STATE1:
State = STATE2;
break;
case STATE2:
State = STATE3;
break;
case STATE3:
State = STATE1;
break;
}
}
}
wouldn't work for you? It doesn't use goto, and is relatively easy to follow.
Edit: All those State = fragments violate DRY, so I might instead do something like:
typedef int (*myStateFn_t)(int OldState);
int myStateFn_Reset(int OldState, void *ObjP);
int myStateFn_Start(int OldState, void *ObjP);
int myStateFn_Process(int OldState, void *ObjP);
myStateFn_t myStateFns[] = {
#define MY_STATE_RESET 0
myStateFn_Reset,
#define MY_STATE_START 1
myStateFn_Start,
#define MY_STATE_PROCESS 2
myStateFn_Process
}
int myStateFn_Reset(int OldState, void *ObjP)
{
return shouldStart(ObjP) ? MY_STATE_START : MY_STATE_RESET;
}
int myStateFn_Start(int OldState, void *ObjP)
{
resetState(ObjP);
return MY_STATE_PROCESS;
}
int myStateFn_Process(int OldState, void *ObjP)
{
return (process(ObjP) == DONE) ? MY_STATE_RESET : MY_STATE_PROCESS;
}
int stateValid(int StateFnSize, int State)
{
return (State >= 0 && State < StateFnSize);
}
int stateFnRunOne(myStateFn_t StateFns, int StateFnSize, int State, void *ObjP)
{
return StateFns[OldState])(State, ObjP);
}
void stateFnRun(myStateFn_t StateFns, int StateFnSize, int CurState, void *ObjP)
{
int NextState;
while(stateValid(CurState))
{
NextState = stateFnRunOne(StateFns, StateFnSize, CurState, ObjP);
if(! stateValid(NextState))
LOG_THIS(CurState, NextState);
CurState = NextState;
}
}
which is, of course, much longer than the first attempt (funny thing about DRY). But it's also more robust - failure to return the state from one of the state functions will result in a compiler warning, rather than silently ignore a missing State = in the earlier code.
I would recommend you the "Dragon book": Compilers, Principles-Techniques-Tools from Aho, Sethi and Ullman. (It is rather expensive to buy but you for sure will find it in a library). There you will find anything you will need to parse strings and build finite automatons. There is no place I could find with a goto. Usually the states are a data table and transitions are functions like accept_space()
I can't see much of a difference between goto and switch. I might prefer switch/while because it gives you a place guaranteed to execute after the switch (where you could throw in logging and reason about your program). With GOTO you just keep jumping from label to label, so to throw in logging you'd have to put it at every label.
But aside from that there shouldn't be much difference. Either way, if you didn't break it up into functions and not every state uses/initializes all local variables you may end up with a mess of almost spaghetti code not knowing which states changed which variables and making it very difficult to debug/reason about.
As an aside, can you maybe parse the string using a regular expression? Most programming languages have libraries that allow using them. The regular expressions often create an FSM as part of their implementation. Generally regular expressions work for non arbitrarily nested items and for everything else there is a parser generator(ANTLR/YACC/LEX). It is generally much easier to maintain a grammar/regex than the underlying state machine. Also you said you were on an internship, and generally they might give you easier work than say a senior developer, so there is a strong chance that a regex may work on the string. Also regular expressions generally aren't emphasized in college so try using Google to read up on them.

Is it a good idea to define a variable in a local block for a case of a switch statement?

I have a rather long switch-case statement. Some of the cases are really short and trivial. A few are longer and need some variables that are never used anywhere else, like this:
switch (action) {
case kSimpleAction:
// Do something simple
break;
case kComplexAction: {
int specialVariable = 5;
// Do something complex with specialVariable
} break;
}
The alternative would be to declare that variable before going into the switch like this:
int specialVariable = 5;
switch (action) {
case kSimpleAction:
// Do something simple
break;
case kComplexAction:
// Do something complex with specialVariable
break;
}
This can get rather confusing since it is not clear to which case the variable belongs and it uses some unnecessary memory.
However, I have never seen this usage anywhere else.
Do you think it is a good idea to declare variables locally in a block for a single case?
If specialVariable is not used after the switch block, declare it in the "case" block.
In general, variables should be declared in the smallest possible scope it will be used.
If the switch statement becomes unmanageably huge, you may want to convert to a table of function pointers. By having the code for each case in separate functions, you don't have to worry about variable declaration and definitions.
Another advantage is that you can put each case function into a separate translation unit. This will speed up the build process by only compiling the cases that have changed. Also improves quality by isolating changes to their smallest scope.
Yes define variables in the narrowest scope needed.
So example 1 is preferred.
I'm all for
case X:
{
type var;
...;
}
break; // I like to keep breaks outside of the blocks if I can
If the stuff in there gets too complicated and starts getting in the way of your ability to see the entire switch/case as a switch/case then consider moving as much as you can into one or two inline functions that get called by the cases code. This can improve readability without throwing function call overhead in there.
Agree with Max -- smallest possible scope as possible. That way, when the next person needs to update it, he/she doesn't need to worry about if the variable is used in other sections of the switch statement.
My own rule for switch statements is that there should be a maximum of a single statement inside each case, excluding a break. This means the statement is either a an initialisation, an assignment or a function call. Putting any more complex code in a case is a recipe for disaster - I "fondly" remember all the Windows code I've seen (inspired by Petzold) which processed message parameters in-line in the same case of a windows procedure.
So call a function, and put the variable in there!

Resources