Why was the switch statement designed to need a break? - c

Given a simple switch statement
switch (int)
{
case 1 :
{
printf("1\n");
break;
}
case 2 :
{
printf("2\n");
}
case 3 :
{
printf("3\n");
}
}
The absence of a break statement in case 2, implies that execution will continue inside the code for case 3.
This is not an accident; it was designed that way. Why was this decisions made? What benefit does this provide vs. having an automatic break semantic for the blocks? What was the rationale?

Many answers seem to focus on the ability to fall through as the reason for requiring the break statement.
I believe it was simply a mistake, due largely because when C was designed there was not nearly as much experience with how these constructs would be used.
Peter Van der Linden makes the case in his book "Expert C Programming":
We analyzed the Sun C compiler sources
to see how often the default fall
through was used. The Sun ANSI C
compiler front end has 244 switch
statements, each of which has an
average of seven cases. Fall through
occurs in just 3% of all these cases.
In other words, the normal switch
behavior is wrong 97% of the time.
It's not just in a compiler - on the
contrary, where fall through was used
in this analysis it was often for
situations that occur more frequently
in a compiler than in other software,
for instance, when compiling operators
that can have either one or two
operands:
switch (operator->num_of_operands) {
case 2: process_operand( operator->operand_2);
/* FALLTHRU */
case 1: process_operand( operator->operand_1);
break;
}
Case fall through is so widely
recognized as a defect that there's
even a special comment convention,
shown above, that tells lint "this is
really one of those 3% of cases where
fall through was desired."
I think it was a good idea for C# to require an explicit jump statement at the end of each case block (while still allowing multiple case labels to be stacked - as long as there's only a single block of statements). In C# you can still have one case fall through to another - you just have to make the fall thru explicit by jumping to the next case using a goto.
It's too bad Java didn't take the opportunity to break from the C semantics.

In a lot of ways c is just a clean interface to standard assembly idioms. When writing jump table driven flow control, the programmer has the choice of falling through or jumping out of the "control structure", and a jump out requires an explicit instruction.
So, c does the same thing...

To implement Duff's device, obviously:
dsend(to, from, count)
char *to, *from;
int count;
{
int n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}

If cases were designed to break implicitly then you couldn't have fallthrough.
case 0:
case 1:
case 2:
// all do the same thing.
break;
case 3:
case 4:
// do something different.
break;
default:
// something else entirely.
If the switch was designed to break out implicitly after every case you wouldn't have a choice about it. The switch-case structure was designed the way it is to be more flexible.

The case statements in a switch statements are simply labels.
When you switch on a value, the switch statement essentially does a goto to the label with the matching value.
This means that the break is necessary to avoid passing through to the code under the next label.
As for the reason why it was implemented this way - the fall-through nature of a switch statement can be useful in some scenarios. For example:
case optionA:
// optionA needs to do its own thing, and also B's thing.
// Fall-through to optionB afterwards.
// Its behaviour is a superset of B's.
case optionB:
// optionB needs to do its own thing
// Its behaviour is a subset of A's.
break;
case optionC:
// optionC is quite independent so it does its own thing.
break;

To allow things like:
switch(foo) {
case 1:
/* stuff for case 1 only */
if (0) {
case 2:
/* stuff for case 2 only */
}
/* stuff for cases 1 and 2 */
case 3:
/* stuff for cases 1, 2, and 3 */
}
Think of the case keyword as a goto label and it comes a lot more naturally.

It eliminates code duplication when several cases need to execute the same code (or the same code in sequence).
Since on the assembly language level it doesn't care whether you break between each one or not there is zero overhead for fall through cases anyways, so why not allow them since they offer significant advantages in certain cases.

I happened to run in to a case of assigning values in vectors to structs: it had to be done in such a manner that if the data vector was shorter than the number of data members in the struct, the rest of the members would remain in their default value. In that case omitting break was quite useful.
switch (nShorts)
{
case 4: frame.leadV1 = shortArray[3];
case 3: frame.leadIII = shortArray[2];
case 2: frame.leadII = shortArray[1];
case 1: frame.leadI = shortArray[0]; break;
default: TS_ASSERT(false);
}

As many here have specified, it's to allow a single block of code to work for multiple cases. This should be a more common occurrence for your switch statements than the "block of code per case" you specify in your example.
If you have a block of code per case without fall-through, perhaps you should consider using an if-elseif-else block, as that would seem more appropriate.

Related

PIC16f877a switch not reading correctly

I am having a problem with a switch case when using the UART functions. I receive data and store it into the eeprom. I think call a switch statement to see what was sent. I read the eeprom and the information is right but I am just not able to read the right one. It always comes base as an error (default case). I am using Hi-tech C compiler.
unsigned char tempVal;
tempVal = eeprom_read(cmdByteAddr);
switch(tempVal){
//Get temperature
case 30:
writeByte('T');
break;
//Get temp high
case 31:
writeByte('T');
writeByte('H');
break;
//Get temp low
case 32:
writeByte('T');
writeByte('L');
break;
//Get humidity
case 41:
writeByte('H');
break;
//Get humidity high
case 42:
writeByte('H');
writeByte('H');
break;
//Get humidity low
case 43:
writeByte('H');
writeByte('L');
break;
//Error
default:
writeByte('E');
writeByte(eeprom_read(cmdByteAddr));
break;
}
The value returned from eeprom_read() is not one of your cases. The switch() is working correctly. Adjust code to present a more meaningful error using the same switch variable and not another call to eeprom_read().
default:
writeByte('E');
writeByte(tempVal);
break; // Not sure why you want `break` here.
If you still get unsatisfactory results, try unsigned tempVal. Sometimes a compiler get confused, although it should not, on sub-int sized data. You may need to writeUnsigned(tempVal) or its equivalent.
You may want to print cmdByteAddr also. Maybe it is outside the EE range.

For loop is not incrementing

I have been working on this code class today and assure you I have gone through it a number of times. For some reason whenever I set my breakpoints to determine the value of "channelsel" all I get is "0". I never get 1,2,3 or 4 (my MAXCHANNELS is 5).
I'm using: P18F45K22 microcontroller, and mplab c18.
Please take a look at the following code, and thank you in advance
int channelsel = 0;
for (channelsel = 0; channelsel < MAXCHANNELS; channelsel++)
{
switch(channelsel)
{
case 0:
SetChanADC(ADC_CH0);
break;
case 1:
SetChanADC(ADC_CH1);
break;
case 2:
SetChanADC(ADC_CH2);
break;
case 3:
SetChanADC(ADC_CH3);
break;
case 4:
SetChanADC(ADC_CH4);
break;
default:
SetChanADC(ADC_CH0);
break;
}
ConvertADC();
while(BusyADC() == TRUE) Delay1TCY();
sampledValue = ReadADC();
setCurrentTemperatureForChannel(channelsel, sampledValue);
sprintf (buf, "current Temp of channel %i is %x \n\r", channelsel, sampledValue);
puts1USART(buf);
Delay10KTCYx(10);
}
Declare channelsel as volatile
volatile int channelsel
It is quite likely that your compiler is optimizing away the rest of the statements so that they are not even in the disassembly. When dealing with values that update extremely fast or conditional statements who are in close proximity to the the control values declaration and assignment, volatile tells the compiler that we always want a fresh value for this variable and to take any shortcut. Variables that depend on IO should always be declared volatile and cases like this are good candidates for its use. Compilers are all different and your mileage may vary.
If you are sure that your hardware is configured correctly, this would be my suggestion. If in doubt, please post your disassembled code for this segment.
I have been working on PIC18, its a bug that my co-worker discovered, for loops don't work with the c18 compiler, if you change it to a while loop it will work fine.

switch case with two variables

I'm seeing in a sample code of TI the following switch case,
I was wondering what is the meaning of the second variable that the switch argument receives,
__interrupt void Timer_A(void)
{
switch (TAIV, 10) // Efficient switch-implementation
{
case 2: break; // TACCR1 not used
case 4: break; // TACCR2 not used
case 10: P1OUT ^= 0x01; // overflow
break;
}
}
my guess is that there is a priority to first check the case value of "10" but I'm not really sure.
I think there is an intrinsic call missing:
switch (__even_in_range(TAIV, 10))
{
__even_in_range is an intrinsic used for MSP-430 mcu. It is provided by both TI compiler cl430 for MSP-430 and IAR compiler for MSP-430. It requires two arguments, the interrupt vector register and the last value in the allowed range, which in this example is 10. The intrinsic is used to help the compiler to generate efficient code.
See IAR for MSP-430 compiler documentation which gives this example in page 25:
#pragma vector=TIMERA1_VECTOR
__interrupt void Timer_A1_ISR(void)
{
switch (__even_in_range(TAIV, 10))
{
case 2: P1POUT ˆ= 0x04;
break;
case 4: P1POUT ˆ= 0x02;
break;
case 10: P1POUT ˆ= 0x01;
break;
}
}
and says:
The effect of the intrinsic function is that the generated code can only handle even values within the given range, which is exactly what is required in this case as the interrupt vector register for Timer A can only be 0, 2, 4, 6, 8, or 10.
The description of __even_in_range in page 237 says:
Instructs the compiler to rely on the specified value being even and within the specified range. The code will be generated accordingly and will only work if the requirement is fulfilled
There is no multi-argument switch in C. An errant refactorer has used the comma operator, which, given its left to right associativity, yields an expression equal to 10.
Your code reduces to switch (10), notwithstanding that TAIV is evaluated and might be doing something useful (a macro perhaps).
Comma operator back in action.
It boils down to case 10.

Switch case weird scoping

Reviewing some 3rd party C code I came across something like:
switch (state) {
case 0:
if (c=='A') { // open brace
// code...
break; // brace not closed!
case 1:
// code...
break;
} // close brace!
case 2:
// code...
break;
}
Which in the code I was reviewing appeared to be just a typo but I was surprised that it compiled with out error.
Why is this valid C?
What is the effect on the execution of this code compared to closing the brace at the expected place?
Is there any case where this could be of use?
Edit: In the example I looked at all breaks were present (as above) - but answer could also include behaviour if break absent in case 0 or 1.
Not only is it valid, similar structure has been used in real code, e.g., Duff's Device, which is an unrolled loop for copying a buffer:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch(count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while(--n > 0);
}
}
Since a switch statement really just computes an address and jumps to it, it's easy to see why it can overlap with other control structures; the lines within other control structures have addresses that can be jump targets, too!
In the case you presented, imagine if there were no switch or breaks in your code. When you've finished executing the then portion of a if statement, you just keep going, so you'd fall through into the case 2:. Now, since you have the switch and break, it matters what break can break out of. According to the MSDN page, “The C break statement”,
The break statement terminates the execution of the nearest enclosing do, for, switch, or while statement in which it appears. Control passes to the statement that follows the terminated statement.
Since the nearest enclosing do, for, switch, or while statement is your switch (notice that if is not included in that list), then if you're inside the then block, you transfer to the outside of the switch statement. What's a bit more interesting, though, is what happens if you enter case 0, but c == 'A' is false. Then the if transfers control to just after the closing brace of the then block, and you start executing the code in case 2.
In C and C++ it is legal to jump into loops and if blocks so long as you don't jump over any variable declarations. You can check this answer for an example using goto, but I don't see why the same ideas wouldn't apply to switch blocks.
The semantics are different than if the } was above case 1 as you would expect.
This code actually says if state == 0 and c != 'A' then go to case 2 since that's where the closing brace of the if statement is. It then processes that code and hits the break statement at the end of the case 2 code.

Switch case optimization scenario

I am aware of various switch case opimization techniques, but as per my understanding most of the modern compilers do not care about how you write switch cases, they optimize them anyway.
Here is the issue:
void func( int num)
set = 1,2,3,4,6,7,8,10,11,15
{
if (num is not from set )
regular_action();
else
unusual_stuff();
}
The set would always have values mentioned above or something resembling with many of the elements closely spaced.
E.g.
set = 0,2,3,6,7,8,11,15,27 is another possible value.
The passed no is not from this set most of the times during my program run, but when it is from the set I need to take some actions.
I have tried to simulate the above behavior with following functions just to figure out which way the switch statement should be written. Below functions do not do anything except the switch case - jump tables - comparisons.
I need to determine whether compare_1 is faster or compare_2 is faster. On my dual core machine, compare_2 always looks faster but I am unable to figure out why does this happen? Is the compiler so smart that it optimizes in such cases too?
There is no way of feeling that one function is faster than the other. Do measurements (without the printf) and also compare the assembler that is produced (use the option -S to the compiler).
Here are some suggestions for optimizing a switch statement:
Remove the switch statement
Redesign your code so that a switch statement is not necessary. For example, implementing virtual base methods in a base class. Or using an array.
Filter out common choices. If there are many choices in a range, reduce the choices to the first item in the range (although the compiler may do this automagically for you.)
Keep choices contiguous
This is very easy for the compiler to implement as a single indexed jump table.
Many choices, not contiguous
One method is to implement an associated array (key, function pointer). The code may search the table or for larger tables, they could be implemented as a linked list. Other options are possible.
Few choices, not contiguous
Often implemented by compilers as an if-elseif ladder.
Profiling
The real proof is in setting compiler optimization switches and profiling.
The Assembly Listing
You may want to code up some switch statements and see how the compiler generates the assembly code. See which version generates the optimal assembly code for your situation.
If your set really consists of numbers in the range 0 to 63, use:
#define SET 0x.......ULL
if (num < 64U && (1ULL<<num & SET)) foo();
else bar();
Looking at your comparison functions, the second one is always faster because it is optimized to always execute the default statement. The default statement is execute "in order" as it appears in the switch, so in the second function it is immediately executed. It is very efficiently giving you the same answer for every switch!
Default case must always appear as the last case in a switch. See http://www.tutorialspoint.com/cplusplus/cpp_switch_statement.htm
for example, where it states "A switch statement can have an optional default case, which must appear at the end of the switch. The default case can be used for performing a task when none of the cases is true. No break is needed in the default case."
Here are the functions mentioned above
#define MAX 100000000
void compare_1(void)
{
unsigned long i;
unsigned long j;
printf("%s\n", __FUNCTION__);
for(i=0;i<MAX;i++)
{
j = rand()%100;
switch(j)
{
case 1:
case 2:
case 3:
case 4:
case 6:
case 7:
case 8:
case 10:
case 11:
case 15:
break ;
default:
break ;
}
}
}
void unreg(void)
{
int i;
int j;
printf("%s\n", __FUNCTION__);
for(i=0;i<MAX;i++)
{
j = rand()%100;
switch(j)
{
default:
break ;
case 1:
case 2:
case 3:
case 4:
case 6:
case 7:
case 8:
case 10:
case 11:
case 15:
break ;
}
}
}

Resources