Reviewing some 3rd party C code I came across something like:
switch (state) {
case 0:
if (c=='A') { // open brace
// code...
break; // brace not closed!
case 1:
// code...
break;
} // close brace!
case 2:
// code...
break;
}
Which in the code I was reviewing appeared to be just a typo but I was surprised that it compiled with out error.
Why is this valid C?
What is the effect on the execution of this code compared to closing the brace at the expected place?
Is there any case where this could be of use?
Edit: In the example I looked at all breaks were present (as above) - but answer could also include behaviour if break absent in case 0 or 1.
Not only is it valid, similar structure has been used in real code, e.g., Duff's Device, which is an unrolled loop for copying a buffer:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch(count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while(--n > 0);
}
}
Since a switch statement really just computes an address and jumps to it, it's easy to see why it can overlap with other control structures; the lines within other control structures have addresses that can be jump targets, too!
In the case you presented, imagine if there were no switch or breaks in your code. When you've finished executing the then portion of a if statement, you just keep going, so you'd fall through into the case 2:. Now, since you have the switch and break, it matters what break can break out of. According to the MSDN page, “The C break statement”,
The break statement terminates the execution of the nearest enclosing do, for, switch, or while statement in which it appears. Control passes to the statement that follows the terminated statement.
Since the nearest enclosing do, for, switch, or while statement is your switch (notice that if is not included in that list), then if you're inside the then block, you transfer to the outside of the switch statement. What's a bit more interesting, though, is what happens if you enter case 0, but c == 'A' is false. Then the if transfers control to just after the closing brace of the then block, and you start executing the code in case 2.
In C and C++ it is legal to jump into loops and if blocks so long as you don't jump over any variable declarations. You can check this answer for an example using goto, but I don't see why the same ideas wouldn't apply to switch blocks.
The semantics are different than if the } was above case 1 as you would expect.
This code actually says if state == 0 and c != 'A' then go to case 2 since that's where the closing brace of the if statement is. It then processes that code and hits the break statement at the end of the case 2 code.
Related
Currently we are developing a client/server application. In the client code we use a step sequence, performed by a switch/case. The code is working fine, but for me it's seems not quite right.
So here's the code.
while(true)
{
switch(sub_step_1){
case 1: /* Step 1: */
...
sub_step_1++;
break;
case 2: /* Step 2: */
...
sub_step_1++;
break;
case 3: /* Step 3: */
...
sub_step_1++;
break;
case 4: /* Step 4: */
...
sub_step_1=0;
break;
default:
return 0;
}
}
The step sequence is execute inside a function, and at the end it just returns to the main. Since everything works fine, I just wanted to ask if there is a possible way to optimize this sequence, espacially for debugging.
One thing that struck me as wrong is not so important, but your comment.
/* Step 3: */
is not step 3. it is actually step 1.
You dont go from switch case 1 to switch case 2 and so on because theres a break statement.
Just remove the sub_step_1 counter, then breaks, cases and switch, so your function becomes like this:
/* Step 1: */
...
/* Step 2: */
...
/* Step 3: */
...
/* Step 4: */
...
return 0;
Why not just have a sequence of function ?
step1();
step2();
step3();
step4();
Or simply sequential code if you don't want to pass the context.
I came across this syntax on the Wikipedia page on Duff's Device.
The page mentions, that such construct helps with some modulo-divide requirement in this case, particularly 8, as here:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++; /* {int somevar =0;} */
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
How does control even reach case 5 when the primary loop block is at case 0?
How is it syntactically correct? Also, why then is variable declaration inside the case statements constrained to block scope only?
[Edit]
Quoting lines from the same page, "...the first edition of The C Programming Language which requires only that the body of the switch be a syntactically valid (compound) statement within which case labels can appear prefixing any sub-statement...".
I believe that's an answer, however, why did the masters chose to do it thus?
That idiom is known as Duff's Device, and depends on the fact that switch/case is a very thin abstraction over goto.
In C, it is permissible to jump from outside to inside a block like that (though a good compiler will warn if doing so will miss any variable initialisations).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How does Duff's device work?
I am trying to understand how this is working. Any help would be appreciated.
#include<stdio.h>
void duff(int count)
{
int n=(count+7)/8;
printf("n=%d count =%d\n",n,count%8);
switch(count%8){
case 0: do{ printf("case 0\n");
case 7: printf("case 7\n");
case 6: printf("case 6\n");
case 5: printf("case 5\n");
case 4: printf("case 4\n");
case 3: printf("case 3\n");
case 2: printf("case 2\n");
case 1: printf("case 1\n");
}while( --n >0);
}
}
main(){
int count;
scanf("%d",&count);
duff(count);
}
Basically if the switch case evaluates to case statement 2, then the do statment of the while is never executed. But i ran this program and it gave me the output, but unable to explain:
output:
3
n=1 count =3
case 3
case 2
case 1
This is known as duff's device and is used in code optimization techniques to reduce branch instructions. The reason that it works is that by default case statements without breaks fall through to the next case so when you hit case 3, you keep going through to case 2 and case 1.
Both the do and the case "statements" are essentially just "goto labels". They don't add any actual code. They just tell while and switch (respectively) where to jump to. In other words, there is no code for the do to (not) execute.
(That said, it is somewhat remarkable/bizarre that C's grammar allows cases to exist in children of the switch, rather just as direct children of a switch.)
There are no break statements between the cases so the cases fall through. Therefore n=3 causes case 3: case 2: and case 1: to be executed.
I am aware of various switch case opimization techniques, but as per my understanding most of the modern compilers do not care about how you write switch cases, they optimize them anyway.
Here is the issue:
void func( int num)
set = 1,2,3,4,6,7,8,10,11,15
{
if (num is not from set )
regular_action();
else
unusual_stuff();
}
The set would always have values mentioned above or something resembling with many of the elements closely spaced.
E.g.
set = 0,2,3,6,7,8,11,15,27 is another possible value.
The passed no is not from this set most of the times during my program run, but when it is from the set I need to take some actions.
I have tried to simulate the above behavior with following functions just to figure out which way the switch statement should be written. Below functions do not do anything except the switch case - jump tables - comparisons.
I need to determine whether compare_1 is faster or compare_2 is faster. On my dual core machine, compare_2 always looks faster but I am unable to figure out why does this happen? Is the compiler so smart that it optimizes in such cases too?
There is no way of feeling that one function is faster than the other. Do measurements (without the printf) and also compare the assembler that is produced (use the option -S to the compiler).
Here are some suggestions for optimizing a switch statement:
Remove the switch statement
Redesign your code so that a switch statement is not necessary. For example, implementing virtual base methods in a base class. Or using an array.
Filter out common choices. If there are many choices in a range, reduce the choices to the first item in the range (although the compiler may do this automagically for you.)
Keep choices contiguous
This is very easy for the compiler to implement as a single indexed jump table.
Many choices, not contiguous
One method is to implement an associated array (key, function pointer). The code may search the table or for larger tables, they could be implemented as a linked list. Other options are possible.
Few choices, not contiguous
Often implemented by compilers as an if-elseif ladder.
Profiling
The real proof is in setting compiler optimization switches and profiling.
The Assembly Listing
You may want to code up some switch statements and see how the compiler generates the assembly code. See which version generates the optimal assembly code for your situation.
If your set really consists of numbers in the range 0 to 63, use:
#define SET 0x.......ULL
if (num < 64U && (1ULL<<num & SET)) foo();
else bar();
Looking at your comparison functions, the second one is always faster because it is optimized to always execute the default statement. The default statement is execute "in order" as it appears in the switch, so in the second function it is immediately executed. It is very efficiently giving you the same answer for every switch!
Default case must always appear as the last case in a switch. See http://www.tutorialspoint.com/cplusplus/cpp_switch_statement.htm
for example, where it states "A switch statement can have an optional default case, which must appear at the end of the switch. The default case can be used for performing a task when none of the cases is true. No break is needed in the default case."
Here are the functions mentioned above
#define MAX 100000000
void compare_1(void)
{
unsigned long i;
unsigned long j;
printf("%s\n", __FUNCTION__);
for(i=0;i<MAX;i++)
{
j = rand()%100;
switch(j)
{
case 1:
case 2:
case 3:
case 4:
case 6:
case 7:
case 8:
case 10:
case 11:
case 15:
break ;
default:
break ;
}
}
}
void unreg(void)
{
int i;
int j;
printf("%s\n", __FUNCTION__);
for(i=0;i<MAX;i++)
{
j = rand()%100;
switch(j)
{
default:
break ;
case 1:
case 2:
case 3:
case 4:
case 6:
case 7:
case 8:
case 10:
case 11:
case 15:
break ;
}
}
}
Given a simple switch statement
switch (int)
{
case 1 :
{
printf("1\n");
break;
}
case 2 :
{
printf("2\n");
}
case 3 :
{
printf("3\n");
}
}
The absence of a break statement in case 2, implies that execution will continue inside the code for case 3.
This is not an accident; it was designed that way. Why was this decisions made? What benefit does this provide vs. having an automatic break semantic for the blocks? What was the rationale?
Many answers seem to focus on the ability to fall through as the reason for requiring the break statement.
I believe it was simply a mistake, due largely because when C was designed there was not nearly as much experience with how these constructs would be used.
Peter Van der Linden makes the case in his book "Expert C Programming":
We analyzed the Sun C compiler sources
to see how often the default fall
through was used. The Sun ANSI C
compiler front end has 244 switch
statements, each of which has an
average of seven cases. Fall through
occurs in just 3% of all these cases.
In other words, the normal switch
behavior is wrong 97% of the time.
It's not just in a compiler - on the
contrary, where fall through was used
in this analysis it was often for
situations that occur more frequently
in a compiler than in other software,
for instance, when compiling operators
that can have either one or two
operands:
switch (operator->num_of_operands) {
case 2: process_operand( operator->operand_2);
/* FALLTHRU */
case 1: process_operand( operator->operand_1);
break;
}
Case fall through is so widely
recognized as a defect that there's
even a special comment convention,
shown above, that tells lint "this is
really one of those 3% of cases where
fall through was desired."
I think it was a good idea for C# to require an explicit jump statement at the end of each case block (while still allowing multiple case labels to be stacked - as long as there's only a single block of statements). In C# you can still have one case fall through to another - you just have to make the fall thru explicit by jumping to the next case using a goto.
It's too bad Java didn't take the opportunity to break from the C semantics.
In a lot of ways c is just a clean interface to standard assembly idioms. When writing jump table driven flow control, the programmer has the choice of falling through or jumping out of the "control structure", and a jump out requires an explicit instruction.
So, c does the same thing...
To implement Duff's device, obviously:
dsend(to, from, count)
char *to, *from;
int count;
{
int n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
If cases were designed to break implicitly then you couldn't have fallthrough.
case 0:
case 1:
case 2:
// all do the same thing.
break;
case 3:
case 4:
// do something different.
break;
default:
// something else entirely.
If the switch was designed to break out implicitly after every case you wouldn't have a choice about it. The switch-case structure was designed the way it is to be more flexible.
The case statements in a switch statements are simply labels.
When you switch on a value, the switch statement essentially does a goto to the label with the matching value.
This means that the break is necessary to avoid passing through to the code under the next label.
As for the reason why it was implemented this way - the fall-through nature of a switch statement can be useful in some scenarios. For example:
case optionA:
// optionA needs to do its own thing, and also B's thing.
// Fall-through to optionB afterwards.
// Its behaviour is a superset of B's.
case optionB:
// optionB needs to do its own thing
// Its behaviour is a subset of A's.
break;
case optionC:
// optionC is quite independent so it does its own thing.
break;
To allow things like:
switch(foo) {
case 1:
/* stuff for case 1 only */
if (0) {
case 2:
/* stuff for case 2 only */
}
/* stuff for cases 1 and 2 */
case 3:
/* stuff for cases 1, 2, and 3 */
}
Think of the case keyword as a goto label and it comes a lot more naturally.
It eliminates code duplication when several cases need to execute the same code (or the same code in sequence).
Since on the assembly language level it doesn't care whether you break between each one or not there is zero overhead for fall through cases anyways, so why not allow them since they offer significant advantages in certain cases.
I happened to run in to a case of assigning values in vectors to structs: it had to be done in such a manner that if the data vector was shorter than the number of data members in the struct, the rest of the members would remain in their default value. In that case omitting break was quite useful.
switch (nShorts)
{
case 4: frame.leadV1 = shortArray[3];
case 3: frame.leadIII = shortArray[2];
case 2: frame.leadII = shortArray[1];
case 1: frame.leadI = shortArray[0]; break;
default: TS_ASSERT(false);
}
As many here have specified, it's to allow a single block of code to work for multiple cases. This should be a more common occurrence for your switch statements than the "block of code per case" you specify in your example.
If you have a block of code per case without fall-through, perhaps you should consider using an if-elseif-else block, as that would seem more appropriate.