how cases get evaluated in switch statements (C) - c

I'm corrently learning C, and I'm allready familiar with basic programming concepts
I have a question about switch statement
for ex in the following code
for(int i =0 ; i<20; i++){
switch(i){
case 0: i+=5; /*label 1*/
case 1: i+=2; /*label 2*/
case 5: i+=5; /*label 3*/
default : i+=4; /*label 4*/
}
printf("%d\t",i);
}
the output is 16 21
that means that case at label 1 is executed first, then since there is no break, label 2, 3 and 4 are also executed, the question is that if label 1 is executed then value of i is updated to 5, does other cases check the condition first (if i =1 or 5 ) then execute or it just execute anything without checking?

It's a very good question, and actually reveals the internals of the switch statement in C and C++, which can sometimes be confused with cascading if-else statements.
The switch statement in C/C++ works as follows:
(1) first it evaluates the expression presented as a condition in the switch statement
(2) stores the result on the stack or using a general-purpose register
(3) using that result it attempts to jump to the corresponding case statement with the minimum comparisons possible by using a jump-table (when one can be built).
It is because of (1) and (2) that the switch you created is not behaving the way you may expect, and it doesn't reevaluate the initial expression during the execution of the case statements.
In contrast with cascading if-else statements, your case statements are essentially blocks of instructions compiled in sequential order, referenced by a jump table as mentioned at (3). Once the execution reaches a case statement, it will automatically cascade over the next case statements if break is not encountered. The break actually instructs the compiler to jump over the switch statement and stop executing the case statements.
Check out this commented disassembly of your switch statement, just to have a better grip of what's happening under the hood:
0x56555585 <+56>: mov -0x10(%ebp),%eax ;<--- store "i" (the switch condition) into EAX
0x56555588 <+59>: cmp $0x1,%eax ;<--- check "case 1"
0x5655558b <+62>: je 0x5655559a <main+77> ;<--- jump if equal to "case 1"
0x5655558d <+64>: cmp $0x5,%eax ;<--- check "case 5"
0x56555590 <+67>: je 0x5655559e <main+81> ;<--- jump if equal to "case 5"
0x56555592 <+69>: test %eax,%eax ;<--- check "case 0"
0x56555594 <+71>: jne 0x565555a2 <main+85> ;<--- jump if not equal to "default"
0x56555596 <+73>: addl $0x5,-0x10(%ebp) ;<--- case 0
0x5655559a <+77>: addl $0x2,-0x10(%ebp) ;<--- case 1
0x5655559e <+81>: addl $0x5,-0x10(%ebp) ;<--- case 5
0x565555a2 <+85>: addl $0x4,-0x10(%ebp) ;<--- default
Note: this is built with -m32 -O0 gcc options to use 32bit code which is much easier to read, and disable optimizations.
You can clearly see that after the jump is made (to any case statement) there is no further reevaluation of i (-0x10(%ebp)). Also, when the case is executed, it automatically cascades to the next one if no break is used.
Now, you may ask yourself why this odd behavior and the answer is at (3): to jump to the corresponding case statement with the minimum comparisons possible.
The switch statements in C/C++ show their true strength when the number of case statements really scales up and especially when the spread between the values used for the case statements is constant.
For example, let's assume we have a large switch statement with 100 case values, with a constant spread of 1 between case values and that the switch expression (i) evaluates to 100 (last case in the switch):
switch (i) {
case 1: /*code for case 1*/ break;
case 2: /*code for case 2*/ break;
[...]
case 99: /*code for case 99*/ break;
case 100: /*code for case 100*/ break;
}
If you used cascading if-else statements you would get 100 comparisons, but this switch can obtain the same result using just a couple of instructions, in this order:
first: the compiler will index all the case statements in a jump table
second: it will evaluate the condition in the switch and store the result (i.e.: fetch i)
third: it calculates the corresponding index in the jump table based on the result (i.e.: decrement i by 1, the first case statement, results in index 99)
fourth, it jumps directly to the corresponding case without any further operation
The same will apply if your case values have a spread of 2:
switch (i) {
case 1: /*code for case 1*/ break;
case 3: /*code for case 3*/ break;
[...]
case 99: /*code for case 99*/ break;
case 101: /*code for case 101*/ break;
}
Your compiler should detect this spread too and after subtracting the first case value (which is 1) will divide by 2 to obtain the same index for the jump table.
This complicated inner-workings of the switch statement makes it a very powerful tool in C/C++ when you want to branch your code based on a value you can only evaluate at run-time, and when that value belongs to a set that is evenly spread, or at least, groups of values with an even spread.
When the case values don't have an even spread, the switch becomes less efficient and it starts to perform similarly to if we have used cascading if-else instead.

In the labeled parts of the switch statement there are no break statements
for(int i =0 ; i<20; i++){
switch(i){
case 0: i+=5; /*label 1*/
case 1: i+=2; /*label 2*/
case 5: i+=5; /*label 3*/
default : i+=4; /*label 4*/
}
printf("%d\t",i);
}
So when i is equal to 0 (in the first iteration of the loop) then all these statements
case 0: i+=5; /*label 1*/
case 1: i+=2; /*label 2*/
case 5: i+=5; /*label 3*/
default : i+=4; /*label 4*/
are executed sequentially after passing the control to the case label 0: and i becomes equal to 16.
You can imagine the switch statement in the first iteration of the loop the following way
goto Label0;
Label0: i+=5; /*label 1*/
Label1: i+=2; /*label 2*/
Label5: i+=5; /*label 3*/
Labeldefault : i+=4; /*label 4*/
Labels are just labels. They do not execute any code. After evaluating the expression in the switch statement
switch(i)
the control is passed to the corresponding labeled statement and all the following statements are executed sequentially if there is no jump statement.
Then in the third expression of the for loop i is incremented and becomes equal to 17.
So in the next iteration of the for loop the control at once is passed to the label default and i becomes equal to 21.

Related

Which is better, a function with one return statement or multiple ones? [Embedded C]

While developing a code for the micro-contoller, I faced these warning that my function has multiple return statements.
I can replace them with single one at the end of the function, but I thought this is better. Could someone explain to me which is better and why
unsigned char getDays(unsigned char oldDay,unsigned char newDay,unsigned char currentMonth){
unsigned char xtemp;
if(oldDay < newDay){ //in the same month
xtemp = newDay - oldDay ;
return xtemp ;
}
else{
switch(currentMonth){
case 2:
case 4:
case 6:
case 8:
case 9:
case 0x11:
case 1:
xtemp = newDay + 0x31 - oldDay;
return xtemp ;
break;
case 3:
xtemp = newDay + 0x28-oldDay;
return xtemp ;
break;
case 5:
case 7:
case 0x10:
case 0x12:
xtemp = newDay+0x30-oldDay;
return xtemp ;
}
}
}
Given that all paths with a return statement calculate a value of xtemp and then finish with return xtemp;, and there are no loops so the flow is clear, I suggest a single return statement would suffice.
That said, I think the number of return statements is the most minor problem with that code.
Your switch statement has no default clause so, if currentMonth is not any of the chosen case values, the function falls off the end (with no return statement at all). That causes the caller to have undefined behaviour if it uses the return value of your function. Having a single return statement at the end would eliminate that problem, if the code is structured so xtemp is always initialised or assigned a value.
I'd also be concerned about readability - a set of magic values, some expressed as decimal and some as hex, increases difficulty for mere mortals to understand the code - which in turn makes it harder to get right. In fact, my hunch is that - because you have used a hex value in at least one place where a decimal value appears to have been intended - that you have not actually got that code working correctly.
Rather than a switch, I'd probably use some carefully constructed if/else if statements.
Thanks alot for your help ... with your help that's the final code i get
unsigned char getDays(unsigned char oldDay,unsigned char newDay,unsigned char currentMonth){
if (oldDay > newDay){
switch(currentMonth){
case 0x3: return newDay + 0x28 - oldDay;
case 0x5: case 0x7: case 0x10: case 0x12: return newDay + 0x30 - oldDay;
default: return newDay + 0x31 - oldDay;
}
}
return newDay - oldDay; //in the same month
}

Switch case weird scoping

Reviewing some 3rd party C code I came across something like:
switch (state) {
case 0:
if (c=='A') { // open brace
// code...
break; // brace not closed!
case 1:
// code...
break;
} // close brace!
case 2:
// code...
break;
}
Which in the code I was reviewing appeared to be just a typo but I was surprised that it compiled with out error.
Why is this valid C?
What is the effect on the execution of this code compared to closing the brace at the expected place?
Is there any case where this could be of use?
Edit: In the example I looked at all breaks were present (as above) - but answer could also include behaviour if break absent in case 0 or 1.
Not only is it valid, similar structure has been used in real code, e.g., Duff's Device, which is an unrolled loop for copying a buffer:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch(count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while(--n > 0);
}
}
Since a switch statement really just computes an address and jumps to it, it's easy to see why it can overlap with other control structures; the lines within other control structures have addresses that can be jump targets, too!
In the case you presented, imagine if there were no switch or breaks in your code. When you've finished executing the then portion of a if statement, you just keep going, so you'd fall through into the case 2:. Now, since you have the switch and break, it matters what break can break out of. According to the MSDN page, “The C break statement”,
The break statement terminates the execution of the nearest enclosing do, for, switch, or while statement in which it appears. Control passes to the statement that follows the terminated statement.
Since the nearest enclosing do, for, switch, or while statement is your switch (notice that if is not included in that list), then if you're inside the then block, you transfer to the outside of the switch statement. What's a bit more interesting, though, is what happens if you enter case 0, but c == 'A' is false. Then the if transfers control to just after the closing brace of the then block, and you start executing the code in case 2.
In C and C++ it is legal to jump into loops and if blocks so long as you don't jump over any variable declarations. You can check this answer for an example using goto, but I don't see why the same ideas wouldn't apply to switch blocks.
The semantics are different than if the } was above case 1 as you would expect.
This code actually says if state == 0 and c != 'A' then go to case 2 since that's where the closing brace of the if statement is. It then processes that code and hits the break statement at the end of the case 2 code.

C switch statement with do-while interleaved [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How does Duff's device work?
I am trying to understand how this is working. Any help would be appreciated.
#include<stdio.h>
void duff(int count)
{
int n=(count+7)/8;
printf("n=%d count =%d\n",n,count%8);
switch(count%8){
case 0: do{ printf("case 0\n");
case 7: printf("case 7\n");
case 6: printf("case 6\n");
case 5: printf("case 5\n");
case 4: printf("case 4\n");
case 3: printf("case 3\n");
case 2: printf("case 2\n");
case 1: printf("case 1\n");
}while( --n >0);
}
}
main(){
int count;
scanf("%d",&count);
duff(count);
}
Basically if the switch case evaluates to case statement 2, then the do statment of the while is never executed. But i ran this program and it gave me the output, but unable to explain:
output:
3
n=1 count =3
case 3
case 2
case 1
This is known as duff's device and is used in code optimization techniques to reduce branch instructions. The reason that it works is that by default case statements without breaks fall through to the next case so when you hit case 3, you keep going through to case 2 and case 1.
Both the do and the case "statements" are essentially just "goto labels". They don't add any actual code. They just tell while and switch (respectively) where to jump to. In other words, there is no code for the do to (not) execute.
(That said, it is somewhat remarkable/bizarre that C's grammar allows cases to exist in children of the switch, rather just as direct children of a switch.)
There are no break statements between the cases so the cases fall through. Therefore n=3 causes case 3: case 2: and case 1: to be executed.

Switch case optimization scenario

I am aware of various switch case opimization techniques, but as per my understanding most of the modern compilers do not care about how you write switch cases, they optimize them anyway.
Here is the issue:
void func( int num)
set = 1,2,3,4,6,7,8,10,11,15
{
if (num is not from set )
regular_action();
else
unusual_stuff();
}
The set would always have values mentioned above or something resembling with many of the elements closely spaced.
E.g.
set = 0,2,3,6,7,8,11,15,27 is another possible value.
The passed no is not from this set most of the times during my program run, but when it is from the set I need to take some actions.
I have tried to simulate the above behavior with following functions just to figure out which way the switch statement should be written. Below functions do not do anything except the switch case - jump tables - comparisons.
I need to determine whether compare_1 is faster or compare_2 is faster. On my dual core machine, compare_2 always looks faster but I am unable to figure out why does this happen? Is the compiler so smart that it optimizes in such cases too?
There is no way of feeling that one function is faster than the other. Do measurements (without the printf) and also compare the assembler that is produced (use the option -S to the compiler).
Here are some suggestions for optimizing a switch statement:
Remove the switch statement
Redesign your code so that a switch statement is not necessary. For example, implementing virtual base methods in a base class. Or using an array.
Filter out common choices. If there are many choices in a range, reduce the choices to the first item in the range (although the compiler may do this automagically for you.)
Keep choices contiguous
This is very easy for the compiler to implement as a single indexed jump table.
Many choices, not contiguous
One method is to implement an associated array (key, function pointer). The code may search the table or for larger tables, they could be implemented as a linked list. Other options are possible.
Few choices, not contiguous
Often implemented by compilers as an if-elseif ladder.
Profiling
The real proof is in setting compiler optimization switches and profiling.
The Assembly Listing
You may want to code up some switch statements and see how the compiler generates the assembly code. See which version generates the optimal assembly code for your situation.
If your set really consists of numbers in the range 0 to 63, use:
#define SET 0x.......ULL
if (num < 64U && (1ULL<<num & SET)) foo();
else bar();
Looking at your comparison functions, the second one is always faster because it is optimized to always execute the default statement. The default statement is execute "in order" as it appears in the switch, so in the second function it is immediately executed. It is very efficiently giving you the same answer for every switch!
Default case must always appear as the last case in a switch. See http://www.tutorialspoint.com/cplusplus/cpp_switch_statement.htm
for example, where it states "A switch statement can have an optional default case, which must appear at the end of the switch. The default case can be used for performing a task when none of the cases is true. No break is needed in the default case."
Here are the functions mentioned above
#define MAX 100000000
void compare_1(void)
{
unsigned long i;
unsigned long j;
printf("%s\n", __FUNCTION__);
for(i=0;i<MAX;i++)
{
j = rand()%100;
switch(j)
{
case 1:
case 2:
case 3:
case 4:
case 6:
case 7:
case 8:
case 10:
case 11:
case 15:
break ;
default:
break ;
}
}
}
void unreg(void)
{
int i;
int j;
printf("%s\n", __FUNCTION__);
for(i=0;i<MAX;i++)
{
j = rand()%100;
switch(j)
{
default:
break ;
case 1:
case 2:
case 3:
case 4:
case 6:
case 7:
case 8:
case 10:
case 11:
case 15:
break ;
}
}
}

Why was the switch statement designed to need a break?

Given a simple switch statement
switch (int)
{
case 1 :
{
printf("1\n");
break;
}
case 2 :
{
printf("2\n");
}
case 3 :
{
printf("3\n");
}
}
The absence of a break statement in case 2, implies that execution will continue inside the code for case 3.
This is not an accident; it was designed that way. Why was this decisions made? What benefit does this provide vs. having an automatic break semantic for the blocks? What was the rationale?
Many answers seem to focus on the ability to fall through as the reason for requiring the break statement.
I believe it was simply a mistake, due largely because when C was designed there was not nearly as much experience with how these constructs would be used.
Peter Van der Linden makes the case in his book "Expert C Programming":
We analyzed the Sun C compiler sources
to see how often the default fall
through was used. The Sun ANSI C
compiler front end has 244 switch
statements, each of which has an
average of seven cases. Fall through
occurs in just 3% of all these cases.
In other words, the normal switch
behavior is wrong 97% of the time.
It's not just in a compiler - on the
contrary, where fall through was used
in this analysis it was often for
situations that occur more frequently
in a compiler than in other software,
for instance, when compiling operators
that can have either one or two
operands:
switch (operator->num_of_operands) {
case 2: process_operand( operator->operand_2);
/* FALLTHRU */
case 1: process_operand( operator->operand_1);
break;
}
Case fall through is so widely
recognized as a defect that there's
even a special comment convention,
shown above, that tells lint "this is
really one of those 3% of cases where
fall through was desired."
I think it was a good idea for C# to require an explicit jump statement at the end of each case block (while still allowing multiple case labels to be stacked - as long as there's only a single block of statements). In C# you can still have one case fall through to another - you just have to make the fall thru explicit by jumping to the next case using a goto.
It's too bad Java didn't take the opportunity to break from the C semantics.
In a lot of ways c is just a clean interface to standard assembly idioms. When writing jump table driven flow control, the programmer has the choice of falling through or jumping out of the "control structure", and a jump out requires an explicit instruction.
So, c does the same thing...
To implement Duff's device, obviously:
dsend(to, from, count)
char *to, *from;
int count;
{
int n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
If cases were designed to break implicitly then you couldn't have fallthrough.
case 0:
case 1:
case 2:
// all do the same thing.
break;
case 3:
case 4:
// do something different.
break;
default:
// something else entirely.
If the switch was designed to break out implicitly after every case you wouldn't have a choice about it. The switch-case structure was designed the way it is to be more flexible.
The case statements in a switch statements are simply labels.
When you switch on a value, the switch statement essentially does a goto to the label with the matching value.
This means that the break is necessary to avoid passing through to the code under the next label.
As for the reason why it was implemented this way - the fall-through nature of a switch statement can be useful in some scenarios. For example:
case optionA:
// optionA needs to do its own thing, and also B's thing.
// Fall-through to optionB afterwards.
// Its behaviour is a superset of B's.
case optionB:
// optionB needs to do its own thing
// Its behaviour is a subset of A's.
break;
case optionC:
// optionC is quite independent so it does its own thing.
break;
To allow things like:
switch(foo) {
case 1:
/* stuff for case 1 only */
if (0) {
case 2:
/* stuff for case 2 only */
}
/* stuff for cases 1 and 2 */
case 3:
/* stuff for cases 1, 2, and 3 */
}
Think of the case keyword as a goto label and it comes a lot more naturally.
It eliminates code duplication when several cases need to execute the same code (or the same code in sequence).
Since on the assembly language level it doesn't care whether you break between each one or not there is zero overhead for fall through cases anyways, so why not allow them since they offer significant advantages in certain cases.
I happened to run in to a case of assigning values in vectors to structs: it had to be done in such a manner that if the data vector was shorter than the number of data members in the struct, the rest of the members would remain in their default value. In that case omitting break was quite useful.
switch (nShorts)
{
case 4: frame.leadV1 = shortArray[3];
case 3: frame.leadIII = shortArray[2];
case 2: frame.leadII = shortArray[1];
case 1: frame.leadI = shortArray[0]; break;
default: TS_ASSERT(false);
}
As many here have specified, it's to allow a single block of code to work for multiple cases. This should be a more common occurrence for your switch statements than the "block of code per case" you specify in your example.
If you have a block of code per case without fall-through, perhaps you should consider using an if-elseif-else block, as that would seem more appropriate.

Resources