Why case: always requires constant expression while if() doesn't? - c

May be possible duplicate but couldn't have found the same.
Suppose I have following C code :
int a;
printf("Enter number :");
scanf("%d",&a); // suppose entered only an integer
// ignoring return value of scanf()
I got a case to check whether a is zero or non-zero.
if(a)
printf("%d is non-zero",a);
else
printf("%d is zero",a);
Everything is fine using if-else and I also know the other variations of if-else to achieve this . But problem comes with the switch-case as it says that we can implement everything in switch-case which we can do in if-else. But the following code fails.
switch(a)
{
case a:
printf("%d is non-zero",a);
break;
default:
printf("%d is zero",a);
break;
}
Also I know to reverse the case in the above code like this below will work and I will have my answer.
switch(a)
{
case 0:
printf("%d is zero",a);
break;
default :
printf("%d is non-zero",a);
break;
}
But the question is, Why ? Why if(a) is valid while case a: is not ? Is switch-case a compile time operation and if() run-time ?

The reason is that switch cases can be implemented as jump tables (typically using unconditional branch instructions). So they have to be resolved at compile time.
This makes them faster than ifs so it is better to use them when possible.

Case expressions must be constants. a is a variable, so it is not allowed. 0 is a constant, so that's fine. Only allowing constant expressions means that it is easier for the compiler to optimize the code.
The expression for the condition of an if statement has no such constraint.

As others have said, it's the way the language was defined.
If you have
int x, y, z;
int a;
... some code calculates x, y, z and a ...
switch(a)
{
case x:
.. do stuff here ...
break;
case y:
.. some more stuff ...
break;
case z:
... another bit of code ....
break;
}
the compiler can not figure out beforehand, at time of compilation where a should go if it's 1, 2, 3, 99, 465 or 5113212. So the code here is no more efficient than if we did
if (a == x) ... do stuff here ...
else if (a == y) ... some more stuff
else if (a == z) ... another bit of code
Further, what if x and y are the same value. Do we want BOTH do stuff and some more stuff to be executed, or just the one - and which one, the first or the second. What if the compiler re-orders the compares so that they are in a different order, because it's more efficient?
Switch is mainly intended for when you have a lot of choices of something, but each choice is known when you build the code. If that's not the case, you need something else.

Additional Information want to share Wiki
If the range of input values is identifiably 'small' and has only a few gaps, some
compilers that incorporate an optimizer may actually implement the switch statement as
a jump table or an array of indexed function pointers instead of a lengthy series of
conditional instructions. This allows the switch statement to determine instantly what
branch to execute without having to go through a list of comparisons.

It's a design decision by the creators of the language. IF case labels are constant, the compiler can optimize some cases by using a jump table. If they are not, the code will be equivalent to the multi-way if statement anyway, and the potential improvement goes away.
There is no problem defining a switch statement with variable case labels, or even different conditions for each branch, it is just that the designers of C didn't do that. Likely because they didn't see that as an advantage for the code they were writing.
The construct exists in other languages, like the COBOL I sometimes use. There it is not unusual to have a degenerated version like:
EVALUATE TRUE
WHEN x IS EQUAL TO 7
Do something
WHEN y IS LESS THAN 12
Do something else
WHEN z
Do yet another thing
END-EVALUATE
Here we have the if-else if-else chain masked as a switch (EVALUATE), which works by evaluating the conditions in order until it matches the first value.
In C the designers didn't want this, because it offers absolutely no performance advantage over the chained if-statements. On the other hand, if we require that all the conditions are constants...

Beyond the compile-time/jump-table problem, if and switch are not the same and even if case would accept a variable those two codes wouldn't have the same behavior. if body is evaluated if and only if the condition expression results in a non-zero value, while a case is entered if and only if the controlling expression and the label have the same value.
There is a big difference between if-then-else and switch statements, remember that breaks are not mandatory and execution falls through all the cases if nothing stops it. This behavior is really similar to a jump table, since inside a switch execution simply jump somewhere and goes on until it finds a break. However this use is rare, but it could be useful and easier to do than the if-then-else version.
The standard requires labels to be compile-time constants, and as other people already say, the idea behind it is a jump table for performance. Even if it's not mandatory (the C standard needs to be flexible), C99 rationale document seems to confirm this:
Case ranges of the form, lo .. hi, were seriously considered, but ultimately not adopted in the Standard on the grounds that it added no new capability, just a problematic coding convenience. The construct seems to promise more than it could be mandated to deliver:
A great deal of code or jump table space might be generated for an innocent-looking case range such as 0 .. 65535.
The range 'A' .. 'Z' would specify all the integers between the character code for
“upper-case-A” and that for “upper-case-Z”. In some common character sets this range
would include non-alphabetic characters, and in others it might not include all the
alphabetic characters, especially in non-English character sets.
Wikipedia has an article about jump tables.

Related

C case number+1: need brackets?

Programming in C I found out it was convenient in a switch-case to make little groups of cases by giving them the same name and add a number to it like:
case initiating:
break;
case (initiating+1):
break;
etc etc.
Currently I am still using brackets around around (initializing+1). But I wonder, do I have to do that?
would
case (initiating+2):
work?
I could not really find an anwser.
As you can see here a switch requires a constant expression. Every label should be known at compile time. Brackets are not required but can improve readability. Be sure that you really need (initiating + 2). 2 is a magic number and does not provide any additional information to the reader of your program. Using an enum will give you the same result but better readability.
The best way to verify if this works is by simply writing an example down and compiling it.
If initiating is a constant it should work.
Case labels are not required to be enclosed in parentheses, even if they are expressions containing arithmetic operations.

C: Empty case before default, useless?

In a program, I saw code like this (simplyfied):
switch (x){
case 1:
//dostuff
break;
/*___________________*/
//Here it is important
case 2:
default:
//dostuff
break;
}
Now I was wondering why someone writes a case and leaves it empty before the default case.
(Clearly it would make sense before another case).
I know that in C, there is a fallthrough if there is no break, so if x is 2, the program runs in the case 2: part, and directly falls through to the default-case.
So is the case 2:-case useless, since there is no code in it, and default will be done also without the label, so the same things are done with and without the case?
Is there a reason to write code like this (like easier modification when maintaining, but in my opinion not really relevant), or did the programmer just not remove it by mistake?
I have used switch several times in different languages, but never would have needed such code...
There is no reason for it. The explicit case 2 could be an attempt of writing self-documenting code, but here it doesn't really add anything, as the code lacks meaningful comments that explain what's unique with case 2.
Sometimes you could write code like this to explicitly document to the reader that you have considered all possible values that a variable can have. For example such self-documenting code might make sense with enums.
In this case, it really just looks like it is code still in development. Or it's some sloppily written left-overs that made it to the production code.
Last time I used MISRA C (embedded C rules for vehicles), all switch statements had to have a default clause. Potentially, this could be a reason for what you are seeing, as this would mean that ALL values passed into the switch would do something. Admittedly, that would mean the case 2 is redundant, but it might make things clearer when reading the code as a whole. It could also be some sort of embedded compiler optimisation (sometimes embedded compilers generate more efficient code when given a switch, rather than several ifs).

In C language, what is the best practice to check return value of a function for a branching statement?

I'm trying to have an embedded software development point of view, and I'd like to ask which one is better to go with, and what are the possible advantages and disadvantages?
bool funct(){
bool retVal = 0;
//do something
return retVal;
}
//First Choice
if(funct()){
//do something
}
//Second Choice
bool retVal = funct();
if(retVal)
{
//do something
}
Either is probably OK in this example, however the second has a slight advantage when debugging in that when stepping the code you will know whether the condition is true before the branch is taken and can coerce the variable to a different value if you want to test the alternate path, and being able to see the result of a call after the event is useful in any case during debugging.
In more complex expressions the approach may be more important, for example in:
if( x() || y() ) ...
if x() returns true, then y() will not be evaluated, which may or may not be desirable if y() has side effects, so the semantics of that are not the same as:
bool xx = x() ;
bool yy = y() ;
if( xx || yy ) ...
Using explicit assignment allows the required semantics to be clearly expressed.
//First Choise
if(funct()){
//do something
}
This is totally fine as you check the return value of function to take the decision and your function returns either 0 or 1.
Also there is a advantage here over the second choice as you are saving space of one variable retVal just to hold the return value and perform the check.
If there is a need to use the return value not only just for the check in if condition and somewhere else in the program then I would suggest storing the return value (choice 2)
Both methods will work fine. If you define better as code that will execute (very slightly) faster and take up (very slightly) less room when it is compiled, then alternative 1) is better. Alternative 1) will read the value of the function into a register and branch on the value in two commands and use no memory. Alternative 2) will read the value of the function into register, write the value to memory, read the value from memory into a register and branch on the value - for a total of four commands and four bytes of storage (assumes a 32 bit processor).
The first choice (note the spelling) is better, but for reasons entirely unrelated to what you might think.
The reason is that it is one line of code shorter, and therefore you have one less line of code to have to worry about, one less line of code to have to read when trying to understand how it works, one less line of code to have to maintain in the future.
Performance considerations are completely pointless under any real-life scenario, and as a matter of fact I would be willing to guess that any halfway decent compiler will produce the exact same machine code for both of these choices.
If you have questions of such a basic nature, I would strongly advice you to quit trying to "have an embedded software development point of view". Embedded is hard; try for non-embedded which is a lot easier. Once you master non-embedded, then you can try embedded.

Mingling switch and while in C [duplicate]

This question already has answers here:
Nested case statements
(2 answers)
Closed 8 years ago.
This code works for some reason but it does not make sense at all.
#include <stdio.h>
int main(void)
{
switch(1)
{
case 0:
while(1)
{
case 1: puts("Works"); break;
}
}
return 0;
}
Can someone explain why it does work and what applications does this have?
The case labels are almost exactly like labels used by goto.1 If you think of your code in those terms, it should be clear that it's valid. Namely, you can consider a switch statement to be a glorified conditional goto.
That said, I would slap anyone who wrote code like that in a production environment.2
In fact, they are both listed in the same grammar section (6.8.1) of the C99 standard.
Yes, this is almost identical to Duff's device, but that last had any practical use decades ago.
The reason this works is somewhat non-intuitive: case labels of a switch statement act very much like regular labels, i.e. the ones designed for use with goto statement. You can place such labels anywhere in your code.
It turns out that the same rule applies to the case label: you can place them anywhere inside their corresponding switch statement, which incidentally includes the bodies of any nested loops.
The reasons why you may want to place labels inside control statements within the body of your switch statement are even less intuitive: it turns out that you can perform loop unrolling with a cumbersome-looking but very intuitive construct called Duff's Device. It is this construct that lead to popularizing the idea of embedding case labels inside other control structures within switch statements.
You can interleave statements through the labels of switch, since they're just labels. What happens here is:
you have an infinite loop defined using while (1);
the switch (1) statement jumps to the case 1: label;
where "Works" is printed, then break; exits the inifinite loop.

How to design software tests for subtle errors

This is a super simple example, in C here, to illustrate a subtle error that I don't know how to expose as a bug through a test.
Consider:
#include <stdio.h>
int main()
{
int a;
int b;
int input;
printf("Enter 1 or 2: ");
scanf("%d", &input);
switch(input) {
case 1:
a = 10;
/* ERROR HERE, I FORGOT A BREAK! */
case 2:
b = 20;
break;
default:
printf("You didn't listen!\n");
return 1;
break;
}
if(input == 1) {
b = 30;
printf("%d, %d\n", a, b);
} else {
printf("%d\n",b);
}
return 0;
}
As noted in the code, a break is missing so when 1 is entered, it falls through to case 2. The output for 1 though doesn't reflect this as it overwrites b later. So all tests that we can design, say by entering a number from the set {1, 2, 10} all result in the correct output.
In reality, the assignments inside the switch could be very expensive and so this bug could be quite costly. But, assuming it was written this way from day one, there's no benchmark to see that the cost is higher than expected.
So what can be done to flush out these kinds of errors? Is there a way to design test cases to expose it in production software?
EDIT
So I guess I wasn't entirely clear -- I wrote it in C to illustrate the type of problem encountered, but in reality it's not specific to C. The point I'm trying to make is that the code goes into sections we never intended it to go into (in this case because of a forgotten break to illustrate the point). My actual case is a Fortran code with 700,000 lines and it is going into branches we never intended it to go into because of poor if/switch design that is legal from a language point of view but potentially very expensive.
Is it possible to design a test or look at some data from some tool that will tell us it's going into branches it shouldn't be? I caught a mistake by printing "I shouldn't be here!" inside all the cases and saw that it was printed, there's gotta be a better way than randomly seeing it and putting print statements.
You can define coding convention for switch statements, so that each branch will impose a special state. Like a variable getting assigned the very case`s value. For example:
switch (v) {
case 1:
vcheck = 1;
...
break;
case 2:
vcheck = 2;
...
break;
}
And test vcheck in your test case.
Other than that you can use tools that perform static code analysis of validation of MISRA rules - and engage them into your build process. They will induce some piece of mind... :-)
Finally, (my favorite) you can write a script that checks for such cases and warns agains them.
The correct state for case 1 is that b won't be set.
Check to see if b is set.
You may need to break your code down into smaller segments to test this if you're setting b later, but that's just good modularity.
It seems like you're asking "how do I test untestable code?". The answer is that it takes skill and planning to write testable code, it can't just be an afterthought.
There's tons of stuff on the web to help you write testable code:
https://www.google.com/search?q=writing+testable+code
For your specific example, it is in no way a mistake / bug by definition. The possibility to fallthrough is wanted in the language. If you want to forbid certain dangerous language features, then linters are the way to go.
To avoid completely unforeseen mistakes, there is the one rule: Always code defensively and make use of asserts whereever you can put them.
Why is this a bug I ask? Using any kind of black-box testing the code works. So, if the only requirement is that the code works, then there is no bug.
But the code is flawed. That missing break makes the code more difficult to understand and harder to maintain.
Coding conventions are rules on how the code should look. Adherence to coding conventions can not be done by testing the compiled program, they must be done on the source code.
Testing coding conventions is done through code inspection (automatic or manual).
Edit:
If you are worried about performance, then use instrumentation tools to find the "hot spots". You will find that most of the execution time is probably spent in just a few modules. Review those modules and every call to them. You will find that you only need to review 10-30 000 lines of code. Since the review scope is limited, the review should take 1-3 weeks.
Bottom line: Code review is vastly superior to testing when it comes to finding subtle bugs.
If input==1 and you see b = 30 on the output, you know something is wrong. Also, remember in the else clause, you should write something to b before reading from it. In the case of default:, (say, input==100) you might end up reading from a location without properly setting it.
Besides, code reviews, if you can afford them should help greatly in finding things like these.

Resources