C: Empty case before default, useless? - c

In a program, I saw code like this (simplyfied):
switch (x){
case 1:
//dostuff
break;
/*___________________*/
//Here it is important
case 2:
default:
//dostuff
break;
}
Now I was wondering why someone writes a case and leaves it empty before the default case.
(Clearly it would make sense before another case).
I know that in C, there is a fallthrough if there is no break, so if x is 2, the program runs in the case 2: part, and directly falls through to the default-case.
So is the case 2:-case useless, since there is no code in it, and default will be done also without the label, so the same things are done with and without the case?
Is there a reason to write code like this (like easier modification when maintaining, but in my opinion not really relevant), or did the programmer just not remove it by mistake?
I have used switch several times in different languages, but never would have needed such code...

There is no reason for it. The explicit case 2 could be an attempt of writing self-documenting code, but here it doesn't really add anything, as the code lacks meaningful comments that explain what's unique with case 2.
Sometimes you could write code like this to explicitly document to the reader that you have considered all possible values that a variable can have. For example such self-documenting code might make sense with enums.
In this case, it really just looks like it is code still in development. Or it's some sloppily written left-overs that made it to the production code.

Last time I used MISRA C (embedded C rules for vehicles), all switch statements had to have a default clause. Potentially, this could be a reason for what you are seeing, as this would mean that ALL values passed into the switch would do something. Admittedly, that would mean the case 2 is redundant, but it might make things clearer when reading the code as a whole. It could also be some sort of embedded compiler optimisation (sometimes embedded compilers generate more efficient code when given a switch, rather than several ifs).

Related

C case number+1: need brackets?

Programming in C I found out it was convenient in a switch-case to make little groups of cases by giving them the same name and add a number to it like:
case initiating:
break;
case (initiating+1):
break;
etc etc.
Currently I am still using brackets around around (initializing+1). But I wonder, do I have to do that?
would
case (initiating+2):
work?
I could not really find an anwser.
As you can see here a switch requires a constant expression. Every label should be known at compile time. Brackets are not required but can improve readability. Be sure that you really need (initiating + 2). 2 is a magic number and does not provide any additional information to the reader of your program. Using an enum will give you the same result but better readability.
The best way to verify if this works is by simply writing an example down and compiling it.
If initiating is a constant it should work.
Case labels are not required to be enclosed in parentheses, even if they are expressions containing arithmetic operations.

How to design software tests for subtle errors

This is a super simple example, in C here, to illustrate a subtle error that I don't know how to expose as a bug through a test.
Consider:
#include <stdio.h>
int main()
{
int a;
int b;
int input;
printf("Enter 1 or 2: ");
scanf("%d", &input);
switch(input) {
case 1:
a = 10;
/* ERROR HERE, I FORGOT A BREAK! */
case 2:
b = 20;
break;
default:
printf("You didn't listen!\n");
return 1;
break;
}
if(input == 1) {
b = 30;
printf("%d, %d\n", a, b);
} else {
printf("%d\n",b);
}
return 0;
}
As noted in the code, a break is missing so when 1 is entered, it falls through to case 2. The output for 1 though doesn't reflect this as it overwrites b later. So all tests that we can design, say by entering a number from the set {1, 2, 10} all result in the correct output.
In reality, the assignments inside the switch could be very expensive and so this bug could be quite costly. But, assuming it was written this way from day one, there's no benchmark to see that the cost is higher than expected.
So what can be done to flush out these kinds of errors? Is there a way to design test cases to expose it in production software?
EDIT
So I guess I wasn't entirely clear -- I wrote it in C to illustrate the type of problem encountered, but in reality it's not specific to C. The point I'm trying to make is that the code goes into sections we never intended it to go into (in this case because of a forgotten break to illustrate the point). My actual case is a Fortran code with 700,000 lines and it is going into branches we never intended it to go into because of poor if/switch design that is legal from a language point of view but potentially very expensive.
Is it possible to design a test or look at some data from some tool that will tell us it's going into branches it shouldn't be? I caught a mistake by printing "I shouldn't be here!" inside all the cases and saw that it was printed, there's gotta be a better way than randomly seeing it and putting print statements.
You can define coding convention for switch statements, so that each branch will impose a special state. Like a variable getting assigned the very case`s value. For example:
switch (v) {
case 1:
vcheck = 1;
...
break;
case 2:
vcheck = 2;
...
break;
}
And test vcheck in your test case.
Other than that you can use tools that perform static code analysis of validation of MISRA rules - and engage them into your build process. They will induce some piece of mind... :-)
Finally, (my favorite) you can write a script that checks for such cases and warns agains them.
The correct state for case 1 is that b won't be set.
Check to see if b is set.
You may need to break your code down into smaller segments to test this if you're setting b later, but that's just good modularity.
It seems like you're asking "how do I test untestable code?". The answer is that it takes skill and planning to write testable code, it can't just be an afterthought.
There's tons of stuff on the web to help you write testable code:
https://www.google.com/search?q=writing+testable+code
For your specific example, it is in no way a mistake / bug by definition. The possibility to fallthrough is wanted in the language. If you want to forbid certain dangerous language features, then linters are the way to go.
To avoid completely unforeseen mistakes, there is the one rule: Always code defensively and make use of asserts whereever you can put them.
Why is this a bug I ask? Using any kind of black-box testing the code works. So, if the only requirement is that the code works, then there is no bug.
But the code is flawed. That missing break makes the code more difficult to understand and harder to maintain.
Coding conventions are rules on how the code should look. Adherence to coding conventions can not be done by testing the compiled program, they must be done on the source code.
Testing coding conventions is done through code inspection (automatic or manual).
Edit:
If you are worried about performance, then use instrumentation tools to find the "hot spots". You will find that most of the execution time is probably spent in just a few modules. Review those modules and every call to them. You will find that you only need to review 10-30 000 lines of code. Since the review scope is limited, the review should take 1-3 weeks.
Bottom line: Code review is vastly superior to testing when it comes to finding subtle bugs.
If input==1 and you see b = 30 on the output, you know something is wrong. Also, remember in the else clause, you should write something to b before reading from it. In the case of default:, (say, input==100) you might end up reading from a location without properly setting it.
Besides, code reviews, if you can afford them should help greatly in finding things like these.

My Simpler Dead-code Remover

I am doing a stimulation of dead-code remover in a very simpler manner.
For that my Idea is to,
Step 1: Read the input C-Program line by line and store it in a doubly linked-list or Array.(Since deletion and insertion will be easier than in file operations).
Doubt:Is my approach correct? If so, How to minimize traversing a Linked-List each time.
Step 2: Analyzing of the read strings will be done in parallel, and tables are created to maintain variables names and their details, functions and their calls,etc.,
Step 3: Searching will be done for each entries in the variable table, and the variables will be replaced by its that time's value(as it has).
(E.g.)
i=0;
if(i==3) will be replaced by if(0==3).
But on situation like..
get(a);
i=a;
if(i){}
here,'i' will not be replaced since it depends on another variable. 'a' will not be replaced since it depends on user input.
Doubt: if user input is,
if(5*5+6){print hello;} ,
it surely will be unnecessary check. How can i solve this expression to simplify the code as
{
print hello;
}
Step 4: Strings will be searched for if(0),while(0) etc., and using stack, the action block is removed. if(0){//this will be removed*/}
Step 5:(E.g) function foo(){/**/} ... if(0) foo(); ..., Once all the dead codes are removed, foo()'s entry in the function table is checked to get no.of.times it gets referred in the code. If it is 0, that function has to be removed using the same stack method.
Step 6: In the remaining functions, the lines below the return statements (if any) are removed except the '}'. This removal is done till the end of the function. The end of the function is identified using stack.
Step 7: And I will assume that my dead-free code is ready now. Store the linked-list or array in an output file.
My Questions are..
1.Whether my idea will be meaningful? or will it be implementable? How
can I improve this algorithm?
2.While i am trying to implement this idea, I have to deal more with string
manipulations rather than removing dead-codes. Is any way to reduce
string manipulations in this algorithm.
Do not do it this way. C is a free-form language, and trying to process it line-by-line will result in supporting a subset of C that is so ridiculously restricted that it doesn't deserve the name.
What you need to do is to write a proper parser. There is copious literature about that out there. Find out which textbook your school uses for its compiler-construction course, and work through that -- or just take the course! Only when you've got the parser down should you even begin to consider semantics. Then do your work on abstract syntax trees instead of strings. Alternatively, find an already written and tested parser for C that you can reuse (but you'll still need to learn quite a bit in order to integrate it with your own processing).
If you end up writing the parser yourself, and it's only for your own edification, consider using a simpler language than C as your subject. Even though C at is core is fairly compact as languages go, getting all details of the declaration syntax right is surprisingly tricky, and will probably detract you from what you're actually interested in. And the presence of the preprocessor is an issue in itself which can make it very difficult to design meaningful source-to-source transformations.
By the way, the transformations you sketch are known in the trade as "constant propagation", or (in a more ambitious variants that will clone functions and loop bodies when they have differing constant inputs) "partial evaluation". Googling those terms may be interesting.

break in a case with return.. and for default

My OCD makes me add "break" when writing case statements, even if they will not be executed. Consider the following code example:
switch(option) {
case 1:
a = 1;
b = 7;
break;
case 2:
a = 2;
b = 4;
return (-1);
break;
default:
a = -1;
break;
}
My two questions are:
For "case 2:", I don't really need the break, but is it a good idea to have it there anyway?
For "default:". Is it purely OCD, or is there any real reason to have the break here?
You don't need either break, but there's no harm in having them. In my opinion, keeping your code structured is worth having a couple of extraneous statements.
I agree with having a break in a final default case, and don't agree with breaks after returns. (A colleague does those and it hurts my eyes.)
I also indent switches so as to reduce proliferation of indent levels. :) i.e.:
switch(option) {
case 1:
a = 1;
b = 7;
break;
case 2:
a = 2;
b = 4;
return -1;
default:
a = -1;
break;
}
(I also think that, since the return statement is not a function, it isn't appropriate to enforce a superfluous style that makes it look like one.)
The break after your default case is just a matter of personal preference.
Putting a break after return almost seems contradictory to me. I'd remove the break, just to make the return statement really stand out.
I would consider the break after return to be bad form, you will get warnings about unreachable code on some compilers.
The break on your default case is completely appropriate, case fall through is a tool and should be especially marked when used.
I prefer always have a break in each case including the default and avoid doing return at all inside switch's. For short switches with just 2-3 cases(including default) return is ok but only if all cases do it the same way. 'pointless' break i see as pointless and only make's it more code to read. Same goes for empty defaults that just do break, totally pointless. The ease to read the code is in my opinion more important that what happens if someone happens to change this or that.
Neither break does anything for you, but neither does harm.
Personally, I usually leave them out if I have a return - but I also try to avoid having multiple return points in a function if possible.
However, I do think the break in the default: case is good - for one reason: If you were to leave it out, and somebody added a new case after default:, the behavior would be different if they "forget" to add in a break.
As others have pointed out, placing a break after a return or in the default case is mostly a matter of personal style.
When I don't have to follow any specific style rules, I prefer something like this:
switch(foo){
case 0:
baz = 1;
break;
case 1:
bar %= 2;
return -1;
/* NOTREACHED */
case 2:
bar = -1;
return -2;
/* NOTREACHED */
break;
default:
break;
}
Between cases 1 and 2, I tend to prefer 2. Even though the comment says NOTREACHED, comments can lie ( unintentionally of course ) when the code changes. I like the NOTREACHED comment since it can satisfy lint that you know what you are doing and serves as a reminder that you exiting the function early. The reasoning that placing a break after the return will mitigate errors if the return is deleted seem flawed to me. You are still going to get bogus behavior regardless if you fall through to the next case or you exit the switch and continue on as before.
Of course, if I can avoid it I would not return from a function within the body of a switch.
I'm told that in C, C++, java and C#, if you don't put those "breaks" the program code flow will fall into the other "cases" and will execute the instructions inside them, not matter if the variable doesn't have the values assignned to the "cases".
Regarding the comment that others have made that they leave the break in the default case in case someone comes by later and adds a case after it: Where I work, the coding standard says to always put the default as the last case; so in our situation, a break on that case is just redundant. (This is one case where I agree wholeheartedly with the company's coding standard, because with the default case always being the last one, you always know where to find it, even in a long switch-statement.)
As for breaks after returns, I tend to omit the break unless there are any execution paths that do not return, as I find it redundant. (My exception to this is, on the rare occasions when there are several execution paths in a case and I can't tell with a quick scan of the code whether or not they all return, then I'll leave the break in just to be safe.)
I don't personally put the breaks in, but it could help when someone else decides to move the return (-1) to outside of the switch and forgets to add break.
I would put the break in to show that you do not intend to fall through to next case.
In this exact example, for both questions, it is personal preference. In general, the rule is this: anything without a break will fall through. That means (as Pod said) its a good idea to put breaks in default cases in case they are not last. This also means if your case contains a return, then a following break is indeed not necessary.
Please excuse my limited knowledge, but what's OCD?
Apart from that, Brian Kernighan provides a good explanation on when you should (not) use break within a switch statement.

What is this strange C code format?

What advantage, if any, is provided by formatting C code as follows:
while(lock_file(lockdir)==0)
{
count++;
if(count==20)
{
fprintf(stderr,"Can't lock dir %s\n",lockdir);
exit(1);
}
sleep(3);
}
if(rmdir(serverdir)!=0)
{
switch(errno)
{
case EEXIST:
fprintf(stderr,"Server dir %s not empty\n",serverdir);
break;
default:
fprintf(stderr,"Can't delete dir %s\n",serverdir);
}
exit(1);
}
unlock_file(lockdir);
versus something more typical such as
while(lock_file(lockdir)==0) {
count++;
if(count==20) {
fprintf(stderr,"Can't lock dir %s\n",lockdir);
exit(1);
}
sleep(3);
}
if(rmdir(serverdir)!=0) {
switch(errno) {
case EEXIST:
fprintf(stderr,"Server dir %s not empty\n",serverdir);
break;
default:
fprintf(stderr,"Can't delete dir %s\n",serverdir);
}
exit(1);
}
unlock_file(lockdir);
I just find the top version difficult to read and to get the indenting level correct for statements outside of a long block, especially for longs blocks containing several nested blocks.
Only advantage I can see is just to be different and leave your fingerprints on code that you've written.
I notice vim formatting would have to be hand-rolled to handle the top case.
The top example is know as "Whitesmiths style". Wikipedia's entry on Indent Styles explains several styles along with their advantages and disadvantages.
The indentation you're seeing is Whitesmiths style. It's described in the first edition of Code Complete as "begin-end Block Boundaries". The basic argument for this style is that in languages like C (and Pascal) an if governs either a single statement or a block. Thus the whole block, not just its contents should be shown subordinate to the if-statement by being indented consistently.
XXXXXXXXXXXXXXX if (test)
XXXXXXXXXXXX one_thing();
XXXXXXXXXXXXXXX if (test)
X {
XXXXX one_thing();
XXXXX another_thing();
X }
Back when I first read this book (in the 90s) I found the argument for "begin-end Block Boundaries" to be convincing, though I didn't like it much when I put it into practice (in Pascal). I like it even less in C and find it confusing to read. I end up using what Steve McConnel calls "Emulating Pure Blocks" (Sun's Java Style, which is almost K&R).
XXXXXXXXXXXXXX X if (test) {
XXXXXX one_thing();
XXXXXX another_thing();
X }
This is the most common style used to program in Java (which is what I do all day). It's also most similar to my previous language which was a "pure block" language, requiring no "emulation". There are no single-statement bodies, blocks are inherent in the control structure syntax.
IF test THEN
oneThing;
anotherThing
END
Nothing. Indentation and other coding standards are a matter of preference.
Personal Preference I would have thought? I guess it has the code block in one vertical line so possibly easier to work out at a glance? Personally I prefer the brace to start directly under the previous line
It looks pretty standard to me. The only personal change I'd make is aligning the curly-braces with the start of the previous line, rather than the start of the next line, but that's just a personal choice.
Anyway, the style of formatting you're looking at there is a standard one for C and C++, and is used because it makes the code easier to read, and in particular by looking at the level of indentation you can tell where you are with nested loops, conditionals, etc. E.g.:
if (x == 0)
{
if (y == 2)
{
if (z == 3)
{
do_something (x);
}
}
}
OK in that example it's pretty easy to see what's happening, but if you put a lot of code inside those if statements, it can sometimes be hard to tell where you are without consistent indentation.
In your example, have a look at the position of the exit(1) statement -- if it weren't indented like that, it would be hard to tell where this was. As it is, you can tell it's at the end of that big if statement.
Code formatting is personal taste. As long as it is easy to read, it would pay for maintenance!
By following some formatting and commenting standards, first of all you show your respect to other people that will read and edit code written by you. If you don't accept rules and write somehow esoteric code the most probable result is that you will not be able communicate with other people (programmers) effectively. Code format is personal choice if software is written only by you and for you and nobody is expected to read it, but how many modern software is written only by one person ?
The "advantage" of Whitesmiths style (as the top one in your example is called) is that it mirrors the actual logical structure of the code:
indent if there is a logical dependency
place corresponding brackets on the same column so they are easy to find
opening and closing of a context (which may open/close a stack frame etc) are visible, not hidden
So, less if/else errors, loops gone wrong, and catches at the wrong level, and overall logical consistency.
But as benefactual wrote: within certain rational limits, formatting is a matter of personal preference.
Its just another style--people code how they like to code, and that is one accepted style (though not my preferred). I don't think it has much of a disadvantage or advantage over the more common style in which brackets are not indented but the code within them is. Perhaps one could justify it by saying that it more clearly delimits code blocks.
In order for this format to have "advantage", we really need some equivalent C code in another format to compare to!
Where I work, this indentation scheme is used in order to facilitate a home-grown folding editor mechanism.
Thus, I see nothing fundamentally wrong with this format - within certain rational limits, formatting is a matter of personal preference.

Resources