Both kernel coding style and gnome's C style guide states that:
Do not unnecessarily use braces where a single statement will do.
if (condition)
action();
but at the same time it should be sometimes used, as in else branch of:
if (condition) {
do_this();
do_that();
} else {
otherwise();
}
Is there any technical or usability reasons to prefer it this way? Are there any objective reasons not to put the braces there everytime?
There are only stylistic and ease-of-editing-related reasons.
Whether you omit the brace or not, C compilers must act as if the braces were there (+ a pair around the whole iteration statement (if or if-else)).
6.8.4p3:
A selection statement is a block whose scope is a strict subset of the
scope of its enclosing block. Each associated substatement is also a
block whose scope is a strict subset of the scope of the selection
statement.
The existence of these implicit blocks can be nicely demonstrated with enums:
#include <stdio.h>
int main()
{
enum{ e=0};
printf("%d\n", (int)e);
if(1) printf("%d\n", (sizeof(enum{e=1}),(int)e));
if(sizeof(enum{e=2})) printf("%d\n", (int)e);
printf("%d\n", (int)e);
//prints 0 1 2 0
}
A similar rule also exists for iteration statements: 6.8.5p5.
These implicit blocks also mean that a compound literal defined inside an iteration or selection statement is limited to such an implicit block. That is why example http://port70.net/~nsz/c/c11/n1570.html#6.5.2.5p15 from the standard puts a compound literal in between a label an explicit goto instead of simply using a while statement, which would limit the scope of the literal, regardless of whether or not explicit braces were used.
While it may be tempting, don't ever do:
if (Ptr) Ptr = &(type){0}; //WRONG way to provide a default for Ptr
The above leads to UB (and actually nonworking wit gcc -O3) because of the scoping rules.
The correct way to do the above is either with:
type default_val = {0};
if (Ptr) Ptr = &default_val; //OK
or with:
Ptr = Ptr ? Ptr : &(type){0}; //OK
These implicit blocks are new in C99 and the inner ones (for selection statements (=ifs)) are well rationalized (C99RationaleV5.10.pdf, section 6.8) as aids in refactoring, preventing braces that are added from previously unbraced branches from changing meaning.
The outermost branch around the whole selection statements doesn't appear to be so well rationalized, unfortunately (more accurately, it's not rationalized at all). It appears copied from the rule for iterations statements, which appears to copy the C++ rules where for-loop-local variables are destructed at the very end of the whole for loop (as if the for loop were braced).
(Unfortunately, I think that for selection statement the outermost implicit {} does more harm than good as it prevents you from having macros that stack-allocate in just the scope of the caller but also need a check, because then you can only check such macros with ?: but not with if, which is weird.)
Well, there's one special case in which braces do need to be used: Suppose you have the following code:
if (a)
if (b)
f();
else g();
As it is indented, one could assume the else g(); statement belongs to the first if(a) statement, but C syntax rules say that it is interpreted as (now with braces):
if (a) {
if (b) {
f();
}
else {
g();
}
}
which actually means:
if (a) {
if (b) {
f();
}
else {
g();
}
}
in case you wanted the other possibility, then you must use braces. For example you can write it this way:
if (a) {
if (b)
f();
}
else
g();
which actually means:
if (a) {
if (b) {
f();
}
}
else {
g();
}
Note
As all elementary programming books recommend: If you are in doubt about operator precedence, then use parentheses; if you extend that to statements coding, if you are in doubt, use braces! :)
I hate those "if in doubt" guidelines with a passion. They engender laziness that pushes the cost onto the code reader.
Such guidelines lead to code that is more cluttered, slower to read, and therefore harder to debug.
If in doubt go and read the precedence table.
If still hesitating, write some test code to verify the interpretation.
Repeat this every time you code until precedence becomes second nature.
When you are sure you have a firm grasp of precedence, then and only then write the production code.
If you really can't manager that, then always break up your statements so that they contain no more than two levels of grouping parentheses in any one statement. If that means you have to make up lots of temporary variable names, that's a good thing.
Related
case 1:
{ //question is about this curly brace
int val;
scanf("%d", &val);
if(top1 == NULL){
enqueue(top1, val, bottom1);
}
else{
enqueue(top1, val);
}
break;
}
Without the curly brace after case 1: it gave an error: *
a label can only be part of a statement and a declaration is not a
statement: int val;
*
That is how the C grammar is defined. Variable declarations are not considered statements:
int x; //a declaration
int y = 3; //another declaration
x = y + 1; //a statement
A label is required to be followed by a statement. A label plus a declaration is not allowed.
foo: int x; //error
bar: x = y +1; //ok
That is, IMO, inconvenient:
while (...)
{
//...
goto end;
end: //error, no statement
}
And remember that a case is just a special kind of label, so:
case 1: int x; //error
case 2: x = y +1; //ok
The issue with braces is that in C they are used to build a compound statement, that of course is a kind of statement. The lines inside the compound statements can be both declarations and statements, all mixed (old C versions only allowed declarations at the beginning, not in the middle). So:
case 1: int x; //error: label plus declaration
case 2: { int x; } //ok: label plus compound statement
As a footnote, since modern C allows to intermix declarations and statements you can also write:
case 1:; int x; //ok: label plus empty statement.
because an isolated ; is an empty statement, it can be used to satisfy the grammar whereever a no-op statement is needed.
Whether to use a ; or a { ... } is a matter of readability. In the end: example I'd use a ;, but in the case: I prefer the {...}.
while (...)
{
//...
goto end;
end:; //ok, empty statement
}
switch (...)
{
case 1: //ok, compound statement
{
int x;
}
}
Of course more creative solutions can be written, such as:
case 1: {} int x; //ok, label plus empty compound statement
C Rules About Case Labels
The rules of the C standard that prevent a declaration from following a case label are:
A case label must be followed by a statement (C 2011 [N1570] 6.8.1).
The C standard defines a statement as one of labeled-statement, compound-statement, expression-statement, selection-statement, iteration-statement, or jump-statement (6.8). None of these is a declaration.
The C standard treats declarations and statements separately. The rule that allows declarations to be largely mingled with statements is that a compound-statement is a list of block-items in braces (that is, { block-item-listopt }) (6.8.2), and a block-item is defined as a declaration or a statement. So, inside braces, you can mix declarations and statements. But a case label must be part of a statement; it is not a separate thing you can insert anywhere.
Declarations can be included inside a switch using two alternatives. One is to use an empty statement after the case label, as in:
case 123:
;
int foo;
…
Another is to use a compound statement after the case label, as in:
case 123:
{
int foo;
…
}
Generally, the latter is preferable, because the scope of foo is limited to the compound statement, so it cannot be used accidentally in another section of the switch statement.
Reasons For the Rules
I do not see a reason for this other than history. Originally, declarations were even more restricted than they are now. Inside functions, declarations had to be the first statements inside braces. You could not put a declaration after any statement. That has been relaxed in modern C, but why is there still a restriction on what follows a case label?
There cannot be a semantic reason that a declaration cannot follow a case label in modern C, because the empty-statement example above would have the same semantics as:
case 123:
int foo;
That is, the compiler would have to be prepared to create and initialize a new object at the same point in execution. Since it has to do that for the legal example code, it would be able to do it for this version too.
I also do not see a syntactic or grammatical barrier. The colon after the constant expression of a case label is pretty distinct. (The constant expression can have colons in it from ? : operators, but the first : not associated with a ? will be the end of the case label.) Once parsing reaches that colon, the current parsing state seems clean. I do not see why it could not recognize either a declaration or a statement there, just as it was prepared to do before the case.
(If somebody can find a problem in the grammar that would be caused by allowing a case label to be followed by a declaration, that would be interesting.)
First to say, the error you post is a syntax error related to the format of a case label statement. It allows you to use only an executable statement, and not a declaration. Put an empty statement before the declaration and you'll be ok. Try the following:
#include <stdio.h>
int main()
{
switch(3) {
int x = 3; /* the initializer is ignored by the compiler
* you can include declarations here, but as
* this code is never executed, the initializer
* never takes place. */
case 3:; /* <=== look this semicolon to allow the next declaration */
int y = 5;
printf("x = %d, y = %d\n", x, y);
break;
}
}
The first variable, x, will be declared properly, but the initializer will not be executed, as the case statement selected is the one corresponding to case label 3. The printf will print
x = 0, y = 5
(note: this happens on my machine, as the variable x is not initialized, Undefined Behaviour is expected)
In C, some evolution has been realized over the years, concerning the use of declarations in a block.
In ancient C, variables can be declared only at the beginning of a block (the piece of code between { and }, but this approach has been thrown for the new possibility of declaring a variable whenever you need it (even after some executable sentences after a block begin) But a case statement permits only to put an executable statement, and not a declaration, so that's the reason of your compiler error.
If you follow the ancient C way, you can only declare new local variables only after the opening { curly brace after the switch, as in:
switch(something) {
int var1;
case BLA_BLA:
/* use var1 here */
which, although counterintuitive, is valid since the old K&R code. The problem with this approach is that the variable is valid from the point of definition until the end of the switch statement, and so, it is global to all the case parts.
Another way, is the form you propose, in which you declare a new block by opening curly braces. This works also since the old K&R code, and makes it easier to control the scope of the variables defined. Personally, I prefer this second approach. A block is an executable statement, so there's no problem in using it as the labeled case statement of the switch (the declarations happen inside it).
Case labels don't delimit blocks of code, they label executable statements, so their syntax is specific to the case statement syntax (which finishes after the semicolon of the statement it is attached to, or the closing curly br)
Are inner-scopes ever used in C, or is this similar to something like a goto statement that isn't used too much in production code? The only thing I can think of that might use it is making something volatile temporarily, for example:
int main(void)
{
int a=1;
const int b=4;
printf("a=%d, b=%d\n", a, b);
{
int b=5;
printf("a=%d, b=%d\n", a, b);
}
}
But that seems like a pretty non-practical example. How would these be used in practice?
One case when blocks are required is defining local variables in a switch statement.
switch(foo())
{
case 0:
printf("no '{}' block generally required, except...\n");
break;
case 1:
{
int n = bar();
printf("%d\n", (n + 1) * n);
}
break;
//...
}
Without the {} block, the code would not compile because case 1: expects a statement right after, and declarations are not statements in C.
(Incidentally, the block is usually required in C++, too, though for an entirely different reason, because initialization of n would be skipped by other case labels.)
Are inner-scopes ever used in C, or is this similar to something like a goto statement that isn't used too much in production code?
Not so much by themselves, but inner scopes come up naturally all the time as the bodies of control statements such as loops and conditional statements. For example, a variable declared inside the body of a loop goes out of scope at the end of each iteration of the loop and is instantiated again during the next iteration.
we don't use them very often in production code. but this one very useful when you need your variable's scope very specific.
as far our use case, suppose you are coding in production code, and you want your some of variable's scope very specific . in this circumstances, we just use block {.....} to limit variable scope.
case 1:
{ //question is about this curly brace
int val;
scanf("%d", &val);
if(top1 == NULL){
enqueue(top1, val, bottom1);
}
else{
enqueue(top1, val);
}
break;
}
Without the curly brace after case 1: it gave an error: *
a label can only be part of a statement and a declaration is not a
statement: int val;
*
That is how the C grammar is defined. Variable declarations are not considered statements:
int x; //a declaration
int y = 3; //another declaration
x = y + 1; //a statement
A label is required to be followed by a statement. A label plus a declaration is not allowed.
foo: int x; //error
bar: x = y +1; //ok
That is, IMO, inconvenient:
while (...)
{
//...
goto end;
end: //error, no statement
}
And remember that a case is just a special kind of label, so:
case 1: int x; //error
case 2: x = y +1; //ok
The issue with braces is that in C they are used to build a compound statement, that of course is a kind of statement. The lines inside the compound statements can be both declarations and statements, all mixed (old C versions only allowed declarations at the beginning, not in the middle). So:
case 1: int x; //error: label plus declaration
case 2: { int x; } //ok: label plus compound statement
As a footnote, since modern C allows to intermix declarations and statements you can also write:
case 1:; int x; //ok: label plus empty statement.
because an isolated ; is an empty statement, it can be used to satisfy the grammar whereever a no-op statement is needed.
Whether to use a ; or a { ... } is a matter of readability. In the end: example I'd use a ;, but in the case: I prefer the {...}.
while (...)
{
//...
goto end;
end:; //ok, empty statement
}
switch (...)
{
case 1: //ok, compound statement
{
int x;
}
}
Of course more creative solutions can be written, such as:
case 1: {} int x; //ok, label plus empty compound statement
C Rules About Case Labels
The rules of the C standard that prevent a declaration from following a case label are:
A case label must be followed by a statement (C 2011 [N1570] 6.8.1).
The C standard defines a statement as one of labeled-statement, compound-statement, expression-statement, selection-statement, iteration-statement, or jump-statement (6.8). None of these is a declaration.
The C standard treats declarations and statements separately. The rule that allows declarations to be largely mingled with statements is that a compound-statement is a list of block-items in braces (that is, { block-item-listopt }) (6.8.2), and a block-item is defined as a declaration or a statement. So, inside braces, you can mix declarations and statements. But a case label must be part of a statement; it is not a separate thing you can insert anywhere.
Declarations can be included inside a switch using two alternatives. One is to use an empty statement after the case label, as in:
case 123:
;
int foo;
…
Another is to use a compound statement after the case label, as in:
case 123:
{
int foo;
…
}
Generally, the latter is preferable, because the scope of foo is limited to the compound statement, so it cannot be used accidentally in another section of the switch statement.
Reasons For the Rules
I do not see a reason for this other than history. Originally, declarations were even more restricted than they are now. Inside functions, declarations had to be the first statements inside braces. You could not put a declaration after any statement. That has been relaxed in modern C, but why is there still a restriction on what follows a case label?
There cannot be a semantic reason that a declaration cannot follow a case label in modern C, because the empty-statement example above would have the same semantics as:
case 123:
int foo;
That is, the compiler would have to be prepared to create and initialize a new object at the same point in execution. Since it has to do that for the legal example code, it would be able to do it for this version too.
I also do not see a syntactic or grammatical barrier. The colon after the constant expression of a case label is pretty distinct. (The constant expression can have colons in it from ? : operators, but the first : not associated with a ? will be the end of the case label.) Once parsing reaches that colon, the current parsing state seems clean. I do not see why it could not recognize either a declaration or a statement there, just as it was prepared to do before the case.
(If somebody can find a problem in the grammar that would be caused by allowing a case label to be followed by a declaration, that would be interesting.)
First to say, the error you post is a syntax error related to the format of a case label statement. It allows you to use only an executable statement, and not a declaration. Put an empty statement before the declaration and you'll be ok. Try the following:
#include <stdio.h>
int main()
{
switch(3) {
int x = 3; /* the initializer is ignored by the compiler
* you can include declarations here, but as
* this code is never executed, the initializer
* never takes place. */
case 3:; /* <=== look this semicolon to allow the next declaration */
int y = 5;
printf("x = %d, y = %d\n", x, y);
break;
}
}
The first variable, x, will be declared properly, but the initializer will not be executed, as the case statement selected is the one corresponding to case label 3. The printf will print
x = 0, y = 5
(note: this happens on my machine, as the variable x is not initialized, Undefined Behaviour is expected)
In C, some evolution has been realized over the years, concerning the use of declarations in a block.
In ancient C, variables can be declared only at the beginning of a block (the piece of code between { and }, but this approach has been thrown for the new possibility of declaring a variable whenever you need it (even after some executable sentences after a block begin) But a case statement permits only to put an executable statement, and not a declaration, so that's the reason of your compiler error.
If you follow the ancient C way, you can only declare new local variables only after the opening { curly brace after the switch, as in:
switch(something) {
int var1;
case BLA_BLA:
/* use var1 here */
which, although counterintuitive, is valid since the old K&R code. The problem with this approach is that the variable is valid from the point of definition until the end of the switch statement, and so, it is global to all the case parts.
Another way, is the form you propose, in which you declare a new block by opening curly braces. This works also since the old K&R code, and makes it easier to control the scope of the variables defined. Personally, I prefer this second approach. A block is an executable statement, so there's no problem in using it as the labeled case statement of the switch (the declarations happen inside it).
Case labels don't delimit blocks of code, they label executable statements, so their syntax is specific to the case statement syntax (which finishes after the semicolon of the statement it is attached to, or the closing curly br)
I would like to return an error code using longjmp, and pass it on from the function that called setjmp. Simplified code:
int do_things(stuff ........)
{
int error_code;
jmp_buf jb;
if ((error_code = setjmp(jb)) == 0) {
/* do stuff */
return 0;
}
else {
return error_code;
}
}
But I'v read:
"An invocation of the setjmp macro shall appear only in one of the following contexts:"
the entire controlling expression of a selection or iteration statement
if (setjmp(jb)) {
switch (setjmp(jb)) {
while (setjmp(jb)) {
or
one operand of a relational or equality operator with the other operand
an integer constant expression, with the resulting expression being
the entire controlling expression of a selection or iteration statement
if (setjmp(jb) < 3) {
or
the operand of a unary ! operator with the resulting
expression being the entire controlling expression of a
selection or iteration statement
if (!setjmp(jb)) {
or
the entire expression of an expression statement (possibly cast to void).
setjmp(bf);
Is there a nice way get the return value?
( without using switch, and writing a case for all possibly values )
EDIT
Thanks to Matt for finding it in the c99 rationale.
What I came up with now, is:
int do_things(stuff ........)
{
volatile error_code;
jmp_buf jb;
if (setjmp(jb) == 0) {
working_some(&error_code, ....);
working_again(&error_code, ....);
working_more(&error_code, ....);
working_for_fun(&error_code, ....);
return 0;
}
else {
general_cleanup();
return error_code;
}
}
One more variable, doesn't seem very nice...
From the C99 rationale:
One proposed requirement on setjmp is that it be usable like any other function, that is, that it
be callable in any expression context, and that the expression evaluate correctly whether the
return from setjmp is direct or via a call to longjmp. Unfortunately, any implementation of
setjmp as a conventional called function cannot know enough about the calling environment to
save any temporary registers or dynamic stack locations used part way through an expression evaluation. (A setjmp macro seems to help only if it expands to inline assembly code or a call
to a special built-in function.) The temporaries may be correct on the initial call to setjmp, but
are not likely to be on any return initiated by a corresponding call to longjmp. These
considerations dictated the constraint that setjmp be called only from within fairly simple
expressions, ones not likely to need temporary storage.
An alternative proposal considered by the C89 Committee was to require that implementations
recognize that calling setjmp is a special case, and hence that they take whatever precautions
are necessary to restore the setjmp environment properly upon a longjmp call. This proposal
was rejected on grounds of consistency: implementations are currently allowed to implement
library functions specially, but no other situations require special treatment.
My interpretation of this is that it was considered to be too restrictive to specify that a = setjmp(jb); must work. So the Standard leaves it undefined. But a particular compiler may choose to support this (and hopefully, would document it). To be portable, I guess you should use some preprocessor checks to verify that the code is being compiled with a compiler that is known to support this.
POSIX.1 indeed specifies these limitations. However, on Linux, setjmp(3) doesn't mention them, so on that platform you can simply do:
int retval = setjmp(jb);
Of course this comes at the cost of some portability, but I don't know how bad it is.
Here's an example of a macro that wraps iterator functions in C,
Macro definition:
/* helper macros for iterating over tree types */
#define NODE_TREE_TYPES_BEGIN(ntype) \
{ \
GHashIterator *__node_tree_type_iter__ = ntreeTypeGetIterator(); \
for (; !BLI_ghashIterator_done(__node_tree_type_iter__); BLI_ghashIterator_step(__node_tree_type_iter__)) { \
bNodeTreeType *ntype = BLI_ghashIterator_getValue(__node_tree_type_iter__);
#define NODE_TREE_TYPES_END \
} \
BLI_ghashIterator_free(__node_tree_type_iter__); \
} (void)0
Example use:
NODE_TREE_TYPES_BEGIN(nt)
{
if (nt->ext.free) {
nt->ext.free(nt->ext.data);
}
}
NODE_TREE_TYPES_END;
However nested use (while functional), causes shadowing (gcc's -Wshadow)
NODE_TREE_TYPES_BEGIN(nt_a)
{
NODE_TREE_TYPES_BEGIN(nt_b)
{
/* do something */
}
NODE_TREE_TYPES_END;
}
NODE_TREE_TYPES_END;
The only way I can think of to avoid this is to pass a unique identifier to NODE_TREE_TYPES_BEGIN and NODE_TREE_TYPES_END. So my question is...
Is there there a way to prevent shadowing if variables declared within an iterator macro when its scope is nested?
You don't need to insert the same unique identifier in two places, if you can restructure the block so that it never needs the second macro to close it - then you only have one macro invocation and can use simple solutions like __LINE__ or __COUNTER__.
You can restructure the block by taking further advantage of for, to insert operations intended to happen after the block, in a position textually before it:
#define NODE_TREE_TYPES(ntype) \
for (GHashIterator *__node_tree_type_iter__ = ntreeTypeGetIterator(); \
__node_tree_type_iter__; \
(BLI_ghashIterator_free(__node_tree_type_iter__), __node_tree_type_iter__ = NULL)) \
for (bNodeTreeType *ntype = NULL; \
(ntype = BLI_ghashIterator_getValue(__node_tree_type_iter__), !BLI_ghashIterator_done(__node_tree_type_iter__)); \
BLI_ghashIterator_step(__node_tree_type_iter__))
The outer level of your original macro pairs is a compound statement, containing exactly three things: a declaration+initialization, an enclosed for structure, and a single free operation after which the declared variable is not used again.
This makes it very easy to restructure as a for of its own instead of an explicit compound statement: the declaration+initialization goes in the first clause of the for (wouldn't be as easy if you'd had two variables, although it is still possible); the enclosed for can be placed after the end of the for header we're building, since it's a single statement; and the free operation is placed in the third clause. Since the variable is not used in any further statements, we can take advantage of it: combine the free with an explicit assignment of NULL, using the comma operator, and then make the middle clause a check that the variable is not NULL, ensuring the loop runs exactly once.
The nested for gets a similar but more minor modification. Its statement body contains a declaration and per-loop initialization, but we can still hoist this out; put the declaration in the unused first clause of the for (which will still put it in the new scope), and initialize it in the second clause so that it happens at the start of every iteration; combine that initialization with the actual test using the comma operator again. This removes all boilerplate from the statement block and therefore means you no longer have any braces, and thus no need for a second macro to close the braces.
Then you have a single macro invocation you can use like this:
NODE_TREE_TYPES (nt) {
if (nt->ext.free) {
nt->ext.free(nt->ext.data);
}
}
(you can then apply the generation of a unique identifier to this to get rid of shadowing easily, using techniques shown in other questions)
Is this ugly? Does abusing the for statement and comma operator make the average C programmer's skin crawl? Oh lord yes. BUT, it's a bit cleaner, and it's the arguable "right" way to mess about if you really have to mess about.
Having a "close" macro that inserts compound-statement-breaks or hides close braces is a much worse idea, because not only does it give you problems with identifiers and matching scope, but it also hides the block structure of the program from the reader; abuse of the for statement at least means that the block structure of the program, and variable scope and so on, is not mutilated as well.