Nitpicking booleans in C - c

I was reading comp.lang.cs description of booleans values, pre-C99. It mentions that some people prefer to define their own boolean values as:
#define TRUE (1==1)
#define FALSE (!TRUE)
However, the standard defines the equality operator to always return a signed int with a value of 1 when two values compare equal (C11 - 6.5.9) and the logical not operator shall return a int with a value of 0 if the value compares unequal to 0 (C11 - 6.5.3.3).
If this is the case and the above definitions use literals, won't the evaluation happen compile time and the resulting definitions be:
#define TRUE (1)
#define FALSE (0)
And a follow-up question. Is there any case where it makes sense to define the true and false labels to anything other than 1 and 0, respectively?
And pardon that I reference C11 when my question concerns C89 but I only have the C11 standard at hand.

(1==1) and (!TRUE) are useful definitions on some compilers (I don't have a concrete example off of the top of my head) that track whether an integer came from a boolean comparison. This enables them to warn for
if (i)
while at the same time not warning for
if (i != 0)
and also not warning for
j = i != 0;
if (j)
even though in all three cases, the conditional is a non-constant int.
This way, no warning would be generated for int b = TRUE;...if (b), since b would be considered a truth-integer.
You can make a legitimate argument that such warnings are useless, but others can make an equally legitimate argument that such warnings do have use. It will have many false positives in common code, but it may make code more readable if it is written in a way that avoids such warnings.
At the same time, such definitions are harmless for other compilers that do not track this, since they just see constant expressions that evaluate to 1 and 0.

Related

Does the C standard explicitly indicate truth value as 0 or 1?

We know that any numbers that are not equal to 0 are viewed as true in C, so we can write:
int a = 16;
while (a--)
printf("%d\n", a); // prints numbers from 15 to 0
However, I was wondering whether true / false are defined as 1/0 in C, so I tried the code below:
printf("True = %d, False = %d\n", (0 == 0), (0 != 0)); // prints: True = 1, False = 0
Does C standard explicitly indicate the truth values of true and false as 1 and 0 respectively?
Does the C standard explicitly indicate the truth values of true and false as 0 and 1 respectively?
The C standard defines true and false as macros in stdbool.h which expand to 1 and 0 respectively.
C11-§7.18:
The remaining three macros are suitable for use in #if preprocessing directives. They are
true
which expands to the integer constant 1,
false
which expands to the integer constant 0 [...]
For the operators == and != , standard says
C11-§6.5.9/3:
The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence.108) Each of the operators yields 1 if the specified relation is true and 0 if it is false. The result has type int. For any pair of operands, exactly one of the relations is true.
It is not explicitly indicated in C11. All language-level operations will return 1 as truthy (and accept any nonzero including NaN as true).
If you concern about _Bool, then true must be 1 because the standard only require it to hold 0 and 1. (§6.2.5/2).
Also in <stdbool.h> the macro true expands to 1 (§7.18/3)
==, !=, <, >, <= and >= return 0 or 1 (§6.5.8/6, §6.5.9/3).
!, && and || return 0 or 1 (§6.5.3.3/5, §6.5.13/3, §6.5.14/3)
defined expands to 0 or 1 (§6.10.1/1)
But all standard library functions e.g. islower just say "nonzero" for truthy (e.g. §7.4.1/1, §7.17.5.1/3, §7.30.2.1/1, §7.30.2.2.1/4).
§6.2.5/2: An object declared as type _Bool is large enough to store the values 0 and 1.
§6.5.5.3/5: The result of the logical negation operator ! is 0 if the value of its operand compares unequal to 0, 1 if the value of its operand compares equal to 0. …
§6.5.8/6: Each of the operators < (less than), > (greater than), <= (less than or equal to), and >= (greater than or equal to) shall yield 1 if the specified relation is true and 0 if it is false.107) …
§6.5.9/3: The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence.108) Each of the operators yields 1 if the specified relation is true and 0 if it is false. …
§6.5.13/3: The && operator shall yield 1 if both of its operands compare unequal to 0; …
§6.5.14/3: The || operator shall yield 1 if either of its operands compare unequal to 0; …
§6.10.1/1: … it may contain unary operator expressions of the form — defined identifier — or — defined ( identifier ) — which evaluate to 1 if …
§7.4.1 (Character classification functions)/1: The functions in this subclause return nonzero (true) if and only if …
§7.18/3: The remaining three macros are suitable for use in #if preprocessing directives. They are — true — which expands to the integer constant 1, …
§7.17.5.1/3: The atomic_is_lock_free generic function returns nonzero (true) if and only if the object’s operations are lock-free. …
§7.30.2.1 (Wide character classification functions)/1: The functions in this subclause return nonzero (true) if and only if …
§7.30.2.2.1/4: The iswctype function returns nonzero (true) if and only if …
There are two areas of the standard you need to be aware with when dealing with Boolean values (by which I mean true/false values rather than the specific C bool/_Bool type) in C.
The first has to do with the result of expressions and can be found in various portions of C11 6.5 Expressions (relational and equality operators, for example) . The bottom line is that, whenever a Boolean value is generated by an expression, it ...
... yields 1 if the specified relation is true and 0 if it is false. The result has type int.
So, yes, the result of any Boolean-generating expression will be one for true, or zero for false. This matches what you will find in stdbool.h where the standard macros true and false are defined the same way.
Keep in mind however that, following the robustness principle of "be conservative in what you send, liberal in what you accept", the interpretation of integers in the Boolean context is somewhat more relaxed.
Again, from various parts of 6.5, you'll see language like:
The || operator shall yield 1 if either of its operands compare unequal to 0; otherwise, it yields 0. The result has type int.
From that (and other parts), it's obvious that zero is considered false and any other value is true.
As an aside, the language specifying what value are used for Boolean generation and interpretation also appear back in C99 and C89 so they've been around for quite some time. Even K&R (ANSI-C second edition and the first edition) specified that, with text segments such as:
Relational expressions like i > j and logical expressions connected by && and || are defined to have value 1 if true, and 0 if false.
In the test part of if, while, for, etc, "true" just means "non-zero".
The && operator ... returns 1 if both its operands compare unequal to zero, 0 otherwise.
The || operator ... returns 1 if either its operands compare unequal to zero, and 0 otherwise.
The macros in stdbool.h appear back in C99 as well, but not in C89 or K&R since that header file did not exist at that point.
You are mixing up a lot of different things: control statements, operators and boolean types. Each have their own rules.
Control statements work like for example the if statement, C11 6.4.8.1:
In both forms, the first substatement is executed if the expression
compares unequal to 0.
while, for etc have the same rule. This has nothing to do with "true" or "false".
As for operators that are supposedly yielding a boolean result, they are actually yielding an int with value 1 or 0. For example the equality operators, C11 6.5.9:
Each of the operators yields 1 if the specified relation is true and 0
if it is false
All of the above is because C did not have a boolean type until the year 1999, and even when it did get one, the above rules weren't changed. So unlike most other programming languages where statements and operators yield a boolean type (like C++ and Java), they just yield an int, with a value zero or not zero. For example, sizeof(1==1) will give 4 in C but 1 in C++.
The actual boolean type in C is named _Bool and requires a modern compiler. The header stdbool.h defines macros bool, true and false, that expand to _Bool, 1 and 0 respectively (for compatibility with C++).
It is however considered good programming practice to treat control statements and operators as if they actually required/yielded a boolean type. Certain coding standards like MISRA-C recommend such practice. That is:
if(ptr == NULL) instead of if(ptr).
if((data & mask) != 0) instead of if(data & mask).
The aim of such style is to increase type safety with the aid of static analysis tools, which in turn reduces bugs. Arguably, this style is only meaningful if you do use static analysers. Though in some cases it leads to more readable, self-documenting code, for example
if(c == '\0')
Good, the intent is clear, the code is self-documenting.
versus
if(c)
Bad. Could mean anything, and we have to go look for the type of c to understand the code. Is it an integer, a pointer or a character?
I've programmed in many languages. I've seen true be 1 or -1 depending on the language. The logic behind true being 1 was that a bit was either a 0 or 1. The logic behind true being -1 was that the ! operator was a one's complement. It changed all the 1's to 0's and all the 0's to 1's in an int. So, for an int, !0 = -1 and !(-1) = 0. This has tripped me up enough that I don't compare something to be == true, but instead compare it to be != false. That way, my programming style works in every language. So my answer is to not worry about it, but program so that your code works correctly either way.
This answer needs to be looked at a bit more closely.
The actual definition in C++ is that anything not 0 is treated as true. Why is this relevant? Because C++ doesn't know what an integer is by how we think about it--we create that meaning, all it holds is the shell and rules for what that means. It knows what bits are though, that which make up an integer.
1 as an integer is loosely represented in bits, say an 8-bit signed int as 0000 0001. Many times what we see visually is a bit of a lie, -1 is a much more common way to represent it because of the signed nature of 'integer'. 1 really can't mean true proper, why? Because it's NOT operation is 1111 1110. That's a really major issue for a boolean. When we talk about a boolean, it's just 1 bit--it's really simple, 0 is false and 1 is true. All the logic operations hold as trivial. This is why '-1' should be designated as 'true' for integers (signed). 1111 1111 NOT'ed becomes 0000 0000---the logic holds and we're good. Unsigned ints is a little bit tricky and were a lot more commonly used in the past--where 1 means true because it's easy to imply the logic that 'anything not 0 is true'.
That's the explanation. I say the accepted answer here is wrong--there is no clear definition in the C/C++ definition. A boolean is a boolean, you can treat an integer as a boolean, but the fact the output is an integer says nothing about the operation actually being done is bitwise.
It happened because of the Relational Operators in your printf statement.
Operator == and operator !=
Since (0 == 0) holds true so, it gives a value 1
whereas, (0 != 0) doesn't hold true so, gives a value 0 .
I think I might have found the perfect solution to your problem.
Yes, 0 and any non-zero number are False and True respectively. Though there is no boolean data type in C.
But this is not the problem, the actual problem is how you are dealing with the modification of variable a in the your code :
int a = 16;
while (a--){
printf("%d\n", a);
}
When the compiler comes to the while (condition) statement, first the value of a is read by the compiler for the condition, then the arithmetic operation takes place, in this case,
a = a - 1 / a -= 1. So in the end there will be a case when a = 1 and the condition satisfies and after the arithmetic operation a-- which leads to a = 0, the print statement prints a as 0.
The above scenario depends on whether you use --a or a--. These two statements are read by the compiler in the order they are written.
For --a first the operation is performed on a then its value is read and vice-versa for the other.
So for case --a when a = 1 first the operation is done i.e a = a - 1 / a -= 1 and then a is evaluated for the condition, which then comes out to be Falsy as a = 0. Try the code below :
int a = 16;
while (--a){
printf("%d\n", a); // prints numbers from 15 to 1 as intended
}
OR deal with the modification of a inside the while loop block.
int a = 16;
while(a){
a = a - 1; // or a -= 1
printf("%d\n", a); // also prints numbers from 15 to 1 as intended
}

#define TRUE !FALSE vs #define TRUE 1

Putting aside the fact that since c99 the stdbool.h has existed, when defining macros to handle Boolean types in C is there any difference between the following?
#define FALSE 0
#define TRUE 1 // Option 1
#define TRUE !FALSE // Option 2
From the live example here, it doesn't seem to make a difference. Is there a technical benefit to either option? (Not including the fact that the second example would work better with c++ bool objects.)
ISO C and C99 both define ! like so.
The result of the logical negation operator ! is 0 if the value of
its operand compares unequal to 0, 1 if the value of its operand
compares equal to 0. The result has type int . The expression !E is
equivalent to (0==E).
So !0 evaluates to 1. Given a standards compliant C compiler both your options will have the same result. In addition there's no runtime penalty, compilers will constant fold !0 to 1 at compile time.
If you want to take this to the logical extreme and make no assumptions about what true or false are...
#define TRUE (1==1)
#define FALSE (!TRUE)
This has the advantage of always being true no matter the language. For example, in shell 0 is usually considered "true" or "not an error".
This sort of thing is an anachronism from a time when C did not have an agreed upon standard. For example, the first edition of Code Complete advocates this on page 369. When it was published back in 1993 there was a good chance your C compiler was not going to be ISO compliant and stdbool.h did not exist. "Code Complete" is also intended for the polyglot programmer working in many different languages. Some, like shell and Lisp, define truth differently.
There is no benefit to option 2, as ! 0 is guaranteed by the C standard to evaluate to 1.
Defining TRUE in that manner is a staple of old sources, presumably in an attempt to follow the style guide that calls for avoiding "magical constants" whenever possible.
#define FALSE 0
#define TRUE 1 // Option 1
#define TRUE !FALSE // Option 2
There is no difference in the values. Both 1 and !0 are constant expressions of type int with the same value, 1 (by the Standard's definition of the semantics of the ! operator).
There is a possible difference in that the second definition is not properly parenthesized. Remember that macro expansion is performed textually. Expanding an unparenthesized macro in the middle of an expression can lead to operator precedence problems. I've written up a contrived example here.
Since the unary ! operator has very high precedence, you're not likely to run into a problem. The only case I can think of is if you use it as a prefix to the indexing operator. For example, given:
int arr[] = { 10, 20 };
Option 1 gives:
TRUE[arr] == 20
while option 2 gives:
TRUE[arr] == 0
To see why, remember that array indexing is commutative (see this question and my answer, and that the indexing operator [] binds more tightly than !.
The lessons here are:
For any macro that's intended to be used as an expression, the entire macro definition should be enclosed in parentheses -- even if you can't think of a case where it would matter.
Keep It Simple. In C, 0 is the only false value, and 1 is the canonical true value. (Any non-zero value is "true", but the built-in "Boolean" operators always yield 0 or 1.) Using the ! operator to define TRUE in terms of FALSE (or vice versa) is just an unnecessary complication.
Use <stdbool.h> if you can. If you can't (because you're stuck with a pre-C99 compiler), I recommend this:
typedef enum { false, true } bool;
It's not quite the same as C99's _Bool / bool (conversions to this bool type aren't normalized to 0 or 1), but it's close enough for almost all purposes.
Not much difference.
#define TRUE 1 has a slight advantage over #define TRUE !FALSE in that 1 is a single item unaffected by operator precedence.
!FALSE could be (!FALSE) to cope with arcane code that attempts to use ++ -- [] . ->, which have higher precedence next to FALSE.
In the C language, TRUE is properly defined as (!FALSE) because while zero (0) is FALSE and FALSE is zero (0), any other value is TRUE. You can use almost any variable as a boolean expression, and if it is non-zero the value of the expression is TRUE. A NULL pointer is zero for just that reason. So is the end-of-string character ('\0'). There is a great deal of code written to take advantage of that fact. Consider:
while ( *d++ = *s++ );
The copy will end when the end-of-string character is copied. This idiom is very common. Never mind the buffer size issues.
This is one reason why it is a bad idea to test for equality to TRUE if you do not have a modern dedicated boolean type where the only possible values are TRUE and FALSE. I suggest you make it a habit to test for inequality to FALSE anyway, for safety's sake. You may not always get to work with the new and shiny.
No much difference. But I think #define TRUE 1 is better.
If you use #define TRUE !FALSE:
FALSE is 0 and TRUE is all numbers expect 0.
If you use #define TRUE 1:
FALSE is 0 and TRUE is 1. No problem.
They are just same.
!0 is 1 so !FALSE is 1
#define TRUE !FALSE has no technical benefit at all although it existed for along time and appeared many where.
#define TRUE !FALSE can be misunderstood, one could think that TRUE represents every value which not 0.
Only 1 equals to TRUE, others values like 2,3, 255... (which !=0) does not equal to TRUE
To prevent this misunderstanding, many organizations require not using #define TRUE !FALSE any more or comparison to TRUE should be changed to !FALSE:
// Should not
if (var_bool == TRUE) {
...
}
//Should
if (var_bool != FALSE) {
...
}

Why #define TRUE (1==1) in a C boolean macro instead of simply as 1?

I've seen definitions in C
#define TRUE (1==1)
#define FALSE (!TRUE)
Is this necessary? What's the benefit over simply defining TRUE as 1, and FALSE as 0?
This approach will use the actual boolean type (and resolve to true and false) if the compiler supports it. (specifically, C++)
However, it would be better to check whether C++ is in use (via the __cplusplus macro) and actually use true and false.
In a C compiler, this is equivalent to 0 and 1.
(note that removing the parentheses will break that due to order of operations)
The answer is portability. The numeric values of TRUE and FALSE aren't important. What is important is that a statement like if (1 < 2) evaluates to if (TRUE) and a statement like if (1 > 2) evaluates to if (FALSE).
Granted, in C, (1 < 2) evaluates to 1 and (1 > 2) evaluates to 0, so as others have said, there's no practical difference as far as the compiler is concerned. But by letting the compiler define TRUE and FALSE according to its own rules, you're making their meanings explicit to programmers, and you're guaranteeing consistency within your program and any other library (assuming the other library follows C standards ... you'd be amazed).
Some History
Some BASICs defined FALSE as 0 and TRUE as -1. Like many modern languages, they interpreted any non-zero value as TRUE, but they evaluated boolean expressions that were true as -1. Their NOT operation was implemented by adding 1 and flipping the sign, because it was efficient to do it that way. So 'NOT x' became -(x+1). A side effect of this is that a value like 5 evaluates to TRUE, but NOT 5 evaluates to -6, which is also TRUE! Finding this sort of bug is not fun.
Best Practices
Given the de facto rules that zero is interpreted as FALSE and any non-zero value is interpreted as TRUE, you should never compare boolean-looking expressions to TRUE or FALSE. Examples:
if (thisValue == FALSE) // Don't do this!
if (thatValue == TRUE) // Or this!
if (otherValue != TRUE) // Whatever you do, don't do this!
Why? Because many programmers use the shortcut of treating ints as bools. They aren't the same, but compilers generally allow it. So, for example, it's perfectly legal to write
if (strcmp(yourString, myString) == TRUE) // Wrong!!!
That looks legitimate, and the compiler will happily accept it, but it probably doesn't do what you'd want. That's because the return value of strcmp() is
0 if yourString == myString
<0 if yourString < myString
>0 if yourString > myString
So the line above returns TRUE only when yourString > myString.
The right way to do this is either
// Valid, but still treats int as bool.
if (strcmp(yourString, myString))
or
// Better: lingustically clear, compiler will optimize.
if (strcmp(yourString, myString) != 0)
Similarly:
if (someBoolValue == FALSE) // Redundant.
if (!someBoolValue) // Better.
return (x > 0) ? TRUE : FALSE; // You're fired.
return (x > 0); // Simpler, clearer, correct.
if (ptr == NULL) // Perfect: compares pointers.
if (!ptr) // Sleazy, but short and valid.
if (ptr == FALSE) // Whatisthisidonteven.
You'll often find some of these "bad examples" in production code, and many experienced programmers swear by them: they work, some are shorter than their (pedantically?) correct alternatives, and the idioms are almost universally recognized. But consider: the "right" versions are no less efficient, they're guaranteed to be portable, they'll pass even the strictest linters, and even new programmers will understand them.
Isn't that worth it?
The (1 == 1) trick is useful for defining TRUE in a way that is transparent to C, yet provides better typing in C++. The same code can be interpreted as C or C++ if you are writing in a dialect called "Clean C" (which compiles either as C or C++) or if you are writing API header files that can be used by C or C++ programmers.
In C translation units, 1 == 1 has exactly the same meaning as 1; and 1 == 0 has the same meaning as 0. However, in the C++ translation units, 1 == 1 has type bool. So the TRUE macro defined that way integrates better into C++.
An example of how it integrates better is that for instance if function foo has overloads for int and for bool, then foo(TRUE) will choose the bool overload. If TRUE is just defined as 1, then it won't work nicely in the C++. foo(TRUE) will want the int overload.
Of course, C99 introduced bool, true, and false and these can be used in header files that work with C99 and with C.
However:
this practice of defining TRUE and FALSE as (0==0) and (1==0) predates C99.
there are still good reasons to stay away from C99 and work with C90.
If you're working in a mixed C and C++ project, and don't want C99, define the lower-case true, false and bool instead.
#ifndef __cplusplus
typedef int bool;
#define true (0==0)
#define false (!true)
#endif
That being said, the 0==0 trick was (is?) used by some programmers even in code that was never intended to interoperate with C++ in any way. That doesn't buy anything and suggests that the programmer has a misunderstanding of how booleans work in C.
In case the C++ explanation wasn't clear, here is a test program:
#include <cstdio>
void foo(bool x)
{
std::puts("bool");
}
void foo(int x)
{
std::puts("int");
}
int main()
{
foo(1 == 1);
foo(1);
return 0;
}
The output:
bool
int
As to the question from the comments of how are overloaded C++ functions relevant to mixed C and C++ programming. These just illustrate a type difference. A valid reason for wanting a true constant to be bool when compiled as C++ is for clean diagnostics. At its highest warning levels, a C++ compiler might warn us about a conversion if we pass an integer as a bool parameter. One reason for writing in Clean C is not only that our code is more portable (since it is understood by C++ compilers, not only C compilers), but we can benefit from the diagnostic opinions of C++ compilers.
#define TRUE (1==1)
#define FALSE (!TRUE)
is equivalent to
#define TRUE 1
#define FALSE 0
in C.
The result of the relational operators is 0 or 1. 1==1 is guaranteed to be evaluated to 1 and !(1==1) is guaranteed to be evaluated to 0.
There is absolutely no reason to use the first form. Note that the first form is however not less efficient as on nearly all compilers a constant expression is evaluated at compile time rather than at run-time. This is allowed according to this rule:
(C99, 6.6p2) "A constant expression can be evaluated during translation rather than runtime, and accordingly may be used in any place that a constant may be."
PC-Lint will even issue a message (506, constant value boolean) if you don't use a literal for TRUE and FALSE macros:
For C, TRUE should be defined to be 1. However, other languages use quantities other than 1 so some programmers feel that !0 is playing it safe.
Also in C99, the stdbool.h definitions for boolean macros true and false directly use literals:
#define true 1
#define false 0
Aside from C++ (already mentioned), another benefit is for static analysis tools. The compiler will do away with any inefficiencies, but a static analyser can use its own abstract types to distinguish between comparison results and other integer types, so it knows implicitly that TRUE must be the result of a comparison and should not be assumed to be compatible with an integer.
Obviously C says that they are compatible, but you may choose to prohibit deliberate use of that feature to help highlight bugs -- for example, where somebody might have confuse & and &&, or they've bungled their operator precedence.
The pratical difference is none. 0 is evaluated to false and 1 is evaluated to true. The fact that you use a boolean expression (1 == 1) or 1, to define true, doesn't make any difference. They both gets evaluated to int.
Notice that the C standard library provides a specific header for defining booleans: stdbool.h.
We don't know the exact value that TRUE is equal to and the compilers can have their own definitions. So what you privode is to use the compiler's internal one for definition. This is not always necessary if you have good programming habits but can avoid problems for some bad coding style, for example:
if ( (a > b) == TRUE)
This could be a disaster if you mannually define TRUE as 1, while the internal value of TRUE is another one.
List item
Typically in the C Programming Language, 1 is defined as true and 0 is defined as false. Hence why you see the following quite often:
#define TRUE 1
#define FALSE 0
However, any number not equal to 0 would be evaluated to true as well in a conditional statement. Therefore by using the below:
#define TRUE (1==1)
#define FALSE (!TRUE)
You can just explicitly show that you trying to play it safe by making false equal to whatever isn't true.

Expansion of module_param() macro: a struct with a single member or a bitfield? [duplicate]

I bumped into this strange macro code in /usr/include/linux/kernel.h:
/* Force a compilation error if condition is true, but also produce a
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
What does :-!! do?
This is, in effect, a way to check whether the expression e can be evaluated to be 0, and if not, to fail the build.
The macro is somewhat misnamed; it should be something more like BUILD_BUG_OR_ZERO, rather than ...ON_ZERO. (There have been occasional discussions about whether this is a confusing name.)
You should read the expression like this:
sizeof(struct { int: -!!(e); }))
(e): Compute expression e.
!!(e): Logically negate twice: 0 if e == 0; otherwise 1.
-!!(e): Numerically negate the expression from step 2: 0 if it was 0; otherwise -1.
struct{int: -!!(0);} --> struct{int: 0;}: If it was zero, then we declare a struct with an anonymous integer bitfield that has width zero. Everything is fine and we proceed as normal.
struct{int: -!!(1);} --> struct{int: -1;}: On the other hand, if it isn't zero, then it will be some negative number. Declaring any bitfield with negative width is a compilation error.
So we'll either wind up with a bitfield that has width 0 in a struct, which is fine, or a bitfield with negative width, which is a compilation error. Then we take sizeof that field, so we get a size_t with the appropriate width (which will be zero in the case where e is zero).
Some people have asked: Why not just use an assert?
keithmo's answer here has a good response:
These macros implement a compile-time test, while assert() is a run-time test.
Exactly right. You don't want to detect problems in your kernel at runtime that could have been caught earlier! It's a critical piece of the operating system. To whatever extent problems can be detected at compile time, so much the better.
The : is a bitfield. As for !!, that is logical double negation and so returns 0 for false or 1 for true. And the - is a minus sign, i.e. arithmetic negation.
It's all just a trick to get the compiler to barf on invalid inputs.
Consider BUILD_BUG_ON_ZERO. When -!!(e) evaluates to a negative value, that produces a compile error. Otherwise -!!(e) evaluates to 0, and a 0 width bitfield has size of 0. And hence the macro evaluates to a size_t with value 0.
The name is weak in my view because the build in fact fails when the input is not zero.
BUILD_BUG_ON_NULL is very similar, but yields a pointer rather than an int.
Some people seem to be confusing these macros with assert().
These macros implement a compile-time test, while assert() is a runtime test.
Well, I am quite surprised that the alternatives to this syntax have not been mentioned. Another common (but older) mechanism is to call a function that isn't defined and rely on the optimizer to compile-out the function call if your assertion is correct.
#define MY_COMPILETIME_ASSERT(test) \
do { \
extern void you_did_something_bad(void); \
if (!(test)) \
you_did_something_bad(void); \
} while (0)
While this mechanism works (as long as optimizations are enabled) it has the downside of not reporting an error until you link, at which time it fails to find the definition for the function you_did_something_bad(). That's why kernel developers starting using tricks like the negative sized bit-field widths and the negative-sized arrays (the later of which stopped breaking builds in GCC 4.4).
In sympathy for the need for compile-time assertions, GCC 4.3 introduced the error function attribute that allows you to extend upon this older concept, but generate a compile-time error with a message of your choosing -- no more cryptic "negative sized array" error messages!
#define MAKE_SURE_THIS_IS_FIVE(number) \
do { \
extern void this_isnt_five(void) __attribute__((error( \
"I asked for five and you gave me " #number))); \
if ((number) != 5) \
this_isnt_five(); \
} while (0)
In fact, as of Linux 3.9, we now have a macro called compiletime_assert which uses this feature and most of the macros in bug.h have been updated accordingly. Still, this macro can't be used as an initializer. However, using by statement expressions (another GCC C-extension), you can!
#define ANY_NUMBER_BUT_FIVE(number) \
({ \
typeof(number) n = (number); \
extern void this_number_is_five(void) __attribute__(( \
error("I told you not to give me a five!"))); \
if (n == 5) \
this_number_is_five(); \
n; \
})
This macro will evaluate its parameter exactly once (in case it has side-effects) and create a compile-time error that says "I told you not to give me a five!" if the expression evaluates to five or is not a compile-time constant.
So why aren't we using this instead of negative-sized bit-fields? Alas, there are currently many restrictions of the use of statement expressions, including their use as constant initializers (for enum constants, bit-field width, etc.) even if the statement expression is completely constant its self (i.e., can be fully evaluated at compile-time and otherwise passes the __builtin_constant_p() test). Further, they cannot be used outside of a function body.
Hopefully, GCC will amend these shortcomings soon and allow constant statement expressions to be used as constant initializers. The challenge here is the language specification defining what is a legal constant expression. C++11 added the constexpr keyword for just this type or thing, but no counterpart exists in C11. While C11 did get static assertions, which will solve part of this problem, it wont solve all of these shortcomings. So I hope that gcc can make a constexpr functionality available as an extension via -std=gnuc99 & -std=gnuc11 or some such and allow its use on statement expressions et. al.
It's creating a size 0 bitfield if the condition is false, but a size -1 (-!!1) bitfield if the condition is true/non-zero. In the former case, there is no error and the struct is initialized with an int member. In the latter case, there is a compile error (and no such thing as a size -1 bitfield is created, of course).

What is ":-!!" in C code?

I bumped into this strange macro code in /usr/include/linux/kernel.h:
/* Force a compilation error if condition is true, but also produce a
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
What does :-!! do?
This is, in effect, a way to check whether the expression e can be evaluated to be 0, and if not, to fail the build.
The macro is somewhat misnamed; it should be something more like BUILD_BUG_OR_ZERO, rather than ...ON_ZERO. (There have been occasional discussions about whether this is a confusing name.)
You should read the expression like this:
sizeof(struct { int: -!!(e); }))
(e): Compute expression e.
!!(e): Logically negate twice: 0 if e == 0; otherwise 1.
-!!(e): Numerically negate the expression from step 2: 0 if it was 0; otherwise -1.
struct{int: -!!(0);} --> struct{int: 0;}: If it was zero, then we declare a struct with an anonymous integer bitfield that has width zero. Everything is fine and we proceed as normal.
struct{int: -!!(1);} --> struct{int: -1;}: On the other hand, if it isn't zero, then it will be some negative number. Declaring any bitfield with negative width is a compilation error.
So we'll either wind up with a bitfield that has width 0 in a struct, which is fine, or a bitfield with negative width, which is a compilation error. Then we take sizeof that field, so we get a size_t with the appropriate width (which will be zero in the case where e is zero).
Some people have asked: Why not just use an assert?
keithmo's answer here has a good response:
These macros implement a compile-time test, while assert() is a run-time test.
Exactly right. You don't want to detect problems in your kernel at runtime that could have been caught earlier! It's a critical piece of the operating system. To whatever extent problems can be detected at compile time, so much the better.
The : is a bitfield. As for !!, that is logical double negation and so returns 0 for false or 1 for true. And the - is a minus sign, i.e. arithmetic negation.
It's all just a trick to get the compiler to barf on invalid inputs.
Consider BUILD_BUG_ON_ZERO. When -!!(e) evaluates to a negative value, that produces a compile error. Otherwise -!!(e) evaluates to 0, and a 0 width bitfield has size of 0. And hence the macro evaluates to a size_t with value 0.
The name is weak in my view because the build in fact fails when the input is not zero.
BUILD_BUG_ON_NULL is very similar, but yields a pointer rather than an int.
Some people seem to be confusing these macros with assert().
These macros implement a compile-time test, while assert() is a runtime test.
Well, I am quite surprised that the alternatives to this syntax have not been mentioned. Another common (but older) mechanism is to call a function that isn't defined and rely on the optimizer to compile-out the function call if your assertion is correct.
#define MY_COMPILETIME_ASSERT(test) \
do { \
extern void you_did_something_bad(void); \
if (!(test)) \
you_did_something_bad(void); \
} while (0)
While this mechanism works (as long as optimizations are enabled) it has the downside of not reporting an error until you link, at which time it fails to find the definition for the function you_did_something_bad(). That's why kernel developers starting using tricks like the negative sized bit-field widths and the negative-sized arrays (the later of which stopped breaking builds in GCC 4.4).
In sympathy for the need for compile-time assertions, GCC 4.3 introduced the error function attribute that allows you to extend upon this older concept, but generate a compile-time error with a message of your choosing -- no more cryptic "negative sized array" error messages!
#define MAKE_SURE_THIS_IS_FIVE(number) \
do { \
extern void this_isnt_five(void) __attribute__((error( \
"I asked for five and you gave me " #number))); \
if ((number) != 5) \
this_isnt_five(); \
} while (0)
In fact, as of Linux 3.9, we now have a macro called compiletime_assert which uses this feature and most of the macros in bug.h have been updated accordingly. Still, this macro can't be used as an initializer. However, using by statement expressions (another GCC C-extension), you can!
#define ANY_NUMBER_BUT_FIVE(number) \
({ \
typeof(number) n = (number); \
extern void this_number_is_five(void) __attribute__(( \
error("I told you not to give me a five!"))); \
if (n == 5) \
this_number_is_five(); \
n; \
})
This macro will evaluate its parameter exactly once (in case it has side-effects) and create a compile-time error that says "I told you not to give me a five!" if the expression evaluates to five or is not a compile-time constant.
So why aren't we using this instead of negative-sized bit-fields? Alas, there are currently many restrictions of the use of statement expressions, including their use as constant initializers (for enum constants, bit-field width, etc.) even if the statement expression is completely constant its self (i.e., can be fully evaluated at compile-time and otherwise passes the __builtin_constant_p() test). Further, they cannot be used outside of a function body.
Hopefully, GCC will amend these shortcomings soon and allow constant statement expressions to be used as constant initializers. The challenge here is the language specification defining what is a legal constant expression. C++11 added the constexpr keyword for just this type or thing, but no counterpart exists in C11. While C11 did get static assertions, which will solve part of this problem, it wont solve all of these shortcomings. So I hope that gcc can make a constexpr functionality available as an extension via -std=gnuc99 & -std=gnuc11 or some such and allow its use on statement expressions et. al.
It's creating a size 0 bitfield if the condition is false, but a size -1 (-!!1) bitfield if the condition is true/non-zero. In the former case, there is no error and the struct is initialized with an int member. In the latter case, there is a compile error (and no such thing as a size -1 bitfield is created, of course).

Resources