I bumped into this strange macro code in /usr/include/linux/kernel.h:
/* Force a compilation error if condition is true, but also produce a
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
What does :-!! do?
This is, in effect, a way to check whether the expression e can be evaluated to be 0, and if not, to fail the build.
The macro is somewhat misnamed; it should be something more like BUILD_BUG_OR_ZERO, rather than ...ON_ZERO. (There have been occasional discussions about whether this is a confusing name.)
You should read the expression like this:
sizeof(struct { int: -!!(e); }))
(e): Compute expression e.
!!(e): Logically negate twice: 0 if e == 0; otherwise 1.
-!!(e): Numerically negate the expression from step 2: 0 if it was 0; otherwise -1.
struct{int: -!!(0);} --> struct{int: 0;}: If it was zero, then we declare a struct with an anonymous integer bitfield that has width zero. Everything is fine and we proceed as normal.
struct{int: -!!(1);} --> struct{int: -1;}: On the other hand, if it isn't zero, then it will be some negative number. Declaring any bitfield with negative width is a compilation error.
So we'll either wind up with a bitfield that has width 0 in a struct, which is fine, or a bitfield with negative width, which is a compilation error. Then we take sizeof that field, so we get a size_t with the appropriate width (which will be zero in the case where e is zero).
Some people have asked: Why not just use an assert?
keithmo's answer here has a good response:
These macros implement a compile-time test, while assert() is a run-time test.
Exactly right. You don't want to detect problems in your kernel at runtime that could have been caught earlier! It's a critical piece of the operating system. To whatever extent problems can be detected at compile time, so much the better.
The : is a bitfield. As for !!, that is logical double negation and so returns 0 for false or 1 for true. And the - is a minus sign, i.e. arithmetic negation.
It's all just a trick to get the compiler to barf on invalid inputs.
Consider BUILD_BUG_ON_ZERO. When -!!(e) evaluates to a negative value, that produces a compile error. Otherwise -!!(e) evaluates to 0, and a 0 width bitfield has size of 0. And hence the macro evaluates to a size_t with value 0.
The name is weak in my view because the build in fact fails when the input is not zero.
BUILD_BUG_ON_NULL is very similar, but yields a pointer rather than an int.
Some people seem to be confusing these macros with assert().
These macros implement a compile-time test, while assert() is a runtime test.
Well, I am quite surprised that the alternatives to this syntax have not been mentioned. Another common (but older) mechanism is to call a function that isn't defined and rely on the optimizer to compile-out the function call if your assertion is correct.
#define MY_COMPILETIME_ASSERT(test) \
do { \
extern void you_did_something_bad(void); \
if (!(test)) \
you_did_something_bad(void); \
} while (0)
While this mechanism works (as long as optimizations are enabled) it has the downside of not reporting an error until you link, at which time it fails to find the definition for the function you_did_something_bad(). That's why kernel developers starting using tricks like the negative sized bit-field widths and the negative-sized arrays (the later of which stopped breaking builds in GCC 4.4).
In sympathy for the need for compile-time assertions, GCC 4.3 introduced the error function attribute that allows you to extend upon this older concept, but generate a compile-time error with a message of your choosing -- no more cryptic "negative sized array" error messages!
#define MAKE_SURE_THIS_IS_FIVE(number) \
do { \
extern void this_isnt_five(void) __attribute__((error( \
"I asked for five and you gave me " #number))); \
if ((number) != 5) \
this_isnt_five(); \
} while (0)
In fact, as of Linux 3.9, we now have a macro called compiletime_assert which uses this feature and most of the macros in bug.h have been updated accordingly. Still, this macro can't be used as an initializer. However, using by statement expressions (another GCC C-extension), you can!
#define ANY_NUMBER_BUT_FIVE(number) \
({ \
typeof(number) n = (number); \
extern void this_number_is_five(void) __attribute__(( \
error("I told you not to give me a five!"))); \
if (n == 5) \
this_number_is_five(); \
n; \
})
This macro will evaluate its parameter exactly once (in case it has side-effects) and create a compile-time error that says "I told you not to give me a five!" if the expression evaluates to five or is not a compile-time constant.
So why aren't we using this instead of negative-sized bit-fields? Alas, there are currently many restrictions of the use of statement expressions, including their use as constant initializers (for enum constants, bit-field width, etc.) even if the statement expression is completely constant its self (i.e., can be fully evaluated at compile-time and otherwise passes the __builtin_constant_p() test). Further, they cannot be used outside of a function body.
Hopefully, GCC will amend these shortcomings soon and allow constant statement expressions to be used as constant initializers. The challenge here is the language specification defining what is a legal constant expression. C++11 added the constexpr keyword for just this type or thing, but no counterpart exists in C11. While C11 did get static assertions, which will solve part of this problem, it wont solve all of these shortcomings. So I hope that gcc can make a constexpr functionality available as an extension via -std=gnuc99 & -std=gnuc11 or some such and allow its use on statement expressions et. al.
It's creating a size 0 bitfield if the condition is false, but a size -1 (-!!1) bitfield if the condition is true/non-zero. In the former case, there is no error and the struct is initialized with an int member. In the latter case, there is a compile error (and no such thing as a size -1 bitfield is created, of course).
Related
My system uses libc6 2.29. In /usr/include/assert.h I can find the definition of assert() macro:
/* The first occurrence of EXPR is not evaluated due to the sizeof,
but will trigger any pedantic warnings masked by the __extension__
for the second occurrence. The ternary operator is required to
support function pointers and bit fields in this context, and to
suppress the evaluation of variable length arrays. */
# define assert(expr) \
((void) sizeof ((expr) ? 1 : 0), __extension__ ({ \
if (expr) \
; /* empty */ \
else \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION); \
}))
I wonder why to use the comma operator, and what is meant by 'The first occurrence of EXPR is not evaluated due to the sizeof'.
What problem there would be using the following definition:
# define assert(expr) \
({ \
if (expr) \
; /* empty */ \
else \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION); \
})
Edit:
what value does the operator ({ }) get if expr is true?
Is it possible to rewrite the definition of assert() as follows?
# define assert(expr) \
((void) sizeof ((expr) ? 1 : 0), __extension__ ({ \
if (!expr) \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION); \
}))
Where are the problems with this last definition?
I'm not 100% certain on this but I'll give it a go.
First, let's review a few things being used here.
The comma operator discards the first n-1 expression results and returns the nth result. It's often used as a sequence point, as it's guaranteed that the expressions will be evaluated in order.
The use of __extension__ here, which is a GNU LibC macro, is used to mask any warnings about GNU-specific extensions in headers under compilation environments that specify pedantic warnings, either via -ansi or -pedantic, etc. Usually under such compilers, using a compiler-specific extension would throw a warning (or an error if you're running under -Werror, which is quite common), but since in cases where GNU libraries and compilers are being used, libc allows itself to use some extensions where it can safely do so.
Now, since the actual assertion logic might use a GNU extension (as is indicated by the use of __extension__, any real warnings that the expression itself might have raised given its semantics (that is, the expression passed to assert(expr)) would be masked since that expression would be semantically located within the __extension__ block and thus masked.
Therefore, there needed to be a way to allow the compiler the chance to show those warnings, but without evaluating the actual expression (since the expression could have side effects and a double-evaluation could cause undesired behaviour).
You can do this by using the sizeof operator, which takes an expression, looks at its type, and finds the number of chars it takes up - without actually evaluating the expression.
For example, if we have a function int blow_up_the_world(), then the expression sizeof(blow_up_the_world()) would find the size of the result of the expression (in this case, int) without actually evaluating the expression. Using sizeof() in this case meant the world would, in fact, not be blown up.
However, if the expr passed to assert(expr) contained code that would otherwise trigger a compiler warning (e.g. using an extension under -pedantic or -ansi modes), the compiler would still show those warnings even though the code was inside the sizeof() - warnings that would otherwise be masked inside the __extension__ block.
Next, we see that instead of passing expr directly to sizeof, they instead use a ternary. That's because the type of a ternary is whatever type both resulting expressions have - in this case is int or something equivalent. This is because passing certain things to sizeof will result in a runtime value - namely in the case of variable length arrays - which could have undesired effects, or might produce an error, such as when passing sizeof a function name.
Lastly, they wanted all of that, but before the actual evaluation and wanted to keep assert() as an expression, so instead of using a do{}while() block or something similar, which would ultimately result in assert() being a statement, they instead used the comma operator to discard the result of the first sizeof() trick.
({ is not standard C and will trigger warnings or errors in standard C compilation modes.
So they are using __extension__, which will disable any such diagnostics.
However __extension__ will also mask non-standard constructs in expr, which you do want diagnosed.
Which is why they need expr repeated twice, once inside __extension__ and once outside.
However expr only needs to be evaluated once.
So they inhibit another evaluation by placing the other occurrence of expr inside sizeof.
Just sizeof(expr) is not enough though, because it won't work for things like function names.
So sizeof((expr) ? 1 : 0) is used instead, which doesn't have this problem.
So the two parts of the generated expression are (a) sizeof((expr) ? 1 : 0) and (b) the __extension__(...) part .
The first part is only needed to produce diagnostics if something is wrong with the expr.
The second part does the actual assertion.
Finally, the two parts are connected with the comma operator.
I need to write a macro which traps any invalid index i for an array of length n. Here is what I got so far:
#define TRAP(i, n) (((unsigned int) (i) < (n))? (i): (abort(), 0))
The problem with this definition, however, is that the index expression i is evaluated twice; in the expression a[TRAP(f(), n)], for instance, f may have a side effect or take a long time to execute. I cannot introduce a temporary variable since the macro needs to expand to an expression. Also, defining TRAP as an ordinary function implies a run-time overhead and makes it harder for the compiler to optimize away the trap.
Is there a way to rewrite TRAP so that i is evaluated only once?
Edit: I'm using ANSI C89
You can evaluate once, and use the result, by doing something like this:
#define TRAP2(i, n) ({unsigned int _i = (i); _i < (n)? _i: (abort(), 0);})
This is a gcc specific solution, that will compile when used as the RHS of an assignment. It defines a (very) local variable, which might hide a prior definition of another variable, but that doesn't matter, as long as you don't try to use the prior version in the macro. But as people say, why do this in the first place?
Use the macro TRAP when the index expression doesn't contain a function call and use a (non-macro) function trap when it does. This way the function call overhead only occurs in the rarer latter case.
I was reading comp.lang.cs description of booleans values, pre-C99. It mentions that some people prefer to define their own boolean values as:
#define TRUE (1==1)
#define FALSE (!TRUE)
However, the standard defines the equality operator to always return a signed int with a value of 1 when two values compare equal (C11 - 6.5.9) and the logical not operator shall return a int with a value of 0 if the value compares unequal to 0 (C11 - 6.5.3.3).
If this is the case and the above definitions use literals, won't the evaluation happen compile time and the resulting definitions be:
#define TRUE (1)
#define FALSE (0)
And a follow-up question. Is there any case where it makes sense to define the true and false labels to anything other than 1 and 0, respectively?
And pardon that I reference C11 when my question concerns C89 but I only have the C11 standard at hand.
(1==1) and (!TRUE) are useful definitions on some compilers (I don't have a concrete example off of the top of my head) that track whether an integer came from a boolean comparison. This enables them to warn for
if (i)
while at the same time not warning for
if (i != 0)
and also not warning for
j = i != 0;
if (j)
even though in all three cases, the conditional is a non-constant int.
This way, no warning would be generated for int b = TRUE;...if (b), since b would be considered a truth-integer.
You can make a legitimate argument that such warnings are useless, but others can make an equally legitimate argument that such warnings do have use. It will have many false positives in common code, but it may make code more readable if it is written in a way that avoids such warnings.
At the same time, such definitions are harmless for other compilers that do not track this, since they just see constant expressions that evaluate to 1 and 0.
I bumped into this strange macro code in /usr/include/linux/kernel.h:
/* Force a compilation error if condition is true, but also produce a
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
What does :-!! do?
This is, in effect, a way to check whether the expression e can be evaluated to be 0, and if not, to fail the build.
The macro is somewhat misnamed; it should be something more like BUILD_BUG_OR_ZERO, rather than ...ON_ZERO. (There have been occasional discussions about whether this is a confusing name.)
You should read the expression like this:
sizeof(struct { int: -!!(e); }))
(e): Compute expression e.
!!(e): Logically negate twice: 0 if e == 0; otherwise 1.
-!!(e): Numerically negate the expression from step 2: 0 if it was 0; otherwise -1.
struct{int: -!!(0);} --> struct{int: 0;}: If it was zero, then we declare a struct with an anonymous integer bitfield that has width zero. Everything is fine and we proceed as normal.
struct{int: -!!(1);} --> struct{int: -1;}: On the other hand, if it isn't zero, then it will be some negative number. Declaring any bitfield with negative width is a compilation error.
So we'll either wind up with a bitfield that has width 0 in a struct, which is fine, or a bitfield with negative width, which is a compilation error. Then we take sizeof that field, so we get a size_t with the appropriate width (which will be zero in the case where e is zero).
Some people have asked: Why not just use an assert?
keithmo's answer here has a good response:
These macros implement a compile-time test, while assert() is a run-time test.
Exactly right. You don't want to detect problems in your kernel at runtime that could have been caught earlier! It's a critical piece of the operating system. To whatever extent problems can be detected at compile time, so much the better.
The : is a bitfield. As for !!, that is logical double negation and so returns 0 for false or 1 for true. And the - is a minus sign, i.e. arithmetic negation.
It's all just a trick to get the compiler to barf on invalid inputs.
Consider BUILD_BUG_ON_ZERO. When -!!(e) evaluates to a negative value, that produces a compile error. Otherwise -!!(e) evaluates to 0, and a 0 width bitfield has size of 0. And hence the macro evaluates to a size_t with value 0.
The name is weak in my view because the build in fact fails when the input is not zero.
BUILD_BUG_ON_NULL is very similar, but yields a pointer rather than an int.
Some people seem to be confusing these macros with assert().
These macros implement a compile-time test, while assert() is a runtime test.
Well, I am quite surprised that the alternatives to this syntax have not been mentioned. Another common (but older) mechanism is to call a function that isn't defined and rely on the optimizer to compile-out the function call if your assertion is correct.
#define MY_COMPILETIME_ASSERT(test) \
do { \
extern void you_did_something_bad(void); \
if (!(test)) \
you_did_something_bad(void); \
} while (0)
While this mechanism works (as long as optimizations are enabled) it has the downside of not reporting an error until you link, at which time it fails to find the definition for the function you_did_something_bad(). That's why kernel developers starting using tricks like the negative sized bit-field widths and the negative-sized arrays (the later of which stopped breaking builds in GCC 4.4).
In sympathy for the need for compile-time assertions, GCC 4.3 introduced the error function attribute that allows you to extend upon this older concept, but generate a compile-time error with a message of your choosing -- no more cryptic "negative sized array" error messages!
#define MAKE_SURE_THIS_IS_FIVE(number) \
do { \
extern void this_isnt_five(void) __attribute__((error( \
"I asked for five and you gave me " #number))); \
if ((number) != 5) \
this_isnt_five(); \
} while (0)
In fact, as of Linux 3.9, we now have a macro called compiletime_assert which uses this feature and most of the macros in bug.h have been updated accordingly. Still, this macro can't be used as an initializer. However, using by statement expressions (another GCC C-extension), you can!
#define ANY_NUMBER_BUT_FIVE(number) \
({ \
typeof(number) n = (number); \
extern void this_number_is_five(void) __attribute__(( \
error("I told you not to give me a five!"))); \
if (n == 5) \
this_number_is_five(); \
n; \
})
This macro will evaluate its parameter exactly once (in case it has side-effects) and create a compile-time error that says "I told you not to give me a five!" if the expression evaluates to five or is not a compile-time constant.
So why aren't we using this instead of negative-sized bit-fields? Alas, there are currently many restrictions of the use of statement expressions, including their use as constant initializers (for enum constants, bit-field width, etc.) even if the statement expression is completely constant its self (i.e., can be fully evaluated at compile-time and otherwise passes the __builtin_constant_p() test). Further, they cannot be used outside of a function body.
Hopefully, GCC will amend these shortcomings soon and allow constant statement expressions to be used as constant initializers. The challenge here is the language specification defining what is a legal constant expression. C++11 added the constexpr keyword for just this type or thing, but no counterpart exists in C11. While C11 did get static assertions, which will solve part of this problem, it wont solve all of these shortcomings. So I hope that gcc can make a constexpr functionality available as an extension via -std=gnuc99 & -std=gnuc11 or some such and allow its use on statement expressions et. al.
It's creating a size 0 bitfield if the condition is false, but a size -1 (-!!1) bitfield if the condition is true/non-zero. In the former case, there is no error and the struct is initialized with an int member. In the latter case, there is a compile error (and no such thing as a size -1 bitfield is created, of course).
#if sizeof(int) != 4
/* do something */
Using sizeof inside #if doesn't work while inside #define it works, why?
#define size(x) sizeof(x)/sizeof(x[0]) /*works*/
Nothing is evil - everything can be misused, or in your case misunderstood. The sizeof operator is a compiler feature, but compiler features are not available to the preprocessor (which runs before the compiler gets involved), and so cannot be used in #if preprocessor directives.
However, when you say:
#define size(x) sizeof(x)/sizeof(x[0])
and use it:
size(a)
the preprocessor performs a textual substitution that is handed to the compiler:
sizeof(a)/sizeof(a[0])
C "Preprocessor" Macros Only Evaluate Constants and Other Macros
The short answer is a preprocessor expression only provides a meaningful evaluation of an expression composed of other preprocessor macros and constants.
Try this, you will not get an error:
#if sizeof < 2
int f(int x) { return x; }
#endif
If you generate assembly, you will find that sizeof < 2 compiles the function and sizeof >= 2 does not. Neither returns an error.
What's going on? It turns out that, except for preprocessor macros themselves, all identifiers in a preprocessor ("macro") expression are replaced with 0. So the above #if is the same as saying:
#if Easter_Bunny < 2
or
#if 0 < 2
This is why you don't actually get any sort of error when mistakenly using the sizeof operator in a preprocessor expression.
As it happens, sizeof is an operator, but it's also an identifier, and identifiers that are not themselves macros all turn into 0 in preprocessor expressions. The preprocessor runs, at least conceptually, before the compiler. It can turn non-C syntax into C so at the point it is running, the C program hasn't even been parsed yet. It isn't possible to reference actual C objects yet: they don't exist.
And naturally, a sizeof in the replacement text of a definition is simply passed through to the compiler as, well, the replacement text where the macro is used.
The preprocessor cannot evaluate the results of the sizeof operator. That is calculated by the compiler, long after the preprocessor is finished.
Since the second expression results in a compile-time computation, it works. The first is an impossible test for the preprocessor.
#define is merely text replacement. #if being a conditional preprocessor directive evaluates sizeof() but at the time of preprocessing the preprocessor has no idea what sizeof() is. Preprocessor runs before the lexical analysis phase.
sizeof is replaced at compile time.
Preprocessing runs before compile starts.
The compiler doesn't touch either line. Rather, the preprocessor rips through the file, replacing any instances of size(x) with your macro. The compiler DOES see these replacements.
Preprocessor doesn't know sizeof operator, it just cannot understand it. So #if doesn't work, since it has to understand it to work, because it is a conditional conditional preprocessor; it needs to know whether it evaluates to true or false.
But #define doesn't need to understand sizeof, as #define is just for text replacement. Preprocessor searches size macro (defined in #define) in the source code, and replaces it with what it is defined to be, which is in your case sizeof(x)/sizeof(x[0]).
The reason it doesn't work is because the pre-processor macros are 'evaluated' in a pass before the code reaches the compiler. So in the if pre-processor directive, the sizeof(int) (actually the sizeof(int) != 4) cannot be evaluated because that is done by the compiler, not the pre-processor.
The define statement though, simply does a text substitution, and so when it comes to the compiler, everywhere you had 'size(x)' you would have 'sizeof(x)/sizeof(x[0])' instead, and then this evaluates there at the compile stage... at every point in the code where you had 'size(x)'
If you want to check the size of the integer in the processor, use your make system to discover the size of integer on your system before running the preprocessor and write it to a header file as e.g. #define SIZEOF_INT 4, include this header file and do #if SIZEOF_INT == 4
For example, if you use cmake, you can use the CMAKE_SIZEOF_INT variable which has the size of the integer which you can put in a macro.