libc6: comma operator in assert macro definition - c

My system uses libc6 2.29. In /usr/include/assert.h I can find the definition of assert() macro:
/* The first occurrence of EXPR is not evaluated due to the sizeof,
but will trigger any pedantic warnings masked by the __extension__
for the second occurrence. The ternary operator is required to
support function pointers and bit fields in this context, and to
suppress the evaluation of variable length arrays. */
# define assert(expr) \
((void) sizeof ((expr) ? 1 : 0), __extension__ ({ \
if (expr) \
; /* empty */ \
else \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION); \
}))
I wonder why to use the comma operator, and what is meant by 'The first occurrence of EXPR is not evaluated due to the sizeof'.
What problem there would be using the following definition:
# define assert(expr) \
({ \
if (expr) \
; /* empty */ \
else \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION); \
})
Edit:
what value does the operator ({ }) get if expr is true?
Is it possible to rewrite the definition of assert() as follows?
# define assert(expr) \
((void) sizeof ((expr) ? 1 : 0), __extension__ ({ \
if (!expr) \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION); \
}))
Where are the problems with this last definition?

I'm not 100% certain on this but I'll give it a go.
First, let's review a few things being used here.
The comma operator discards the first n-1 expression results and returns the nth result. It's often used as a sequence point, as it's guaranteed that the expressions will be evaluated in order.
The use of __extension__ here, which is a GNU LibC macro, is used to mask any warnings about GNU-specific extensions in headers under compilation environments that specify pedantic warnings, either via -ansi or -pedantic, etc. Usually under such compilers, using a compiler-specific extension would throw a warning (or an error if you're running under -Werror, which is quite common), but since in cases where GNU libraries and compilers are being used, libc allows itself to use some extensions where it can safely do so.
Now, since the actual assertion logic might use a GNU extension (as is indicated by the use of __extension__, any real warnings that the expression itself might have raised given its semantics (that is, the expression passed to assert(expr)) would be masked since that expression would be semantically located within the __extension__ block and thus masked.
Therefore, there needed to be a way to allow the compiler the chance to show those warnings, but without evaluating the actual expression (since the expression could have side effects and a double-evaluation could cause undesired behaviour).
You can do this by using the sizeof operator, which takes an expression, looks at its type, and finds the number of chars it takes up - without actually evaluating the expression.
For example, if we have a function int blow_up_the_world(), then the expression sizeof(blow_up_the_world()) would find the size of the result of the expression (in this case, int) without actually evaluating the expression. Using sizeof() in this case meant the world would, in fact, not be blown up.
However, if the expr passed to assert(expr) contained code that would otherwise trigger a compiler warning (e.g. using an extension under -pedantic or -ansi modes), the compiler would still show those warnings even though the code was inside the sizeof() - warnings that would otherwise be masked inside the __extension__ block.
Next, we see that instead of passing expr directly to sizeof, they instead use a ternary. That's because the type of a ternary is whatever type both resulting expressions have - in this case is int or something equivalent. This is because passing certain things to sizeof will result in a runtime value - namely in the case of variable length arrays - which could have undesired effects, or might produce an error, such as when passing sizeof a function name.
Lastly, they wanted all of that, but before the actual evaluation and wanted to keep assert() as an expression, so instead of using a do{}while() block or something similar, which would ultimately result in assert() being a statement, they instead used the comma operator to discard the result of the first sizeof() trick.

({ is not standard C and will trigger warnings or errors in standard C compilation modes.
So they are using __extension__, which will disable any such diagnostics.
However __extension__ will also mask non-standard constructs in expr, which you do want diagnosed.
Which is why they need expr repeated twice, once inside __extension__ and once outside.
However expr only needs to be evaluated once.
So they inhibit another evaluation by placing the other occurrence of expr inside sizeof.
Just sizeof(expr) is not enough though, because it won't work for things like function names.
So sizeof((expr) ? 1 : 0) is used instead, which doesn't have this problem.
So the two parts of the generated expression are (a) sizeof((expr) ? 1 : 0) and (b) the __extension__(...) part .
The first part is only needed to produce diagnostics if something is wrong with the expr.
The second part does the actual assertion.
Finally, the two parts are connected with the comma operator.

Related

Working of conditional compilation #if and #else (and others) in c

I tried to write a program using some conditional compilation pre-processing directives instead of "if-else" as follows.
#include<stdio.h>
int main ()
{
int x;
scanf ("%d",&x);
#if (x==5)
printf ("x is 5");
#else
printf ("x not 5");
#endif
}
But the thing is, it always print the else part even though value of xis 5. My simplest question is----->WHY?
Is it possible to successfully complete this program (i.e taking value of x from user and check conditions using #if directive and print statement under #if).
During compilation it shows a warning "'x' is not defined, evaluates to 0". But x seems defined to me. Does that mean x should be defined using #define. Please explain me concept behind Conditional Compilation.
x is not an integer literal or an integer literal expression (integer literals + operators) or a macro expanding to those, so in a conditional, the preprocessor replaces it with 0 (6.10.1p4). 0==5 is false, so the #else branch is taken.
The preprocessor doesn't know about C declarations, types and such. It only works with tokens (and macros that ultimately expand to those).
6.10.1p4
After all replacements due to macro expansion and the defined unary
operator have been performed, all remaining identifiers (including
those lexically identical to keywords) are replaced with the pp-number
0, and then each preprocessing token is converted into a token.
Preprocessing takes place before the compilation. So preprocessor does not know anything about your C code or variables. You cant use any C variables in conditions.
Conditional compilation is for different purposes.
#define DEBUG
/* ....*/
#ifdef DEBUG
printf("Some debug value %d\n", val);
#endif
Operands in #if statements can be only constants, things defined with #define, and a special defined operator. Any other identifiers in the expression are replaced with 0. The x in your sample code is not defined with #define, so (x==5) becomes (0==0).
In the C 2018 standard, clause 6.10.1 tells us that evaluation of the expression in an #if statement includes:
Preprocessor macros (things defined with #define) are replaced according to their definitions.
Uses of the defined operator are replaced with 0 or 1.
Any remaining identifiers are replaced with 0.
Because the x in your sample code is not defined with #define, it is replaced with 0 in the #if statement. This results in (0==5), which is false, so code between the #if and the #else is skipped.
In a preprocessor statement, you cannot evaluate variables based on values that will be set during program execution.
It's the "pre-processor". "Pre" means "before".
You're trying to use a runtime value during preprocessing! The preprocessor of course has no access to that information during the build.
This problem isn't limited to runtime values, but is more fundamental. Even if you were trying to use a (named) compile-time constant such as constexpr int x = 2, you couldn't do that. These are two languages interleaving, like generating HTML with PHP; the HTML has no knowledge of PHP variables, and the PHP has no knowledge of what widgets the user clicks on the page. These are completely different execution contexts with no built-in interaction or cross-compatibility.

Is a repeated macro invocation via token concatenation unspecified behavior?

The C11 standard admits vagueness with regard to at least one situation that can arise in macro expansion, when a function like macro expands to its unenvoked name, and is invoked by the next preprocessing token. The example given in the standard is this.
#define f(a) a*g
#define g(a) f(a)
// may produce either 2*f(9) or 2*9*g
f(2)(9)
That example does not clarify what happens when a macro, M, is expanded, and all or part of the result contributes via token concatenation to a second preprocessing token, M, which is invoked.
Question: Is such an invocation blocked?
Here is an example of such an invocation. This issue tends to only come up when using a fairly complicated set of macros, so this example is contrived for the sake of simplicity.
// arity gives the arity of its args as a decimal integer (good up to 4 args)
#define arity(...) arity_help(__VA_ARGS__,4,3,2,1,)
#define arity_help(_1,_2,_3,_4,_5,...) _5
// define 'test' to mimic 'arity' by calling it twice
#define test(...) test_help_A( arity(__VA_ARGS__) )
#define test_help_A(k) test_help_B(k)
#define test_help_B(k) test_help_##k
#define test_help_1 arity(1)
#define test_help_2 arity(1,2)
#define test_help_3 arity(1,2,3)
#define test_help_4 arity(1,2,3,4)
// does this expand to '1' or 'arity(1)'?
test(X)
test(X) expands to test_help_A( arity(X) ), which invokes test_help_A on rescanning, which expands its arg before substitution, and so is identical to test_help_A(1), which produces test_help_B(1), which produces test_help_1. This much is clear.
So, the question comes in here. test_help_1 is produced using a character, 1, that came from an expansion of arity. So can the expansion of test_help_1 invoke arity again? My versions of gcc and clang each think so.
Can anyone argue that the interpretations made by gcc and clang are required by something in the standard?
Is anyone aware of an implementation that interprets this situation differently?
I think that gcc's and clang's interpretation are correct. The two expansions of arity are not in the same call path. The first descends from the expansion of test_help_A's argument, the second from the expansion of test_help_A itself.
The idea of these rules is to guarantee that there can't be infinite recursion, which is guaranteed, here. There is progress in the evaluation of the macro between the two calls.

Expansion of module_param() macro: a struct with a single member or a bitfield? [duplicate]

I bumped into this strange macro code in /usr/include/linux/kernel.h:
/* Force a compilation error if condition is true, but also produce a
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
What does :-!! do?
This is, in effect, a way to check whether the expression e can be evaluated to be 0, and if not, to fail the build.
The macro is somewhat misnamed; it should be something more like BUILD_BUG_OR_ZERO, rather than ...ON_ZERO. (There have been occasional discussions about whether this is a confusing name.)
You should read the expression like this:
sizeof(struct { int: -!!(e); }))
(e): Compute expression e.
!!(e): Logically negate twice: 0 if e == 0; otherwise 1.
-!!(e): Numerically negate the expression from step 2: 0 if it was 0; otherwise -1.
struct{int: -!!(0);} --> struct{int: 0;}: If it was zero, then we declare a struct with an anonymous integer bitfield that has width zero. Everything is fine and we proceed as normal.
struct{int: -!!(1);} --> struct{int: -1;}: On the other hand, if it isn't zero, then it will be some negative number. Declaring any bitfield with negative width is a compilation error.
So we'll either wind up with a bitfield that has width 0 in a struct, which is fine, or a bitfield with negative width, which is a compilation error. Then we take sizeof that field, so we get a size_t with the appropriate width (which will be zero in the case where e is zero).
Some people have asked: Why not just use an assert?
keithmo's answer here has a good response:
These macros implement a compile-time test, while assert() is a run-time test.
Exactly right. You don't want to detect problems in your kernel at runtime that could have been caught earlier! It's a critical piece of the operating system. To whatever extent problems can be detected at compile time, so much the better.
The : is a bitfield. As for !!, that is logical double negation and so returns 0 for false or 1 for true. And the - is a minus sign, i.e. arithmetic negation.
It's all just a trick to get the compiler to barf on invalid inputs.
Consider BUILD_BUG_ON_ZERO. When -!!(e) evaluates to a negative value, that produces a compile error. Otherwise -!!(e) evaluates to 0, and a 0 width bitfield has size of 0. And hence the macro evaluates to a size_t with value 0.
The name is weak in my view because the build in fact fails when the input is not zero.
BUILD_BUG_ON_NULL is very similar, but yields a pointer rather than an int.
Some people seem to be confusing these macros with assert().
These macros implement a compile-time test, while assert() is a runtime test.
Well, I am quite surprised that the alternatives to this syntax have not been mentioned. Another common (but older) mechanism is to call a function that isn't defined and rely on the optimizer to compile-out the function call if your assertion is correct.
#define MY_COMPILETIME_ASSERT(test) \
do { \
extern void you_did_something_bad(void); \
if (!(test)) \
you_did_something_bad(void); \
} while (0)
While this mechanism works (as long as optimizations are enabled) it has the downside of not reporting an error until you link, at which time it fails to find the definition for the function you_did_something_bad(). That's why kernel developers starting using tricks like the negative sized bit-field widths and the negative-sized arrays (the later of which stopped breaking builds in GCC 4.4).
In sympathy for the need for compile-time assertions, GCC 4.3 introduced the error function attribute that allows you to extend upon this older concept, but generate a compile-time error with a message of your choosing -- no more cryptic "negative sized array" error messages!
#define MAKE_SURE_THIS_IS_FIVE(number) \
do { \
extern void this_isnt_five(void) __attribute__((error( \
"I asked for five and you gave me " #number))); \
if ((number) != 5) \
this_isnt_five(); \
} while (0)
In fact, as of Linux 3.9, we now have a macro called compiletime_assert which uses this feature and most of the macros in bug.h have been updated accordingly. Still, this macro can't be used as an initializer. However, using by statement expressions (another GCC C-extension), you can!
#define ANY_NUMBER_BUT_FIVE(number) \
({ \
typeof(number) n = (number); \
extern void this_number_is_five(void) __attribute__(( \
error("I told you not to give me a five!"))); \
if (n == 5) \
this_number_is_five(); \
n; \
})
This macro will evaluate its parameter exactly once (in case it has side-effects) and create a compile-time error that says "I told you not to give me a five!" if the expression evaluates to five or is not a compile-time constant.
So why aren't we using this instead of negative-sized bit-fields? Alas, there are currently many restrictions of the use of statement expressions, including their use as constant initializers (for enum constants, bit-field width, etc.) even if the statement expression is completely constant its self (i.e., can be fully evaluated at compile-time and otherwise passes the __builtin_constant_p() test). Further, they cannot be used outside of a function body.
Hopefully, GCC will amend these shortcomings soon and allow constant statement expressions to be used as constant initializers. The challenge here is the language specification defining what is a legal constant expression. C++11 added the constexpr keyword for just this type or thing, but no counterpart exists in C11. While C11 did get static assertions, which will solve part of this problem, it wont solve all of these shortcomings. So I hope that gcc can make a constexpr functionality available as an extension via -std=gnuc99 & -std=gnuc11 or some such and allow its use on statement expressions et. al.
It's creating a size 0 bitfield if the condition is false, but a size -1 (-!!1) bitfield if the condition is true/non-zero. In the former case, there is no error and the struct is initialized with an int member. In the latter case, there is a compile error (and no such thing as a size -1 bitfield is created, of course).

What is ":-!!" in C code?

I bumped into this strange macro code in /usr/include/linux/kernel.h:
/* Force a compilation error if condition is true, but also produce a
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
What does :-!! do?
This is, in effect, a way to check whether the expression e can be evaluated to be 0, and if not, to fail the build.
The macro is somewhat misnamed; it should be something more like BUILD_BUG_OR_ZERO, rather than ...ON_ZERO. (There have been occasional discussions about whether this is a confusing name.)
You should read the expression like this:
sizeof(struct { int: -!!(e); }))
(e): Compute expression e.
!!(e): Logically negate twice: 0 if e == 0; otherwise 1.
-!!(e): Numerically negate the expression from step 2: 0 if it was 0; otherwise -1.
struct{int: -!!(0);} --> struct{int: 0;}: If it was zero, then we declare a struct with an anonymous integer bitfield that has width zero. Everything is fine and we proceed as normal.
struct{int: -!!(1);} --> struct{int: -1;}: On the other hand, if it isn't zero, then it will be some negative number. Declaring any bitfield with negative width is a compilation error.
So we'll either wind up with a bitfield that has width 0 in a struct, which is fine, or a bitfield with negative width, which is a compilation error. Then we take sizeof that field, so we get a size_t with the appropriate width (which will be zero in the case where e is zero).
Some people have asked: Why not just use an assert?
keithmo's answer here has a good response:
These macros implement a compile-time test, while assert() is a run-time test.
Exactly right. You don't want to detect problems in your kernel at runtime that could have been caught earlier! It's a critical piece of the operating system. To whatever extent problems can be detected at compile time, so much the better.
The : is a bitfield. As for !!, that is logical double negation and so returns 0 for false or 1 for true. And the - is a minus sign, i.e. arithmetic negation.
It's all just a trick to get the compiler to barf on invalid inputs.
Consider BUILD_BUG_ON_ZERO. When -!!(e) evaluates to a negative value, that produces a compile error. Otherwise -!!(e) evaluates to 0, and a 0 width bitfield has size of 0. And hence the macro evaluates to a size_t with value 0.
The name is weak in my view because the build in fact fails when the input is not zero.
BUILD_BUG_ON_NULL is very similar, but yields a pointer rather than an int.
Some people seem to be confusing these macros with assert().
These macros implement a compile-time test, while assert() is a runtime test.
Well, I am quite surprised that the alternatives to this syntax have not been mentioned. Another common (but older) mechanism is to call a function that isn't defined and rely on the optimizer to compile-out the function call if your assertion is correct.
#define MY_COMPILETIME_ASSERT(test) \
do { \
extern void you_did_something_bad(void); \
if (!(test)) \
you_did_something_bad(void); \
} while (0)
While this mechanism works (as long as optimizations are enabled) it has the downside of not reporting an error until you link, at which time it fails to find the definition for the function you_did_something_bad(). That's why kernel developers starting using tricks like the negative sized bit-field widths and the negative-sized arrays (the later of which stopped breaking builds in GCC 4.4).
In sympathy for the need for compile-time assertions, GCC 4.3 introduced the error function attribute that allows you to extend upon this older concept, but generate a compile-time error with a message of your choosing -- no more cryptic "negative sized array" error messages!
#define MAKE_SURE_THIS_IS_FIVE(number) \
do { \
extern void this_isnt_five(void) __attribute__((error( \
"I asked for five and you gave me " #number))); \
if ((number) != 5) \
this_isnt_five(); \
} while (0)
In fact, as of Linux 3.9, we now have a macro called compiletime_assert which uses this feature and most of the macros in bug.h have been updated accordingly. Still, this macro can't be used as an initializer. However, using by statement expressions (another GCC C-extension), you can!
#define ANY_NUMBER_BUT_FIVE(number) \
({ \
typeof(number) n = (number); \
extern void this_number_is_five(void) __attribute__(( \
error("I told you not to give me a five!"))); \
if (n == 5) \
this_number_is_five(); \
n; \
})
This macro will evaluate its parameter exactly once (in case it has side-effects) and create a compile-time error that says "I told you not to give me a five!" if the expression evaluates to five or is not a compile-time constant.
So why aren't we using this instead of negative-sized bit-fields? Alas, there are currently many restrictions of the use of statement expressions, including their use as constant initializers (for enum constants, bit-field width, etc.) even if the statement expression is completely constant its self (i.e., can be fully evaluated at compile-time and otherwise passes the __builtin_constant_p() test). Further, they cannot be used outside of a function body.
Hopefully, GCC will amend these shortcomings soon and allow constant statement expressions to be used as constant initializers. The challenge here is the language specification defining what is a legal constant expression. C++11 added the constexpr keyword for just this type or thing, but no counterpart exists in C11. While C11 did get static assertions, which will solve part of this problem, it wont solve all of these shortcomings. So I hope that gcc can make a constexpr functionality available as an extension via -std=gnuc99 & -std=gnuc11 or some such and allow its use on statement expressions et. al.
It's creating a size 0 bitfield if the condition is false, but a size -1 (-!!1) bitfield if the condition is true/non-zero. In the former case, there is no error and the struct is initialized with an int member. In the latter case, there is a compile error (and no such thing as a size -1 bitfield is created, of course).

sizeof() is not executed by preprocessor

#if sizeof(int) != 4
/* do something */
Using sizeof inside #if doesn't work while inside #define it works, why?
#define size(x) sizeof(x)/sizeof(x[0]) /*works*/
Nothing is evil - everything can be misused, or in your case misunderstood. The sizeof operator is a compiler feature, but compiler features are not available to the preprocessor (which runs before the compiler gets involved), and so cannot be used in #if preprocessor directives.
However, when you say:
#define size(x) sizeof(x)/sizeof(x[0])
and use it:
size(a)
the preprocessor performs a textual substitution that is handed to the compiler:
sizeof(a)/sizeof(a[0])
C "Preprocessor" Macros Only Evaluate Constants and Other Macros
The short answer is a preprocessor expression only provides a meaningful evaluation of an expression composed of other preprocessor macros and constants.
Try this, you will not get an error:
#if sizeof < 2
int f(int x) { return x; }
#endif
If you generate assembly, you will find that sizeof < 2 compiles the function and sizeof >= 2 does not. Neither returns an error.
What's going on? It turns out that, except for preprocessor macros themselves, all identifiers in a preprocessor ("macro") expression are replaced with 0. So the above #if is the same as saying:
#if Easter_Bunny < 2
or
#if 0 < 2
This is why you don't actually get any sort of error when mistakenly using the sizeof operator in a preprocessor expression.
As it happens, sizeof is an operator, but it's also an identifier, and identifiers that are not themselves macros all turn into 0 in preprocessor expressions. The preprocessor runs, at least conceptually, before the compiler. It can turn non-C syntax into C so at the point it is running, the C program hasn't even been parsed yet. It isn't possible to reference actual C objects yet: they don't exist.
And naturally, a sizeof in the replacement text of a definition is simply passed through to the compiler as, well, the replacement text where the macro is used.
The preprocessor cannot evaluate the results of the sizeof operator. That is calculated by the compiler, long after the preprocessor is finished.
Since the second expression results in a compile-time computation, it works. The first is an impossible test for the preprocessor.
#define is merely text replacement. #if being a conditional preprocessor directive evaluates sizeof() but at the time of preprocessing the preprocessor has no idea what sizeof() is. Preprocessor runs before the lexical analysis phase.
sizeof is replaced at compile time.
Preprocessing runs before compile starts.
The compiler doesn't touch either line. Rather, the preprocessor rips through the file, replacing any instances of size(x) with your macro. The compiler DOES see these replacements.
Preprocessor doesn't know sizeof operator, it just cannot understand it. So #if doesn't work, since it has to understand it to work, because it is a conditional conditional preprocessor; it needs to know whether it evaluates to true or false.
But #define doesn't need to understand sizeof, as #define is just for text replacement. Preprocessor searches size macro (defined in #define) in the source code, and replaces it with what it is defined to be, which is in your case sizeof(x)/sizeof(x[0]).
The reason it doesn't work is because the pre-processor macros are 'evaluated' in a pass before the code reaches the compiler. So in the if pre-processor directive, the sizeof(int) (actually the sizeof(int) != 4) cannot be evaluated because that is done by the compiler, not the pre-processor.
The define statement though, simply does a text substitution, and so when it comes to the compiler, everywhere you had 'size(x)' you would have 'sizeof(x)/sizeof(x[0])' instead, and then this evaluates there at the compile stage... at every point in the code where you had 'size(x)'
If you want to check the size of the integer in the processor, use your make system to discover the size of integer on your system before running the preprocessor and write it to a header file as e.g. #define SIZEOF_INT 4, include this header file and do #if SIZEOF_INT == 4
For example, if you use cmake, you can use the CMAKE_SIZEOF_INT variable which has the size of the integer which you can put in a macro.

Resources