Example of usage:
void closefrom (int lowfd);
int main()
{
// expected warning
closefrom(-1);
return 0;
}
Specifically: I should implement diagnostics (compiler warning) for function calls. Function located in glibc:
void closefrom (int lowfd);
If lowfd is negative compiler should issue a warning.
Here is some information about closefrom:
https://sourceware.org/pipermail/libc-alpha/2021-August/129718.html
Maybe attribute of the function will somehow help in this?
You can do it like this, but don’t:
void foo(int x) {}
#define foo(x) foo(((int [(x)+1]){0}, (x)))
int main(void)
{
foo(-1);
return 0;
}
The macro attempts to create a compound literal consisting of an array with (x)+1 elements. If x is negative, (x)+1 will be negative or zero, and the compiler will complain about an improper size for the array. (This may require using compiler switches to disable a zero-length array extension.) If x is zero or positive, the compound literal will be discarded by the comma operator, and foo will be called with argument x. (When a macro name is used in its own replacement, it is not replaced again, so the function foo will be called.)
Note that overflow in (x)+1 is possible. You could add a check against INT_MAX to deal with that.
This requires that x be some compile-time expression, which is implied in your question about how to check at compile time.
With Clang or GCC, you can combine a standard _Static_assert with a non-standard statement expression:
#define foo(x) ({ _Static_assert((x) >= 0, "Argument to foo must be nonnegative."); foo(x); })
If foo is always void, you can simply use a _Static_assert with the do … while idiom to give it statement form when a semicolon is appended:
#define foo(x) do { _Static_assert((x) >= 0, "Argument to foo must be nonnegative."); foo(x); } while (0)
If x can be something other than an int, such as a floating-point type, you might want to work on the conditions a bit to deal with issues that arise in the tests and conversions.
want to issue warning only if compiler is able to calculate the value (by optimizations, or if the value is constant). Otherwise just use function without warnin
In short, with GNU extensions:
void foo (int lowfd);
#if defined(__GNUC__) && defined(__OPTIMIZE__)
#define curb(expr, msg) \
__extension__({ \
if (__builtin_constant_p(expr)) { \
if (!(expr)) { \
__attribute__((__noinline__, \
__warning__(msg))) void warnit() {__asm__("");}; warnit(); \
} \
} \
})
#else
#define curb(expr, msg) (void)0
#endif
#define foo(x) (curb(x >= 0, "x is lower than 0"), foo(x))
int main()
{
// expected warning
foo(-1);
return 0;
}
I have this https://gitlab.com/Kamcuk/kamillibc/-/blob/master/libs/curb/src/curb.h#L46 in my tree about the idea of issuing a warning or error if the expression is known at compile time, and if it isn't, failing at runtime or not. The && defined(__OPTIMIZE__) seems not to be needed in this case.
Related
I am building some generic things in C.
Here is the code:
// main.c
#include <stdio.h>
#define T int;
#include "test.h"
int main()
{
return 0;
}
// test.h
#define _array_is_pointer(T) ( \
{ \
T _value; \
__builtin_classify_type(_value) == 5; \
})
#ifdef T
#if _array_is_pointer(T)
struct array_s
{
T *items;
}
void array_push(struct array_s * array, T value)
{
// push method for pointer.
}
#else
struct array_s
{
T *items;
}
void array_push(struct array_s * array, T value)
{
// push method for non-pointer.
}
#endif
#endif
** edited: add more code in test.h **
I would like the preprocessor runs different code when T is pointer or non-pointer.
But I got an error token "{" is not valid in preprocessor expressions.
Is it possible to do that?
I would like the preprocessor runs different code when T is pointer or non-pointer.
Is it possible to do that?
No, it is not possible. Preprocessor is not aware of types.
If you really want this, pass a mark if T is a pointer or not as a separate macro.
#define T int*
#define T_IS_A_POINTER 1
#include "test.h"
Or have separate calls:
#define T int*
#include "test_a_pointer.h"
#define T int
#include "test_not_a_pointer.h"
The preprocessor doesn't know whether T is a pointer, because preprocessing happens before semantic analysis of the program. All the preprocessor sees are tokens; it knows that 42 is a number and take42, but that's it. The only definitions it knows about are preprocessor #defines.
Moreover, in C, functions --even builtin constant functions like sizeof and __builtin_classify_type-- cannot be evaluated by the preprocessor. The preprocessor cannot evaluate block expressions either, but there wouldn't be much point so it has no idea what a variable is and thus doesn't need declarations. The only identifier you can use in an #if preprocessor conditional are macro definitions which expand to integer constants (or entire expressions containing only arithmetic operations on integer constants).
There is the _Generic construct introduced in C11, which allows you to generate different expressions based on the type of a controlling expression. But it can only be used to generate expressions, not declarations, so it's probably not much help either.
There is no issue while writing multi-line code-snippet in
#define _array_is_pointer(T) ( \
{ \
T _value; \
__builtin_classify_type(_value) == 5; \
})
But, as you have know, the first step done before passing the code to compiler is to create an Expanded source code. In this step, all the five lines woud be pasted whereever you would have written _array_is_pointer(T) and hence resulting code would have :
#if (
{
T _value;
__builtin_classify_type(_value) == 5;
})
and here is a blunder. One can not write multiple lines like this in if clause, nor you could do this using {}. And hence, you got the error token "{" is not valid in preprocessor expressions.
Hence, you would have to write a single expression to in if clause preprocessor.
It looks like the typeof operator is likely to be accepted into the next C standard, and I was looking to see if there was a way to leverage this to create a macro using portable ISO-C that can get the length of an array passed into it or fail to compile if a pointer is passed into it. Normally generic selection can be used to force a compiler error when using an unwanted type by leaving it out of the generic association list, but in this case, we need a default association to deal with arrays of any length, so instead I am trying to force a compiler error for the generic association for the type we don't want. Here's an example of what the macro could look like:
#define ARRAY_SIZE(X) _Generic(&(X), \
typeof(&X[0]) *: sizeof(struct{_Static_assert(0, "Trying to get the array length of a pointer"); int _a;}), \
default: (sizeof(X) / sizeof(X[0])) \
)
The problem is that _Static_assert is tripping even when the generic association selected is the default association. For sake of simplicity, since the issue at hand is not related anything being introduced in C23, we'll make a test program that works explicitly to reject a pointer to int:
#include <stdio.h>
#include <stdlib.h>
#define ARRAY_SIZE(X) _Generic(&(X), \
int **: sizeof(struct{_Static_assert(0, "Trying to get the array length of a pointer"); int _a;}), \
default: (sizeof(X) / sizeof(X[0])) \
)
int main(void) {
int x[100] = {0};
int *y = x;
int (*z)[100] = {&x};
printf("length of x: %zu\n", ARRAY_SIZE(x));
printf("length of y: %zu\n", ARRAY_SIZE(y));
printf("length of z: %zu\n", ARRAY_SIZE(z));
printf("length of *z: %zu\n", ARRAY_SIZE(*z));
return EXIT_SUCCESS;
}
Building the above with -std=c11, I find _Static_assert tripping on all expansions of ARRAY_SIZE when I would expect to only have problems with the pointers that will use the int ** generic association.
According to 6.5.1.1 p3 of the C11 standard for Generic Selection,
None of the expressions from any other generic association of the generic selection is evaluated
Is this a bug in gcc and clang, or is there something I've missed in the standard that would cause the compile-time evaluation of this _Static_assert in the unused generic association?
It doesn't matter which generic selection is evaluated.
When the expression that is part of a _Status_assert has the value 0, this is considered a constraint violation and the compiler is required to generate a diagnostic.
You can't really mix _Static_assert with expressions that should return a value, such as a function-like macro. You could perhaps work around that with a "poor man's static assert", like one of the ugly tricks we used before C11:
#define POOR_STATIC_ASSERT(expr) (int[expr]){0}
#define CHECK(X) _Generic((&X), \
int **: 0,\
default: (sizeof(X) / sizeof(X[0])) \
)
#define ARRAY_SIZE(X) ( (void)POOR_STATIC_ASSERT(CHECK(X)), CHECK(X) )
Here the comma operator is called to have the macro CHECK return the size or zero, in case a type is valid or not. Then call the same macro again to have that one returned from the function-like macro ARRAY_SIZE. This will lead to some cryptic error from an ISO C compiler such as "error: ISO C forbids zero-size array".
The next problem is that &(X) in _Generic is by no means guaranteed to boil down to a int** so this macro isn't safe or reliable. Regarding array sizes though, there's a trick we can use. A pointer to an array of no size (incomplete type) is compatible with every array of the same element type no matter it's size. The macro could be rewritten as:
#define POOR_STATIC_ASSERT(expr) (int[expr]){0}
#define CHECK(X) _Generic((&X), \
int (*)[]: sizeof(X) / sizeof(X[0]), \
default: 0)
#define ARRAY_SIZE(X) ( (void)POOR_STATIC_ASSERT(CHECK(X)), CHECK(X) )
This will work for any int array no matter size but fail for everything else.
Utilizing some of the suggestions from Lundin's answer, I have come up with the following solution to the simplified problem:
#define STATIC_ASSERT_EXPRESSION(X, ERROR_MESSAGE) (sizeof(struct {_Static_assert((X), ERROR_MESSAGE); int _a;}))
#define NO_POINTERS(X) _Generic(&(X), \
int (*)[]: 1, \
default: 0 \
)
#define ARRAY_SIZE(X) ( (void)STATIC_ASSERT_EXPRESSION(NO_POINTERS(X), "Cannot retrieve the number of array elements from a pointer"), (sizeof(X) / sizeof(X[0])) )
For the actual use-case to be type generic using typeof, which should be coming to the C23 standard, replace the NO_POINTERS macro using this:
#define NO_POINTERS(X) _Generic(&(X), \
typeof(*X) (*)[]: 1, \
default: 0 \
)
By moving the _Static_assert outside of the Generic Selection, it will only be evaluated with the value that actually returns from the selection, so it won't get fired off for existing in an unused selection. Additionally, the number of elements calculation was also removed from the generic selection so that expression of the generic selection could safely be used in a Static Assert even if your array was a Variable-Length array which requires its size to be calculated at run-time.
The Static Assert itself is placed inside of an anonymous struct that we take the sizeof so that it is a part of an expression. And then as in Lundin's example, we use the comma operator to have that expression evaluated, and then thrown out and use the results of the array size calculation.
With this, we reject pointers while getting the number of elements in both static arrays and VLAs, plus we get a nice compiler error message when trying to pass in a pointer.
P.S.- I have taken int and int * for simplicity purpose, It can also be struct and struct *.
I am trying to implement a macro to copy data present in one variable to other independent of the variable datatype, In the below solution I am using '_Generic' compiler feature.
program 1:
#include<stdio.h>
#include <string.h>
#define copyVar(var,newVar) _Generic((var),int:({memcpy(newVar,(void *)&var,sizeof(int));}),\
int *:({memcpy(newVar,(void *)var,sizeof(int));}),default:newVar=var)
int main() {
int data = 2;
int *copy;copy = (int *)malloc(sizeof(int));
copyVar(data,copy);
printf("copied Data=%i",*copy);
}
Program 2:
#include<stdio.h>
#include <string.h>
#define copyVar(var,newVar) _Generic((var),int:({memcpy(newVar,(void *)&var,sizeof(int));}),\
int *:({memcpy(newVar,(void *)var,sizeof(int));}),default:newVar=var)
int main() {
int data = 2;
int *copy;copy = (int *)malloc(sizeof(int));
copyVar(&data,copy);
printf("copied Data=%i",*copy);
}
Now problem is, 'program 1' get compiled successfully despite some warning.
But while compiling program 2 gcc throwing error:
error: lvalue required as unary '&' operand #define
copyVar(var,newVar) _Generic((var),int:({memcpy(newVar,(void
*)&var,sizeof(int));}),
and I assume this is due to since _Generic int: selection get preprocessed with one more ampersand
(void *)&&var
why is gcc evaluates all selection?
Your code has various problems: you copy data into an uninitialized pointed, you have superfluous void* casts, you treat _Generic as some sort of compound statement instead of an expression, and so on.
But to answer your question, your code doesn't work because the result of &something is not a lvalue. Since the & operator needs a lvalue, you cannot do & &something. (And you cannot do &&something either because that gets treated as the && operator by the "maximum munch rule".)
So your code doesn't work for the same reason as this code doesn't work:
int x;
int**p = & &x;
gcc tells you that &x is not a lvalue:
lvalue required as unary '&' operand
EDIT - clarification
This _Generic macro, like any macro, works like pre-processor text replacement. So when you have this code in the macro:
_Generic((var), ...
int: ... (void *)&var
int*: ... (void)var
It gets pre-processed as
_Generic((&data), ...
int: ... (void *)& &data
int*: ... (void)&data
And all paths of the _Generic expression are pre-processed. _Generic itself is not part of the pre-processor, but gets evaluated later on, like any expression containing operators. The whole expression is checked for syntactic correctness, even though only one part of the expression is evaluated and executed.
The indented original use of _Generic is with function pointers as here
#define copyVar(var,newVar) \
_Generic((var), \
int: function1, \
int*: function2, \
default:function3)(&(var), &(newVar))
Here the generic expression chooses the function and then this function is applied to whatever are the arguments.
You would have to write the three stub functions that correspond to the three different cases.
If you have them small and nice and as inline in your header file, the optimizer will usually ensure that this mechanism does not have a run time overhead.
This can be solved with a two level _Generic
#define copyVar(var,newVar) \
_Generic((var), \
int : ({ __auto_type _v = var; memcpy(newVar, (void *) _Generic((_v), int: &_v , int *: _v) , sizeof(int));}) , \
int *: ({ __auto_type _v = var; memcpy(newVar, (void *) _v , sizeof(int));}) , \
default: newVar=var \
)
Preamble: I want to statically check amount of struct members in C program, so I created two macros, each of them creates constant int storing __LINE__ into variable:
#include <stdio.h>
#include <string.h>
#define BEGIN(log) const int __##log##_begin = __LINE__;
#define END(log) const int __##log##_end = __LINE__;
BEGIN(TEST);
struct TEST {
int t1;
int t2;
float t3;
int t4;
int t5;
int t6;
};
END(TEST)
main()
{
static_assert(__TEST_end - __TEST_begin == 6 + 3, "not_equal");
}
When I use C++ compiler with -std=c++11 option (c++ test.cpp -std=c++11), it works fine, but the same code (with replacement of static_assert to _Static_assert) doesn't work in C(gcc version 4.8.4) with a strange error as this expression could be evaluated at a compile time:
test.c: In function ‘main’: test.c:18:17: error: expression in static
assertion is not constant _Static_assert(__TEST_end - __TEST_begin
== 6 + 4, "not_equal");
How can I fix this error or achieve the original goal in C?
In C a variable even if defined with const is not a constant expression. _Static_assert requires its first parameter to be a constant expression. Therefore the same thing that can be done in C++ cannot be done in C.
You can do a runtime check instead; use assert.
Note that this method won't guard you against a programmer typing out two members in the same line or using multiple single line declarations of the same type, or adding an empty line (or a comment). Instead of forcing a programmer to follow a string coding pattern, just so that this assert will catch the error, it is less error prone to simply require the programmer to define a correct number of members. It is strictly better, because you can make the an undetectable error either way, but at least doesn't have to worry about the strict coding pattern.
A solution for your issue would be to use an anonymous enumeration. Instead of:
#define BEGIN(log) const int __##log##_begin = __LINE__
#define END(log) const int __##log##_end = __LINE__
Do:
#define BEGIN(log) enum { __##log##_begin = __LINE__ }
#define END(log) enum { __##log##_end = __LINE__ }
This is allowed in C11 since, unlike a const int (or even static const int) variable, an enumeration constant is defined to be an integral constant expression.
(As an aside, I omitted the terminal semicolon from my version of your BEGIN()/END() macros. In my opinion, declaration macros should not include a terminal semicolon, that should be provided by the macro user, so the macro behaves more like a non-macro C declaration.)
I often see instances in which using a macro is better than using a function.
Could someone explain me with an example the disadvantage of a macro compared to a function?
Macros are error-prone because they rely on textual substitution and do not perform type-checking. For example, this macro:
#define square(a) a * a
works fine when used with an integer:
square(5) --> 5 * 5 --> 25
but does very strange things when used with expressions:
square(1 + 2) --> 1 + 2 * 1 + 2 --> 1 + 2 + 2 --> 5
square(x++) --> x++ * x++ --> increments x twice
Putting parentheses around arguments helps but doesn't completely eliminate these problems.
When macros contain multiple statements, you can get in trouble with control-flow constructs:
#define swap(x, y) t = x; x = y; y = t;
if (x < y) swap(x, y); -->
if (x < y) t = x; x = y; y = t; --> if (x < y) { t = x; } x = y; y = t;
The usual strategy for fixing this is to put the statements inside a "do { ... } while (0)" loop.
If you have two structures that happen to contain a field with the same name but different semantics, the same macro might work on both, with strange results:
struct shirt
{
int numButtons;
};
struct webpage
{
int numButtons;
};
#define num_button_holes(shirt) ((shirt).numButtons * 4)
struct webpage page;
page.numButtons = 2;
num_button_holes(page) -> 8
Finally, macros can be difficult to debug, producing weird syntax errors or runtime errors that you have to expand to understand (e.g. with gcc -E), because debuggers cannot step through macros, as in this example:
#define print(x, y) printf(x y) /* accidentally forgot comma */
print("foo %s", "bar") /* prints "foo %sbar" */
Inline functions and constants help to avoid many of these problems with macros, but aren't always applicable. Where macros are deliberately used to specify polymorphic behavior, unintentional polymorphism may be difficult to avoid. C++ has a number of features such as templates to help create complex polymorphic constructs in a typesafe way without the use of macros; see Stroustrup's The C++ Programming Language for details.
Macro features:
Macro is Preprocessed
No Type Checking
Code Length Increases
Use of macro can lead to side effect
Speed of Execution is Faster
Before Compilation macro name is replaced by macro value
Useful where small code appears many time
Macro does not Check Compile Errors
Function features:
Function is Compiled
Type Checking is Done
Code Length remains Same
No side Effect
Speed of Execution is Slower
During function call, Transfer of Control takes place
Useful where large code appears many time
Function Checks Compile Errors
Side-effects are a big one. Here's a typical case:
#define min(a, b) (a < b ? a : b)
min(x++, y)
gets expanded to:
(x++ < y ? x++ : y)
x gets incremented twice in the same statement. (and undefined behavior)
Writing multi-line macros are also a pain:
#define foo(a,b,c) \
a += 10; \
b += 10; \
c += 10;
They require a \ at the end of each line.
Macros can't "return" anything unless you make it a single expression:
int foo(int *a, int *b){
side_effect0();
side_effect1();
return a[0] + b[0];
}
Can't do that in a macro unless you use GCC's statement expressions. (EDIT: You can use a comma operator though... overlooked that... But it might still be less readable.)
Order of Operations: (courtesy of #ouah)
#define min(a,b) (a < b ? a : b)
min(x & 0xFF, 42)
gets expanded to:
(x & 0xFF < 42 ? x & 0xFF : 42)
But & has lower precedence than <. So 0xFF < 42 gets evaluated first.
When in doubt, use functions (or inline functions).
However answers here mostly explain the problems with macros, instead of having some simple view that macros are evil because silly accidents are possible.You can be aware of the pitfalls and learn to avoid them. Then use macros only when there is a good reason to.
There are certain exceptional cases where there are advantages to using macros, these include:
Generic functions, as noted below, you can have a macro that can be used on different types of input arguments.
Variable number of arguments can map to different functions instead of using C's va_args.eg: https://stackoverflow.com/a/24837037/432509.
They can optionally include local info, such as debug strings:(__FILE__, __LINE__, __func__). check for pre/post conditions, assert on failure, or even static-asserts so the code won't compile on improper use (mostly useful for debug builds).
Inspect input args, You can do tests on input args such as checking their type, sizeof, check struct members are present before casting(can be useful for polymorphic types).Or check an array meets some length condition.see: https://stackoverflow.com/a/29926435/432509
While its noted that functions do type checking, C will coerce values too (ints/floats for example). In rare cases this may be problematic. Its possible to write macros which are more exacting then a function about their input args. see: https://stackoverflow.com/a/25988779/432509
Their use as wrappers to functions, in some cases you may want to avoid repeating yourself, eg... func(FOO, "FOO");, you could define a macro that expands the string for you func_wrapper(FOO);
When you want to manipulate variables in the callers local scope, passing pointer to a pointer works just fine normally, but in some cases its less trouble to use a macro still.(assignments to multiple variables, for a per-pixel operations, is an example you might prefer a macro over a function... though it still depends a lot on the context, since inline functions may be an option).
Admittedly, some of these rely on compiler extensions which aren't standard C. Meaning you may end up with less portable code, or have to ifdef them in, so they're only taken advantage of when the compiler supports.
Avoiding multiple argument instantiation
Noting this since its one of the most common causes of errors in macros (passing in x++ for example, where a macro may increment multiple times).
its possible to write macros that avoid side-effects with multiple instantiation of arguments.
C11 Generic
If you like to have square macro that works with various types and have C11 support, you could do this...
inline float _square_fl(float a) { return a * a; }
inline double _square_dbl(float a) { return a * a; }
inline int _square_i(int a) { return a * a; }
inline unsigned int _square_ui(unsigned int a) { return a * a; }
inline short _square_s(short a) { return a * a; }
inline unsigned short _square_us(unsigned short a) { return a * a; }
/* ... long, char ... etc */
#define square(a) \
_Generic((a), \
float: _square_fl(a), \
double: _square_dbl(a), \
int: _square_i(a), \
unsigned int: _square_ui(a), \
short: _square_s(a), \
unsigned short: _square_us(a))
Statement expressions
This is a compiler extension supported by GCC, Clang, EKOPath & Intel C++ (but not MSVC);
#define square(a_) __extension__ ({ \
typeof(a_) a = (a_); \
(a * a); })
So the disadvantage with macros is you need to know to use these to begin with, and that they aren't supported as widely.
One benefit is, in this case, you can use the same square function for many different types.
Example 1:
#define SQUARE(x) ((x)*(x))
int main() {
int x = 2;
int y = SQUARE(x++); // Undefined behavior even though it doesn't look
// like it here
return 0;
}
whereas:
int square(int x) {
return x * x;
}
int main() {
int x = 2;
int y = square(x++); // fine
return 0;
}
Example 2:
struct foo {
int bar;
};
#define GET_BAR(f) ((f)->bar)
int main() {
struct foo f;
int a = GET_BAR(&f); // fine
int b = GET_BAR(&a); // error, but the message won't make much sense unless you
// know what the macro does
return 0;
}
Compared to:
struct foo {
int bar;
};
int get_bar(struct foo *f) {
return f->bar;
}
int main() {
struct foo f;
int a = get_bar(&f); // fine
int b = get_bar(&a); // error, but compiler complains about passing int* where
// struct foo* should be given
return 0;
}
No type checking of parameters and code is repeated which can lead to code bloat. The macro syntax can also lead to any number of weird edge cases where semi-colons or order of precedence can get in the way. Here's a link that demonstrates some macro evil
one drawback to macros is that debuggers read source code, which does not have expanded macros, so running a debugger in a macro is not necessarily useful. Needless to say, you cannot set a breakpoint inside a macro like you can with functions.
Functions do type checking. This gives you an extra layer of safety.
Adding to this answer..
Macros are substituted directly into the program by the preprocessor (since they basically are preprocessor directives). So they inevitably use more memory space than a respective function. On the other hand, a function requires more time to be called and to return results, and this overhead can be avoided by using macros.
Also macros have some special tools than can help with program portability on different platforms.
Macros don't need to be assigned a data type for their arguments in contrast with functions.
Overall they are a useful tool in programming. And both macroinstructions and functions can be used depending on the circumstances.
I did not notice, in the answers above, one advantage of functions over macros that I think is very important:
Functions can be passed as arguments, macros cannot.
Concrete example: You want to write an alternate version of the standard 'strpbrk' function that will accept, rather than an explicit list of characters to search for within another string, a (pointer to a) function that will return 0 until a character is found that passes some test (user-defined). One reason you might want to do this is so that you can exploit other standard library functions: instead of providing an explicit string full of punctuation, you could pass ctype.h's 'ispunct' instead, etc. If 'ispunct' was implemented only as a macro, this wouldn't work.
There are lots of other examples. For example, if your comparison is accomplished by macro rather than function, you can't pass it to stdlib.h's 'qsort'.
An analogous situation in Python is 'print' in version 2 vs. version 3 (non-passable statement vs. passable function).
If you pass function as an argument to macro it will be evaluated every time.
For example, if you call one of the most popular macro:
#define MIN(a,b) ((a)<(b) ? (a) : (b))
like that
int min = MIN(functionThatTakeLongTime(1),functionThatTakeLongTime(2));
functionThatTakeLongTime will be evaluated 5 times which can significantly drop perfomance