Purpose of using parentheses in macros

Purpose of using parentheses in macros - c

I am curious to know the use of parentheses for both filp and x pointers in the following assignment operation:
#define init_sync_kiocb(x, filp) \
do { \
struct task_struct *tsk = current; \
(x)->ki_flags = 0; \
(x)->ki_users = 1; \
(x)->ki_key = KIOCB_SYNC_KEY; \
(x)->ki_filp = (filp); \ // This line here
....
....
Source:
https://github.com/gp-b2g/gp-peak-kernel/blob/master/include/linux/aio.h#L135

These are used in a macro definition which is handled by the preprocessor as text substitution. The fact that it is text substitution can result in weird expressions. Consider:
p = &a_struct_array[10];
init_sync_kiocb(p + 20, filp)
without the parens, it turns into:
p + 20->ki_filp = (filp);
with the parens:
(p + 20)->ki_filp = (filp);
I couldn't, but I bet similar examples can be found for the filp too, or at least you never know for sure.

The left-hand side is just typical safety measure since x is a macro parameter. It could expand to something that makes the -> operator fail unless the "thing that needs to be a struct pointer" is protected.
The right-hand side is less obvious to me but might be done just for reasons of consistency and symmetry; always protect macro arguments with parentheses. Some people treat that as a hard rule, and perhaps that project's style guide does, too.

It is inside a macro. This is common and good habit. Imagine you invoke the macro init_sync_kiocb as e.g.
init_sync_kiocb(pp?*pp:&x,fil?fil:somfil+1);
with the parenthesis this gets expanded as
(pp?*pp:&x)->ki_filp = (fil?fil:somfil+1);
without parenthesis the macro-expansion would be wrong (typing error, or parsing error):
pp?*pp:&x->ki_filp = fil?fil:somfil+1;

Don't forget to mention this is part of a function macro expansion. Such parameters should always be parenthesised to avoid bugs if the passed-in expressions are complex.

Related

Function or macro definition which one to use

I have some macros that I use a lot
So I was thinking in my case would it be better to use a function or use the macro definition?
Example of a macro code that I use:
#define Test(Id, TName, i, FName, var1, var2, var3) do { \
if (GetTable(Id, TName)) { \
while (i < 5) { \
if (GetField(Id, FName)) { \
const char *user = PushName(Id, FName); \
if (!CheckNameisValid(user)) \
continue; \
var1 = GetTimestamp(user); \
var2 = GetSex(user); \
var3 = GetCntLogin(user); \
i++; \
} \
} \
} \
} while (false);
What would be better to use for as per the above code?
Keep using macro definition or migrate to function ?

Given how the macro is written, it would be much better to use a function for this purpose. The macro does not use any construction such as referring to a structure member whose name is passed as an argument to the macro.
The macro has multiple problems:
variable i is assumed to have been declared and initialized appropriately, but it is only incremented if CheckNameisValid(user) is non zero, potentially causing an infinite loop.
the ; should not be part of the macro expansion to allow usage as a single expression statement.
it is unclear if the variables var1, var2, var3 should be updated once or multiple times, and the caller has no way to tell what actually happened as a side effect of this macro.
Defining functions with clear semantics is much preferred. Don't even worry about inline functions, modern compilers can determine which functions are worth inlining even if not defined as such.

would it be better to use a function or use the macro definition?
A function.
What would be better to use for as per the above code?
A function.
Keep using macro definition or migrate to function ?
Migrate to function.

Is there a way to use the parameter as a prefix in a C macro in GCC

I am trying to do some template stuff in C:
I need to create a 2d-vector struct with functions to work with them and I need to have them for the types int, unsigned int, float, double and maybe even other types that all support basic math functionality.
I could write all the code for every type. If I write an bug in one of them, I may forget to fix the bug in all others. I did it before in C++ and I just used templates, but now that I want it in C (C11)... there are none.
So I look at macro's, and I see that it is possible to use the ## tag to literally use the macro's passed parameter.
Here's some code I tried:
#define DECLARE_VEC2(N, T) \
typedef struct { T x, y;} N; \
inline N *##N_new() { return malloc(sizeof(N));}
DECLARE_VEC2(vec2i, int);
DECLARE_VEC2(vec2f, float);
DECLARE_VEC2(vec2d, double);
It doesn't work. The ##N is not replaced with vec2i/vec2f/vec2d but just remains ##N or even simply N. The compiler won't accept it and when I run gcc with the -E flag on the file, I see that the preprocessor indeed doesn't replace it.
If, however, I add a space between ##N and the _new, the preprocessor does replace it, but the space is also there which obviously gives problems when compiling it.
If, however, I change the function name to new_##N, it not only preprocesses fine, but is also working flawlessly.
BUT I DON'T WANT THE LATTER!
I create all my structs with functionality this way: typename_function() and don't want to change it because of this stupid problem.
So, the question is: is there a way to solve this?
Note: I also tried stuff like adding an extra parameter to the macro containing the underscore, or even declaring a macro within the macro adding the underscore, but they all come to the same problem: there is need for some kind of separator that makes the preprocessor understand that the name ends before the underscore or whatever is behind it.
Edit: strangely enough, when using something like ##N##_new(), the preprocessor replaces the name fine, but then won't accept it with errors like:
error: pasting "*" and "vec2i" does not give a valid preprocessing token
inline N * ##N##_new() { return malloc(sizeof(N));}
although the resulting code seems to be correct (gcc -E):
typedef struct { int x, y;} vec2i; inline vec2i *vec2i_new() { return malloc(sizeof(vec2i));};

## is not a prefix operator, it's an infix one — a binary concatenation operator similar to +. So you'd use it like this:
#define DECLARE_VEC2(N, T) \
typedef struct { T x, y;} N; \
inline N * N##_new() { return malloc(sizeof(N));}
Note that whitespace around ## is ignored, so if you consider it more readable, you can also do this for identical effects:
#define DECLARE_VEC2(N, T) \
typedef struct { T x, y;} N; \
inline N * N ## _new() { return malloc(sizeof(N));}
This should also explain why you were getting the error "does not give a valid preprocessing token": you were effectively trying to concatenate * and int into a single token.

Multiline macro function with "return" statement

I'm currently working on a project, and a particular part needs a multi-line macro function (a regular function won't work here as far as I know).
The goal is to make a stack manipulation macro, that pulls data of an arbitrary type off the stack (being the internal stack from a function call, not a high-level "stack" data type). If it were a function, it'd look like this:
type MY_MACRO_FUNC(void *ptr, type);
Where type is the type of data being pulled from the stack.
I currently have a working implementation of this for my platform (AVR):
#define MY_MACRO_FUNC(ptr, type) (*(type*)ptr); \
(ptr = /* Pointer arithmetic and other stuff here */)
This allows me to write something like:
int i = MY_MACRO_FUNC(ptr, int);
As you can see in the implementation, this works because the statement which assigns i is the first line in the macro: (*(type*)ptr).
However, what I'd really like is to be able to have a statement before this, to verify that ptr is a valid pointer before anything gets broken. But, this would cause the macro to be expanded with the int i = pointing to that pointer check. Is there any way to get around this issue in standard C? Thanks for any help!

As John Bollinger points out, macros expanding to multiple statements can have surprising results. A way to make several statements (and declarations!) a single statement is to wrap them into a block (surrounded by do … while(0), see for example here).
In this case, however, the macro should evaluate to something, so it must be an expression (and not a statement). Everything but declarations and iteration and jump statements (for, while, goto) can be transformed to an expression: Several expressions can be sequenced with the comma operator, if-else-clauses can be replaced by the conditional operator (?:).
Given that the original value of ptr can be recovered (I’ll assume "arithmetic and other stuff here" as adding 4 for the sake of having an example)
#define MY_MACRO_FUNC(ptr, type) \
( (ptr) && (uintptr_t)(ptr)%4 == 0 \
? (ptr) += 4 , *(type*)((ptr) - 4) \
: (abort() , (type){ 0 }) )
Note, that I put parentheses around ptr and around the whole expression, see e.g. here for an explanation.
The second and third operand of ?: must be of the same type, so I included (type){0} after the abort call. This expression is never evaluated. You just need some valid dummy object; here, type cannot be a function type.
If you use C89 and can’t use compound literals, you can use (type)0, but that wouldn’t allow for structure or union types.
Just as a note, Gcc has an extension Statements and Declarations in Expressions.

This is very nasty:
#define MY_MACRO_FUNC(ptr, type) (*(type*)ptr); \
(ptr = /* Pointer arithmetic and other stuff here */)
It may have unexpected results in certain inoccuous-looking circumstances, such as
if (foo) bar = MY_MACRO_FUNC(ptr, int);
Consider: what happens then if foo is 0?
I think you would be better off implementing this in a form that assigns the popped value instead of 'returning' it:
#define MY_POP(stack, type, v) do { \
if (!stack) abort_abort_abort(); \
v = *((type *) stack); \
stack = (... compute new value ...); \
} while (0)

How to prevent shadowing with iterator macros in C

Here's an example of a macro that wraps iterator functions in C,
Macro definition:
/* helper macros for iterating over tree types */
#define NODE_TREE_TYPES_BEGIN(ntype) \
{ \
GHashIterator *__node_tree_type_iter__ = ntreeTypeGetIterator(); \
for (; !BLI_ghashIterator_done(__node_tree_type_iter__); BLI_ghashIterator_step(__node_tree_type_iter__)) { \
bNodeTreeType *ntype = BLI_ghashIterator_getValue(__node_tree_type_iter__);
#define NODE_TREE_TYPES_END \
} \
BLI_ghashIterator_free(__node_tree_type_iter__); \
} (void)0
Example use:
NODE_TREE_TYPES_BEGIN(nt)
{
if (nt->ext.free) {
nt->ext.free(nt->ext.data);
}
}
NODE_TREE_TYPES_END;
However nested use (while functional), causes shadowing (gcc's -Wshadow)
NODE_TREE_TYPES_BEGIN(nt_a)
{
NODE_TREE_TYPES_BEGIN(nt_b)
{
/* do something */
}
NODE_TREE_TYPES_END;
}
NODE_TREE_TYPES_END;
The only way I can think of to avoid this is to pass a unique identifier to NODE_TREE_TYPES_BEGIN and NODE_TREE_TYPES_END. So my question is...
Is there there a way to prevent shadowing if variables declared within an iterator macro when its scope is nested?

You don't need to insert the same unique identifier in two places, if you can restructure the block so that it never needs the second macro to close it - then you only have one macro invocation and can use simple solutions like __LINE__ or __COUNTER__.
You can restructure the block by taking further advantage of for, to insert operations intended to happen after the block, in a position textually before it:
#define NODE_TREE_TYPES(ntype) \
for (GHashIterator *__node_tree_type_iter__ = ntreeTypeGetIterator(); \
__node_tree_type_iter__; \
(BLI_ghashIterator_free(__node_tree_type_iter__), __node_tree_type_iter__ = NULL)) \
for (bNodeTreeType *ntype = NULL; \
(ntype = BLI_ghashIterator_getValue(__node_tree_type_iter__), !BLI_ghashIterator_done(__node_tree_type_iter__)); \
BLI_ghashIterator_step(__node_tree_type_iter__))
The outer level of your original macro pairs is a compound statement, containing exactly three things: a declaration+initialization, an enclosed for structure, and a single free operation after which the declared variable is not used again.
This makes it very easy to restructure as a for of its own instead of an explicit compound statement: the declaration+initialization goes in the first clause of the for (wouldn't be as easy if you'd had two variables, although it is still possible); the enclosed for can be placed after the end of the for header we're building, since it's a single statement; and the free operation is placed in the third clause. Since the variable is not used in any further statements, we can take advantage of it: combine the free with an explicit assignment of NULL, using the comma operator, and then make the middle clause a check that the variable is not NULL, ensuring the loop runs exactly once.
The nested for gets a similar but more minor modification. Its statement body contains a declaration and per-loop initialization, but we can still hoist this out; put the declaration in the unused first clause of the for (which will still put it in the new scope), and initialize it in the second clause so that it happens at the start of every iteration; combine that initialization with the actual test using the comma operator again. This removes all boilerplate from the statement block and therefore means you no longer have any braces, and thus no need for a second macro to close the braces.
Then you have a single macro invocation you can use like this:
NODE_TREE_TYPES (nt) {
if (nt->ext.free) {
nt->ext.free(nt->ext.data);
}
}
(you can then apply the generation of a unique identifier to this to get rid of shadowing easily, using techniques shown in other questions)
Is this ugly? Does abusing the for statement and comma operator make the average C programmer's skin crawl? Oh lord yes. BUT, it's a bit cleaner, and it's the arguable "right" way to mess about if you really have to mess about.
Having a "close" macro that inserts compound-statement-breaks or hides close braces is a much worse idea, because not only does it give you problems with identifiers and matching scope, but it also hides the block structure of the program from the reader; abuse of the for statement at least means that the block structure of the program, and variable scope and so on, is not mutilated as well.

Macro vs Function in C

I often see instances in which using a macro is better than using a function.
Could someone explain me with an example the disadvantage of a macro compared to a function?

Macros are error-prone because they rely on textual substitution and do not perform type-checking. For example, this macro:
#define square(a) a * a
works fine when used with an integer:
square(5) --> 5 * 5 --> 25
but does very strange things when used with expressions:
square(1 + 2) --> 1 + 2 * 1 + 2 --> 1 + 2 + 2 --> 5
square(x++) --> x++ * x++ --> increments x twice
Putting parentheses around arguments helps but doesn't completely eliminate these problems.
When macros contain multiple statements, you can get in trouble with control-flow constructs:
#define swap(x, y) t = x; x = y; y = t;
if (x < y) swap(x, y); -->
if (x < y) t = x; x = y; y = t; --> if (x < y) { t = x; } x = y; y = t;
The usual strategy for fixing this is to put the statements inside a "do { ... } while (0)" loop.
If you have two structures that happen to contain a field with the same name but different semantics, the same macro might work on both, with strange results:
struct shirt
{
int numButtons;
};
struct webpage
{
int numButtons;
};
#define num_button_holes(shirt) ((shirt).numButtons * 4)
struct webpage page;
page.numButtons = 2;
num_button_holes(page) -> 8
Finally, macros can be difficult to debug, producing weird syntax errors or runtime errors that you have to expand to understand (e.g. with gcc -E), because debuggers cannot step through macros, as in this example:
#define print(x, y) printf(x y) /* accidentally forgot comma */
print("foo %s", "bar") /* prints "foo %sbar" */
Inline functions and constants help to avoid many of these problems with macros, but aren't always applicable. Where macros are deliberately used to specify polymorphic behavior, unintentional polymorphism may be difficult to avoid. C++ has a number of features such as templates to help create complex polymorphic constructs in a typesafe way without the use of macros; see Stroustrup's The C++ Programming Language for details.

Macro features:
Macro is Preprocessed
No Type Checking
Code Length Increases
Use of macro can lead to side effect
Speed of Execution is Faster
Before Compilation macro name is replaced by macro value
Useful where small code appears many time
Macro does not Check Compile Errors
Function features:
Function is Compiled
Type Checking is Done
Code Length remains Same
No side Effect
Speed of Execution is Slower
During function call, Transfer of Control takes place
Useful where large code appears many time
Function Checks Compile Errors

Side-effects are a big one. Here's a typical case:
#define min(a, b) (a < b ? a : b)
min(x++, y)
gets expanded to:
(x++ < y ? x++ : y)
x gets incremented twice in the same statement. (and undefined behavior)
Writing multi-line macros are also a pain:
#define foo(a,b,c) \
a += 10; \
b += 10; \
c += 10;
They require a \ at the end of each line.
Macros can't "return" anything unless you make it a single expression:
int foo(int *a, int *b){
side_effect0();
side_effect1();
return a[0] + b[0];
}
Can't do that in a macro unless you use GCC's statement expressions. (EDIT: You can use a comma operator though... overlooked that... But it might still be less readable.)
Order of Operations: (courtesy of #ouah)
#define min(a,b) (a < b ? a : b)
min(x & 0xFF, 42)
gets expanded to:
(x & 0xFF < 42 ? x & 0xFF : 42)
But & has lower precedence than <. So 0xFF < 42 gets evaluated first.

When in doubt, use functions (or inline functions).
However answers here mostly explain the problems with macros, instead of having some simple view that macros are evil because silly accidents are possible.You can be aware of the pitfalls and learn to avoid them. Then use macros only when there is a good reason to.
There are certain exceptional cases where there are advantages to using macros, these include:
Generic functions, as noted below, you can have a macro that can be used on different types of input arguments.
Variable number of arguments can map to different functions instead of using C's va_args.eg: https://stackoverflow.com/a/24837037/432509.
They can optionally include local info, such as debug strings:(__FILE__, __LINE__, __func__). check for pre/post conditions, assert on failure, or even static-asserts so the code won't compile on improper use (mostly useful for debug builds).
Inspect input args, You can do tests on input args such as checking their type, sizeof, check struct members are present before casting(can be useful for polymorphic types).Or check an array meets some length condition.see: https://stackoverflow.com/a/29926435/432509
While its noted that functions do type checking, C will coerce values too (ints/floats for example). In rare cases this may be problematic. Its possible to write macros which are more exacting then a function about their input args. see: https://stackoverflow.com/a/25988779/432509
Their use as wrappers to functions, in some cases you may want to avoid repeating yourself, eg... func(FOO, "FOO");, you could define a macro that expands the string for you func_wrapper(FOO);
When you want to manipulate variables in the callers local scope, passing pointer to a pointer works just fine normally, but in some cases its less trouble to use a macro still.(assignments to multiple variables, for a per-pixel operations, is an example you might prefer a macro over a function... though it still depends a lot on the context, since inline functions may be an option).
Admittedly, some of these rely on compiler extensions which aren't standard C. Meaning you may end up with less portable code, or have to ifdef them in, so they're only taken advantage of when the compiler supports.
Avoiding multiple argument instantiation
Noting this since its one of the most common causes of errors in macros (passing in x++ for example, where a macro may increment multiple times).
its possible to write macros that avoid side-effects with multiple instantiation of arguments.
C11 Generic
If you like to have square macro that works with various types and have C11 support, you could do this...
inline float _square_fl(float a) { return a * a; }
inline double _square_dbl(float a) { return a * a; }
inline int _square_i(int a) { return a * a; }
inline unsigned int _square_ui(unsigned int a) { return a * a; }
inline short _square_s(short a) { return a * a; }
inline unsigned short _square_us(unsigned short a) { return a * a; }
/* ... long, char ... etc */
#define square(a) \
_Generic((a), \
float: _square_fl(a), \
double: _square_dbl(a), \
int: _square_i(a), \
unsigned int: _square_ui(a), \
short: _square_s(a), \
unsigned short: _square_us(a))
Statement expressions
This is a compiler extension supported by GCC, Clang, EKOPath & Intel C++ (but not MSVC);
#define square(a_) __extension__ ({ \
typeof(a_) a = (a_); \
(a * a); })
So the disadvantage with macros is you need to know to use these to begin with, and that they aren't supported as widely.
One benefit is, in this case, you can use the same square function for many different types.

Example 1:
#define SQUARE(x) ((x)*(x))
int main() {
int x = 2;
int y = SQUARE(x++); // Undefined behavior even though it doesn't look
// like it here
return 0;
}
whereas:
int square(int x) {
return x * x;
}
int main() {
int x = 2;
int y = square(x++); // fine
return 0;
}
Example 2:
struct foo {
int bar;
};
#define GET_BAR(f) ((f)->bar)
int main() {
struct foo f;
int a = GET_BAR(&f); // fine
int b = GET_BAR(&a); // error, but the message won't make much sense unless you
// know what the macro does
return 0;
}
Compared to:
struct foo {
int bar;
};
int get_bar(struct foo *f) {
return f->bar;
}
int main() {
struct foo f;
int a = get_bar(&f); // fine
int b = get_bar(&a); // error, but compiler complains about passing int* where
// struct foo* should be given
return 0;
}

No type checking of parameters and code is repeated which can lead to code bloat. The macro syntax can also lead to any number of weird edge cases where semi-colons or order of precedence can get in the way. Here's a link that demonstrates some macro evil

one drawback to macros is that debuggers read source code, which does not have expanded macros, so running a debugger in a macro is not necessarily useful. Needless to say, you cannot set a breakpoint inside a macro like you can with functions.

Functions do type checking. This gives you an extra layer of safety.

Adding to this answer..
Macros are substituted directly into the program by the preprocessor (since they basically are preprocessor directives). So they inevitably use more memory space than a respective function. On the other hand, a function requires more time to be called and to return results, and this overhead can be avoided by using macros.
Also macros have some special tools than can help with program portability on different platforms.
Macros don't need to be assigned a data type for their arguments in contrast with functions.
Overall they are a useful tool in programming. And both macroinstructions and functions can be used depending on the circumstances.

I did not notice, in the answers above, one advantage of functions over macros that I think is very important:
Functions can be passed as arguments, macros cannot.
Concrete example: You want to write an alternate version of the standard 'strpbrk' function that will accept, rather than an explicit list of characters to search for within another string, a (pointer to a) function that will return 0 until a character is found that passes some test (user-defined). One reason you might want to do this is so that you can exploit other standard library functions: instead of providing an explicit string full of punctuation, you could pass ctype.h's 'ispunct' instead, etc. If 'ispunct' was implemented only as a macro, this wouldn't work.
There are lots of other examples. For example, if your comparison is accomplished by macro rather than function, you can't pass it to stdlib.h's 'qsort'.
An analogous situation in Python is 'print' in version 2 vs. version 3 (non-passable statement vs. passable function).

If you pass function as an argument to macro it will be evaluated every time.
For example, if you call one of the most popular macro:
#define MIN(a,b) ((a)<(b) ? (a) : (b))
like that
int min = MIN(functionThatTakeLongTime(1),functionThatTakeLongTime(2));
functionThatTakeLongTime will be evaluated 5 times which can significantly drop perfomance