Concatenation of tokens in variadic macros - c

In C, is it possible to concatenate each of the variable arguments in a a variadic macro?
Example:
MY_MACRO(A, B, C) // will yield HDR_A, HDR_B, HDR_C
MY_MACRO(X, Y) // will yield HDR_X, HDR_Y
The normal ## operator has special meaning for variadic macros (avoiding the comma for empty argument list). And concatenation when used with __VA_ARGS__ takes place with the first token only.
Example:
#define MY_MACRO(...) HDR_ ## __VA_ARGS__
MY_MACRO(X, Y) // yields HDR_X, Y
Suggestions?

First, the comma rule you are mentioning is a gcc extension, standard C doesn't have it and most probably will never have it since the feature can be achieved by different means.
What you are looking for is meta programming with macros, which is possible, but you'd need some tricks to achieve that. P99 provides you with tools for that:
#define MY_PREFIX(NAME, X, I) P99_PASTE2(NAME, X)
#define MY_MACRO(...) P99_FOR(HDR_, P99_NARG(__VA_ARGS__), P00_SEQ, MY_PREFIX, __VA_ARGS__)
Here MY_PREFIX describes what has to be done with the individual
items.
P00_SEQ declares how the items should be separated
P99_NARGS just counts the number of arguments

Related

C Macro Arguments including comma

I would like to pass 2 arguments to a macro using another macro:
#define A_AND_B 5,1
#define ADD(a, b) a + b
int add_them(void)
{
int result = ADD(A_AND_B);
return result ;
}
I would hope that expands to
int result = 5 + 1;
and I get 6. Instead I get
Error 'ADD' undeclared (first use in this function)
Error macro "ADD" requires 2 arguments, but only 1 given
Is there a way around this?
As is often the case, you need an extra level of (macro) indirection:
#define A_AND_B 5,1
#define ADD(...) ADD_(__VA_ARGS__)
#define ADD_(a, b) a + b
int add_them(void)
{
int result = ADD(A_AND_B);
return result ;
}
ADD is defined as variadic so that it will work as either ADD(A_AND_B) or ADD(A, B).
This works because __VA_ARGS__ in the replacement body of ADD is replaced with the actual arguments before the replacement body is scanned for macros.
Per C 2018 6.10.3.1, a compiler first identifies the arguments for a function-like macro and then performs macro replacement on the arguments, followed by macro replacement for the function-like macro. This means that, in ADD(A_AND_B), the argument is identified as A_AND_B before it is replaced with 5,1. As the macro invocation has only this single argument and the macro is defined to have two parameters, an error is diagnosed.
Given your definition of ADD, there is no way to change this behavior in a compiler that conforms to the C standard.
You can instead use another macro to expand the arguments and apply the desired macro:
#define Apply(Macro, Arguments) Macro(Arguments)
then int result = Apply(ADD, A_AND_B); will work. That will identify ADD and A_AND_B as arguments to Apply. Then it will expand those, producing an unchanged ADD and 5,1. Then the macro replacement for Apply produces ADD(5,1). Then this is again processed for macro replacement, which replaces ADD(5,1) in the ordinary way.
Note that good practice is usually to define ADD as #define ADD(a, b) ((a) + (b)) to avoid unexpected interactions with other operators neighboring the use of the macro.

Generalized iteration over arguments of macro in the C preprocessor

There were several questions here regarding variadic macros in C. These include:
How to make a variadic macro (variable number of arguments) which explains the basics, e.g., passing a variable number of arguments to functions such as printf
Is it possible to iterate over arguments in variadic macros?, which explains how to iteratively apply a macro to each of the arguments of the variadic macro.
https://github.com/swansontec/map-macro which explains how to do so on pairs
My question is related to the iteration technique. I am interested in a macro with this generalized semantics.
ITERATE(Before, Action, Between, After, Empty, ...)
that will place Before prior to all expansions, apply Action to each argument, place Between between every two consecutive applications, and will finally place the expansion of After. Moreover, if the number of argument With such a macro, it should be possible to write
// Loop elements
#define A(x) (x)
#define Befor (
#define After )
#define Between ||
#define Empty 1
// Define an OR macro
#define OR(...) ITERATE(Before, A, Between, Empty, __VA_ARGS__)
// Use it
OR() // Expands to 1
OR(a) // Expands to ((a))
OR(a,b) // Expands to ((a)||(b))
OR(a,b,c) // Expands to to ((a)||(b)||(c))
The purpose of course is not to write an OR function. A generalized functionality could be for more intricate applications. E.g., a macro for defining classes and functions, something to print the trace, etc.
I never liked the recursive REPEAT() macro idiom - it generates horrible hour long read error messages that are.. recursive, so you don't know where the error is and it's also hard to grasp how the OBSTRUCT(REPEAT_INDIRECT) () stuff works. Overall, overloading the macro on number of arguments and using an external tool (shell script or m4 preprocessor) to generate C source code is waay easier, easier to read, maintain and fix and also you can expand the macros on the tools side removing the burden of recursive expansion on C side. With that in mind, your ITERATE can be generated with existing preprocessor libraries, P99_FOR or BOOST_FOREACH comes to mind.
Also, typing shift all the time is strange - I prefer snake case. Here's a reduced example without Before and After macros with overloading the macro on number of arguments:
#define _in_ITERATE_0(f,b,e) e()
#define _in_ITERATE_1(f,b,e,_1) f(_1)
#define _in_ITERATE_2(f,b,e,_1,...) f(_1)b()_in_ITERATE_1(f,b,e,__VA_ARGS__)
#define _in_ITERATE_3(f,b,e,_1,...) f(_1)b()_in_ITERATE_2(f,b,e,__VA_ARGS__)
// or you could expand it instead of reusing previous one with same result:
#define _in_ITERATE_4(f,b,e,_1,_2,_3,_4) f(_1)b()f(_2)b()f(_3)b()f(_4)
// etc.... generate
#define _in_ITERATE_N(_0,_1,_2,_3,_4,_5,_6,_7,_8,_9,N,...) _in_ITERATE_##N
#define ITERATE(func, between, empty, ...) \
_in_ITERATE_N(0,##__VA_ARGS__,9,8,7,6,5,4,3,2,1,0)(func, between, empty, ##__VA_ARGS__)
#define _in_OR_OP(x) (x)
#define _in_OR_EMPTY() 1
#define _in_OR_BETWEEN() ||
#define OR(...) (ITERATE(_in_OR_OP, _in_OR_BETWEEN, _in_OR_EMPTY, ##__VA_ARGS__))
// Use it
OR() // Expands to (1)
OR(a) // Expands to ((a))
OR(a,b) // Expands to ((a)||(b))
OR(a,b,c) // Expands to to ((a)||(b)||(c))
outputs:
(1)
((a))
((a)||(b))
((a)||(b)||(c))
For more examples on overloading macro on count of arguments see this thread. I am using ## GNU extension to remove the comma before __VA_ARGS__ because I am used to using it - I think __VA_OPT__(,) should be nowadays preferred, I am not sure.

Surprising expansion of variadic GNU C preprocessor macros in the presence of the ## operator

If we define a macro
#define M(x, ...) { x, __VA_ARGS__ }
and then use it passing itself as an argument
M(M(1, 2), M(3, 4), M(5, 6))
then it expands to the expected form:
{ { 1, 2 }, { 3, 4 }, { 5, 6 } }
However, when we use the ## operator (to prevent dangling comma from appearing in the output in the case of the single argument invocations, as documented in the GCC manual), i.e.
#define M0(x, ...) { x, ## __VA_ARGS__ }
then the expansion of arguments in
M0(M0(1,2), M0(3,4), M0(5,6))
seems to stop after the first argument, i.e. we get:
{ { 1,2 }, M0(3,4), M0(5,6) }
Is this behavior a bug, or does it stem from some principle?
(I have also checked it with clang, and it behaves in the same way as GCC)
Way down at the end of this answer there is a possible solution.
Is this behavior a bug, or does it stem from some principle?
It stems from two principles whose interaction is pretty subtle. So I agree that it is surprising, but it's not a bug.
The two principles are the following:
Inside the replacement of macro invocation, that macro is not expanded. (See the GCC Manual Section 3.10.5, Self-Referential Macros or the C Standard, §6.10.3.4 paragraph 2.) This precludes recursive macro expansion, which in most cases would produce infinite recursion if allowed. Although it is likely that no-one anticipated such uses, it turns out that there would be ways of using recursive macro expansion which would not result in infinite recursion (see the Boost Preprocessor Library documentation for a thorough discussion of this issue), but the standard isn't going to get changed now.
If ## is applied to a macro argument, it suppresses macro expansion of that argument. (See the GCC Manual section 3.5, Concatenation or the C Standard, §6.10.3.3 paragraph 2.) The suppression of expansion is part of the C Standard, but GCC/Clang's extension to allow use of ## to conditionally suppress the comma preceding __VA_ARGS__ is non-standard. (See the GCC Manual Section 3.6, Variadic Macros.) Apparently, the extension still respects the standard's rule about not expanding concatenated macro arguments.
Now, the curious thing about the second point, with respect to optional comma suppression, is that you hardly ever notice it in practice. You can use ## to conditionally suppress commas and arguments will still get expanded as normal:
#define SHOW_ARGS(arg1, ...) Arguments are (arg1, ##__VA_ARGS__)
#define DOUBLE(a) (2 * a)
SHOW_ARGS(DOUBLE(2))
SHOW_ARGS(DOUBLE(2), DOUBLE(3))
This expands to:
Arguments are ((2 * 2))
Arguments are ((2 * 2), (2 * 3))
Both DOUBLE(2) and DOUBLE(3) are expanded normally, despite the fact that one of them is an argument to the concatenation operator.
But there's a subtlety to macro expansion. Expansion happens twice:
First, macro arguments are expanded. (This expansion is in the context of the text which invokes the macro.) These expanded arguments are substituted for the parameters in the macro replacement body (but only where the parameter is not an argument to # or ##).
Then the # and ## operators are applied to the replacement token list.
Finally, the resulting replacement tokens are inserted into the input stream, so that they are expanded again. This time, the expansion is in the context of the macro so recursive invocation is suppressed.
With that in mind, we see that in SHOW_ARGS(DOUBLE(2), DOUBLE(3)), DOUBLE(2) is expanded in step 1, before being inserted into the replacement token list, and DOUBLE(3) is expanded in step 3, as part of the replacement token list.
This doesn't make a difference with DOUBLE inside SHOW_ARGS, since they're different macros. But the difference would become apparent if they were the same macro.
To see the difference, consider the following macro:
#define INVOKE(A, ...) A(__VA_ARGS__)
That macro creates a macro invocation (or a function invocation, but here we're only interested in the case where it's a macro). That is, in turns INVOKE(X, Y) into X(Y). (That's a simplification of a useful feature, where the named macro is actually invoked several times, possibly with slightly different arguments.)
That works fine with SHOW_ARGS:
INVOKE(SHOW_ARGS, one arg)
⇒ Arguments are (one arg)
But if we try to INVOKE the macro INVOKE itself, we find that the ban on recursive invocation takes effect:
INVOKE(INVOKE, SHOW_ARGS, one arg)
⇒ INVOKE(SHOW_ARGS, one arg)
"Of course", we could expand INVOKE as an argument to INVOKE:
INVOKE(SHOW_ARGS, INVOKE(SHOW_ARGS, one arg))
⇒ Arguments are (Arguments are (one arg))
That works fine because there is no ## inside INVOKE, so expansion of the argument is not suppressed. But if the expansion of the argument had been suppressed, then the argument would be inserted into the macro body unexpanded, and then it would become a recursive expansion.
So that's what is going on in your example:
#define M0(x, ...) { x, ## __VA_ARGS__ }
M0(M0(1,2), M0(3,4), M0(5,6))
⇒ { { 1,2 }, M0(3,4), M0(5,6) }
Here, the first argument to the outer M0, M0(1,2), is not used with ##, so it is expanded as part of the invocation. The other two arguments are part of __VA_ARGS__, which is used with ##. Consequently, they are not expanded prior to being substituted into the macro's replacement list. But as part of the macro's replacement list, their expansion is suppressed by the no-recursive-macros rule.
You can easily work around that by defining two versions of the M0 macro, with the same contents but different names (as suggested in a comment to the OP):
#define M0(x, ...) { x, ## __VA_ARGS__ }
M0(M1(1,2), M1(3,4), M1(5,6))
⇒ { { 1,2 }, { 3,4 }, { 5,6 } }
But that's not very pleasant.
Solution: Use __VA_OPT__
C++2a will include a new feature designed specifically to assist with suppressing commas in variadic invocations: the __VA_OPT__ function-like macro. Inside a variadic macro expansion, __VA_OPT__(x) expands to its argument provided that there is at least one token in the variadic arguments. But if __VA_ARGS__ expands to an empty token list, so does __VA_OPT__(x). Thus, __VA_OPT__(,) can be used for conditional suppression of a comma just like the GCC ## extension, but unlike ##, it does not trigger suppression of macro expansion.
As an extension to the C standard, recent versions of GCC and Clang implement __VA_OPT__ for C as well as C++. (See the GCC Manual Section 3.6, Variadic Macros.) So if you're willing to rely on relatively recent compiler versions, there is a very clean solution:
#define M0(x, ...) { x __VA_OPT__(,) __VA_ARGS__ }
M0(M0(1,2), M0(3,4), M0(5,6))
⇒ { { 1 , 2 } , { 3 , 4 }, { 5 , 6 } }
Notes:
You can see these examples on Godbolt
This question was originally closed as a duplicate of Variadic macros: expansion of pasted tokens but I don't think that answer is really adequate to this particular situation.

Renaming a macro in C

Let's say I have already defined 9 macros from
ABC_1 to ABC_9
If there is another macro XYZ(num) whose objective is to call one of the ABC_{i} based on the value of num, what is a good way to do this? i.e. XYZ(num) should call/return ABC_num.
This is what the concatenation operator ## is for:
#define XYZ(num) ABC_ ## num
Arguments to macros that use concatenation (and are used with the operator) are evaluated differently, however (they aren't evaluated before being used with ##, to allow name-pasting, only in the rescan pass), so if the number is stored in a second macro (or the result of any kind of expansion, rather than a plain literal) you'll need another layer of evaluation:
#define XYZ(num) XYZ_(num)
#define XYZ_(num) ABC_ ## num
In the comments you say that num should be a variable, not a constant. The preprocessor builds compile-time expressions, not dynamic ones, so a macro isn't really going to be very useful here.
If you really wanted XYZ to have a macro definition, you could use something like this:
#define XYZ(num) ((int[]){ \
0, ABC_1, ABC_2, ABC_3, ABC_4, ABC_5, ABC_6, ABC_7, ABC_8, ABC_9 \
}[num])
Assuming ABC_{i} are defined as int values (at any rate they must all be the same type - this applies to any method of dynamically selecting one of them), this selects one with a dynamic num by building a temporary array and selecting from it.
This has no obvious advantages over a completely non-macro solution, though. (Even if you wanted to use macro metaprogramming to generate the list of names, you could still do that in a function or array definition.)
Yes, that's possible, using concatenation. For example:
#define FOO(x, y) BAR ##x(y)
#define BAR1(y) "hello " #y
#define BAR2(y) int y()
#define BAR3(y) return y
FOO(2, main)
{
puts(FOO(1, world));
FOO(3, 0);
}
This becomes:
int main()
{
puts("hello " "world");
return 0;
}

Macro expansion of __typeof__ to function name

I wrote the following code in plain C:
#define _cat(A, B) A ## _ ## B
#define cat(A, B) _cat(A, B)
#define plus(A, B) cat(cat(plus,__typeof__(A)),__typeof__(B))(A, B)
int main(int argc, const char * argv[])
{
double x = 1, y = 0.5;
double r = plus(x, y);
printf("%lf",r);
return 0;
}
Here, I would like the macro plus to be expanded becoming a function name which contains the types of the parameters. In this example I would like it to expand the following way
double r = plus(x, y)
...
/* First becomes*/
double r = cat(cat(plus,double),double)(x, y)
...
/* Then */
double r = cat(plus_double,double)(x, y)
...
/* And finally */
double r = plus_double_double(x, y)
However all I got from the preprocessor is
double r = plus___typeof__(x)___typeof(y)(x,y)
and gcc will obviously refuse to compile.
Now, I know that typeof evaluates at compile-time and it is my understanding that a macro is only prevented from being evaluated when it is contained in second macro which directly involves the stringify #and the concatenation ## tokens (here's the reason why I split cat in the way you see). If this is right, why doesn't __typeof__(x) get evaluated to double by the preprocessor? Seems to me that the behaviour should be perfectly clear at build time. Shouldn't __typeof__(x) evaluate to double before even going in _cat?
I searched and searched but I couldn't find anything... Am I doing something really really stupid?
I'm running Mac OS X Mountain Lion but I'm mostly interested in getting it work on any POSIX platform.
The reason this does not work is typeof is not a macro but a reserved word in the gcc's dialect of C and is thus handled after the preprocessor has finished its work. A good analogy would be the sizeof operator which is not a macro either and is not expanded by the preprocessor. To do (approximately) what you want (pick a different function based on the type of the arguments) try the _Generic construct (new in C11)
Macro expansion occurs before C token analysis (see https://stackoverflow.com/a/1479972/1583175 for a diagram of the phases of translation)
The macro preprocessor is unaware of the type information -- it merely does text processing
The preprocessor knows nothing about types, only about textual tokens. __typeof__() gets evaluated by the compiler pass, after the preprocessor has finished performing macro replacements.

Resources