Understanding recursive Macro Expansions - c

I came across this question in an Embedded interview question set.
#define cat(x,y) x##y
concatenates x to y. But cat(cat(1,2),3) does not expand but gives preprocessor warning. Why?
Does C not encourage Recursive Macro expansions ? My assumption is the expression should display 1##2##3. Am i wrong ?

The problem is that cat(cat(1,2),3) isn't expanded in a normal way which you expect that cat(1,2) would give 12 and cat(12, 3) would give 123.
Macro parameters that are preceded or followed by ## in a replacement list aren't expanded at the time of substitution. As a result, cat(cat(1,2),3) expands to cat(1,2)3, which can't be further expanded since there is no macro named cat(1,2)3.
So the simple rule is that, macros whose replacement lists depends on ## usually can't be called in a nested fashion.

Related

How to define a macro that takes an arbitrary number of arguments and expands to give only the even arguments? [duplicate]

I am working on a recursive macro. However, it seems that it is not expanded recursively. Here is a minimal working example to show what I mean:
// ignore input, do nothing
#define ignore(...)
// choose between 6 names, depending on arity
#define choose(_1,_2,_3,_4,_5,_6,NAME,...) NAME
// if more than one parameter is given to this macro, then execute f, otherwise ignore
#define ifMore(f,...) choose(__VA_ARGS__,f,f,f,f,f,ignore)(__VA_ARGS__)
// call recursively if there are more parameters
#define recursive(first,args...) first:ifMore(recursive,args)
recursive(a,b,c,d)
// should print: a:b:c:d
// prints: a:recursive(b,c,d)
The recursive macro should expand itself recursively and always concatenate the result, separated with a colon. However, it doesn't work. The recursive macro is generated correctly (as can be seen on the result a:recursive(b,c,d) which includes a well-formed call to the macro again), but the generated recursive call ist not exanded.
Why is this the case and how can I get the behaviour I want?
You can't get the behaviour you want. The C preprocessor is, by design, not turing complete.
You can use multiple macros to get multiple replacements, but you will not achieve true recursion with an arbitrary number of replacements.
As others have mentioned, pure recursion is impossible with C macros. It is, however, possible to simulate recursion-like effects.
The Boost Pre-Processor tools do this well for both C and C++ and are a stand-alone library:
http://www.boost.org/doc/libs/1_60_0/libs/preprocessor/doc/index.html
The compiler pre-processor will not re expand the macro that you define. That is it will blindly replace whatever string is found in the macro statement with the string that it finds in the definition. For example, Can we have recursive macros? or Macro recursive expansion to a sequence and C preprocessor, recursive macros
That is, recursive(a,b,c,d) will be expanded to a:recursive(b,c,d) and the pre-processor will then continue to the next line in the base code. It will not loop around to try to continue to expand the string (see the links that I cited).

Parameterized macros involving the ## operator in the replacement-list

In the book that I am reading "C Programming A Modern Approach", there is a section on Page 343 that discusses some tricks you can use to get around certain deficits in macros.
The example problem is depicted as follows:
#define CONCAT(x,y) x##y (Directive 1)
The author then explains that the following line of code will fail to function as intended if using the aforementioned directive:
CONCAT(a, CONCAT(b,c))
This line of code will result in aCONCAT(b,c) as opposed to the desired abc.
In order to address this shortcoming, the author proposes the following work-around:
#define CONCAT2(x,y) CONCAT(x,y) (Directive 2)
The author explains that the presence of Directive 1 and Directive 2 will ensure that the slightly different line of code CONCAT2(a, CONCAT2(b,c)) is correctly replaced with abc.
(notice that this line of code is different than the original line of code...CONCAT2 is used instead of CONCAT.)
Could someone please walk me through why this will successfully carry out the desired objective? From what I understand, the preprocesser will keep scanning the precompiled code until all defined terms have been dealt with. For a given scan, how many defined words are updated per line?
I would think that the following flow of preprocessing replacements take place:
Given CONCAT2(a, CONCAT2(b,c))...
First pass over: CONCAT(a, CONCAT2(b,c))
However, for the second pass over, does CONCAT get expanded to its replacement list expression? Or does CONCAT2 get expanded to its replacement list expression? In either case, it seems like we once again arrive at a failed expression of either aCONCAT2(b,c) or CONCAT(a, CONCAT(b,c)), which would therefore still fail just like the very original case we presented.
Any help is greatly appreciated!
When the preprocessor detects a function-like macro invocation while scanning a source line, it completely expands the macro's arguments before substituting them into the macro's replacement text, except that where an argument appears as an operand of the stringification (#) or token-pasting (##) operator, its literal value is used for the operation. The resulting replacement text, with expanded arguments and the results of any # and ## operations substituted, is then rescanned for additional macros to expand.
Thus, with ...
CONCAT(a, CONCAT(b,c))
... the literal values of both arguments are used as operands for the token-pasting operation. The result is ...
aCONCAT(b,c)
. That is rescanned for further macros to expand, but aCONCAT is not defined as a macro name, so no further macro expansion occurs.
Now consider ...
CONCAT2(a, CONCAT2(b,c))
. In CONCAT2, neither argument is an operand of # or ##, so both are fully macro-expanded before being substituted. Of course a is unchanged, but CONCAT2(b,c) expands to CONCAT(b,c), which upon rescan is expanded to bc. By substitution of the expanded argument values into its replacement text, the outer CONCAT2 invocation expands to ...
CONCAT(a, bc)
. That expansion is then rescanned, in the context of the surrounding source text, for further macro expansion, yielding ...
abc
. That is again rescanned, but there are no further macro expansions to perform, so that's the final result.

Can anybody please explain the behavour of C preprocessor in following examples?

I am implementing a C macro preprocessor (C99)...
I am surprised by the following behaviour....
Ex1:
#define PASTE(x) X_##x
#define EXPAND(x) PASTE(x)
#define TABSIZE 1024
#define BUFSIZE TABSIZE
PASTE(BUFSIZE)
EXPAND(BUFSIZE)
expands to:
X_BUFFSIZE
X_1024
Ex2:
#define EXPAND(s) TO_STRING(s)
#define TO_STRING(s) #s
#define FOUR 4
TO_STRING(FOUR)
EXPAND(FOUR)
Expands to:
"FOUR"
"4"
I have gone through the "free" standard of C but I couldn’t find following things...
Actually how many passes preprocessor performs?
Does it replace one macro first then other and so on
or does it store & replace them as #defines are encountered one by one?
Whether file inclusion is done first or the macro expansion?
You should read this page for starters. It contains gems such as:
The C standard states that, after any parameters have been replaced with their possibly-expanded arguments, the replacement list is scanned for nested macros. Further, any identifiers in the replacement list that are not expanded during this scan are never again eligible for expansion in the future, if the reason they were not expanded is that the macro in question was disabled.
I think one can infer from this that there is no fixed number of passes: each time a macro expansion happens (which generates a "replacement list"), the newly created text is scanned for further expansions. It's a recursive process.
Actually how many passes preprocessor performs?
It replaces all occurences of # PARAMETER by the stringification of that parameter
It joins all tokens that have a ## inbetween
it replaces all remaining ocurences of the parameters by their value
It recursively expands the replacement text for occurences of other macros. (The macro itself is blocked in these recursive calls.)
Does it replace one macro first then other and so on or does it store
& replace them as #defines are encountered one by one?
It replaces macros in the order it encounters them in the program text, or during the recursive replacement as described above.
Whether file inclusion is done first or the macro expansion?
first the argument of an #include is expanded if it doesn't consist in something that is either enclosed in <> or "". That then must lead to exactly that something that is in <> or in ""

Using a #define-d list as input to a C preprocessor macro

In an example project, I defined the macro
#define FOO(x, y) x + y .
This works perfectly well. For example, FOO(42, 1337) is evaluated to 1379.
However, I now want to use another #define:
#define SAMPLE 42, 1337
When I now call FOO(SAMPLE), this won't work. The compiler tells me that FOO takes two arguments, but is only called with one argument.
I guess that the reason for this is that, although, the arguments of a macro are evaluated in advance of the function itself, that the preprocessor does not parse the whole instruction again after this evaluation. This is a similar to the fact that it is not possible to output additional preprocessor directives from a macro.
Is there any possibility to get the desired functionality?
Replacing the FOO macro with a C function is not a possibility. The original macro is located in third party code I cannot change, and it outputs a comma-separated list of values to be directly used in array initializers. Therefore, a C function cannot replicate the same behaviour.
If it is not possible to accomplish this task by using simple means: How would you store the (x, y) pairs in a maintainable form? In my case, there are 8 arguments. Therefore, storing the individual parts in separate #define-s is also not easy maintainable.
You're running into a problem where the preprocessor is not matching and expanding macros in the order you want. Now you can generally get it to do what you want by inserting some extra macros to force it to get the order right, but in order to that you need to understand what the normal order is.
when the compiler sees the name of a macro with arguments followed by a ( it first scans in that argument list, breaking it into arguments WITHOUT recognizing or expanding any macros in the arguments.
after parsing and separating the arguments, it then rescans each argument for macros, and expands any it finds withing the argument UNLESS the argument is used with # or ## in the macro body
it then replaces each instance of the argument in the body with the (now possibly expanded) argument
finally, it rescans the body for any OTHER macros that may exist with the body for expansion. In this one scan, the original macro WILL NOT be recognized and reexpanded, so you can't have recursive macro expansions
So you can get the effect you want by careful use of an EXPAND macro that takes a single argument and expands it, allowing you to force extra expansions at the right point in the process:
#define EXPAND(X) X
#define FOO(x,y) x + y
#define SAMPLE 42, 1337
EXPAND(FOO EXPAND((SAMPLE)))
In this case you first explicitly expand macros in the argument list, and then manually expand the resulting macro call afterwards.
Update by question poster
#define INVOKE(macro, ...) macro(__VA_ARGS__)
INVOKE(FOO, SAMPLE)
provides an extended solution that works without cluttering the code with EXPANDs.

Can a C macro definition refer to other macros?

What I'm trying to figure out is if something such as this (written in C):
#define FOO 15
#define BAR 23
#define MEH (FOO / BAR)
is allowed? I would want the preprocessor to replace every instance of
MEH
with
(15 / 23)
but I'm not so sure that will work. Certainly if the preprocessor only goes through the code once then I don't think it'd work out the way I'd like.
I found several similar examples but all were really too complicated for me to understand. If someone could help me out with this simple one I'd be eternally grateful!
Short answer yes. You can nest defines and macros like that - as many levels as you want as long as it isn't recursive.
The answer is "yes", and two other people have correctly said so.
As for why the answer is yes, the gory details are in the C standard, section 6.10.3.4, "Rescanning and further replacement". The OP might not benefit from this, but others might be interested.
6.10.3.4 Rescanning and further replacement
After all parameters in the replacement list have been substituted and
# and ## processing has taken place, all placemarker preprocessing tokens are removed.
Then, the resulting preprocessing token sequence
is rescanned, along with all subsequent preprocessing tokens of the
source file, for more macro names to replace.
If the name of the macro being replaced is found during this scan of
the replacement list (not including the rest of the source file's
preprocessing tokens), it is not replaced. Furthermore, if any nested
replacements encounter the name of the macro being replaced, it is not
replaced. These nonreplaced macro name preprocessing tokens are no
longer available for further replacement even if they are later
(re)examined in contexts in which that macro name preprocessing token
would otherwise have been replaced.
The resulting completely macro-replaced preprocessing token sequence
is not processed as a preprocessing directive even if it resembles
one, but all pragma unary operator expressions within it are then
processed as specified in 6.10.9 below.
Yes, it's going to work.
But for your personal information, here are some simplified rules about macros that might help you (it's out of scope, but will probably help you in the future). I'll try to keep it as simple as possible.
The defines are "defined" in the order they are included/read. That means that you cannot use a define that wasn't defined previously.
Usefull pre-processor keyword: #define, #undef, #else, #elif, #ifdef, #ifndef, #if
You can use any other previously #define in your macro. They will be expanded. (like in your question)
Function macro definitions accept two special operators (# and ##)
operator # stringize the argument:
#define str(x) #x
str(test); // would translate to "test"
operator ## concatenates two arguments
#define concat(a,b) a ## b
concat(hello, world); // would translate to "helloworld"
There are some predefined macros (from the language) as well that you can use:
__LINE__, __FILE__, __cplusplus, etc
See your compiler section on that to have an extensive list since it's not "cross platform"
Pay attention to the macro expansion
You'll see that people uses a log of round brackets "()" when defining macros. The reason is that when you call a macro, it's expanded "as is"
#define mult(a, b) a * b
mult(1+2, 3+4); // will be expanded like: 1 + 2 * 3 + 4 = 11 instead of 21.
mult_fix(a, b) ((a) * (b))
Yes, and there is one more advantage of this feature. You can leave some macro undefined and set its value as a name of another macro in the compilation command.
#define STR "string"
void main() { printf("value=%s\n", VALUE); }
In the command line you can say that the macro "VALUE" takes value from another macro "STR":
$ gcc -o test_macro -DVALUE=STR main.c
$ ./test_macro
Output:
value=string
This approach works as well for MSC compiler on Windows. I find it very flexible.
I'd like to add a gotcha that tripped me up.
Function-style macros cannot do this.
Example that doesn't compile when used:
#define FOO 1
#define FSMACRO(x) FOO + x
Yes, that is supported. And used quite a lot!
One important thing to note though is to make sure you paranthesize the expression otherwise you might run into nasty issues!
#define MEH FOO/BAR
// vs
#define MEH (FOO / BAR)
// the first could be expanded in an expression like 5 * MEH to mean something
// completely different than the second

Resources