I am trying to write a program where the names of some functions are dependent on the value of a certain macro variable with a macro like this:
#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE
int NAME(some_function)(int a);
Unfortunately, the macro NAME() turns that into
int some_function_VARIABLE(int a);
rather than
int some_function_3(int a);
so this is clearly the wrong way to go about it. Fortunately, the number of different possible values for VARIABLE is small, so I can simply do an #if VARIABLE == n and list all the cases separately, but is there is a clever way to do it?
Standard C Preprocessor
$ cat xx.c
#define VARIABLE 3
#define PASTER(x,y) x ## _ ## y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun, VARIABLE)
extern void NAME(mine)(char *x);
$ gcc -E xx.c
# 1 "xx.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "xx.c"
extern void mine_3(char *x);
$
Two levels of indirection
In a comment to another answer, Cade Roux asked why this needs two levels of indirection. The flippant answer is because that's how the standard requires it to work; you tend to find you need the equivalent trick with the stringizing operator too.
Section 6.10.3 of the C99 standard covers 'macro replacement', and 6.10.3.1 covers 'argument substitution'.
After the arguments for the invocation of a function-like macro have been identified,
argument substitution takes place. A parameter in the replacement list, unless preceded
by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is
replaced by the corresponding argument after all macros contained therein have been
expanded. Before being substituted, each argument’s preprocessing tokens are
completely macro replaced as if they formed the rest of the preprocessing file; no other
preprocessing tokens are available.
In the invocation NAME(mine), the argument is 'mine'; it is fully expanded to 'mine'; it is then substituted into the replacement string:
EVALUATOR(mine, VARIABLE)
Now the macro EVALUATOR is discovered, and the arguments are isolated as 'mine' and 'VARIABLE'; the latter is then fully expanded to '3', and substituted into the replacement string:
PASTER(mine, 3)
The operation of this is covered by other rules (6.10.3.3 'The ## operator'):
If, in the replacement list of a function-like macro, a parameter is immediately preceded
or followed by a ## preprocessing token, the parameter is replaced by the corresponding
argument’s preprocessing token sequence; [...]
For both object-like and function-like macro invocations, before the replacement list is
reexamined for more macro names to replace, each instance of a ## preprocessing token
in the replacement list (not from an argument) is deleted and the preceding preprocessing
token is concatenated with the following preprocessing token.
So, the replacement list contains x followed by ## and also ## followed by y; so we have:
mine ## _ ## 3
and eliminating the ## tokens and concatenating the tokens on either side combines 'mine' with '_' and '3' to yield:
mine_3
This is the desired result.
If we look at the original question, the code was (adapted to use 'mine' instead of 'some_function'):
#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE
NAME(mine)
The argument to NAME is clearly 'mine' and that is fully expanded.
Following the rules of 6.10.3.3, we find:
mine ## _ ## VARIABLE
which, when the ## operators are eliminated, maps to:
mine_VARIABLE
exactly as reported in the question.
Traditional C Preprocessor
Robert Rüger asks:
Is there any way do to this with the traditional C preprocessor which does not have the token pasting operator ##?
Maybe, and maybe not — it depends on the preprocessor. One of the advantages of the standard preprocessor is that it has this facility which works reliably, whereas there were different implementations for pre-standard preprocessors. One requirement is that when the preprocessor replaces a comment, it does not generate a space as the ANSI preprocessor is required to do. The GCC (6.3.0) C Preprocessor meets this requirement; the Clang preprocessor from XCode 8.2.1 does not.
When it works, this does the job (x-paste.c):
#define VARIABLE 3
#define PASTE2(x,y) x/**/y
#define EVALUATOR(x,y) PASTE2(PASTE2(x,_),y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)
extern void NAME(mine)(char *x);
Note that there isn't a space between fun, and VARIABLE — that is important because if present, it is copied to the output, and you end up with mine_ 3 as the name, which is not syntactically valid, of course. (Now, please can I have my hair back?)
With GCC 6.3.0 (running cpp -traditional x-paste.c), I get:
# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"
extern void mine_3(char *x);
With Clang from XCode 8.2.1, I get:
# 1 "x-paste.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 329 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "x-paste.c" 2
extern void mine _ 3(char *x);
Those spaces spoil everything. I note that both preprocessors are correct; different pre-standard preprocessors exhibited both behaviours, which made token pasting an extremely annoying and unreliable process when trying to port code. The standard with the ## notation radically simplifies that.
There might be other ways to do this. However, this does not work:
#define VARIABLE 3
#define PASTER(x,y) x/**/_/**/y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)
extern void NAME(mine)(char *x);
GCC generates:
# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"
extern void mine_VARIABLE(char *x);
Close, but no dice. YMMV, of course, depending on the pre-standard preprocessor that you're using. Frankly, if you're stuck with a preprocessor that is not cooperating, it would probably be simpler to arrange to use a standard C preprocessor in place of the pre-standard one (there is usually a way to configure the compiler appropriately) than to spend much time trying to work out a way to do the job.
Use:
#define VARIABLE 3
#define NAME2(fun,suffix) fun ## _ ## suffix
#define NAME1(fun,suffix) NAME2(fun,suffix)
#define NAME(fun) NAME1(fun,VARIABLE)
int NAME(some_function)(int a);
Honestly, you don't want to know why this works. If you know why it works, you'll become that guy at work who knows this sort of thing, and everyone will come ask you questions. =)
Plain-English explanation of the EVALUATOR two-step pattern
I haven't fully understood every word of the C standard, but I think this is a reasonable working model for how the solution shown in Jonathan Leffler's answer works, explained a little more verbosely. Let me know if my understanding is incorrect, hopefully with a minimal example that breaks my theory.
For our purposes, we can think of macro expansion as happening in three steps:
(prescan) Macro arguments are replaced:
if they are part of concatenation or stringification, they are replaced exactly as the string given on the macro call, without being expanded
otherwise, they are first fully expanded, and only then replaced
Stringification and concatenation happen
All defined macros are expanded, including macros generated in stringification
Step-by-step example without indirection
main.c
#define CAT(x) pref_ ## x
#define Y a
CAT(Y)
and expand it with:
gcc -E main.c
we get:
pref_Y
because:
Step 1: Y is a the macro argument of CAT.
x appears in a stringification pref_ ## x. Therefore, Y gets pasted as is without expansion giving:
pref_ ## Y
Step 2: concatenation happens and we are left with:
pref_Y
Step 3: any further macro replacement happens. But pref_Y is not any known macro, so it is left alone.
We can confirm this theory by actually adding a definition to pref_Y:
#define CAT(x) pref_ ## x
#define Y a
#define pref_Y asdf
CAT(Y)
and now the result would be:
asdf
because on Step 3 above pref_Y is now defined as a macro, and therefore expands.
Step-by-step example with indirection
If we use the two step pattern however:
#define CAT2(x) pref_ ## x
#define CAT(x) CAT2(x)
#define Y a
CAT(Y)
we get:
pref_a
Step 1: CAT is evaluated.
CAT(x) is defined as CAT2(x), so argument x of CAT at the definition does not appear in a stringification: the stringification only happens after CAT2 is expanded, which is not seen in this step.
Therefore, Y is fully expanded before being replaced, going through steps 1, 2, and 3, which we omit here because it trivially expands to a. So we put a in CAT2(x) giving:
CAT2(a)
Step 2: there is no stringification to be done
Step 3: expand all existing macros. We have the macro CAT2(a) and so we go on to expand that.
Step 3.1: the argument x of CAT2 appears in a stringification pref_ ## x. Therefore, paste the input string a as is, giving:
pref_ ## a
Step 3.2: stringify:
pref_a
Step 3: expand any further macros. pref_a is not any macro, so we are done.
GCC argument prescan documentation
GCC's documentation on the matter is also worth a read: https://gcc.gnu.org/onlinedocs/cpp/Argument-Prescan.html
Bonus: how those rules prevent nested calls from going infinite
Now consider:
#define f(x) (x + 1)
f(f(a))
which expands to:
((a + 1) + 1)
instead of going infinite.
Let's break it down:
Step 1: the outer f is called with argument x = f(a).
In the definition of f, the argument x is not part of a concatenation in the definition (x + 1) of f. Therefore it is first fully expanded before being replaced.
Step 1.1.: we fully expand the argument x = f(1) according to steps 1, 2, and 3, giving x = (a + 1).
Now back in Step 1, we take that fully expanded x argument equaling (a + 1), and put it inside the definition of f giving:
((a + 1) + 1)
Steps 2 and 3: not much happens, because we have no stringification and no more macros to expand.
Related
In C I have a define like:
#define VALUE 5
But the user can set a function too:
#define VALUE get_value()
For historical reasons
#define VALUE 0
means "the value is not set, use the default"
The question is how to write an #if that decides if VALUE is 0.
#if VALUE == 0
...
#else
...
#endif
gives "missing binary operator before token "("" error with GCC.
EDIT:
To make the use case clearer:
#if VALUE == 0
set_default_value();
#else
set_value(VALUE)
#endif
So I don't need VALUE to be evaluated in the #if just see if it's literally '0' or not.
You can use preprocessing pattern matching.
#define SECOND(...) SECOND_I(__VA_ARGS__,,)
#define SECOND_I(A,B,...) B
#define GLUE(A,B) GLUE_I(A, B)
#define GLUE_I(A,B) A##B
#define ZERO_TEST(X_) SECOND(GLUE(ZERO_TEST_AGAINST_,X_),0)
#define ZERO_TEST_AGAINST_0 ,1
The key construct here is the SECOND macro, which indirectly expands to its second argument. For pattern matching, you would use this by carefully constructing a first argument; since SECOND normally expands to its second argument, whatever you construct is normally ignored. But since SECOND expands to its second argument indirectly, you can cherry pick a particular pattern by having the first argument expand in particular cases with a comma, which would shove in a new second argument.
In this case we have an indirect paste at the end of ZERO_TEST_AGAINST_, and we're looking for the result of that to be ZERO_TEST_AGAINST_0.
To use this:
#if ZERO_TEST(VALUE)
set_default_value();
#else
set_value(VALUE)
#endif
Demo
http://coliru.stacked-crooked.com/a/2a6afc189637cfd3
Caveat
This fits your spec precisely as given; the indirect paste would not work with this form if you have parenthetical definitions:
#define VALUE (5)
...or:
#define VALUE (get_value() << 2) | 1
...since ZERO_TEST_AGAINST_ and ( do not join to make a valid token.
A generic solution is likely impossible, but you can hack something together:
#define CAT(x, ...) CAT_(x, __VA_ARGS__)
#define CAT_(x, ...) x##__VA_ARGS__
#define CHECK_VALUE_CHECK_0 )(
#define CHECK_VALUE_FALSE(...) CHECK_VALUE_TRUE
#define CHECK_VALUE_TRUE() 1
#define CHECK_VALUE CHECK_VALUE_(CAT(CHECK_VALUE_CHECK_, VALUE))
#define CHECK_VALUE_(...) CHECK_VALUE_FALSE(__VA_ARGS__)
#if CHECK_VALUE
#error Value is 0.
#else
#error Value is not 0.
#endif
Now, if VALUE is defined to 0, the macro CHECK_VALUE will expand to 1. Otherwise it will expand to CHECK_VALUE_TRUE, which as an unknown identifier, is considered falsey by #if.
This solution is hacky:
If VALUE starts with 0,, its causes a hard error.
If VALUE starts with something other than a letter or digit or _ (e.g. (), it causes a hard error.
...
A preprocessor cannot evaluate C-Functions, such as get_value(). It can only handle static data, as its replaced with code prior to compilation, so your statements are not executed at runtime.
if (VALUE == 0) in your C-Code would get replaced for example with if (get_value() == 0) prior to compilation. The Preprocessor is not able to evaluate the return value of get_value() (even if it would always return 0).
A #define VALUE 5 line makes the C preprocessor replace the string VALUE with the string 5 wherever it sees it (outside "strings", that is). The resulting program will not even contain VALUE anymore, and that is what the compiler proper sees(1).
Do an experiment: Get your program (cut it down to a few lines), and run:
cc -E proggie.c > proggie.i
Under Unixy systems you'll get a file proggie.i that contains preprocessed C, which is what the compiler proper sees. No VALUE in sight, and trying to set VALUE in the source ends up as trying to set 0, which obviously won't work.
(1) Historically, C compilers ran a chain of programs over the source (then memories where measured in KiB, not GiB), the first --preprocessor-- was usually called cpp; today's compilers usually don't have a separate preprocessor anymore. But conceptually the first step is still preprocessing the code (handle #include, #if and #defined macros).
I think you're misunderstand how the historical 'the value is not set' works.
In the C preprocessor, whenever you have an #if directive any identifier in the expression that is not an macro (and is not the special function defined) will be replaced by the constant 0 prior to evaluating the #if. So with your example
#define VALUE get_value()
#if VALUE == 0
what happens is the VALUE macro is expanded, then, since there is no get_value macro defined, it is replaced by 0, leaving you with
#if 0() == 0
which gives you the syntax error you see.
let's say I want to select the behaviour of a certain preprocessor directive evaluating at compile time the concatenation of a constant string and the result of another macro.
#define CASE1 text1
#define CASE2 text2
#define CASE3 text3
#define SCENARIO 3
/** the following won't work - for examplification purposes only**/
#define FUNCTION CASE##SCENARIO
/** whenever I write FUNCTION, I expect to see text3 **/
I am having an hard time thinking of a viable solution, as the preprocessor is a one-pass beast. Is that even feasible ?
It's possible, you just need to add some extra layers of macros. The key is that when you use the token-pasting operator ##, the preprocessor will not expand its operands. But, if you add another layer of macros, the preprocessor will expand those arguments. For example:
#define CASE1 text1
#define CASE2 text2
#define CASE3 text3
#define SCENARIO 3
#define TOKENPASTE_HELPER(x, y) x ## y
#define TOKENPASTE(x, y) TOKENPASTE_HELPER(x, y)
#define FUNCTION TOKENPASTE(CASE, SCENARIO)
When the preprocessor expands FUNCTION, it expands TOKENPASTE. When it expands TOKENPASTE, it expands its arugments (so SCENARIO gets replaced by 3), since neither of its arguments are operands of the token-pasting operator. Next, it expands TOKENPASTE_HELPER, which does the actual token pasting to make CASE3. Finally, it expands the CASE3 macro to get text3.
How many passes does the C preprocessor make over the code?
I tested following code on gcc 4.7.2
#define a 5
#define b a
#define c b
#define d c
#define e d
#define f e
#define g f
#define h g
#define j h
#define k j
#define l k
#define m l
int main(void) {return d;}
There is no error:
$ gcc -E 1.c
# 1 "1.c"
# 1 "<command-line>"
# 1 "1.c"
# 14 "1.c"
int main(void) {return 5;}
Is it standard behaviour?
The C preprocesor keeps going until there's nothing more to expand. It isn't a question of passes; it's a question of completeness.
It does avoid recursive expansion of macros. Once a macro has been expanded once, it is not re-expanded in the replacement text.
The only thing the standard says about limits on macro expansion is in §5.2.4.1 Translation limits, where it says:
The implementation shall be able to translate and execute at least one program that
contains at least one instance of every one of the following limits:18)
...
4095 macro identifiers simultaneously defined in one preprocessing translation unit
18) Implementations should avoid imposing fixed translation limits whenever possible.
So, the preprocessor must be able to handle at least 4095 macros, and if all but one of those macros expand to another macro sequentially, like in your example, the result must be correct.
For example:
#define FOO(x) (printf(x))
and
#define FOO(x) {printf(x)}
It seems that both are viable for preprocessing, but which is better?
If you're treating the macro as an expression, use the () form.
If you're treating it as a command (and never as an expression) then use the {} form. Or rather, use the do{}while(0) form as that has fewer substitution hazards when used near keywords like if:
#define FOO(x) do { \
printf(x); \
} while(0)
Parentheses () are used to enforce correct evaluation regardless of operator precedence, so that you hopefully won't get any nasty side effects when the macro is expanded.
Braces {} are used to make the macro a C block statement, although the canonical way to do this is:
#define FOO(x) \
do { \
... stuff ... \
} while (0)
Note that gcc provides an extension to the C language which makes it possible to return a value from a block - the last expression evaluated will be the value returned if the block is used as part of an expression.
The purpose of using parens in a macro is to control precedence when the macro is expanded. Consider:
#define X( a, b ) a * b
if the macro is used like this
X( 1 + 2, 3 )
we would presumably like the answer to be 9, but what we get on expansion is:
1 + 2 * 3
giving us 7. To avoid this kind of thing, we should have written the macro as:
#define X( a, b ) ((a) * (b))
If precedence is not an issue, brackets either type are not absolutely required, though braces may be needed depending on the ,macros semantics - if for example you want to create a local variable,
If you will ever need FOO(x) inside an expression, then you cannot use the {} form. For example:
result = FOO(some_variable);
or
if (FOO(some_variable))
I am trying to write a program where the names of some functions are dependent on the value of a certain macro variable with a macro like this:
#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE
int NAME(some_function)(int a);
Unfortunately, the macro NAME() turns that into
int some_function_VARIABLE(int a);
rather than
int some_function_3(int a);
so this is clearly the wrong way to go about it. Fortunately, the number of different possible values for VARIABLE is small, so I can simply do an #if VARIABLE == n and list all the cases separately, but is there is a clever way to do it?
Standard C Preprocessor
$ cat xx.c
#define VARIABLE 3
#define PASTER(x,y) x ## _ ## y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun, VARIABLE)
extern void NAME(mine)(char *x);
$ gcc -E xx.c
# 1 "xx.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "xx.c"
extern void mine_3(char *x);
$
Two levels of indirection
In a comment to another answer, Cade Roux asked why this needs two levels of indirection. The flippant answer is because that's how the standard requires it to work; you tend to find you need the equivalent trick with the stringizing operator too.
Section 6.10.3 of the C99 standard covers 'macro replacement', and 6.10.3.1 covers 'argument substitution'.
After the arguments for the invocation of a function-like macro have been identified,
argument substitution takes place. A parameter in the replacement list, unless preceded
by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is
replaced by the corresponding argument after all macros contained therein have been
expanded. Before being substituted, each argument’s preprocessing tokens are
completely macro replaced as if they formed the rest of the preprocessing file; no other
preprocessing tokens are available.
In the invocation NAME(mine), the argument is 'mine'; it is fully expanded to 'mine'; it is then substituted into the replacement string:
EVALUATOR(mine, VARIABLE)
Now the macro EVALUATOR is discovered, and the arguments are isolated as 'mine' and 'VARIABLE'; the latter is then fully expanded to '3', and substituted into the replacement string:
PASTER(mine, 3)
The operation of this is covered by other rules (6.10.3.3 'The ## operator'):
If, in the replacement list of a function-like macro, a parameter is immediately preceded
or followed by a ## preprocessing token, the parameter is replaced by the corresponding
argument’s preprocessing token sequence; [...]
For both object-like and function-like macro invocations, before the replacement list is
reexamined for more macro names to replace, each instance of a ## preprocessing token
in the replacement list (not from an argument) is deleted and the preceding preprocessing
token is concatenated with the following preprocessing token.
So, the replacement list contains x followed by ## and also ## followed by y; so we have:
mine ## _ ## 3
and eliminating the ## tokens and concatenating the tokens on either side combines 'mine' with '_' and '3' to yield:
mine_3
This is the desired result.
If we look at the original question, the code was (adapted to use 'mine' instead of 'some_function'):
#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE
NAME(mine)
The argument to NAME is clearly 'mine' and that is fully expanded.
Following the rules of 6.10.3.3, we find:
mine ## _ ## VARIABLE
which, when the ## operators are eliminated, maps to:
mine_VARIABLE
exactly as reported in the question.
Traditional C Preprocessor
Robert Rüger asks:
Is there any way do to this with the traditional C preprocessor which does not have the token pasting operator ##?
Maybe, and maybe not — it depends on the preprocessor. One of the advantages of the standard preprocessor is that it has this facility which works reliably, whereas there were different implementations for pre-standard preprocessors. One requirement is that when the preprocessor replaces a comment, it does not generate a space as the ANSI preprocessor is required to do. The GCC (6.3.0) C Preprocessor meets this requirement; the Clang preprocessor from XCode 8.2.1 does not.
When it works, this does the job (x-paste.c):
#define VARIABLE 3
#define PASTE2(x,y) x/**/y
#define EVALUATOR(x,y) PASTE2(PASTE2(x,_),y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)
extern void NAME(mine)(char *x);
Note that there isn't a space between fun, and VARIABLE — that is important because if present, it is copied to the output, and you end up with mine_ 3 as the name, which is not syntactically valid, of course. (Now, please can I have my hair back?)
With GCC 6.3.0 (running cpp -traditional x-paste.c), I get:
# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"
extern void mine_3(char *x);
With Clang from XCode 8.2.1, I get:
# 1 "x-paste.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 329 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "x-paste.c" 2
extern void mine _ 3(char *x);
Those spaces spoil everything. I note that both preprocessors are correct; different pre-standard preprocessors exhibited both behaviours, which made token pasting an extremely annoying and unreliable process when trying to port code. The standard with the ## notation radically simplifies that.
There might be other ways to do this. However, this does not work:
#define VARIABLE 3
#define PASTER(x,y) x/**/_/**/y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)
extern void NAME(mine)(char *x);
GCC generates:
# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"
extern void mine_VARIABLE(char *x);
Close, but no dice. YMMV, of course, depending on the pre-standard preprocessor that you're using. Frankly, if you're stuck with a preprocessor that is not cooperating, it would probably be simpler to arrange to use a standard C preprocessor in place of the pre-standard one (there is usually a way to configure the compiler appropriately) than to spend much time trying to work out a way to do the job.
Use:
#define VARIABLE 3
#define NAME2(fun,suffix) fun ## _ ## suffix
#define NAME1(fun,suffix) NAME2(fun,suffix)
#define NAME(fun) NAME1(fun,VARIABLE)
int NAME(some_function)(int a);
Honestly, you don't want to know why this works. If you know why it works, you'll become that guy at work who knows this sort of thing, and everyone will come ask you questions. =)
Plain-English explanation of the EVALUATOR two-step pattern
I haven't fully understood every word of the C standard, but I think this is a reasonable working model for how the solution shown in Jonathan Leffler's answer works, explained a little more verbosely. Let me know if my understanding is incorrect, hopefully with a minimal example that breaks my theory.
For our purposes, we can think of macro expansion as happening in three steps:
(prescan) Macro arguments are replaced:
if they are part of concatenation or stringification, they are replaced exactly as the string given on the macro call, without being expanded
otherwise, they are first fully expanded, and only then replaced
Stringification and concatenation happen
All defined macros are expanded, including macros generated in stringification
Step-by-step example without indirection
main.c
#define CAT(x) pref_ ## x
#define Y a
CAT(Y)
and expand it with:
gcc -E main.c
we get:
pref_Y
because:
Step 1: Y is a the macro argument of CAT.
x appears in a stringification pref_ ## x. Therefore, Y gets pasted as is without expansion giving:
pref_ ## Y
Step 2: concatenation happens and we are left with:
pref_Y
Step 3: any further macro replacement happens. But pref_Y is not any known macro, so it is left alone.
We can confirm this theory by actually adding a definition to pref_Y:
#define CAT(x) pref_ ## x
#define Y a
#define pref_Y asdf
CAT(Y)
and now the result would be:
asdf
because on Step 3 above pref_Y is now defined as a macro, and therefore expands.
Step-by-step example with indirection
If we use the two step pattern however:
#define CAT2(x) pref_ ## x
#define CAT(x) CAT2(x)
#define Y a
CAT(Y)
we get:
pref_a
Step 1: CAT is evaluated.
CAT(x) is defined as CAT2(x), so argument x of CAT at the definition does not appear in a stringification: the stringification only happens after CAT2 is expanded, which is not seen in this step.
Therefore, Y is fully expanded before being replaced, going through steps 1, 2, and 3, which we omit here because it trivially expands to a. So we put a in CAT2(x) giving:
CAT2(a)
Step 2: there is no stringification to be done
Step 3: expand all existing macros. We have the macro CAT2(a) and so we go on to expand that.
Step 3.1: the argument x of CAT2 appears in a stringification pref_ ## x. Therefore, paste the input string a as is, giving:
pref_ ## a
Step 3.2: stringify:
pref_a
Step 3: expand any further macros. pref_a is not any macro, so we are done.
GCC argument prescan documentation
GCC's documentation on the matter is also worth a read: https://gcc.gnu.org/onlinedocs/cpp/Argument-Prescan.html
Bonus: how those rules prevent nested calls from going infinite
Now consider:
#define f(x) (x + 1)
f(f(a))
which expands to:
((a + 1) + 1)
instead of going infinite.
Let's break it down:
Step 1: the outer f is called with argument x = f(a).
In the definition of f, the argument x is not part of a concatenation in the definition (x + 1) of f. Therefore it is first fully expanded before being replaced.
Step 1.1.: we fully expand the argument x = f(1) according to steps 1, 2, and 3, giving x = (a + 1).
Now back in Step 1, we take that fully expanded x argument equaling (a + 1), and put it inside the definition of f giving:
((a + 1) + 1)
Steps 2 and 3: not much happens, because we have no stringification and no more macros to expand.