Behavior of ## operator in nested call - c

I was reading a book on C programming language where I found:
#define cat(x,y) x##y
#define xcat(x,y) cat(x,y)
calling cat(cat(1,2),3) produces error whereas calling xcat(xcat(1,2),3) produces expected result 123.
How are both working differently ?

Macros whose replacement lists depends on ## usually can't be called in nested fashion.
cat(cat(1,2),3) is not expanded in a normal fashion, with cat(1,2) yielding 12 and then cat(12, 3) yielding 123.
Macro parameters that are preceded or followed by ## in a replacement list aren't expanded at the time of substitution.
6.10.3.1 Argument substitution
1 After the arguments for the invocation of a function-like macro have been identified,
argument substitution takes place. A parameter in the replacement list, unless preceded
by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been
expanded. Before being substituted, each argument’s preprocessing tokens are
completely macro replaced as if they formed the rest of the preprocessing file; no other
preprocessing tokens are available.
As a result, cat(cat(1,2),3) expands to cat(1,2)3, which can't be expanded further, because there is no macro named cat(1,2)3.
In case
#define xcat(x,y) cat(x,y)
writing xcat(xcat(1,2),3) will work. As the preprocessor expands the outer call of xcat, it will expand xcat(1,2) as well; the difference is that xcat's replacement list does not contain ## anymore.
xcat(xcat(1,2),3) ==> cat(12, 3) ==> 12##3 ==> 123

Related

Macro replacement list rescanning for replacement

I'm reading the Standard N1570 about macro replacement and misunderstand some wording from 6.10.3.4.
1 After all parameters in the replacement list have been substituted
and # and ## processing has taken place, all placemarker preprocessing
tokens are removed. The resulting preprocessing token sequence is then
rescanned, along with all subsequent preprocessing tokens of the
source file, for more macro names to replace
So after all # and ## are resolved we rescan the replacement list. But the section 2 specifies:
2 If the name of the macro being replaced is found during this scan of
the replacement list (not including the rest of the source file’s
preprocessing tokens), it is not replaced. Furthermore, if any nested
replacements encounter the name of the macro being replaced, it is not
replaced.
It looks contradictory to me. So what kind of replacement possible in that rescan? I tried the following example:
#define FOOBAR(a, b) printf(#a #b)
#define INVOKE(a, b) a##b(a, b)
int main() {
INVOKE(FOO, BAR); //expands to printf("FOO" "BAR")
}
So INVOKE(FOO, BAR) expands to FOOBAR(FOO, BAR) after substitution of ##. Then the replacement list FOOBAR(FOO, BAR) is rescanned. But the section 2. specifies that the name of the macro being replaced (FOOBAR) is found (yes, defined above) it is not replaced (but actually replaced as can be seen in th demo).
Can you please clarify that wording? What did I miss?
LIVE DEMO
The (original) macro being replaced is not FOOBAR, it's INVOKE. When you're expanding INVOKE and you find FOOBAR, you expand FOOBAR normally. However, if INVOKE had been found when expanding INVOKE, it would no longer be expanded.
Let's take the following code:
#define FOOBAR(a, b) printf(#a #b)
#define INVOKE(a, b) e1 a##b(a, b)
int main() {
INVOKE(INV, OKE);
}
I added the e1 to the expansion of INVOKE to be able to visualise how many expansions happen. The result of preprocessing main is:
e1 INVOKE(INV, OKE);
This proves that INVOKE was expanded once and then, upon rescanning, not expanded again.
[Live example]
Consider the following simple example:
#include<stdio.h>
const int FOO = 42;
#define FOO (42 + FOO)
int main()
{
printf("%d", FOO);
}
Here the output will be 84.
The printf will be expanded to:
printf("%d", 42 + 42);
This means that when the macro FOO is expanded, the expansion will stop when the second FOO is found. It will not be further expanded. Otherwise, you will end up with endless recursion resulting in: 42 + (42 + (42 + (42 + ....)
Live demo here.

Use of # in a macro [duplicate]

This question already has answers here:
C preprocessor: stringize macro and identity macro
(2 answers)
What does #x inside a C macro mean?
(4 answers)
How can I concatenate twice with the C preprocessor and expand a macro as in "arg ## _ ## MACRO"?
(3 answers)
Closed 6 years ago.
Please explain the code
#include <stdio.h>
#define A(a,b) a##b
#define B(a) #a
#define C(a) B(a)
main()
{
printf("%s\n",C(A(1,2)));
printf("%s\n",B(A(1,2)));
}
Output
12
A(1,2)
I don't understand, how the first printf evaluates to 12?
Isn't it similar to the second, as C macro is simply a wrapper to B macro?
As mentioned in Wikipedia in C-preprocessor :
The ## operator (known as the "Token Pasting Operator") concatenates
two tokens into one token.
The # operator (known as the "Stringification Operator") converts a
token into a string, escaping any quotes or backslashes appropriately.
If you want to stringify the expansion of a macro argument, you have
to use two levels of macros:
You cannot combine a macro argument with additional text and stringify
it all together. You can however write a series of adjacent string
constants and stringified arguments: the C compiler will then combine
all the adjacent string constants into one long string.
#define xstr(s) str(s)
#define str(s) #s
#define foo 4
str (foo) // outputs "foo"
xstr (foo) // outputs "4"
Also, from C-FAQ Question 11.17 :
It turns out that the definition of # says that it's supposed to
stringize a macro argument immediately, without further expanding it
(if the argument happens to be the name of another macro).
So, similarly, going along these lines :
you're doing C(A(1,2)),
which would roll to C(12), // since no #, so inner argument is expanded
and then to B(12)
// [since you've done two levels of macros in the code:
// 1. from C() to B(), and then, 2. B() to #a]
= 12 .
Whereas, in the first case, only 1 level of stringification is plainly done as per definition of B(a)(since it gets stringified immediately because of #)
macro-replacement of B(A(1,2))
= stringification of A(1,2)
= A(1,2).
The confusion here comes from a simple rule.
When evaluating a macro the pre-processor first resolves the macros in the arguments passed to the macro. However, as a special case, if an argument is right of # or adjacent to ##, it doesn't resolve macros within such arguments. Such are the rules.
Your first case
C(A(1,2))
The pre-processor first applies the C(a) macro, which is defined as B(a). There's no # or ## adjacent to the argument in the definition (none of them in B(a) at all), thus the pre-processor must resolve macros in the argument:
A(1,2)
The definition of A(a,b) is a##b which evaluates into 12.
After the macros in the arguments of the C(a) macro are evaluated, the C macro becomes:
C(12)
The pre-processor now resolves the C(a) macro, which according to its definition becomes
B(12)
Once this is done, the pre-processor evaluates macros inside the result once again and applies the B(a) macro, so the result becomes
"12"
Your second case
B(A(1,2))
Similar to the first case, the pre-processor first applies the B(a) macro. But this time, the definition of the macro is such that the argument is preceded by #. Therefore, the special rule applies and macros inside the argument are not evaluated. Therefore, the result immediately becomes:
"A(1,2)"
The preprocessor goes over the result again trying to find more macros to expand, but now everything is a part of the string, and macros don't get expanded within strings. So the final result is:
"A(1,2)"
C preprocessor has two operators # and ##. The # operator turns the argument of a function like macro to a quoted string where ## operator concatenates two identifiers.
#define A(a,b) a##b will concatenate a with b returning ab as string.
so A(1,2) will return 12
#define B(a) #a will return a as string
#define C(a) B(a) will call previous one and return a as string.
so C(A(1,2)) = C(12) = B(12) = 12 (as string)
B(A(1,2)) = A(1,2) because A(1,2) is taken as an argument and returned as string A(1,2)
There are two operators used in the function-like macros:
## causes a macro to concatenate two parameters.
# causes the input to be effectively turned into a string literal.
In A(a,b) ## causes a to be concatenated with b. In B(a), # effectively creates a string literal from the input. So the expansion runs as follows:
C(A(1,2)) -> C(12) -> B(12) -> "12"
B(A(1,2)) -> "A(1,2)"
Because for C(A(1,2)), the A(1,2) part is evaluated first to turn into 12, the two statements aren't equal like they would appear to be.
You can read more about these at cppreference.

Nested macros and ##

From The C Programming Language, by KRC
After
#define cat(x, y) x ## y
the call cat(var, 123) yields var123. However, the call
cat(cat(1,2),3) is undefined: the presence of ## prevents
the arguments of the outer call from being expanded. Thus it
produces the token string cat ( 1 , 2 )3
and )3 (the catenation of the last token of the first argument with
the first token of the second) is not a legal token.
If a second level of macro definition is introduced,
#define xcat(x, y) cat(x,y)
things work more smoothly; xcat(xcat(1, 2), 3) does produce
123, because the expansion of xcat itself does not involve the
## operator.
What is the property of ## that makes the difference between the two examples?
Why is the inner cat(1,2) in the first example not expanded, while the inner xcat(1,2) in the second example is?
Thanks!
It is one of the (not-so-well-known) characteristics of the macro ## operator that it inhibits further expansion of its arguments (it just considers them plain strings). An excerpt from the gcc pre-processor docs:
...As with stringification, the actual argument is not macro-expanded first...
That is, arguments to ## are not expanded.
By implementing the additional indirection using your xcat macro you are working around the problem (A process that is called the argument prescan is jumping in and actually evaluates the resulting string twice)

C language macro code - #define with 2 '##'

I recently came across this question and could not find supporting document or data in explanation. The question was asked to me and the person was not willing to share the answer.
#define BIT(A) BIT_##A
#define PIN_0 0
"Do we get BIT_0 by using macro BIT(PIN_0)? If no make necessary corrections?"
I dont know the answer to the above question?
The macro
#define BIT(A) BIT_##A
means to create a single token from what would otherwise be two separate tokens. Without using ## (the token concatenation operator), you might be tempted to do one of:
#define BIT(A) BIT_A
#define BIT(A) BIT_ A
The problem with the first is that, because BIT_A is already a single token, no attempt to match the A to the passed argument will succeed, and you'll get the literal expansion BIT_A no matter what you've used as an argument:
BIT(42) -> BIT_A
The problem with the second is that, even though A is a separate token and will therefore be subject to replacement, the final expansion will not be a single token:
BIT(42) -> BIT_ 42
The ## in your macro takes the value specified by A, and appends it to the literal BIT_ forming one token so, for example,
BIT(7) -> BIT_7
BIT(PIN0) -> BIT_PIN0, but see below if you want BIT_0
This is covered in C11 6.10.3.3 The ## operator:
... each instance of a ## preprocessing token in the replacement list (not from an argument) is deleted and the preceding preprocessing
token is concatenated with the following preprocessing token.
The resulting token is available for further macro replacement.
Now, if you want a macro that will concatenate together BIT_ and another already-evaluated macro into a single token, you have to use some trickery to get it to do the initial macro substitution before the concatenation.
That's because the standard states that the concatenation is performed before regular macro replacement, which is why this trickery is needed. The problem with what you have:
#define PIN_0 0
#define BIT(A) BIT_##A
is that the ## expansion of BIT(PIN0) will initially result in the single token BIT_PIN0. Now, although that's subject to further macro replacement, that single token doesn't actually have a macro replacement, so it's left as is.
To get around this, you have to use levels of indirection to coerce the preprocessor into doing regular macro replacement before ##:
#define CONCAT(x,y) x ## y
#define PIN0 0
#define BIT(A) CONCAT(BIT_,A)
This series of macros shown above goes through a number of stages:
BIT(PIN0)
-> CONCAT(BIT_,PIN0)
-> CONCAT(BIT_,0)
-> BIT_0

C Preprocessor treating an identifier as object-like instead of function-like

This is a very simplified version of some code I just ran into at work:
#include <stdio.h>
#define F(G) G(1)
#define G(x) x+1
int main() {
printf("%d\n", F(G));
}
prints 2.
Now, I can see that F(G) expands to G(1) and then G(1) expands to 2, but its not clear to me why. I would have expected to get an error that G is not a function from the printf line.
How does the pre-processor parse code like this?
A function-like macro is only invoked if its name is followed by a (.
In F(G), G is not followed by a (, so the G there is not a macro invocation.
In F(G) G(1), G is a macro parameter and thus is not macro-replaced directly (this is a very confusing macro you've got :-O). In G(1), G is replaced by the argument corresponding to the parameter G, which also happens to be G. That replacement is then rescanned and G(1) is evaluated to 1 + 1.
If we rewrite your macros so that you aren't using G in multiple different ways, it's far easier to understand:
#define F(x) x(1)
#define G(x) x + 1
Here, F(G) is replaced by G(1). This is then rescanned, and the invocation of G is evaluated, yielding 1 + 1.
Expanding on James McNellis' answer, the C99 standard prescribes:
6.10.3.4 Rescanning and further replacement
1 After all parameters in the replacement list have been substituted and # and ##
processing has taken place, all placemarker preprocessing tokens are removed. Then, the
resulting preprocessing token sequence is rescanned, along with all subsequent
preprocessing tokens of the source file, for more macro names to replace.
2 If the name of the macro being replaced is found during this scan of the replacement list
(not including the rest of the source file’s preprocessing tokens), it is not replaced.
Furthermore, if any nested replacements encounter the name of the macro being replaced,
it is not replaced. These nonreplaced macro name preprocessing tokens are no longer
available for further replacement even if they are later (re)examined in contexts in which
that macro name preprocessing token would otherwise have been replaced.
3 The resulting completely macro-replaced preprocessing token sequence is not processed
as a preprocessing directive even if it resembles one, but all pragma unary operator
expressions within it are then processed as specified in 6.10.9 below.
#defines do very basic string replacement:
printf("%d\n", F(G));
goes to
printf("%d\n", G(1));
which goes to:
printf("%d\n", 1+1);
The preprocessor makes one pass, but you are thinking that it makes one pass per #define. So during its pass the preprocessor matches and replaces F(G) but doesn't match any G(x).

Resources