This doesn't work as expected:
#define stringify(x) #x
printf("Error at line " stringify(__LINE__));
This works:
#define stringify1(x) #x
#define stringify(x) stringify1(x)
printf("Error at line " stringify(__LINE__));
What's the priority that preprocess uses to expand such macros?
When expanding a macro, the preprocessor expands the macro's arguments only if those arguments are not subjected to the stringizing (#) or token-pasting (##) operators. So, if you have this:
#define stringify(x) #x
stringify(__LINE__)
Then, the preprocessor does not expand __LINE__, because it's the argument of the stringizing operator. However, when you do this:
#define stringify1(x) #x
#define stringify(x) stringify1(x)
stringify(__LINE__)
Then, when expanding stringify, the preprocessor expands __LINE__ to the current line number, since x is not used with either the stringizing or token-pasting operators in the definition of stringify. It then expands stringify1, and we get what we wanted.
The relevant language from the C99 standard comes from §6.10.3.1/1:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.
Clauses §6.10.3.2 and 6.10.3.3 go on to define the behavior of the # and ## operators respectively.
Related
My code looks something like this:
#define LIMB_SIZE 8
...
#define DATA_TYPE uint ## LIMB_SIZE ## _t
...
DATATYPE a;
...
Corresponding pre-processed output looks like this:
uintLIMBSIZE_t a;
while I was expecting
uint8_t a;
Is it the expected behavior? (It works the same way even if spaces across token-pasting operator are removed)
If yes, what's the alternative so that, I can use define LIMB_SIZE from command-line? (Assume LIMB_SIZE will be one among 8, 16, 32)
If no, what am I missing?
I used gcc -E file.c to verify the pre-processed output
Processing of an object-like macro is largely:
Replace the macro with its replacement tokens.
Apply # and ## operators.
Process macros in the new tokens.
So, by the time processing gets to replacing the LIMB_SIZE in the replacement list, it is at step 3. However, step 2 already converted it to uintLIMB_SIZE_t, so the separate macro name is no longer present.
Function-like macros add another step:
For each parameter that appears in the replacement list not preceded by # or adjacent to ##, replace the parameter with the result of replacing macros in the corresponding argument. If the parameter is preceded by #, it is replaced by the “stringized” argument. If the parameter is adjacent to ##, it is replaced by the argument without replacing macros in it.
This means, to have LIMB_SIZE be expanded before ## is processed, you need:
It to appear in an argument of a function-like macro.
The corresponding parameter of that macro not to be adjacent to ##.
That requires using two function-like macros, so you can do this:
#define LIMB_SIZE 8
#define Helper0(x) uint ## x ## _t // Concatenate x.
#define Helper1(x) Helper0(x) // Replace macros in x.
#define DATA_TYPE Helper1(LIMB_SIZE)
DATA_TYPE a;
I often get confused whether to use object-like or function-like macros. I have written about both here. So if the object-like macro's replacement list can be either literal, or list of literals. Then if we have an expression after the identifier we should use function-like macro right?
#define FIRST 1 //object-like
#define INCREASE_A_AND_B() do{++a;++b;}while(0) //functuion-like
#define ORED (FIRST | 5) //func or object? ORED or ORED()?
It would be much appreciated if someone shed some light when to use the one or the other way.
#define ORED (FIRST | 5)
Here, ORED is an object-like macro, because it was defined without parameter list. Try it:
ORED → (FIRST | 5)
ORED() → (FIRST | 5)() // Error: Object 5 not callable
The use of the macro depends on how it is defined. There's a special rule for macros, though: The opening parenthesis for an argument list to a macro must be given immediately after the macro name. Space characters between the macro name and the parenthesis define an object-like macro:
#define MACRO(X) // function-like macro hat expands to nothing
#define MACRO (X) // object-like macro that expands to (X)
Most compilers let you see what the code after preprocessing, i.e. after macro expansion looks like. gcc has the -E flag and Microsoft's cl has ´/E`.
I want create an object-like macro from the concatenation of token and macro. I have this code:
#define alfa vita
/* Code below is broken. */
#define gamma delta##alfa
gamma
The gamma is replaced with deltaalfa. I want it replaced with deltavita. How can I do this?
I don't want the gamma to be a function-like macro.
What are the applications of the ## preprocessor operator and gotchas to consider?
This question is very broad. It isn't focused on my problem and the first answer doesn't address it either.
You must perform a double macro expansion like so:
#define alfa vita
#define concat2(a,b) a ## b
#define concat(a,b) concat2(a,b)
#define gamma concat(delta, alfa)
gamma
The operands of the stringification (#) and token pasting (##) operators are not first expanded. As a special case, expansion of a function-style macro proceeds by first expanding the arguments except where they are operands of the # or ## operator, then substituting them into the macro body, then rescanning for substitutions.
The double-expansion approach above works because the arguments of the concat() macro are not operands of ## (or #). They are therefore expanded before being substituted into that macro's body to yield
concat2(delta, vita)
Upon rescanning, that is expanded further to
delta ## vita
(regardless of any macro definition for the symbol vita), which is then pasted together into a single token to yield the result.
Why does GNU cpp accept the following code, even when run with the flags -std=c99 -pedantic:
#define z()
#define w(x)
z()
w()
w(1)
The C99 Standard requires that the number of arguments in a function-like macro invocation shall match the number of parameters in the macro definition (and is happy with the idea that an argument may consist of an empty sequence of preprocessing tokens so presumably the first two invocations provide a single empty argument), but this cannot be true for all three invocations.
Indeed surely z must be only be invoked with zero arguments, which is syntactically impossible?
A few experiments show that the first w line is interpreted as one empty argument:
#define w(x) #x
w()
w(1)
when preprocessed gives:
""
"1"
Even better:
#define w(x, y) #x <-> #y
w(,)
w(1,)
w(, 2)
w(1, 2)
gives:
"" <-> ""
"1" <-> ""
"" <-> "2"
"1" <-> "2"
Nifty...
Dunno what the standard says about this. Must ask a lawyer...
Macro usage is different from function invocation in one important aspect: macro arguments may be empty. This is valid:
#define x(a,b) X a X b X
x(,) //-> X X X
x(1,) //-> X 1 X X
x(,2) //-> X X 2 X
Given that empty arguments are allowed, this macro usage:
#define x(a) X a X
x() //-> X X
must be interpreted as an empty argument. While this one:
#define x() XX
x() //-> XX
is a macro without arguments.
The only language in the standard that I can use to work with this is as follows:
C 2011. Section 6.10.3 Macro Replacement. Part of Paragraph 10:
A preprocessing directive of the form:
# define identifier lparen identifier-list_opt ) replacement-list new-line
# define identifier lparen ... ) replacement-list new-line
# define identifier lparen identifier-list , ... ) replacement-list new-line
defines a function-like macro with parameters, whose use is
similar syntactically to a function call.
That being said, the standard does give examples of valid macro replacements. They include:
#define p() int
#define t(a) a
#define glue(a, b) a ## b
p() // Produces: int
t(x) // Produces: x
glue(1,2) // Produces: 12
There's a syntactic ambiguity here. Given a function-like macro invocation x(), the empty sequence of tokens between the parentheses can be interpreted as no arguments, or as one empty argument. GCC resolves the ambiguity by looking at the macro definition:
#define x() replacement
#define y(a) repl##a##cement
x() y()
will preprocess without error to
replacement replcement
I no longer have C99 section 6.10 memorized to the point where I can say off the top of my head why this is the correct behavior, but I'm certain that it is, because if it wasn't I would have made GCC do something else. ;-)
There is some discussion of these rules, in relatively informal terms, at the bottom of https://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Macro-Arguments.html . Note in particular that given
#define z(a,b) stuff with a and b
it is an error to write z(), but z(,) is acceptable.
The C standard gives the following example:
#define hash_hash # ## #
#define mkstr(a) # a
#define in_between(a) mkstr(a)
#define join(c, d) in_between(c hash_hash d)
char p[] = join(x, y); // equivalent to char p[] = "x ## y";
But it also says 'The order of evaluation of # and ## operators is unspecified.'
Why is the expansion of hash_hash guaranteed to be interpreted as the ## operator applied to #'s, instead of the # operator applied to ##?
Because '#' only acts as an operator if it appears in a function-like macro and is followed by a parameter name ... but hash_hash isn't a function-like macro and those '#' aren't followed by parameter names.
Quote C99:
A punctuator is a symbol that has independent syntactic and semantic
significance. Depending on context, it may specify an operation to be
performed (which in turn may yield a value or a function designator,
produce a side effect, or some combination thereof) in which case it
is known as an operator (other forms of operator also exist in some
contexts). An operand is an entity on which an operator acts.
# and ## are punctuators.
Besides :
6.10.3.1 Argument substitution
After the arguments for the invocation of a function-like macro have been identified, argument
substitution takes place. A parameter in the replacement list, unless
preceded by a # or ## preprocessing token or followed by a ##
preprocessing token (see below), is replaced by the corresponding
argument after all macros contained therein have been expanded. Before
being substituted, each argument’s preprocessing tokens are completely
macro replaced as if they formed the rest of the preprocessing file;
no other preprocessing tokens are available.