Create object-like macro from concatenation of token and macro - c

I want create an object-like macro from the concatenation of token and macro. I have this code:
#define alfa vita
/* Code below is broken. */
#define gamma delta##alfa
gamma
The gamma is replaced with deltaalfa. I want it replaced with deltavita. How can I do this?
I don't want the gamma to be a function-like macro.
What are the applications of the ## preprocessor operator and gotchas to consider?
This question is very broad. It isn't focused on my problem and the first answer doesn't address it either.

You must perform a double macro expansion like so:
#define alfa vita
#define concat2(a,b) a ## b
#define concat(a,b) concat2(a,b)
#define gamma concat(delta, alfa)
gamma
The operands of the stringification (#) and token pasting (##) operators are not first expanded. As a special case, expansion of a function-style macro proceeds by first expanding the arguments except where they are operands of the # or ## operator, then substituting them into the macro body, then rescanning for substitutions.
The double-expansion approach above works because the arguments of the concat() macro are not operands of ## (or #). They are therefore expanded before being substituted into that macro's body to yield
concat2(delta, vita)
Upon rescanning, that is expanded further to
delta ## vita
(regardless of any macro definition for the symbol vita), which is then pasted together into a single token to yield the result.

Related

How to use value of a symbolic constant in token-pasting operator?

My code looks something like this:
#define LIMB_SIZE 8
...
#define DATA_TYPE uint ## LIMB_SIZE ## _t
...
DATATYPE a;
...
Corresponding pre-processed output looks like this:
uintLIMBSIZE_t a;
while I was expecting
uint8_t a;
Is it the expected behavior? (It works the same way even if spaces across token-pasting operator are removed)
If yes, what's the alternative so that, I can use define LIMB_SIZE from command-line? (Assume LIMB_SIZE will be one among 8, 16, 32)
If no, what am I missing?
I used gcc -E file.c to verify the pre-processed output
Processing of an object-like macro is largely:
Replace the macro with its replacement tokens.
Apply # and ## operators.
Process macros in the new tokens.
So, by the time processing gets to replacing the LIMB_SIZE in the replacement list, it is at step 3. However, step 2 already converted it to uintLIMB_SIZE_t, so the separate macro name is no longer present.
Function-like macros add another step:
For each parameter that appears in the replacement list not preceded by # or adjacent to ##, replace the parameter with the result of replacing macros in the corresponding argument. If the parameter is preceded by #, it is replaced by the “stringized” argument. If the parameter is adjacent to ##, it is replaced by the argument without replacing macros in it.
This means, to have LIMB_SIZE be expanded before ## is processed, you need:
It to appear in an argument of a function-like macro.
The corresponding parameter of that macro not to be adjacent to ##.
That requires using two function-like macros, so you can do this:
#define LIMB_SIZE 8
#define Helper0(x) uint ## x ## _t // Concatenate x.
#define Helper1(x) Helper0(x) // Replace macros in x.
#define DATA_TYPE Helper1(LIMB_SIZE)
DATA_TYPE a;

How to use the token pasting operator with a variable number of arguments?

I thought of having a generic version of #define concatenate(a, b, c) a ## b ## c
I tried it like this:
#include <stdio.h>
#define concatenate(arg1, ...) arg1 ## __VA_ARGS__
int main()
{
int dob = 121201;
printf("%d", concatenate(d, o, b));
return 0;
}
I also tried many other ways:
#define concatenate(arg1, ...) arg1 ## ##__VA_ARGS__
#define concatenate(...) ## ##__VA_ARGS__
#define concatenate(...) ##__VA_ARGS__
#define concatenate(arg1, ...) arg1 ## ...
#define concatenate(arg1, ...) arg1 ## concatenate(##__VA_ARGS__)
Alas, all my attempts failed. I was wondering if it is even possible to do this in any way?
It's possible. Jens Gustedt's interesting P99 macro library includes the macro P99_PASTE, which has precisely the signature of your concatenate, as well as the same semantics.
The mechanics which P99 utilizes to implement that function are complex, to say the least. In particular, they rely on several hundred numbered macros which compensate for the fact that the C preprocessor does not allow recursive macro expansion.
Another useful explanation of how to do iteration in the C preprocessor is found in the documentation for the Boost Preprocessor Library, particularly the topic on reentrancy.
Jens' documentation for P99_PASTE emphasizes the fact that the macro pastes left-to-right to avoid the ambiguity of ##. That might need a bit of explanation.
The token-paste (##) operator is a binary operator; if you want to paste more than two segments into a single token, you need to do it a pair at a time, which means that all intermediate results must be valid tokens. That can require a certain amount of caution. Consider, for example, this macro which attempts to add an exponent to the end of an integer:
#define EXPONENT(INT, EXP) INT ## E ## EXP
(This will only work if both macro arguments are literal integers. In order to allow the macro arguments to be macros, we would need to introduce another level of indirection in the macro expansion. But that's not the point here.)
What we will almost immediately discover is that EXPONENT(42,-3) doesn't work, because -3 is not a single token. It's two tokens, - and 3, and the paste operator will only paste the -. That will result in a two-token sequence 42E- 3, which will eventually lead to a compiler error.
42E and 42E- are valid tokens, by the way. They are ppnumbers, preprocessing numbers, which are any combination of dots, digits, letters and exponents, provided that the token starts with a digit or a dot followed by a digit. (Exponents are one of the letters E or P, possibly lower-case and possibly followed by a sign. Otherwise, sign characters cannot appear in a ppnumber.)
So we could try to fix this by asking the user to separate the sign from the number:
#define EXPONENT(INT, SIGN, EXP) INT ## E ## SIGN ## EXP
EXPONENT(42,-,3)
That will work if the ## operators are evaluated from left-to-right. But the C standard does not impose any particular evaluation order of multiple ## operators. If we're using a preprocessor which works from right to left, then the first thing it will try to do is to paste - and 3, which won't work because -3 is not a single token, just as with the simpler definition.
Now, I can't offer an example of a compiler which will fail on this macro, since I don't have a right-to-left preprocessor handy. Both gcc and clang evaluate ## left-to-right, and I think that's far and away the most common evaluation order. But you can't rely on that; in order to write portable code, you need to ensure that the paste operators are evaluated in the expected order. And that's the guarantee offered by P99_PASTE.
Note: It's possible that there is an application in which right-to-left pasting is required, but after thinking about it for some time, the only example I could come up with of a token paste which would work right-to-left but not left-to-right is the following rather obscure corner case:
#define DOUBLE_HASH %: ## % ## :
and I can't think of any plausible context in which that might come up.

Function-like macro vs Object-like macro

I often get confused whether to use object-like or function-like macros. I have written about both here. So if the object-like macro's replacement list can be either literal, or list of literals. Then if we have an expression after the identifier we should use function-like macro right?
#define FIRST 1 //object-like
#define INCREASE_A_AND_B() do{++a;++b;}while(0) //functuion-like
#define ORED (FIRST | 5) //func or object? ORED or ORED()?
It would be much appreciated if someone shed some light when to use the one or the other way.
#define ORED (FIRST | 5)
Here, ORED is an object-like macro, because it was defined without parameter list. Try it:
ORED → (FIRST | 5)
ORED() → (FIRST | 5)() // Error: Object 5 not callable
The use of the macro depends on how it is defined. There's a special rule for macros, though: The opening parenthesis for an argument list to a macro must be given immediately after the macro name. Space characters between the macro name and the parenthesis define an object-like macro:
#define MACRO(X) // function-like macro hat expands to nothing
#define MACRO (X) // object-like macro that expands to (X)
Most compilers let you see what the code after preprocessing, i.e. after macro expansion looks like. gcc has the -E flag and Microsoft's cl has ´/E`.

# and ## order of expansion

The C standard gives the following example:
#define hash_hash # ## #
#define mkstr(a) # a
#define in_between(a) mkstr(a)
#define join(c, d) in_between(c hash_hash d)
char p[] = join(x, y); // equivalent to char p[] = "x ## y";
But it also says 'The order of evaluation of # and ## operators is unspecified.'
Why is the expansion of hash_hash guaranteed to be interpreted as the ## operator applied to #'s, instead of the # operator applied to ##?
Because '#' only acts as an operator if it appears in a function-like macro and is followed by a parameter name ... but hash_hash isn't a function-like macro and those '#' aren't followed by parameter names.
Quote C99:
A punctuator is a symbol that has independent syntactic and semantic
significance. Depending on context, it may specify an operation to be
performed (which in turn may yield a value or a function designator,
produce a side effect, or some combination thereof) in which case it
is known as an operator (other forms of operator also exist in some
contexts). An operand is an entity on which an operator acts.
# and ## are punctuators.
Besides :
6.10.3.1 Argument substitution
After the arguments for the invocation of a function-like macro have been identified, argument
substitution takes place. A parameter in the replacement list, unless
preceded by a # or ## preprocessing token or followed by a ##
preprocessing token (see below), is replaced by the corresponding
argument after all macros contained therein have been expanded. Before
being substituted, each argument’s preprocessing tokens are completely
macro replaced as if they formed the rest of the preprocessing file;
no other preprocessing tokens are available.

What's the exact step of macro expanding?

This doesn't work as expected:
#define stringify(x) #x
printf("Error at line " stringify(__LINE__));
This works:
#define stringify1(x) #x
#define stringify(x) stringify1(x)
printf("Error at line " stringify(__LINE__));
What's the priority that preprocess uses to expand such macros?
When expanding a macro, the preprocessor expands the macro's arguments only if those arguments are not subjected to the stringizing (#) or token-pasting (##) operators. So, if you have this:
#define stringify(x) #x
stringify(__LINE__)
Then, the preprocessor does not expand __LINE__, because it's the argument of the stringizing operator. However, when you do this:
#define stringify1(x) #x
#define stringify(x) stringify1(x)
stringify(__LINE__)
Then, when expanding stringify, the preprocessor expands __LINE__ to the current line number, since x is not used with either the stringizing or token-pasting operators in the definition of stringify. It then expands stringify1, and we get what we wanted.
The relevant language from the C99 standard comes from §6.10.3.1/1:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.
Clauses §6.10.3.2 and 6.10.3.3 go on to define the behavior of the # and ## operators respectively.

Resources