Can a C macro definition refer to other macros? - c

What I'm trying to figure out is if something such as this (written in C):
#define FOO 15
#define BAR 23
#define MEH (FOO / BAR)
is allowed? I would want the preprocessor to replace every instance of
MEH
with
(15 / 23)
but I'm not so sure that will work. Certainly if the preprocessor only goes through the code once then I don't think it'd work out the way I'd like.
I found several similar examples but all were really too complicated for me to understand. If someone could help me out with this simple one I'd be eternally grateful!

Short answer yes. You can nest defines and macros like that - as many levels as you want as long as it isn't recursive.

The answer is "yes", and two other people have correctly said so.
As for why the answer is yes, the gory details are in the C standard, section 6.10.3.4, "Rescanning and further replacement". The OP might not benefit from this, but others might be interested.
6.10.3.4 Rescanning and further replacement
After all parameters in the replacement list have been substituted and
# and ## processing has taken place, all placemarker preprocessing tokens are removed.
Then, the resulting preprocessing token sequence
is rescanned, along with all subsequent preprocessing tokens of the
source file, for more macro names to replace.
If the name of the macro being replaced is found during this scan of
the replacement list (not including the rest of the source file's
preprocessing tokens), it is not replaced. Furthermore, if any nested
replacements encounter the name of the macro being replaced, it is not
replaced. These nonreplaced macro name preprocessing tokens are no
longer available for further replacement even if they are later
(re)examined in contexts in which that macro name preprocessing token
would otherwise have been replaced.
The resulting completely macro-replaced preprocessing token sequence
is not processed as a preprocessing directive even if it resembles
one, but all pragma unary operator expressions within it are then
processed as specified in 6.10.9 below.

Yes, it's going to work.
But for your personal information, here are some simplified rules about macros that might help you (it's out of scope, but will probably help you in the future). I'll try to keep it as simple as possible.
The defines are "defined" in the order they are included/read. That means that you cannot use a define that wasn't defined previously.
Usefull pre-processor keyword: #define, #undef, #else, #elif, #ifdef, #ifndef, #if
You can use any other previously #define in your macro. They will be expanded. (like in your question)
Function macro definitions accept two special operators (# and ##)
operator # stringize the argument:
#define str(x) #x
str(test); // would translate to "test"
operator ## concatenates two arguments
#define concat(a,b) a ## b
concat(hello, world); // would translate to "helloworld"
There are some predefined macros (from the language) as well that you can use:
__LINE__, __FILE__, __cplusplus, etc
See your compiler section on that to have an extensive list since it's not "cross platform"
Pay attention to the macro expansion
You'll see that people uses a log of round brackets "()" when defining macros. The reason is that when you call a macro, it's expanded "as is"
#define mult(a, b) a * b
mult(1+2, 3+4); // will be expanded like: 1 + 2 * 3 + 4 = 11 instead of 21.
mult_fix(a, b) ((a) * (b))

Yes, and there is one more advantage of this feature. You can leave some macro undefined and set its value as a name of another macro in the compilation command.
#define STR "string"
void main() { printf("value=%s\n", VALUE); }
In the command line you can say that the macro "VALUE" takes value from another macro "STR":
$ gcc -o test_macro -DVALUE=STR main.c
$ ./test_macro
Output:
value=string
This approach works as well for MSC compiler on Windows. I find it very flexible.

I'd like to add a gotcha that tripped me up.
Function-style macros cannot do this.
Example that doesn't compile when used:
#define FOO 1
#define FSMACRO(x) FOO + x

Yes, that is supported. And used quite a lot!
One important thing to note though is to make sure you paranthesize the expression otherwise you might run into nasty issues!
#define MEH FOO/BAR
// vs
#define MEH (FOO / BAR)
// the first could be expanded in an expression like 5 * MEH to mean something
// completely different than the second

Related

Parameterized macros involving the ## operator in the replacement-list

In the book that I am reading "C Programming A Modern Approach", there is a section on Page 343 that discusses some tricks you can use to get around certain deficits in macros.
The example problem is depicted as follows:
#define CONCAT(x,y) x##y (Directive 1)
The author then explains that the following line of code will fail to function as intended if using the aforementioned directive:
CONCAT(a, CONCAT(b,c))
This line of code will result in aCONCAT(b,c) as opposed to the desired abc.
In order to address this shortcoming, the author proposes the following work-around:
#define CONCAT2(x,y) CONCAT(x,y) (Directive 2)
The author explains that the presence of Directive 1 and Directive 2 will ensure that the slightly different line of code CONCAT2(a, CONCAT2(b,c)) is correctly replaced with abc.
(notice that this line of code is different than the original line of code...CONCAT2 is used instead of CONCAT.)
Could someone please walk me through why this will successfully carry out the desired objective? From what I understand, the preprocesser will keep scanning the precompiled code until all defined terms have been dealt with. For a given scan, how many defined words are updated per line?
I would think that the following flow of preprocessing replacements take place:
Given CONCAT2(a, CONCAT2(b,c))...
First pass over: CONCAT(a, CONCAT2(b,c))
However, for the second pass over, does CONCAT get expanded to its replacement list expression? Or does CONCAT2 get expanded to its replacement list expression? In either case, it seems like we once again arrive at a failed expression of either aCONCAT2(b,c) or CONCAT(a, CONCAT(b,c)), which would therefore still fail just like the very original case we presented.
Any help is greatly appreciated!
When the preprocessor detects a function-like macro invocation while scanning a source line, it completely expands the macro's arguments before substituting them into the macro's replacement text, except that where an argument appears as an operand of the stringification (#) or token-pasting (##) operator, its literal value is used for the operation. The resulting replacement text, with expanded arguments and the results of any # and ## operations substituted, is then rescanned for additional macros to expand.
Thus, with ...
CONCAT(a, CONCAT(b,c))
... the literal values of both arguments are used as operands for the token-pasting operation. The result is ...
aCONCAT(b,c)
. That is rescanned for further macros to expand, but aCONCAT is not defined as a macro name, so no further macro expansion occurs.
Now consider ...
CONCAT2(a, CONCAT2(b,c))
. In CONCAT2, neither argument is an operand of # or ##, so both are fully macro-expanded before being substituted. Of course a is unchanged, but CONCAT2(b,c) expands to CONCAT(b,c), which upon rescan is expanded to bc. By substitution of the expanded argument values into its replacement text, the outer CONCAT2 invocation expands to ...
CONCAT(a, bc)
. That expansion is then rescanned, in the context of the surrounding source text, for further macro expansion, yielding ...
abc
. That is again rescanned, but there are no further macro expansions to perform, so that's the final result.

C variadic macro with two named parameters

I want to use a variadic macro but it appears to be designed to only treat the first parameter specially. I want the first two parameters to be named and the rest not, like so:
#define FOO(AA,BB,...) AA->BB(AA,##...)
FOO(mystruct,funcname,123)
However this is not working with LLVM. Am I doing something wrong, or is there a limitation to how the variadic macro works?
UPDATE
The correct answer is, use ##VA_ARGS instead of ##...
There are some webpages that claim that "..." is valid but at least with the MacOS llvm it is not.
The macro arguments are not expanded with ... in the macro expansion - how could they, because then you couldn't have a macro that used ellipsis in the expansion. Instead it will be available as a special parameter __VA_ARGS__.
With this, the following program
#define FOO(AA,BB,...) AA->BB(AA, __VA_ARGS__)
FOO(mystruct,funcname,123)
FOO(mystruct,funcname,123,456)
will be preprocessed to
The ## is a token-pasting operator. It will make a single preprocessing token out of 2 parts. , ## ... attempts to make a preprocessing token ,.... It is not a valid C token, and that is why Clang will report
<source>:3:1: error: pasting formed ',...', an invalid preprocessing token
... macro arguments are pasted into macro bodies with __VA_ARGS__.
The problem is how to allow for it to be empty.
If it is empty, you'll usually want to comma before it erased and
you can use the GNU ##__VA_ARGS__ extension to achieve that.
#define FOO(AA,BB,...) AA->BB(AA,##__VA_ARGS__) /*GNU extension*/
FOO(mystruct,funcname) //warning with -pedantic
FOO(mystruct,funcname,123)
The above, however, will trigger warnings if compiled with -pedantic.
If you want your macro usable without warnings at -pedantic, you could perhaps achieve that by swapping the first two arguments in the macro definition.
#define FIRST(...) FIRST_(__VA_ARGS__,)
#define FIRST_(X,...) X
#define BAR_(CallExpr,...) CallExpr(__VA_ARGS__)
#define BAR(BB,/*AA,*/...) BAR_(FIRST(__VA_ARGS__)->BB,__VA_ARGS__)
BAR(funcname,mystruct) //no warning
BAR(funcname,mystruct,123)

Macro Expansion: Argument with Commas

The code I'm working on uses some very convoluted macro voodoo in order to generate code, but in the end there is a construct that looks like this
#define ARGS 1,2,3
#define MACROFUNC_OUTER(PARAMS) MACROFUNC_INNER(PARAMS)
#define MACROFUNC_INNER(A,B,C) A + B + C
int a = MACROFUNC_OUTER(ARGS);
What is expected is to get
int a = 1 + 2 + 3;
This works well for the compiler it has originally been written for (GHS) and also for GCC, but MSVC (2008) considers PARAMS as a single preprocessing token that it won't expand, setting then A to the whole PARAM and B and C to nothing. The result is this
int a = 1,2,3 + + ;
while MSVC warns that not enough actual parameters for macro 'MACROFUNC_INNER'.
Is it possible to get MSVC do the expansion with some tricks (another layer of macro to force a second expansion, some well placed ## or #, ...). Admitting that changing the way the construct work is not an option. (i.e.: can I solve the problem myself?)
What does the C standard say about such corner case? I couldn't find in the C11 norm anything that explicitly tells how to handle arguments that contains a list of arguments. (i.e.: can I argue with the author of the code that he has to write it again, or is just MVSC non-conform?)
MSVC is non-conformant. The standard is actually clear on the point, although it does not feel the need to mention this particular case, which is not exceptional.
When a function-like macro invocation is encountered, the preprocessor:
§6.10.3/11 identifies the arguments, which are possibly empty sequences of tokens separated by non-protected commas , (a comma is protected if it is inside parentheses ()).
§6.10.3.1/1 does a first pass over the macro body, substituting each parameter which is not used in a # or ## operation with the corresponding fully macro-expanded argument. (It does no other substitutions in the macro body in this step.)
§6.10.3.4/1 rescans the substituted replacement token sequence, performing more macro replacements as necessary.
(The above mostly ignores stringification (#) and token concatenation (##), which are not relevant to this question.)
This order of operations unambiguously leads to the behaviour expected by whoever wrote the software.
Apparently (according to #dxiv, and verified here) the following standards-compliant workaround works on some versions of MS Visual Studio:
#define CALL(A,B) A B
#define OUTER(PARAM) CALL(INNER,(PARAM))
#define INNER(A,B,C) whatever
For reference, the actual language from the C11 standard, skipping over the references to # and ## handling:
§6.10.3 11 The sequence of preprocessing tokens bounded by the outside-most matching parentheses forms the list of arguments for the function-like macro. The individual arguments within the list are separated by comma preprocessing tokens, but comma preprocessing tokens between matching inner parentheses do not separate arguments.…
§6.10.3.1 1 After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list… is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file…
§6.10.3.4 1 After all parameters in the replacement list have been substituted… [t]he resulting preprocessing token sequence is then rescanned, along with all subsequent preprocessing tokens of the source file, for more macro names to replace.
C11 says that each appearance of an object-like macro's name
[is] replaced by the replacement list of preprocessing tokens that constitute the remainder of the directive. The replacement list is then rescanned for more macro names as specified below.
[6.10.3/9]
Of function-like macros it says this:
If the identifier-list in the macro definition does not end with an ellipsis, the number of arguments [...] in an invocation of a function-like macro shall equal the number of parameters in the macro definition.
[6.10.3/4]
and this:
The sequence of preprocessing tokens bounded by the outside-most matching parentheses forms the list of arguments for the function-like macro.
[6.10.3/11]
and this:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list [...] is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.
[6.10.3.1/1]
Of macros in general it also says this:
After all parameters in the replacement list have been substituted [... t]he resulting preprocessing token sequence is then rescanned, along with all subsequent preprocessing tokens of the source file, for more macro names to replace.
[6.10.3.4/1]
MSVC++ does not properly expand the arguments to function-like macros before rescanning the expansion of such macros. It seems unlikely that there is any easy workaround.
UPDATE:
In light of #dxiv's answer, however, it may be that there is a solution after all. The problem with his solution with respect to standard-conforming behavior is that there needs to be one more expansion than is actually performed. That can easily enough be supplied. This variation on his approach works with GCC, as it should, and inasmuch as it is based on code that dxiv claims works with MSVC++, it seems likely to work there, too:
#define EXPAND(x) x
#define PAREN(...) (__VA_ARGS__)
#define EXPAND_F(m, ...) EXPAND(m PAREN(__VA_ARGS__))
#define SUM3(a,b,c) a + b + c
#define ARGS 1,2,3
int sum = EXPAND_F(SUM3, ARGS);
I have of course made it a little more generic than perhaps it needs to be, but that may serve you well if you have a lot of these to deal with..
Curiuosly enough, the following appears to work in MSVC (tested with 2010 and 2015).
#define ARGS 1,2,3
#define OUTER(...) INNER PARAN(__VA_ARGS__)
#define PARAN(...) (__VA_ARGS__)
#define INNER(A,B,C) A + B + C
int a = OUTER(ARGS);
I don't know that it's supposed to work by the letter of the standard, in fact I have a hunch it's not. Could still be conditionally compiled just for MSVC, as a workaround.
[EDIT] P.S. As pointed out in the comments, the above is (another) non-standard MSVC behavior. Instead, the alternative workarounds posted by #rici and #JohnBollinger in the respective replies are compliant, thus recommended.

Understanding recursive Macro Expansions

I came across this question in an Embedded interview question set.
#define cat(x,y) x##y
concatenates x to y. But cat(cat(1,2),3) does not expand but gives preprocessor warning. Why?
Does C not encourage Recursive Macro expansions ? My assumption is the expression should display 1##2##3. Am i wrong ?
The problem is that cat(cat(1,2),3) isn't expanded in a normal way which you expect that cat(1,2) would give 12 and cat(12, 3) would give 123.
Macro parameters that are preceded or followed by ## in a replacement list aren't expanded at the time of substitution. As a result, cat(cat(1,2),3) expands to cat(1,2)3, which can't be further expanded since there is no macro named cat(1,2)3.
So the simple rule is that, macros whose replacement lists depends on ## usually can't be called in a nested fashion.

During C macro expansion, is there a special case for macros that would expand to "/*"?

Here's a relevant example. It's obviously not valid C, but I'm just dealing with the preprocessor here, so the code doesn't actually have to compile.
#define IDENTITY(x) x
#define PREPEND_ASTERISK(x) *x
#define PREPEND_SLASH(x) /x
IDENTITY(literal)
PREPEND_ASTERISK(literal)
PREPEND_SLASH(literal)
IDENTITY(*pointer)
PREPEND_ASTERISK(*pointer)
PREPEND_SLASH(*pointer)
Running gcc's preprocessor on it:
gcc -std=c99 -E macrotest.c
This yields:
(...)
literal
*literal
/literal
*pointer
**pointer
/ *pointer
Please note the extra space in the last line.
This looks like a feature to prevent macros from expanding to "/*" to me, which I'm sure is well-intentioned. But at a glance, I couldn't find anything pertaining to this behaviour in the C99 standard. Then again, I'm inexperienced at C. Can someone shed some light on this? Where is this specified? I would guess that a compiler adhering to C99 should not just insert extra spaces during macro expansion just because it would probably prevent programming mistakes.
The source code is already tokenized before being processed by CPP.
So what you have is a / and a * token that will not be combined implicitly to a /* "token" ( since /* is not really a preprocessor token I put it in "").
If you use -E to output preprocessed source CPP needs to insert a space in order to avoid /* being read by a subsequent compiler pass.
The same feature prevents from two e.g. + signs from different macros being combined into a ++ token on output.
The only way to really paste two preprocessor tokens together is with the ## operator:
#define P(x,y) x##y
...
P(foo,bar)
results in the token foobar
P(+,+)
results in the token ++, but
P(/,*)
is not valid since /* is not a valid preprocessor token.
The behavior of the pre-processor is standardized. In the summary at http://en.wikipedia.org/wiki/C_preprocessor , the results you are observing are the effect of:
"3: Tokenization - The preprocessor breaks the result into preprocessing tokens and whitespace. It replaces comments with whitespace".
This takes place before:
"4: Macro Expansion and Directive Handling".

Resources