C Macro expansion within a macro - c-preprocessor

While testing a C software, I'd like to use macros to generate function calls. Using codewarrior 5.2 (very old) compiler -because I don't have choice, I don't know if this is standard behavior.
in macros.h
#define RUNTEST(i) \
testcase_index = i; \
__PREFIX__##_testCase_##i()
in test_foo.c
#include "macros.h"
#define __PREFIX__ foo
RUNTEST(10);
Apparently __PREFIX__ is not expanded, and preprocessor generates a call to __PREFIX___testcase_10(), which of course will fail at linking time.
Copying everything in the same file doesn't seem to change anything.
Is there a simple way out?
Alternative
I also tried #define __PREFIX__() foo to force macro expansion. In that case, it almost works, and now generates foo _testcase_10() (with a space), which of course won't compile.

I've done a simplified version of your question here (without assigning to the testcase index):
.h:
#define PASTER(x, y) x##_testCase_##y
#define EVAL(x, y) PASTER(x, y)
#define RUNTEST(i) EVAL(__PREFIX__, i)
.c
#define __PREFIX__ foo
// whatever
RUNTEST(1);
Explanation:
From the C standard:
6.10.3.1 Argument substitution
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.
So now that we have this, I'll walk through the expansion of RUNTEST(1):
EVAL(__PREFIX__, i)
PASTER(foo, 1)
foo##_testCase_##1
foo_testCase_1

Related

In what situations does C preprocessor ## work and not work?

It seems that sometimes concatenation with ## does work and sometimes it does not work.
It is an unreliable feature even though it is clearly vitally necessary for some uses.
Is there a clear set of rules for using ##?
An example is:
file1.h
#define concat(a,b) a##b
#define BAR bar
extern int concat(fu,BAR) ();
Here concat produces fuBAR not fubar.
An example is:
file2.h
#define BAR bar
extern int fu##BAR ();
Here ## produces an error about a stray ## in the code.
## obeys rules which are different than the rules you were expecting. That doesn't mean it "doesn't work" or "is an unreliable feature", it only means you have to use it differently than you thought.
When you write
#define concat(a,b) a##b
the C standard says that the arguments a and b are NOT macro expanded before the concatenation happens. (N1570 §6.10.3.1.) This is an intentional difference from the behavior when you don't apply ## or # to a macro argument.
You can get the behavior you wanted with double expansion:
#define concat(a,b) concat_(a,b)
#define concat_(a,b) a##b
With this definition, the arguments to concat are macro expanded before substitution into the macro body, since they are not being used as operands to ##. Then each argument becomes an argument to concat_, and isn't expanded again, but that's fine, because the expansion is done already.
And when you write
#define BAR bar
extern int fu##BAR ();
the C standard says that the ## operator is not recognized at all by the preprocessor (and therefore passes on to translation phase 7, where it is a valid token that isn't accepted by any grammar rule, and therefore causes a syntax error) because it's not part of a macro definition. (N1570 §6.10.3.3p2,3 — by implication only; it says that ## is recognized when it appears in the replacement list of a macro being expanded. This section doesn't say that it is recognized at any other time, and no other part of the standard gives ## a meaning in any other context.)
Is there a clear set of rules for using ##?
The C standard 6.10.3.1:
After the arguments for the invocation of a function-like macro have been identified,
argument substitution takes place. A parameter in the replacement list, unless preceded
by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is
replaced by the corresponding argument after all macros contained therein have been
expanded.
Basically the ## concatenation happens before token expansion, because the result of ## could result in a new preprocessor token. In case we have this:
#define concat(a,b) a##b
#define BAR bar
#define fuBAR "hello world"
Then puts(concat(fu,BAR)); will print "hello world".
In order to fix this, you need a helper macro which expands the preprocessor tokens before passing them to the macro where ## (or #) resides:
#define cc(a,b) a##b
#define concat(a,b) cc(a,b)
#define BAR bar
#define fuBAR "hello world"
#define fubar "fubar"
In this example a and b are expanded to fu and bar before cc is invoked. So now puts(concat(fu,BAR)); will print "fubar".
Here ## produces an error about a stray ## in the code.
Because you can only use # and ## inside macros, simple as that. extern int fu##BAR (); is not a macro.
It seems that sometimes concatenation with ## does work and sometimes it does not work.
That's like saying "sometimes multiplication with * does work, and sometimes it doesn't", and then presenting examples such as
int six = 5;
int product = six * 2; // results in 10 instead of 12
and
void f(int a, int b, int a * b); // rejected by the compiler
The specifications for the token-pasting operator are presented in 6.10.3.3 of the C17 language specification, and your first clue should be that this is a subsection of 6.10.3, which describes macro replacement. Not only the placement of the section but also its explicit text make it clear that token concatenation occurs (only) in the context of macro replacement:
For both object-like and function-like macro invocations, before the replacement list is reexamined for more macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an argument) is deleted and the preceding preprocessing token is concatenated with the following preprocessing token.
(C17 6.10.3.3/3)
There are no other specifications for special treatment of ##, so that's its full scope.
Additionally,
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding
argument after all macros contained therein have been expanded.
(C17 6.10.3.1/1; emphasis added)
Thus, where function-like macro arguments are operands of a ## (or #) operator, they are not themselves macro-expanded, though the concatenated result might be expanded as a macro when the result is re-scanned.

Why does preprocessor macros ignore statements in parentheses

Following to my (duplicate) question ( and as suggested by StoryTeller)
Why do preprocessor macros ignore function names in parenthesis?
#include <stdio.h>
#include <stdlib.h>
#define abs(x) ((x))
int main(void)
{
printf("%d\n", abs(-1)); // output: -1
printf("%d\n", (abs)(-1)); // output: 1
return 0;
}
Is this defined in the standard?
The preprocessor's macro substitution is specified as follows:
6.10.3 Macro replacement / p10 - Emphasis mine:
A preprocessing directive of the form
# define identifier lparen identifier-list<opt> ) replacement-list new-line
# define identifier lparen ... ) replacement-list new-line
# define identifier lparen identifier-list , ... ) replacement-list new-line
defines a function-like macro with parameters, whose use is similar
syntactically to a function call. The parameters are specified by the
optional list of identifiers, whose scope extends from their
declaration in the identifier list until the new-line character that
terminates the #define preprocessing directive. Each subsequent
instance of the function-like macro name followed by a ( as the next
preprocessing token introduces the sequence of preprocessing tokens
that is replaced by the replacement list in the definition (an
invocation of the macro). The replaced sequence of preprocessing
tokens is terminated by the matching ) preprocessing token, skipping
intervening matched pairs of left and right parenthesis preprocessing
tokens. Within the sequence of preprocessing tokens making up an
invocation of a function-like macro, new-line is considered a normal
white-space character.
It says it right there in bold. For substitution to occur, the very next preprocessing token after the macro name must be a (. When it's a ), such as when the macro is in parentheses, no substitution can occur.
So that leaves us only with the function name in parentheses, an expression that is identical to the function's designator.
Why? Because the C preprocessor is agnostic of the C language!
The C preprocessor has been conceived as a pure text replacement tool, barely powerful enough to provide C programmers with a simple means to
define constants that are pasted literally into the program
literally paste the contents of files (i.e. headers) into the program
provide a simple means of creating simple templates of code, that are again pasted literally into the program
None of this includes any awareness of the C syntax! It's pure text manipulation.
On the contrary, you can do stuff like this
#define CLASS(name, base) typedef struct name name;\
struct name {\
base super;
#define END_CLASS };
CLASS(foo, bar)
int baz;
END_CLASS
Note how the preprocessor will generate an unmatched { token when expanding the CLASS macro, and an unmatched } token when expanding the END_CLASS macro.
Thus, the syntax for using a macro has nothing to do with the syntax for calling a function. You can call a C function in many different ways (foo(), foo (), (foo)(), (**(*foo) ) (), etc.) because the C language handles expressions of functions, and defines what happens when you place its name in parentheses (it's implicitly converted to a pointer to it), dereference it, or call it. All this does not exist in the preprocessor, so there is exactly one way to use a macro: foo() with no extra space between the name and the (.
Side note:
The C preprocessor is so agnostic of the C language, that it's not just used with C, it's also commonly used with languages like FORTRAN. (Which is a horrible fit, actually. Nevertheless, the C preprocessor is the best, commonly supported thing that can be used to make FORTRAN a bit less painful.)

Do function like macros need a mandatory parentheses? I am confused after referring the GCC cpp manual

Here is what confuses me:
To define a function-like macro, you use the same '#define' directive, but you put a pair of parentheses immediately after the macro name.
I believe this is to make the code stand out for people other than the author of the program. Like other rules of CAPS for macro names. But the following is where I get confused:
A function-like macro is only expanded if its name appears with a pair of parentheses after it. If you write just the name, it is left alone.
I disagreed instantly after reading it. And gcc -E verified that in the following code
#define FUNC display()
void display()
{
printf("Display\n");
}
int main()
{
FUNC;
return 0;
}
The pre-processed output shows the content of the main() function as expected:
int main()
{
display();
return 0;
}
So what am I missing here? The pre-processor is for tokenizing the source, the macro expansion is a token and the above code was processed that way, the pre-processor isn't supposed to check anything or verify anything, it just dumps tokens. In that case what is the gcc manual trying to convey.
I am learning C programming, so I might be misunderstanding it a great deal as it frequently happens, I searched for a proper explanation and finally resorted to asking here. Please help me with this.
When you define:
#define FUNC display()
FUNC is not a function-like macro; it is an object-like macro that expands to a function call.
A function-like macro looks like:
#define FUNC() display()
Now you must write FUNC() to invoke it. Or, more frequently, it will have arguments:
#define MIN(x, y) ((x) > (y) ? (x) : (y))
and that can be invoked with:
int min = MIN(sin(p), cos(q));
with cautions about the number of times the arguments are expanded.
See also getc() as macro and C standard library function definition. It includes the standard's explanation of why it is important that the simple name of a function-like macro without a following open parenthesis is not expanded, which is what the quote from the GCC manual is telling you.
When a function-like macro is defined, the open parenthesis must 'touch' the macro name:
#define function_like(a) …
#define object_like (…)
Because there's a space after object_like, the open parenthesis is part of the replacement text, not the start of an argument list. When the function-like macro is invoked, there may be spaces between the macro name and the argument list:
function_like (x) // Valid invocation of function_like macro.
However, if you wrote:
int (function_like)(double a) { return asin(a) + 2 * atanh(a); }
this is not an invocation of the function-like macro because the token after function_like is not an open parenthesis.
There are two kinds of macros. They differ mostly in what they look like when they are used. Object-like macros resemble data objects when used, function-like macros resemble function calls.
You may define any valid identifier as a macro, even if it is a C keyword. The preprocessor does not know anything about keywords. This can be useful if you wish to hide a keyword such as const from an older compiler that does not understand it. However, the preprocessor operator can never be defined as a macro, and C++'s named operators cannot be macros when you are compiling C++.

Will ifndef always work on an object-like macro defined with an empty replacement-list

Include guards in header files are often used to protect sections of code from double inclusion:
#ifndef FOOBAR_H
#define FOOBAR_H
extern void myfoofunc(void);
#endif
Include guards typically rely on the expectation that if an object-like macro was already defined, the lines within the #ifndef block will not included -- thus avoiding double inclusion.
I've noticed that the #define line for many include headers have empty replacement-lists. Does the C99 standard guarantee that object-like macros defined with an empty replacement-list will be considered "defined" by #ifndef?
When describing the syntax of #define, the C99 standard seems to imply that a replacement-list is required in section 6.10.3 paragraph 9:
A preprocessing directive of the form
# define identifier replacement-list new-line
defines an object-like macro that causes each subsequent instance of the macro name to be
replaced by the replacement list of preprocessing tokens that
constitute the remainder of the directive. The replacement list is
then rescanned for more macro names as specified below.
Does this mean include headers should instead be of the form: #define FOOBAR_H 1?
The standard syntax production for a replacement-list is: pp-tokens [opt]. So no tokens are necessary for a replacement-list to be valid.
So #ifdef will work fine and as expected for macros that are defined 'empty'. A ton of code depends on this.
No, it doesn't. The replacement list may well be empty. #define FOO means that defined FOO is true, but FOO is replaced with nothing.
Example:
#define FOO
#define BAR 1
#if defined FOO && defined BAR
int a = FOO + BAR ;
#endif
Result of preprocessing:
int a = + 1 ;
No; the macro substitution will not take place in the #ifndef line. If it did, then the whole statement would be a syntax error (since there wouldn't be anything following the #ifndef).
I think I've actually seen some instances of #define FOOBAR_H 1 but that's more of a personal taste.

How, exactly, does the double-stringize trick work?

At least some C preprocessors let you stringize the value of a macro, rather than its name, by passing it through one function-like macro to another that stringizes it:
#define STR1(x) #x
#define STR2(x) STR1(x)
#define THE_ANSWER 42
#define THE_ANSWER_STR STR2(THE_ANSWER) /* "42" */
Example use cases here.
This does work, at least in GCC and Clang (both with -std=c99), but I'm not sure how it works in C-standard terms.
Is this behavior guaranteed by C99?
If so, how does C99 guarantee it?
If not, at what point does the behavior go from C-defined to GCC-defined?
Yes, it's guaranteed.
It works because arguments to macros are themselves macro-expanded, except where the macro argument name appears in the macro body with the stringifier # or the token-paster ##.
6.10.3.1/1:
... After the arguments for the
invocation of a function-like macro
have been identified, argument
substitution takes place. A parameter
in the replacement list, unless
preceded by a # or ## preprocessing
token or followed by a ##
preprocessing token (see below), is
replaced by the corresponding argument
after all macros contained therein
have been expanded...
So, if you do STR1(THE_ANSWER) then you get "THE_ANSWER", because the argument of STR1 is not macro-expanded. However, the argument of STR2 is macro-expanded when it's substituted into the definition of STR2, which therefore gives STR1 an argument of 42, with the result of "42".
As Steve notes, this is guarenteed, and it has been guarenteed since the C89 standard -- that was the standard the codified the # and ## operators in macros and mandates recursively expanding macros in args before substituting them into the body if and only if the body does not apply a # or ## to the argument. C99 is unchanged from C89 in this respect.

Resources