I have difficulty understanding how rewriting rules are applied by the C preprocessor in the following context. I have the following macros:
#define _A(x) "A" _##x
#define _B(x) "B" _##x
#define X(x) _##x
The idea is that each of these macros uses the concatenation to create a new expression, which can itself be a macro — if its a macro, I'd like it to be expanded:
Now, the following expands just like I expect:
X(x) expands to _x
X(A(x)) expands to "A" _x
X(A(B(x))) expands to "A" "B" _x
However, once the same macro is used more then once, the expansion stops:
X(A(A(x))) expands to "A" _A(x), expected "A" "A" _x
X(B(B(x))) expands to "B" _B(x), expected "B" "B" _x
X(A(B(A(x)))) expands to "A" "B" _A(x), expected "A" "B" "A" _x
X(A(B(A(B(x))))) expands to "A" "B" _A(B(x)), expected "A" "B" "A" "B" _x
I guess that there is some sort of "can expand same-named macro only once" rule at play here? Is there something I can do to get the macros to expand the way I want?
When I want to work out macro expansion I generally use this diagram, which I constructed using section 6.10.3 from standard. Hope it helps...
As Toby has already mentioned, nested macros will not be expanded recursively.
C99 draft says that there's no recursion permitted in macro expansion:
6.10.3.4 Rescanning and further replacement
After all parameters in the replacement list have been substituted and
# and
## processing has taken place, all placemarker preprocessing tokens are removed. The resulting preprocessing token sequence
is then rescanned, along with all subsequent preprocessing
tokens of the source file, for more macro names to replace.
If the
name of the macro being replaced is found during this scan of the
replacement list (not including the rest of the source file’s
preprocessing tokens), it is not replaced. Furthermore, if any
nested replacements encounter the name of the macro being replaced, it
is not replaced. These nonreplaced macro name preprocessing
tokens are no longer available for further replacement even if they
are later (re)examined in contexts in which that macro name
preprocessing token would otherwise have been replaced.
So X(A(A(x))) expands to "A" _A(x), but that expansion is not itself expanded, as you've seen.
Related
I have the following scenario...
Header file:
#define TRIG_INDEX 200
#define PATH(target_p) some.path.to.target##target_p
Source file:
read_from_target(PATH(TRIG_INDEX));
As the PATH macro appends the target_p to the text at the end, compilation fails as
some.path.to.targetTRIG_INDEX is not a valid path.
I was expecting to get read_from_target(some.path.to.target200) in the above scenario.
How can I (if at all) define the macros to accept such scenario?
The argument must be expanded by the macro:
#define TRIG_INDEX 200
#define PATH_TARGET(x) some.path.to.target##x
#define PATH(target_p) PATH_TARGET(target_p)
The pre-processor macro expansion/replacement rules for function-like macros are rather intricate. Basically it does so in the following order:
The stuff after the macro name is known as the replacement list. In this case PATH_TARGET(target_p) - things that PATH should be replaced with.
The occurrence of ## or # together with a macro parameter means that the parameter gets replaced with its corresponding pre-processor token sequence (if applicable). In this case, the parameter target_p gets replaced with TRIG_INDEX, so we get some.path.to.target##TRIG_INDEX creating the new pre-processor token some.path.to.targetTRIG_INDEX , which is not as intended.
Other macro parameters get expanded. Doesn't apply here.
The replacement list is then rescanned for macro names to expand, but it's too late since pre-processor token concatenation with ## has already occurred.
The part "other macro parameters get expanded" above is useful to fix the problem. So macro parameters in the replacement list take precedence over macro names. We can take advantage of this by adding a helper macro:
#define TRIG_INDEX 200
#define EXPAND(tgt) some.path.to.target##tgt
#define PATH(target_p) EXPAND(target_p)
Now in the expansion of the replacement list of PATH, we first get macro parameter expansion: taget_p is replaced and we get EXPAND(TRIG_INDEX). The replacement list is then rescanned for macro names to replace, the pre-processor finds two such macro names: EXPAND and TRIG_INDEX. Both are expanded and so TRIG_INDEX is replaced with 200. Then EXPAND is expanded recursively according to the same rules (## enforcing an expansion of that macro parameter tgt into 200 before anything else).
For details, see the C17 standard chapter 6.10.3 - this chapter is also the one which specifies the behavior of the # and ## operators.
I want to use macro parameter like this:
#define D(cond,...) do{ \
#if cond \
#define YYY 1 \
#else \
#define YYY 0 \
} while(0)
Is it possible?
UPD
Maybe when sources will be preprocessed twice: gcc -E source.c | gcc -xc - next will work:
#define D(cond,...) #define YYY cond&DEBUG
#if YYY
#define D(...) printf( __VA_ARGS__ )
#else
#define D(...)
#endif
No, because C 2011 [N1570] 6.10.3.4 3 says, about macro replacement, “The resulting completely macro-replaced preprocessing token sequence is not processed as a preprocessing directive even if it resembles one,…”
This is not possible. Read about the GNU cpp preprocessor and the C11 standard (i.e. n1570), and check here. The C preprocessor is (conceptually at least) run before the rest of the compiler (which gets the preprocessed form of your translation unit). BTW, for a file foo.c you could use gcc -C -E foo.c > foo.i (using GCC) to get inside foo.i its preprocessed form, and you can inspect that foo.i -since it is a textual file- with a pager or an editor.
However, a .c file can be generated (generating C code is a common practice, at least since the 1980s; for example with yacc, bison, rpcgen, swig, ....; many large software projects use specialized generators of C or C++ code...). You might consider using some other tool, perhaps the GPP preprocessor (or GNU m4) or some other program or script, to generate your C file (from something else). Look also into autoconf (it might have goals similar to yours).
You may want to configure your build automation tool for such purpose, e.g. edit your Makefile for GNU make.
No, this is not possible.
During translation, all preprocessing directives (#define, #include, etc.) are executed before any macro expansion occurs, so if a macro expands into a preprocessing directive, it won't be interpreted as such - it will be interpreted as (invalid) source code.
As pointed out by others this is not possible but there is a work around:
int YYY;
/* global scope variables are sometimes considered bad practice... */
#define D(cond,...) do{ \
if (cond) { \
YYY = 1; \
} \
else { \
YYY = 0; \
} \
} while(0)
Use an optimizing flag (ex: gcc/clang -O3) and it will replace the dead code as if it was a macro. Obviously you may want to change the type of YYY but you seem to use it like a boolean.
No, you cannot. The C preprocessor cannot know what is going to occur during runtime.
The preprocessor goes through the program before it is even compiled and replaces every macro defined with its assigned value.
This is some poor man's code generation, for when integrating another tool to the project is overkill.
Define a macro like this, expanding for your needs:
#define NESTED /* Comment out instead of backslash new lines.
*/ /*
*/ UNDEF REPLACED /*
*/ /*
*/ IFDEF CONDITION /*
*/ DEFINE REPLACED 1 /*
*/ ELSE /*
*/ DEFINE REPLACED 0 /*
*/ ENDIF
Your version of NESTED can be a function-like macro, and REPLACED can have a more elaborated body.
Leave CONDITION and the directive named macros without a definition.
DEFINE CONDITION to control which value NESTED gets on compilation, similarly to normal #ifdef usage:
DEFINE CONDITION
NESTED
int i = REPLACED; //i == 1
UNDEF CONDITION
NESTED
int z = REPLACED; //z == 0
Source code that uses NESTED and the other macros will not compile. To generate a .c or .cpp file that you can compile with your chosen options, do this:
gcc -E -CC source.c -o temporary.c
gcc -E \
-DDEFINE=\#define -DUNDEF=\#undef \
-DIFDEF=\#ifdef -DELSE=\#else -DENDIF=\#endif \
temporary.c -o usableFile.c
rm temporary.c #remove the temporary file
-E means preprocess only, not compile. The first gcc command expands NESTED and all normally defined macros from the source. As DEFINE, IFDEF, etc. are not defined, they and their future arguments remain as literal text in the temporary.c file.
-CC makes the comments be preserved in the output file. After the preprocessor replaces NESTED by its body, temporary.c contains the directive macros in separate lines, with the comments. When the comments are removed on the next gcc command, the line breaks remain by the standard.
# is accepted in the body of a macro that takes no arguments. However, unlike macros, directives are not rescaned and executed on expansion, so you need another preprocessor pass to make nested defines work. All preprocessing related to the delayed defines needs to be delayed too, and made available to the preprocessor at once. Otherwise, directives and arguments needed in a later pass are consumed and removed from the code in a previous one.
The second gcc command replaces the -D macros by the delayed directives, making all of them available to the preprocessor starting on the next pass. The directives and their arguments are not rescaned in the same gcc command, and remain as literal text in usableFile.c.
When you compile usableFile.c, the preprocessor executes the delayed directives.
I am trying to understand one of the VPP sample function name 'sample_plugin_api_hookup'. What is the purpose of underscore ('_' ) in macro (#define _(N,n) ) ?
#define _(N,n) \
vl_msg_api_set_handlers((VL_API_##N + sm->msg_id_base), \
#n, \
vl_api_##n##_t_handler, \
vl_noop_handler, \
vl_api_##n##_t_endian, \
vl_api_##n##_t_print, \
sizeof(vl_api_##n##_t), 1);
foreach_sample_plugin_api_msg;
#undef _
_ is a valid C identifier, though it's reserved for use at file scope. A C identifier consists of a letter or underscore, followed by zero or more letters, underscores, or digits (I'm ignoring universal character names).
Apparently the author of that code wanted a name that's short, easy to type, and unobtrusive. I believe the GNU gettext package, or code that uses it, follows this convention, with a macro call like
_("This is a message")
being replaced by a localized version of the message. (Which means that a program that uses GNU gettext likely would have to pick a different name).
foreach_sample_plugin_api_msg is another macro that makes use of the _ macro, which is why the _ macro is undefined immediately after that line.
Perhaps the author was influenced by the Go language, which uses _ as a blank identifier.
Opinions are likely to differ on the question of whether this is a neat trick or a crime against good taste.
Given a macro that has been defined previously:
#define FILENAME somefile.h
I want to concatenate this with another macro-string that defines the (relative) path of this file. My current approach is to do this like so:
#define DIRECTORY ../somedir/
#define STRINGIFY_(x) #x
#define FILE2_(dir, file) STRINGIFY_(dir ## file)
#define FILE_(dir, file) FILE2_(dir, file)
#include FILE_(DIRECTORY, FILENAME)
This however results in an error (GCC4.9):
error: pasting "/" and "file" does not give a valid preprocessing token
Removing the final forward slash from the DIRECTORY definition removes this error, but obviously does not yield the desired result. Similar errors appear when I try to smuggle the / in otherwise. For example:
#define FILE2_(dir, file) STRINGIFY_(dir ## / ## file)
does not work for the same reason.
I would like to know what is going wrong here and, obviously, how to circumvent this.
EDIT: Changed double underscores to singles on Columbo's advice. Apparently, identifiers containing double underscores are reserved to the implementation, regardless of where they appear (I was under the impression that this only held true for double underscores at the beginning of an ID).
[cpp.include]/4:
A preprocessing directive of the form
# include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The
preprocessing tokens after include in the directive are processed
just as in normal text (i.e., each identifier currently defined as a
macro name is replaced by its replacement list of preprocessing
tokens). If the directive resulting after all replacements does not
match one of the two previous forms, the behavior is
undefined.152
152 Note that adjacent string literals are not
concatenated into a single string literal (see the translation phases
in 2.2); thus, an expansion that results in two string literals is an
invalid directive.
So though #include MACRO is valid, MACRO must directly expand to an otherwise valid argument to #include. The concatenation of string literals happens two translation phases after preprocessing.
Also, in the definition of the ## operator, [cpp.concat]/3:
For both object-like and function-like macro invocations, before the replacement list is reexamined for more
macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an
argument) is deleted and the preceding preprocessing token is concatenated with the following preprocessing
token.
[..] If the result is not a valid preprocessing token, the behavior is undefined.
Hence the result of A##B must be one valid preprocessing token. / is an own preprocessing token, and so are the names of the directories and files.
You can't concatenate "abc and /xyz", since abc/ is not a valid preprocessing token - "abc is not one preprocessing token, but two, though "abc" is one.
On the other hand, if you concatenate <abc/ and xyz>, then / and xyz are concatenated, examined, and we have a problem again.
Thus it appears to be impossible to concat the paths using ##. Your approach looks quite impossible to me, too.
With GCC, this is fine though:
#define PATH <foo/bar/
#define FILE boo>
#define ARG PATH FILE
#include ARG
It works because GCCs preprocessor removes the white space (for some reason). Does not work on VC++ or Clang and isn't covered by standard anyway, so definitely not recommended.
I have this bit of code (part of an interpreter for a garbage-collected Forth system, actually):
#define PRIMITIVE(name) \
do \
{ \
VocabEntry* entry = (VocabEntry*)gc_alloc(sizeof(VocabEntry)); \
entry->code = name; \
entry->name = cstr_to_pstr(#name); \
entry->prev = latest_vocab_entry; \
latest_vocab_entry = entry; \
} \
while (false)
PRIMITIVE(dup);
PRIMITIVE(drop);
PRIMITIVE(swap);
// and a lot more
but there's a problem: in the line
entry->name = cstr_to_pstr(#name);
the name field is substituted for dup, drop, swap, and the rest. I want the field name to not be substituted.
So, is there any way to solve this, other than simply renaming the macro argument?
For an answer, please explain if there is, in general, a way to suppress the substitution of a macro argument name in the macro body. Don't answer "just do it this way" (please).
You can define a different macro to expand to name, like this:
#define Name name
and change the name field in the PRIMITIVE macro to use the new macro, like this:
#define PRIMITIVE(name) \
do \
{ \
VocabEntry* entry = (VocabEntry*)gc_alloc(sizeof(VocabEntry)); \
entry->code = name; \
entry->Name = cstr_to_pstr(#name); \
entry->prev = latest_vocab_entry; \
latest_vocab_entry = entry; \
} \
while (false)
Other than using something different from the parameter name in the macro body or changing the parameter name, there is no other way to do this in the C language. Per C 2011 (N1570) 6.10.3.1 1, when a function-like macro is recognized, parameter names are immediately substituted except when # or ## is present, and there no other exceptions:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded.
The # token changes the parameter name to a string, which is no use in this situation. The ## token expands the parameter name and pastes it together with an adjacent token, which is also no use in this situation.
No, there is not.
To see why, you need to consider the way macro expansion actually happens. Expanding a function-like macro requires three main steps:
the arguments to the macro are fully expanded, unless the macro uses the # or ## operators on them (not relevant in the example since they're single tokens)
the entire replacement list is scanned, and any occurrence of a parameter name is replaced by the corresponding argument
after step 2 is complete, the expanded replacement list is itself rescanned, and any macros appearing are expanded at this point
This is outlined in standard section 6.10.3 (C11 and C99).
The upshot of this is that it is impossible to write some kind of macro that can take name and abuse the suppression-rules of '##' or anything like that, because the replacement step in the body of PRIMITIVE must run completely before any of the macros within the body are allowed their turn to be recognised. There is nothing you can do to mark a token within the replacement list for suppression, because any marks you could place upon it will only be examined after the replacement step has already run. Since the order is specified in the standard, any exploit you find that would let you mark a token in this way is a compiler bug.
Best I can suggest if you're really set on not renaming the macro argument is to pass na and me as separate arguments to a concatenation macro; the token name will only be formed after replacement is done and the list is no longer being examined for parameter names.
EDIT wish I typed faster.
No, there is no way to suppress the replacement within a macro's body of a token identical to that of a declared argument to said macro. Every possible solution short of jumping into the preprocessor code will require you to rename something, either the argument name or the name of the field (potentially just for purposes of that macro, as Eric's answer does).