C preprocessor: building a path string - c

Given a macro that has been defined previously:
#define FILENAME somefile.h
I want to concatenate this with another macro-string that defines the (relative) path of this file. My current approach is to do this like so:
#define DIRECTORY ../somedir/
#define STRINGIFY_(x) #x
#define FILE2_(dir, file) STRINGIFY_(dir ## file)
#define FILE_(dir, file) FILE2_(dir, file)
#include FILE_(DIRECTORY, FILENAME)
This however results in an error (GCC4.9):
error: pasting "/" and "file" does not give a valid preprocessing token
Removing the final forward slash from the DIRECTORY definition removes this error, but obviously does not yield the desired result. Similar errors appear when I try to smuggle the / in otherwise. For example:
#define FILE2_(dir, file) STRINGIFY_(dir ## / ## file)
does not work for the same reason.
I would like to know what is going wrong here and, obviously, how to circumvent this.
EDIT: Changed double underscores to singles on Columbo's advice. Apparently, identifiers containing double underscores are reserved to the implementation, regardless of where they appear (I was under the impression that this only held true for double underscores at the beginning of an ID).

[cpp.include]/4:
A preprocessing directive of the form
# include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The
preprocessing tokens after include in the directive are processed
just as in normal text (i.e., each identifier currently defined as a
macro name is replaced by its replacement list of preprocessing
tokens). If the directive resulting after all replacements does not
match one of the two previous forms, the behavior is
undefined.152
152 Note that adjacent string literals are not
concatenated into a single string literal (see the translation phases
in 2.2); thus, an expansion that results in two string literals is an
invalid directive.
So though #include MACRO is valid, MACRO must directly expand to an otherwise valid argument to #include. The concatenation of string literals happens two translation phases after preprocessing.
Also, in the definition of the ## operator, [cpp.concat]/3:
For both object-like and function-like macro invocations, before the replacement list is reexamined for more
macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an
argument) is deleted and the preceding preprocessing token is concatenated with the following preprocessing
token.
[..] If the result is not a valid preprocessing token, the behavior is undefined.
Hence the result of A##B must be one valid preprocessing token. / is an own preprocessing token, and so are the names of the directories and files.
You can't concatenate "abc and /xyz", since abc/ is not a valid preprocessing token - "abc is not one preprocessing token, but two, though "abc" is one.
On the other hand, if you concatenate <abc/ and xyz>, then / and xyz are concatenated, examined, and we have a problem again.
Thus it appears to be impossible to concat the paths using ##. Your approach looks quite impossible to me, too.
With GCC, this is fine though:
#define PATH <foo/bar/
#define FILE boo>
#define ARG PATH FILE
#include ARG
It works because GCCs preprocessor removes the white space (for some reason). Does not work on VC++ or Clang and isn't covered by standard anyway, so definitely not recommended.

Related

How to define a C macro that uses another macro?

I have the following scenario...
Header file:
#define TRIG_INDEX 200
#define PATH(target_p) some.path.to.target##target_p
Source file:
read_from_target(PATH(TRIG_INDEX));
As the PATH macro appends the target_p to the text at the end, compilation fails as
some.path.to.targetTRIG_INDEX is not a valid path.
I was expecting to get read_from_target(some.path.to.target200) in the above scenario.
How can I (if at all) define the macros to accept such scenario?
The argument must be expanded by the macro:
#define TRIG_INDEX 200
#define PATH_TARGET(x) some.path.to.target##x
#define PATH(target_p) PATH_TARGET(target_p)
The pre-processor macro expansion/replacement rules for function-like macros are rather intricate. Basically it does so in the following order:
The stuff after the macro name is known as the replacement list. In this case PATH_TARGET(target_p) - things that PATH should be replaced with.
The occurrence of ## or # together with a macro parameter means that the parameter gets replaced with its corresponding pre-processor token sequence (if applicable). In this case, the parameter target_p gets replaced with TRIG_INDEX, so we get some.path.to.target##TRIG_INDEX creating the new pre-processor token some.path.to.targetTRIG_INDEX , which is not as intended.
Other macro parameters get expanded. Doesn't apply here.
The replacement list is then rescanned for macro names to expand, but it's too late since pre-processor token concatenation with ## has already occurred.
The part "other macro parameters get expanded" above is useful to fix the problem. So macro parameters in the replacement list take precedence over macro names. We can take advantage of this by adding a helper macro:
#define TRIG_INDEX 200
#define EXPAND(tgt) some.path.to.target##tgt
#define PATH(target_p) EXPAND(target_p)
Now in the expansion of the replacement list of PATH, we first get macro parameter expansion: taget_p is replaced and we get EXPAND(TRIG_INDEX). The replacement list is then rescanned for macro names to replace, the pre-processor finds two such macro names: EXPAND and TRIG_INDEX. Both are expanded and so TRIG_INDEX is replaced with 200. Then EXPAND is expanded recursively according to the same rules (## enforcing an expansion of that macro parameter tgt into 200 before anything else).
For details, see the C17 standard chapter 6.10.3 - this chapter is also the one which specifies the behavior of the # and ## operators.

Why writing anything after the fileName of a #include directive does't give any errors in a C program?

Why doesn't writing anything after the fileName of a #include directive give any errors in a C program?
#include <fileName.h> we can write anything in here and it will not give an error after program compilation
main() {
printf("Hello World");
}
heres another example:
#include "fileName.h" we can write here anything and its fine this will not give an error after compilation
main() {
printf("Hello World");
}
Reading the Documentation didn't help please if you find anything about this behavior from the C specification lemme know heres the link C Documentation
There shouldn't be any text after the included file.
Section 6.10.2 of the C standard regarding #include states:
2 A preprocessing directive of the form
# include <h-char-sequence> new-line
searches a sequence of implementation-defined places for a header
identified uniquely by the specified sequence between the < and >
delimiters, and causes the replacement of that directive by the entire
contents of the header. How the places are specified or the header
identified is implementation-defined.
3 A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the
source file identified by the specified sequence between the "
delimiters. The named source file is searched for in an
implementation-defined manner. If this search is not supported, or if
the searchfails, the directive is reprocessed as if it read
# include <h-char-sequence> new-line
with the identical contained sequence
(including > characters, if any) from the original directive.
4 A preprocessing directive of the form
# include pp-tokens new-line
(that does not match one of the two previous forms) is
permitted. The preprocessing tokens after include in the
directive are processed just as in normal text. (Each
identifier currently defined as a macro name is replaced by
its replacement list of preprocessing tokens.) The directive
resulting after all replacements shall match one of the two
previous forms. The method by which a sequence of
preprocessing tokens between a < and a > preprocessing token pair
or a pair of"characters is combined into a single header name
preprocessing token is implementation-defined.
None of these forms allows for text after the included filename. In fact, both gcc and MSVC issue warnings in this case.
Given this code:
#include <stdio.h> bogus text
int main()
{
return 0;
}
gcc 4.8.5 outputs:
x1.c:1:20: warning: extra tokens at end of #include directive [enabled by default]
#include <stdio.h> bogus text
^
And MSVC 2015 outputs:
x1.c
x1.c(1): warning C4067: unexpected tokens following preprocessor directive - expected a newline

Can a C #error macro display multiple line message?

I tried to use the #error directive with GCC compiler like this:
#error "The charging pins aren't differing! One pin cannot be used for multiple purposes!"
This says, I should use double quotes, so the argument will be a single string constant and I can use an apostrophe inside of it. However, I want this string to be appeared in the source code in mutiple lines, like:
#error "The charging pins aren't differing!
One pin cannot be used for multiple purposes!"
Then, I got some error messages:
warning: missing terminating " character
#error "The charging pins aren't differing! One pin
error: missing terminating " character
cannot be used for multiple purposes!"
If I insert a blackslash at the end of the first line, the diagnostic message conatains all the whitespaceb between the beginning of the second line and the first word (One). If both lines are strings the diagnostic message shows the inner double quotes.
So the question: How can I achieve this output? (or a similar wihtout double quotes, but the apostrophe included)
#error "The charging pins aren't differing! One pin cannot be used for multiple purposes!"
Unfortunately, you can't have it all.
Either you have to get rid of the apostrophe so that the message only contains what's regarded as valid pre-processing tokens.
Or you can write it as a string on a single line.
Or you can write it two string literals and break the line with \. You can't do this in the middle of a string literal, because then it wouldn't be a valid pre-processing token. This will make the output look weird though, like : error: "hello" "world".
Relying on pre-processor concatenation of two string literals won't work, since the error directive only looks until it spots a newline character in the source. And the error directive translates everything you type into strings anyway.
The relevant translation phases are (from C17 5.1.1.2) executed in this order:
2) Each instance of a backslash character () immediately followed by a new-line
character is deleted, splicing physical source lines to form logical source lines.
3) The source file is decomposed into preprocessing tokens and sequences of
white-space characters (including comments).
4) Preprocessing directives are executed, ...
6) Adjacent string literal tokens are concatenated.
#error is executed in step 4, earlier than string literal concatenation in step 6.
I personally think the best solution is to skip the apostrophe:
#error The charging pins are not differing! \
One pin cannot be used for multiple purposes!
Slight fix of English and you get the best compromise between readable source and a readable error message.
As stated here
Neither ‘#error’ nor ‘#warning’ macro-expands its argument.
Internal whitespace sequences are each replaced with a single space. The line must consist of complete tokens. It is wisest to make the argument of these directives be a single string constant; this
avoids problems with apostrophes and the like.
So you can use it in single line only.
#include <stdio.h>
//#define var 10
#ifdef var
#error "The charging pins aren't differing! One pin cannot be used for multiple purposes!"
#endif
int main(void){
printf("in main() \n");
return 0;
}
Use double-quotes and \ to split the string on multiple lines:
#error "These aren't \
working!"
MSVC (v19.14) outputs:
<source>(2): fatal error C1189: #error: "These aren't working!"
GCC (v5.5) outputs:
<source>:1:2: error: #error "These aren't working!"
#error "These aren't \
^

C preprocessor macro expansion

I have difficulty understanding how rewriting rules are applied by the C preprocessor in the following context. I have the following macros:
#define _A(x) "A" _##x
#define _B(x) "B" _##x
#define X(x) _##x
The idea is that each of these macros uses the concatenation to create a new expression, which can itself be a macro — if its a macro, I'd like it to be expanded:
Now, the following expands just like I expect:
X(x) expands to _x
X(A(x)) expands to "A" _x
X(A(B(x))) expands to "A" "B" _x
However, once the same macro is used more then once, the expansion stops:
X(A(A(x))) expands to "A" _A(x), expected "A" "A" _x
X(B(B(x))) expands to "B" _B(x), expected "B" "B" _x
X(A(B(A(x)))) expands to "A" "B" _A(x), expected "A" "B" "A" _x
X(A(B(A(B(x))))) expands to "A" "B" _A(B(x)), expected "A" "B" "A" "B" _x
I guess that there is some sort of "can expand same-named macro only once" rule at play here? Is there something I can do to get the macros to expand the way I want?
When I want to work out macro expansion I generally use this diagram, which I constructed using section 6.10.3 from standard. Hope it helps...
As Toby has already mentioned, nested macros will not be expanded recursively.
C99 draft says that there's no recursion permitted in macro expansion:
6.10.3.4 Rescanning and further replacement
After all parameters in the replacement list have been substituted and
# and
## processing has taken place, all placemarker preprocessing tokens are removed. The resulting preprocessing token sequence
is then rescanned, along with all subsequent preprocessing
tokens of the source file, for more macro names to replace.
If the
name of the macro being replaced is found during this scan of the
replacement list (not including the rest of the source file’s
preprocessing tokens), it is not replaced. Furthermore, if any
nested replacements encounter the name of the macro being replaced, it
is not replaced. These nonreplaced macro name preprocessing
tokens are no longer available for further replacement even if they
are later (re)examined in contexts in which that macro name
preprocessing token would otherwise have been replaced.
So X(A(A(x))) expands to "A" _A(x), but that expansion is not itself expanded, as you've seen.

Suppress C Macro Variable Substitution

I have this bit of code (part of an interpreter for a garbage-collected Forth system, actually):
#define PRIMITIVE(name) \
do \
{ \
VocabEntry* entry = (VocabEntry*)gc_alloc(sizeof(VocabEntry)); \
entry->code = name; \
entry->name = cstr_to_pstr(#name); \
entry->prev = latest_vocab_entry; \
latest_vocab_entry = entry; \
} \
while (false)
PRIMITIVE(dup);
PRIMITIVE(drop);
PRIMITIVE(swap);
// and a lot more
but there's a problem: in the line
entry->name = cstr_to_pstr(#name);
the name field is substituted for dup, drop, swap, and the rest. I want the field name to not be substituted.
So, is there any way to solve this, other than simply renaming the macro argument?
For an answer, please explain if there is, in general, a way to suppress the substitution of a macro argument name in the macro body. Don't answer "just do it this way" (please).
You can define a different macro to expand to name, like this:
#define Name name
and change the name field in the PRIMITIVE macro to use the new macro, like this:
#define PRIMITIVE(name) \
do \
{ \
VocabEntry* entry = (VocabEntry*)gc_alloc(sizeof(VocabEntry)); \
entry->code = name; \
entry->Name = cstr_to_pstr(#name); \
entry->prev = latest_vocab_entry; \
latest_vocab_entry = entry; \
} \
while (false)
Other than using something different from the parameter name in the macro body or changing the parameter name, there is no other way to do this in the C language. Per C 2011 (N1570) 6.10.3.1 1, when a function-like macro is recognized, parameter names are immediately substituted except when # or ## is present, and there no other exceptions:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded.
The # token changes the parameter name to a string, which is no use in this situation. The ## token expands the parameter name and pastes it together with an adjacent token, which is also no use in this situation.
No, there is not.
To see why, you need to consider the way macro expansion actually happens. Expanding a function-like macro requires three main steps:
the arguments to the macro are fully expanded, unless the macro uses the # or ## operators on them (not relevant in the example since they're single tokens)
the entire replacement list is scanned, and any occurrence of a parameter name is replaced by the corresponding argument
after step 2 is complete, the expanded replacement list is itself rescanned, and any macros appearing are expanded at this point
This is outlined in standard section 6.10.3 (C11 and C99).
The upshot of this is that it is impossible to write some kind of macro that can take name and abuse the suppression-rules of '##' or anything like that, because the replacement step in the body of PRIMITIVE must run completely before any of the macros within the body are allowed their turn to be recognised. There is nothing you can do to mark a token within the replacement list for suppression, because any marks you could place upon it will only be examined after the replacement step has already run. Since the order is specified in the standard, any exploit you find that would let you mark a token in this way is a compiler bug.
Best I can suggest if you're really set on not renaming the macro argument is to pass na and me as separate arguments to a concatenation macro; the token name will only be formed after replacement is done and the list is no longer being examined for parameter names.
EDIT wish I typed faster.
No, there is no way to suppress the replacement within a macro's body of a token identical to that of a declared argument to said macro. Every possible solution short of jumping into the preprocessor code will require you to rename something, either the argument name or the name of the field (potentially just for purposes of that macro, as Eric's answer does).

Resources