Given the following code that prints a string which is a stringification of two words:
#define PORT_INFO_MAC_2(portNum) port: portNum
#define PORT_INFO_MAC(portNum) PORT_INFO_MAC_2(portNum)
/* Stringify macro expansion instead of the macro itself */
#define INVOKE_MACRO(...) #__VA_ARGS__
printf(" %s " , INVOKE_MACRO(PORT_INFO_MAC(1)) ); /* In a more general way, I'll be using it like follows: INVOKE_MACRO(PORT_INFO_MAC(2), PORT_INFO_MAC(1), ...) */
The output is always " port: 1 " with a single space between the "port" and the "1". Why is there always a single space there and is there a way to control the amount of spaces?
changing the amount of spaces in PORT_INFO_MAC_2 macro between port and portNum doesn't change the output space amount.
EDIT
It seems that there are two cases, at the fist case the port and portNum are closest- PORT_INFO_MAC_2(portNum) port:portNum which then no space exist in the output between them. At the second case, in which any number of spaces exist in the macro between them, the amount of spaces in the output is always 1.
Is there any formal explanation for that? Is there any control over that?
Why is there always a single space there and is there a way to control the amount of spaces?
Because that's what the stringification operator is specified to do:
If, in the replacement list, a parameter is immediately preceded by a # preprocessing token, both are replaced by a single character string literal preprocessing token that contains the spelling of the preprocessing token sequence for the corresponding argument. Each occurrence of white space between the argument’s preprocessing tokens becomes a single space character in the character string literal.
(C2011 6.10.3.2/2; emphasis added)
Of course, if there is no whitespace at all between the preprocessing tokens, then none appears in the stringification.
Related
I have a #define'd value named HEIGHT with a value of 20.
I want to use the ASCI escape code "\033[HA" (where H is the number of lines the cursor is moved up.
However, when my code reads "\033[HEIGHTA", it is reading the 'H' as a different escape code (return cursor home). How can I include a #define'd value within an escape code?
Thanks
There are several alternatives, among them
Use a function instead of a macro to generate the escape code as needed. For example,
const char *cursor_up_seq() {
static char sequence[12];
if (sequence[0] == '\0') {
// one-time initialization
sprintf(sequence, "\033[%dA", HEIGHT);
}
return sequence;
}
As a variation on (1), do not produce the escape sequence as a standalone entity at all. Instead, embed it in whatever else you are printing, where it is natural to use (say) printf() to print the value of the HEIGHT macro.
But if you really want to produce a macro for a string literal containing the whole escape sequence, then you can do so by combining two C features:
the stringification (#) macro operator, and
automatic concatenation of adjacent string literals
Another answer, now deleted, attempted to demonstrate that, but floundered on one of the gotchas in that area. Here is a variation that works:
#define HEIGHT 20
#define STRINGIFY(x) #x
#define STRINGIFY_VALUE(x) STRINGIFY(x)
#define SEQUENCE "\033[" STRINGIFY_VALUE(HEIGHT) "A"
The resulting SEQUENCE macro expands to "\033[" "20" "A", which is 100% equivalent to "\033[20A" because of string literal concatenation. The gotcha here is that you cannot use STRINGIFY() directly for this purpose, because that does not macro-expand its argument before converting it to a string (per the standard behavior of #). Wrapping it in another macro layer (STRINGIFY_VALUE) results in that outer layer expanding the argument before presenting the result for stringification.
As far my knowledge goes in C, C pre-processors replace the literals as it is in #define. But now, I am seeing that, it gives spaces before and after.
Is my explanation correct or am I doing something which should give some undefined behaviors?
Consider the following C code:
#include <stdio.h>
#define k +-6+-
#define kk xx+k-x
int main()
{
int x = 1029, xx = 4,t;
printf("x=%d,xx=%d\n",x,xx);
t=(35*kk*2)*4;
printf("t=%d,x=%d,xx=%d\n",t,x,xx);
return 0;
}
The initial values are: x = 1029, xx = 4. Lets calculate the value of t now.
t = (35*kk*2)*4;
t = (35*xx+k-x*2)*4; // replacing the literal kk
t = (35*xx++-6+--x*2)*4; // replacing the literal k
Now, the value of xx = 4 which would be increased by one just in the next statement and x is decremented by one and became 1028. So, the calculation of the current statement:
t = (35*4-6+1028*2)*4;
t = (140-6+2056)*4;
t = 2190*4;
t = 8760;
But the output of the above code is:
x=1029,xx=4
t=8768,x=1029,xx=4
From the second line of the output, it is clear that increments and decrements are not taken place.
That means after replacing k and kk, it is becoming:
t = (35*xx+ +-6+- -x*2)*4;
(If it is, then the calculation is clear.)
My concerning point: is it the standard of C or just an undefined behavior? Or am I doing something wrong?
The C standard specifies that the source file is analyzed and parsed into preprocessor tokens. When macro replacement occurs, a macro that is replaced is replaced with those tokens. The replacement is not literal text replacement.
C 2018 5.1.1.2 specifies translation phases (rephrasing and summarizing, not exact quotes):
Physical source file multibyte characters are mapped to the source character set. Trigraph sequences are replaced by single-character representations.
Lines continued with backslashes are merged.
The source file is converted from characters into preprocessing tokens and white-space characters—each sequence of characters that can be a preprocessing token is converted to a preprocessing token, and each comment becomes one space.
Preprocessing is performed (directives are executed and macros are expanded).
Source characters in character constants and string literals are converted to members of the execution character set.
Adjacent string literals are concatenated.
White-space characters are discarded. “Each preprocessing token is converted into a token. The resulting tokens are syntactically and semantically analyzed and translated as a translation unit.” (That quoted text is the main part of C compilation as we think of it!)
The program is linked to become an executable file.
So, in phase 3, the compiler recognizes that #define kk xx+k-x consists of the tokens #, define, kk, xx, +, k, -, and x. The compiler also knows there is white space between define and kk and between kk and xx, but this white space is not itself a preprocessor token.
In phase 4, when the compiler replaces kk in the source, it is doing so with these tokens. kk gets replaced by the tokens xx, +, k, -, and x, and k is replaced by the tokens +, -, 6, +, and -. Combined, those form xx, +, +, -, 6, +, -, -, -, and x.
The tokens remain that way. They are not reanalyzed to put + and + together to form ++.
As #EricPostpischil says in a comprehensive answer, the C pre-processor works on tokens, not character strings, and once the input is tokenised, whitespace is no longer needed to separate adjacent tokens.
If you ask a C preprocessor to print out the processed program text, it will probably add whitespace characters where needed to separate the tokens. But that's just for your convenience; the whitespace might or might not be present, and it makes almost no difference because it has no semantic value and will be discarded before the token sequence is handed over to the compiler.
But there is a brief moment during preprocessing when you can see some whitespace, or at least an indication as to whether there was whitespace inside a token sequence, if you can pass the token sequence as an argument to a function-like macro.
Most of the time, the preprocessor does not modify tokens. The tokens it receives are what it outputs, although not necessarily in the same order and not necessarily all of them. But there are two exceptions, involving the two preprocessor operators # (stringify) and ## (token concatenation). The first of these transforms a macro argument -- a possibly empty sequence of tokens -- into a string literal, and when it does so it needs to consider the presence or absence of whitespace in the token sequence.
(The token concatenation operator combines two tokens into a single token if possible; when it does so, intervening whitespace is ignored. That operator is not relevant here.)
The C standard actually specifies precisely how whitespace in a macro argument is handled if the argument is stringified, in paragraph 2 of §6.10.3.2:
Each occurrence of white space between the argument’s preprocessing tokens
becomes a single space character in the character string literal. White space before the first preprocessing token and after the last preprocessing token composing the argument is deleted.
We can see this effect in action:
/* I is just used to eliminate whitespace between two macro invocations.
* The indirection of `STRING/STRING_` is explained in many SO answers;
* it's necessary in order that the stringify operator apply to the expanded
* macro argument, rather than the literal argument.
*/
#define I(x) x
#define STRING_(x) #x
#define STRING(x) STRING_(x)
#define PLUS +
int main(void) {
printf("%s\n", STRING(I(PLUS)I(PLUS)));
printf("%s\n", STRING(I(PLUS) I(PLUS)));
}
The output of this program is:
++
+ +
showing that the whitespace in the second invocation was preserved.
Contrast the above with gcc's -E output for ordinary use of the macro:
int main(void) {
(void) I(PLUS)I(PLUS)3;
(void) I(PLUS) I(PLUS)3;
}
The macro expansion is
int main(void) {
(void) + +3;
(void) + +3;
}
showing that the preprocessor was forced to insert a cosmetic space into the first expansion, in order to preserve the semantics of the macro expansion. (Again, I emphasize that the -E output is not what the preprocessor module passes to the compiler, in normal GCC operation. Internally, it passes a token sequence. All of the whitespace in the -E output above is a courtesy which makes the generated file more useful.)
I was working with macros and wrote one like this:
#define STR(name) #name
I meant STR() to stringise whatever that was given to it as argument and it seemed to be working.
printf( STR(Hello) )
gave the output as expected:
Hello
So did
printf( STR(Hello world) );
printf( STR(String) STR(ise) );
which gave
Hello world
Stringise
But when I tried to use STR() to print only a space, it just didn’t work.
printf( STR(Hello) STR( ) STR(World) ); //There’s a space between the parenthesis of the second STR
Gave the output:
HelloWorld
Here the STR( ) is ignored.
Why is this? Is there a way around it using while still sticking to macros with only a space as argument?
I was just wondering if this was possible.
It is not possible for the stringification to result into a single space. The semantics of the # operator are detailed in C11 6.10.3.2p2:
If, in the replacement list, a parameter is immediately preceded by a # preprocessing token, both are replaced by a single character string literal preprocessing token that contains the spelling of the preprocessing token sequence for the corresponding argument. Each occurrence of white space between the argument's preprocessing tokens becomes a single space character in the character string literal. White space before the first preprocessing token and after the last preprocessing token composing the argument is deleted. [...] The character string literal corresponding to an empty argument is "". [...].
Thus, as the space is not a preprocessing token, and leading and trailing space is deleted, it is impossible for the stringification operator to create a resulting string literal that just contains a single space. As you've noticed, STR( ) would pass an empty argument to the macro, and this would be stringified into ""; likewise
STR( Hello
World
)
would be expanded into "Hello World"; i.e. each occurrence of white space would become a single space character, and the preceding and trailing whitespace would be deleted.
However, while it is not possible to stringify a single space, it is possible to achieve the required output. The preprocessor concatenates consecutive string literal tokens into one, so "Hello" " " "World" would be converted to `"Hello world"; therefore
printf(STR(Hello) " " STR(World));
would after macro expansion be expanded to
printf("Hello" " " "World");
and thereafter to
printf("Hello World");
ATOMIC_JOIN(prefix, detail_platform) is an macro which will output some string as follows:
base/atomic/gcc_gnu_x64
in another macro ATOMIC_DETAIL_HEADER, which output expected to be:
"base/atomic/gcc_gnu_x64.hpp" // notice: double quotes included in the output
I try to write the ATOMIC_DETAIL_HEADER, such as:
#define ATOMIC_DETAIL_HEADER(prefix) "ATOMIC_JOIN(prefix, ATOMIC_DETAIL_PLATFORM).hpp"
#define ATOMIC_DETAIL_HEADER(prefix) \"ATOMIC_JOIN(prefix, ATOMIC_DETAIL_PLATFORM).hpp\"
#define ATOMIC_DETAIL_HEADER(prefix) "##ATOMIC_JOIN(prefix, ATOMIC_DETAIL_PLATFORM).hpp##"
... failed!
but if i hope output is:
<base/atomic/gcc_gnu_x64.hpp>
The follow macro define can do right thing:
#define ATOMIC_DETAIL_HEADER(prefix) <ATOMIC_JOIN(prefix, ATOMIC_DETAIL_PLATFORM).hpp>
A cpp macro cannot build strings this way. It can join tokens to form new tokens, but at every stage it must be a valid token. Your example with angle-brackets works because the bracket characters are distinct tokens whereas the double-quotes cannot exist floating-off like that, and you cannot apply ## to it.
In most contexts, the compiler will concatenate adjacent string literals, so it may be sufficient to #stringify each piece at let the compiler do that.
While luser droog correctly stated why your use of quotes didn't work, he didn't show exactly how the goal can be accomplished. Indeed the # operator replaces a parameter by a string literal, i. e. puts quotation marks around the argument. This is slightly complicated by the fact that your token sequence has to be expanded first, so an additional level of macro substitution is needed:
#define QUOTED(a) #a
#define QUOTE(a) QUOTED(a)
#define ATOMIC_DETAIL_HEADER(prefix) QUOTE(ATOMIC_JOIN(prefix, ATOMIC_DETAIL_PLATFORM).hpp)
Is it possible to define a macro off of the content of a macro?
For example:
#define SET(key,value) #define key value
SET(myKey,"value")
int main(){
char str[] = myKey;
printf("%s",str);
}
would result in
int main(){
char str[] = "value";
printf("%s",str);
}
after being preprocessed.
Why would I do this? Because I'm curious ;)
No, its not possible to define a macro within another macro.
The preprocessor only iterates once before the compiler. What you're suggesting would require an undetermined amount of iterations.
No you can't - # in a replacment list of a macro means QUOTE NEXT TOKEN. It's more of a spelling issue, than any logical puzzle :)
(If you require this kind of solution in your code, than there are ways and tricks of using macro's, but you need to be specific about the use cases you need - as your example can be achieved by defining: #define mykey "value")
Here it is from the ansi C99 standard
6.10.3.2 The # operator
Constraints
1 Each # preprocessing token in the replacement list for a
function-like macro shall be followed by a parameter as the next
preprocessing token in the replacement list. Semantics 2 If, in the
replacement list, a parameter is immediately preceded by a #
preprocessing token, both are replaced by a single character string
literal preprocessing token that contains the spelling of the
preprocessing token sequence for the corresponding argument. Each
occurrence of white space between the argument’s preprocessing tokens
becomes a single space character in the character string literal.
White space before the first preprocessing token and after the last
preprocessing token composing the argument is deleted. Otherwise, the
original spelling of each preprocessing token in the argument is
retained in the character string literal, except for special handling
for producing the spelling of string literals and character constants:
a \ character is inserted before each " and \ character of a character
constant or string literal (including the delimiting " characters),
except that it is implementation-defined whether a \ character is
inserted before the \ character beginning a universal character name.
If the replacement that results is not a valid character string
literal, the behavior is undefined. The character string literal
corresponding to an empty argument is "". The order of evaluation of #
and ## operators is unspecified.
Macros are a simple text substitution. Generating new preprocessor directives from a macro would require the preprocessor to continue preprocessing from the beginning of the substitution. However, the standard defined preprocessing to continue behind the substitution.
This makes sense from a streaming point of view, viewing the unprocessed code as the input stream and the processed (and substituted) code as the output stream. Macro substitutions can have an arbitrary length, which means for the preprocessing from the beginning that an arbitrary number of characters must be inserted at the beginning of the input stream to be processed again.
When the processing continues behind the substitution, then the input simply is handled in one single run without any insertion or buffering, because everything directly goes to the output.
whilst it is not possible to use a macro to define another macro, depending on what you are seeking to achieve, you can use macros to effectively achieve the same thing by having them define constants. for example, i have an extensive library of c macros i use to define objective C constant strings and key values.
here are some snippets of code from some of my headers.
// use defineStringsIn_X_File to define a NSString constant to a literal value.
// usage (direct) : defineStringsIn_X_File(constname,value);
#define defineStringsIn_h_File(constname,value) extern NSString * const constname;
#define defineStringsIn_m_File(constname,value) NSString * const constname = value;
// use defineKeysIn_X_File when the value is the same as the key.
// eg myKeyname has the value #"myKeyname"
// usage (direct) : defineKeysIn_X_File(keyname);
// usage (indirect) : myKeyDefiner(defineKeysIn_X_File);
#define defineKeysIn_h_File(key) defineStringsIn_h_File(key,key)
#define defineKeysIn_m_File(key) defineStringsIn_m_File(key,##key)
// use defineKeyValuesIn_X_File when the value is completely unrelated to the key - ie you supply a quoted value.
// eg myKeyname has the value #"keyvalue"
// usage: defineKeyValuesIn_X_File(keyname,#"keyvalue");
// usage (indirect) : myKeyDefiner(defineKeyValuesIn_X_File);
#define defineKeyValuesIn_h_File(key,value) defineStringsIn_h_File(key,value)
#define defineKeyValuesIn_m_File(key,value) defineStringsIn_m_File(key,value)
// use definePrefixedKeys_in_X_File when the last part of the keyname is the same as the value.
// eg myPrefixed_keyname has the value #"keyname"
// usage (direct) : definePrefixedKeys_in_X_File(prefix_,keyname);
// usage (indirect) : myKeyDefiner(definePrefixedKeys_in_X_File);
#define definePrefixedKeys_in_h_File_2(prefix,key) defineKeyValuesIn_h_File(prefix##key,##key)
#define definePrefixedKeys_in_m_File_2(prefix,key) defineKeyValuesIn_m_File(prefix##key,##key)
#define definePrefixedKeys_in_h_File_3(prefix,key,NSObject) definePrefixedKeys_in_h_File_2(prefix,key)
#define definePrefixedKeys_in_m_File_3(prefix,key,NSObject) definePrefixedKeys_in_m_File_2(prefix,key)
#define definePrefixedKeys_in_h_File(...) VARARG(definePrefixedKeys_in_h_File_, __VA_ARGS__)
#define definePrefixedKeys_in_m_File(...) VARARG(definePrefixedKeys_in_m_File_, __VA_ARGS__)
// use definePrefixedKeyValues_in_X_File when the value has no relation to the keyname, but the keyname has a common prefixe
// eg myPrefixed_keyname has the value #"bollocks"
// usage: definePrefixedKeyValues_in_X_File(prefix_,keyname,#"bollocks");
// usage (indirect) : myKeyDefiner(definePrefixedKeyValues_in_X_File);
#define definePrefixedKeyValues_in_h_File(prefix,key,value) defineKeyValuesIn_h_File(prefix##key,value)
#define definePrefixedKeyValues_in_m_File(prefix,key,value) defineKeyValuesIn_m_File(prefix##key,value)
#define VA_NARGS_IMPL(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, N, ...) N
#define VA_NARGS(...) VA_NARGS_IMPL(X,##__VA_ARGS__, 11, 10,9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#define VARARG_IMPL2(base, count, ...) base##count(__VA_ARGS__)
#define VARARG_IMPL(base, count, ...) VARARG_IMPL2(base, count, __VA_ARGS__)
#define VARARG(base, ...) VARARG_IMPL(base, VA_NARGS(__VA_ARGS__), __VA_ARGS__)
and a usage example that invokes it:
#define sw_Logging_defineKeys(defineKeyValue) \
/** start of key list for sw_Logging_ **/\
/**/defineKeyValue(sw_Logging_,log)\
/**/defineKeyValue(sw_Logging_,time)\
/**/defineKeyValue(sw_Logging_,message)\
/**/defineKeyValue(sw_Logging_,object)\
/**/defineKeyValue(sw_Logging_,findCallStack)\
/**/defineKeyValue(sw_Logging_,debugging)\
/**/defineKeyValue(sw_Logging_,callStackSymbols)\
/**/defineKeyValue(sw_Logging_,callStackReturnAddresses)\
/** end of key list for sw_Logging_ **/
sw_Logging_defineKeys(definePrefixedKeys_in_h_File);
the last part may be a little difficult to get your head around.
the sw_Logging_defineKeys() macro defines a list that takes the name of a macro as it's parameter (defineKeyValue) this is then used to invoke the macro that does the actual definition process. ie, for each item in the list, the macro name passed in is used to define the context ( "header", or "implementation", eg either "h" or "m" file, if you understand the objective c file extensions) whilst this is used for objective c, it is simply plain old c macros, used for a "higher purpose" than possibly Kernighan and Richie ever envisaged. :-)