How, exactly, does the double-stringize trick work? - c

At least some C preprocessors let you stringize the value of a macro, rather than its name, by passing it through one function-like macro to another that stringizes it:
#define STR1(x) #x
#define STR2(x) STR1(x)
#define THE_ANSWER 42
#define THE_ANSWER_STR STR2(THE_ANSWER) /* "42" */
Example use cases here.
This does work, at least in GCC and Clang (both with -std=c99), but I'm not sure how it works in C-standard terms.
Is this behavior guaranteed by C99?
If so, how does C99 guarantee it?
If not, at what point does the behavior go from C-defined to GCC-defined?

Yes, it's guaranteed.
It works because arguments to macros are themselves macro-expanded, except where the macro argument name appears in the macro body with the stringifier # or the token-paster ##.
6.10.3.1/1:
... After the arguments for the
invocation of a function-like macro
have been identified, argument
substitution takes place. A parameter
in the replacement list, unless
preceded by a # or ## preprocessing
token or followed by a ##
preprocessing token (see below), is
replaced by the corresponding argument
after all macros contained therein
have been expanded...
So, if you do STR1(THE_ANSWER) then you get "THE_ANSWER", because the argument of STR1 is not macro-expanded. However, the argument of STR2 is macro-expanded when it's substituted into the definition of STR2, which therefore gives STR1 an argument of 42, with the result of "42".

As Steve notes, this is guarenteed, and it has been guarenteed since the C89 standard -- that was the standard the codified the # and ## operators in macros and mandates recursively expanding macros in args before substituting them into the body if and only if the body does not apply a # or ## to the argument. C99 is unchanged from C89 in this respect.

Related

In what situations does C preprocessor ## work and not work?

It seems that sometimes concatenation with ## does work and sometimes it does not work.
It is an unreliable feature even though it is clearly vitally necessary for some uses.
Is there a clear set of rules for using ##?
An example is:
file1.h
#define concat(a,b) a##b
#define BAR bar
extern int concat(fu,BAR) ();
Here concat produces fuBAR not fubar.
An example is:
file2.h
#define BAR bar
extern int fu##BAR ();
Here ## produces an error about a stray ## in the code.
## obeys rules which are different than the rules you were expecting. That doesn't mean it "doesn't work" or "is an unreliable feature", it only means you have to use it differently than you thought.
When you write
#define concat(a,b) a##b
the C standard says that the arguments a and b are NOT macro expanded before the concatenation happens. (N1570 §6.10.3.1.) This is an intentional difference from the behavior when you don't apply ## or # to a macro argument.
You can get the behavior you wanted with double expansion:
#define concat(a,b) concat_(a,b)
#define concat_(a,b) a##b
With this definition, the arguments to concat are macro expanded before substitution into the macro body, since they are not being used as operands to ##. Then each argument becomes an argument to concat_, and isn't expanded again, but that's fine, because the expansion is done already.
And when you write
#define BAR bar
extern int fu##BAR ();
the C standard says that the ## operator is not recognized at all by the preprocessor (and therefore passes on to translation phase 7, where it is a valid token that isn't accepted by any grammar rule, and therefore causes a syntax error) because it's not part of a macro definition. (N1570 §6.10.3.3p2,3 — by implication only; it says that ## is recognized when it appears in the replacement list of a macro being expanded. This section doesn't say that it is recognized at any other time, and no other part of the standard gives ## a meaning in any other context.)
Is there a clear set of rules for using ##?
The C standard 6.10.3.1:
After the arguments for the invocation of a function-like macro have been identified,
argument substitution takes place. A parameter in the replacement list, unless preceded
by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is
replaced by the corresponding argument after all macros contained therein have been
expanded.
Basically the ## concatenation happens before token expansion, because the result of ## could result in a new preprocessor token. In case we have this:
#define concat(a,b) a##b
#define BAR bar
#define fuBAR "hello world"
Then puts(concat(fu,BAR)); will print "hello world".
In order to fix this, you need a helper macro which expands the preprocessor tokens before passing them to the macro where ## (or #) resides:
#define cc(a,b) a##b
#define concat(a,b) cc(a,b)
#define BAR bar
#define fuBAR "hello world"
#define fubar "fubar"
In this example a and b are expanded to fu and bar before cc is invoked. So now puts(concat(fu,BAR)); will print "fubar".
Here ## produces an error about a stray ## in the code.
Because you can only use # and ## inside macros, simple as that. extern int fu##BAR (); is not a macro.
It seems that sometimes concatenation with ## does work and sometimes it does not work.
That's like saying "sometimes multiplication with * does work, and sometimes it doesn't", and then presenting examples such as
int six = 5;
int product = six * 2; // results in 10 instead of 12
and
void f(int a, int b, int a * b); // rejected by the compiler
The specifications for the token-pasting operator are presented in 6.10.3.3 of the C17 language specification, and your first clue should be that this is a subsection of 6.10.3, which describes macro replacement. Not only the placement of the section but also its explicit text make it clear that token concatenation occurs (only) in the context of macro replacement:
For both object-like and function-like macro invocations, before the replacement list is reexamined for more macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an argument) is deleted and the preceding preprocessing token is concatenated with the following preprocessing token.
(C17 6.10.3.3/3)
There are no other specifications for special treatment of ##, so that's its full scope.
Additionally,
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding
argument after all macros contained therein have been expanded.
(C17 6.10.3.1/1; emphasis added)
Thus, where function-like macro arguments are operands of a ## (or #) operator, they are not themselves macro-expanded, though the concatenated result might be expanded as a macro when the result is re-scanned.

Operator to convert to string

Why does the code snippet 1 works but NOT code snippet 2
Code Snippet 1:
#define mkstr(x) #x
int main(void)
{
printf(mkstr(abc));
return (0);
}
Code Snippet 2:
int main(void)
{
printf(#abc);
return(0);
}
The first snippet works because it has a function-like macro defined in which you put anything, the value is correctly assigned as a constant.
OTOH, the second one has a syntactic error because the compiler doesn't expects that parameter passed in printf(). Thus, the # is meaningless there.
Commands starting with the # symbol are called Macros in C and C++. Macros are blocks of code that are named and referenced with that name.
There are 2 popular types of macros - the Object-like and the function-like. The one you're using is the function-like macro.
The Preprocessor is responsible for replacing macro calls with the actual object/function calls.
The statement in Snippet 1
#define mkstr(x) #x
The above macro uses a special feature called the stringizing. The # before the x specifies that the input parameter should be treated as is, ie. converted to a string constant, thereby returning a string equivalent of what is passed.
On the contrary, when you call the below code in Snippet 2
printf(#abc);
doesn't mean anything. It's a compiler error as #s are not allowed in the middle or end of a statement (please not that # is allowed to be part of string such as "#1", or when used as a character literal as '#'). And thus any statement that starts with # becomes a macro.
Caution: Use of macros is discouraged. You can refer this answer on StackOverflow on why not to use them.
Refer the below resources to learn more about macros
Macro (The C Preprocessor) - GNU GCC
C Preprocessors and Macros - Programiz

C Macro expansion within a macro

While testing a C software, I'd like to use macros to generate function calls. Using codewarrior 5.2 (very old) compiler -because I don't have choice, I don't know if this is standard behavior.
in macros.h
#define RUNTEST(i) \
testcase_index = i; \
__PREFIX__##_testCase_##i()
in test_foo.c
#include "macros.h"
#define __PREFIX__ foo
RUNTEST(10);
Apparently __PREFIX__ is not expanded, and preprocessor generates a call to __PREFIX___testcase_10(), which of course will fail at linking time.
Copying everything in the same file doesn't seem to change anything.
Is there a simple way out?
Alternative
I also tried #define __PREFIX__() foo to force macro expansion. In that case, it almost works, and now generates foo _testcase_10() (with a space), which of course won't compile.
I've done a simplified version of your question here (without assigning to the testcase index):
.h:
#define PASTER(x, y) x##_testCase_##y
#define EVAL(x, y) PASTER(x, y)
#define RUNTEST(i) EVAL(__PREFIX__, i)
.c
#define __PREFIX__ foo
// whatever
RUNTEST(1);
Explanation:
From the C standard:
6.10.3.1 Argument substitution
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.
So now that we have this, I'll walk through the expansion of RUNTEST(1):
EVAL(__PREFIX__, i)
PASTER(foo, 1)
foo##_testCase_##1
foo_testCase_1

A #define in C with three dots

#define LOGI(...) ((void)__android_log_print(ANDROID_LOG_INFO, "native-activity", __VA_ARGS__))
#define LOGW(...) ((void)__android_log_print(ANDROID_LOG_WARN, "native-activity", __VA_ARGS__))
This is definition for these 2 macros; later in the code LOGI and LOGW are used this way
LOGI("accelerometer: x=%f y=%f z=%f",
event.acceleration.x, event.acceleration.y,
event.acceleration.z);
and this way
LOGW("Unable to eglMakeCurrent");
Since I try to avoid complex macros and #define in general, I can't get what this macro actually means. What is the role for the 3 dots notation here? What does this #define change later in the code?
Obviously I know that the 3 dots are used to indicate and indefinite amount of arguments, but I don't know how to read this situation.
The C99 standard introduced variadic macros, i.e., function-like macros that can take a variable number of arguments.
Quoting the latest draft of the C standard, section 6.10.3:
If the identifier-list in the macro definition does not end with an
ellipsis, the number of arguments (including those arguments
consisting of no preprocessing tokens) in an invocation of a
function-like macro shall equal the number of parameters in the macro
definition. Otherwise, there shall be more arguments in the invocation
than there are parameters in the macro definition (excluding the ...).
There shall exist a ) preprocessing token that terminates the
invocation.
The identifier __VA_ARGS__ shall occur only in the replacement-list of
a function-like macro that uses the ellipsis notation in the
parameters.
...
If there is a ... in the identifier-list in the macro definition,
then the trailing arguments, including any separating comma
preprocessing tokens, are merged to form a single item: the variable
arguments. The number of arguments so combined is such that,
following merger, the number of arguments is one more than the number
of parameters in the macro definition (excluding the ...).
And in the next subsection:
An identifier __VA_ARGS__ that occurs in the replacement list shall
be treated as if it were a parameter, and the variable arguments shall
form the preprocessing tokens used to replace it.
So you can invoke LOGI or LOGW with as many arguments as you like, and they'll all be expanded at the place specified in the definition by the reference to __VA_ARGS__.

what is ## in c?

I have seen this snippet:
#define kthread_create(threadfn, data, namefmt, arg...) \
kthread_create_on_node(threadfn, data, -1, namefmt, ##arg)
what does ## stand for ?
what is the meaning of ## when it appears out of a macro ?
Contrary to the other answers, this is actually GCC extension. When pasting variable args directly in, a problem occurs if no extra args were passed. Thus, GCC makes ## when used with __VA_ARGS__ or a varargs variable (declared with argname...). To paste if it contains a value, or remove the previous comma if not.
The documentation for this extension is here:
Second, the '##' token paste operator has a special meaning when placed between a comma and a variable argument. If you write
#define eprintf(format, ...) fprintf (stderr, format, ##__VA_ARGS__)
and the variable argument is left out when the eprintf macro is used, then the comma before the '##' will be deleted. This does not happen if you pass an empty argument, nor does it happen if the token preceding '##' is anything other than a comma.
eprintf ("success!\n")
==> fprintf(stderr, "success!\n");
The above explanation is ambiguous about the case where the only macro parameter is a variable arguments parameter, as it is meaningless to try to distinguish whether no argument at all is an empty argument or a missing argument. In this case the C99 standard is clear that the comma must remain, however the existing GCC extension used to swallow the comma. So CPP retains the comma when conforming to a specific C standard, and drops it otherwise.
This "pastes" whatever passed in arg to the macro expansion.
Example:
kthread_create(threadfn, data, namefmt, foo, bar, doo);
Expands to:
kthread_create_on_node(threadfn, data, -1, namefmt, foo, bar, doo);

Resources