I have found two different things in two well known books in c,
first one is
"Formal parameters are not replaced in quoted string in macro expansion" - by K&R c language page 76
second one is a code,
#define PRINT(var,format) printf("variable is %format\n",var)
PRINT(x_var,f);
later macro invocation would be expanded as
printf("x_var is %f\n",x_var);
this is by programming in ansi c - E. balagurusamy at page 448.
Surely two citations are contradictory one with another. as far I know first one is true and my compiler giving me result so. But second book is also well known and popular. I want to know if there was such things in previous versions of c or the second citation is a false one.
The second book is wrong: it is easy to check that the macro will not be expanded like that. However, you can get the effect that they describe by stringizing tokens using the # preprocessor operator:
#define PRINT(var,format) printf(#var" is %"#format"\n",var)
Now you can print your variable as follows:
int xyz = 123;
PRINT(xyz, d);
Here is a link to a working sample on ideone.
Note the addition of the double quotes before and after the '#format', and the '#' before 'var' and 'format'. The '#' operator causes the value of the variable to be made into a quoted string -- with double quotes of its own. This makes the replaced strings be four quoted strings in a row, that the C compiler recognizes as a request to concatenate into one string. Thus the strings: "xyz", " is %", "d" and "\n" are concatenated into: "xyz is %d\n"
(Note this example is different from the example in the original question in that the orignal example had "variable is..." where the answer replaced 'variable' with an instance of the 'var' macro argument)
The book is correct. For it's time. I wrote a little test program to verify it (you don't appreciate text editors until you have programmed in ed):
#define PRINT(fmt,val) printf("val = %fmt\n", (val))
main()
{
int x;
x = 5;
PRINT(d, x);
}
I compiled it on a PDP-11 running Unix V6. Running the program produces this output:
x = 5
This is even pre K&R C. The "feature" was removed in one of the later iterations of C, and made official in ISO C90.
Related
I have a (possibly faulty) school assignment regarding the C-preprocessor, in which I should essentially define a macro which allows
Today is the 9.
to compile to
int a = 9;
Note the "." after the 9. The rest of the program is similar, I have no problem with that.
Now I replaced "Today" by int (#define Today int), "is" by a, "the" by = but I don't know what to do with the ".", given if I just blindly replace it by doing
#define . ;
I get a compile-time error.
Is it even possible to do something with the dot?
Is it possible to redefine "." using macros in C?
No.
given if I just blindly replace it by doing
#define . ;
I get a compile-time error. Is it even possible to do something with
the dot?
No, it is not possible.
In the first place, the . in the text presented is not a separate token according to C's rules. It is part of 9., a floating-point constant. Macro replacement operates only on complete tokens.
In the second place, macro replacement is not a general search / replace. Macro names must be C identifiers, which start with either an underscore or a Latin letter, and contain only underscores, Latin letters, and decimal digits. Thus, it is not possible to define either . by itself or the full 9. as a macro name.
#define Today int
#define is a
#define the = (int)
void foo(void)
{
Today is the 9.;
printf("%d\n", is);
}
So when looking into getting my define macro to work, I found the # and ## macro helpers, and used them to simplify my macro. The key part of the macro sets a variable to a string containing the name of the variable (but not the variable name alone). As a simplified example, let's take a macro called SET(X) that should expand SET(something) into something = "pre_something".
The only way I've found to do it so far is with two macros like #define QUOTE(X) #X and #define SET(X) X = QUOTE(pre_##X). However, using multiple macros seems excessive, and may cause problems with further macro expansion (I think). Is there a cleaner, one-line way of doing the same thing?
#define SET(x) x = "pre_"#x
C does string concatenation at compile time, so two string literals next to each other are concatenated.
"hello " "world" -> "hello world"
I want to know how the C preprocessor handles circular dependencies (of #defines). This is my program:
#define ONE TWO
#define TWO THREE
#define THREE ONE
int main()
{
int ONE, TWO, THREE;
ONE = 1;
TWO = 2;
THREE = 3;
printf ("ONE, TWO, THREE = %d, %d, %d \n",ONE, TWO, THREE);
}
Here is the preprocessor output. I'm unable to figure out why the output is as such. I would like to know the various steps a preprocessor takes in this case to give the following output.
# 1 "check_macro.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "check_macro.c"
int main()
{
int ONE, TWO, THREE;
ONE = 1;
TWO = 2;
THREE = 3;
printf ("ONE, TWO, THREE = %d, %d, %d \n",ONE, TWO, THREE);
}
I'm running this program on linux 3.2.0-49-generic-pae and compiling in gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5).
While a preprocessor macro is being expanded, that macro's name is not expanded. So all three of your symbols are defined as themselves:
ONE -> TWO -> THREE -> ONE (not expanded because expansion of ONE is in progress)
TWO -> THREE -> ONE -> TWO ( " TWO " )
THREE -> ONE -> TWO -> THREE ( " THREE " )
This behaviour is set by §6.10.3.4 of the C standard (section number from the C11 draft, although as far as I know, the wording and numbering of the section is unchanged since C89). When a macro name is encountered, it is replaced with its definition (and # and ## preprocessor operators are dealt with, as well as parameters to function-like macros). Then the result is rescanned for more macros (in the context of the rest of the file):
2/ If the name of the macro being replaced is found during this scan of the replacement list (not including the rest of the source file’s preprocessing tokens), it is not replaced. Furthermore, if any nested replacements encounter the name of the macro being replaced, it is not replaced…
The clause goes on to say that any token which is not replaced because of a recursive call is effectively "frozen": it will never be replaced:
… These nonreplaced macro name preprocessing tokens are no longer available for further replacement even if they are later (re)examined in contexts in which that macro name preprocessing token would otherwise have been replaced.
The situation which the last sentence refers rarely comes up in practice, but here is the simplest case I could think of:
#define two one,two
#define a(x) b(x)
#define b(x,y) x,y
a(two)
The result is one, two. two is expanded to one,two during the replacement of a, and the expanded two is marked as completely expanded. Subsequently, b(one,two) is expanded. This is no longer in the context of the replacement of two, but the two which is the second argument of b has been frozen, so it is not expanded again.
Your question is answered by publication ISO/IEC 9899:TC2 section 6.10.3.4 "Rescanning and further replacement", paragraph 2, which I quote here for your convenience; in the future, please consider reading the specificaftion when you have a question about the specification.
If the name of the macro being replaced is found during this scan of the replacement list
(not including the rest of the source file’s preprocessing tokens), it is not replaced.
Furthermore, if any nested replacements encounter the name of the macro being replaced,
it is not replaced. These nonreplaced macro name preprocessing tokens are no longer
available for further replacement even if they are later (re)examined in contexts in which
that macro name preprocessing token would otherwise have been replaced.
https://gcc.gnu.org/onlinedocs/cpp/Self-Referential-Macros.html#Self-Referential-Macros answers the question about self referential macros.
The crux of the answer is that when the pre-processor finds self referential macros, it doesn't expand them at all.
I suspect, the same logic is used to prevent expansion of circularly defined macros. Otherwise, the preprocessor will be in an infinite expansion.
In your example you do the macro processing before defining
variables of the same name, so regardless of what the result
of the macro processing is, you always print 1, 2, 3!
Here is an example where the variables are defined first:
#include <stdio.h>
int main()
{
int A = 1, B = 2, C = 3;
#define A B
#define B C
//#define C A
printf("%d\n", A);
printf("%d\n", B);
printf("%d\n", C);
}
This prints 3 3 3. Somewhat insidiously, un-commenting #define C A changes the behaviour of the line printf("%d\n", B);
Here's a nice demonstration of the behavior described in rici's and Eric Lippert's answers, i.e. that a macro name is not re-expanded if it is encountered again while already expanding the same macro.
Content of test.c:
#define ONE 1, TWO
#define TWO 2, THREE
#define THREE 3, ONE
int foo[] = {
ONE,
TWO,
THREE
};
Output of gcc -E test.c (excluding initial # 1 ... lines):
int foo[] = {
1, 2, 3, ONE,
2, 3, 1, TWO,
3, 1, 2, THREE
};
(I would post this as a comment, but including substantial code blocks in comments is kind of awkward, so I'm making this a Community Wiki answer instead. If you feel it would be better included as part of an existing answer, feel free to copy it and ask me to delete this CW version.)
When using C preprocessor one can stringify macro argument like this:
#define TO_STRING(x) "a string with " #x
and so when used, the result is as follows:
TO_STRING(test) will expand to: "a string with test"
Is there any way to do the opposite? Get a string literal as an input argument and produce a C identifier? For example:
TO_IDENTIFIER("some_identifier") would expand to: some_identifier
Thank you for your answers.
EDIT: For those wondering what do I need it for:
I wanted to refer to nodes in a scene graph of my 3D engine by string identifiers but at the same time avoid comparing strings in tight loops. So I figured I'll write a simple tool that will run in pre-build step of compilation and search for predefined string - for example ID("something"). Then for every such token it would calculate CRC32 of the string between the parenthesis and generate a header file with #defines containing those numerical identifiers. For example for the string "something" it would be:
#define __CRC32ID_something 0x09DA31FB
Then, generated header file would be included by each cpp file using ID(x) macros. The ID("something") would of course expand to __CRC32ID_something, so in effect what the compiler would see are simple integer identifiers instead of human friendly strings. Of course now I'll simply settle for ID(something) but I thought that using quotes would make more sense - a programmer who doesn't know how the ID macro works can think that something without quotes is a C identifier when in reality such identifier doesn't exist at all.
No, you can't unstringify something.
//unstringify test
enum fruits{apple,pear};
#define IF_WS_COMPARE_SET_ENUM(x) if(ws.compare(L#x)==0)f_ret=x;
fruits enum_from_string(wstring ws)
{
fruits f_ret;
IF_WS_COMPARE_SET_ENUM(apple)
IF_WS_COMPARE_SET_ENUM(pear)
return f_ret;
}
void main()
{
fruits f;
f=enum_from_string(L"apple");
f=enum_from_string(L"pear");
}
You can create an identifier from a string, this operation is called token-pasting in C :
#define paste(n) x##n
int main(){
int paste(n) = 5;
printf("%d" , x5);
}
output : 5
In C99, we have compound literals, and they can be passed to functions as in:
f((int[2]){ 1, 2 });
However, if f is not a function but rather a function-like macro, gcc barfs on this due to the preprocessor parsing it not as one argument but as two arguments, "(int[2]){ 1" and "2 }".
Is this a bug in gcc or in the C standard? If it's the latter, that pretty much rules out all transparent use of function-like macros, which seems like a huge defect...
Edit: As an example, one would expect the following to be a conforming program fragment:
fgetc((FILE *[2]){ f1, f2 }[i]);
But since fgetc could be implemented as a macro (albeit being required to protect its argument and not evaluate it more than once), this code would actually be incorrect. That seems surprising to me.
This "bug" has existed in the standard since C89:
#include <stdio.h>
void function(int a) {
printf("%d\n", a);
}
#define macro(a) do { printf("%d\n", a); } while (0)
int main() {
function(1 ? 1, 2: 3); /* comma operator */
macro(1 ? 1, 2: 3); /* macro argument separator - invalid code */
return 0;
}
I haven't actually looked through the standard to check this parse, I've taken gcc's word for it, but informally the need for a matching : to each ? trumps both operator precedence and argument list syntax to make the first statement work. No such luck with the second.
This is per the C Standard, similar to how in C++, the following is a problem:
f(ClassTemplate<X, Y>) // f gets two arguments: 'ClassTemplate<X' and 'Y>'
If it is legal to add some extra parentheses there in C99, you can use:
f(((int[2]){ 1, 2 }));
^ ^
The rule specifying this behavior, from C99 §6.10.3/11, is as follows:
The sequence of preprocessing tokens bounded by the outside-most matching parentheses
forms the list of arguments for the function-like macro.
The individual arguments within the list are separated by comma preprocessing tokens, but comma preprocessing tokens between matching inner parentheses do not separate arguments.
To the extent that it's a bug at all, it's with the standard, not gcc (i.e., in this respect I believe gcc is doing what the standard requires).