Generating initializer-lists using the C preprocessor - c-preprocessor

I was wondering if it is possible to construct A C99 macro that can either consume this syntax
MAGIC(a,b,(c,d),(e,(f,g))) // Expands to {{a}, {b}, {{c, d}}, {{e, {f,g}}}}
Or this more functional syntax
MAGIC( (a)(b)(c,d)(e,(f,g)) ) // Again should expands to {{a}, {b}, {{c, d}}, {{e, {f,g}}}}
If need be, I can assume a maximum nesting depth w.r.t. parenthesis of, say four levels.
I have played around with the solution presented here. But so far I haven't come vary far. I am trying to prevent having to create these generation like macro's to simulate recursion / iteration. But it is probably not possible to get around that.

Part of what makes this difficult for the C preprocessor is the fact that it requires recursing into parenthetical groupings. This in practice means that either we need to apply delays/evaluations at the outer levels or we need different macros at the inner ones. There's also an asymmetry in the specification at the recursion level which, while not hard per se, has to be accounted for.
If need be, I can assume a maximum nesting depth w.r.t. parenthesis of, say four levels.
I'll commit to this... and support four levels.
Definitions
Definitions (using boost preprocessor compatible terms): A tuple is a preprocessing data structure comprised of a single parenthetical grouping whose elements are comma delimited. So (a,(b,c),d) is a tuple with three elements... a, (b,c), and d. A sequence is a preprocessing data structure comprised of a series of contiguous elements each of which is surrounded by a parenthetical group. So the same elements listed above are represented by the sequence (a)((b,c))(d).
Design
To process tuples using variadics, we need a counter, a glue macro, and a macro for each of the tuple sizes we want to support. If say we support tuples from 1 to 10 elements, and want to change it to support tuples from 1 to 20, we may have to change the counter and add 10 more macros. But if we support 4 levels of recursion using tuples alone, we need 4 times this many macros; that makes scaling more burdensome.
You show two forms of macros; both use tuples, the latter just uses a sequence at the top level. For consistency I'll just pick the first form. But for scalability, I'll add tuple processing just to convert a tuple to a sequence, showing 9 element tuple support... then, to support n element tuples, you only need to change the count and add n-9 macros.
Approach
We start with a basic glue macro, a count, and a sequence converter at a single level (PARN):
#define GLUE(A,B) GLUE_I(A,B)
#define GLUE_I(A,B) A##B
#define PARN(...) GLUE(PARN_, COUNT (__VA_ARGS__)) (__VA_ARGS__)
#define PARN_9(A,B,C,D,E,F,G,H,I) (A)(B)(C)(D)(E)(F)(G)(H)(I)
...
#define PARN_2(A,B) (A)(B)
#define PARN_1(A) (A)
We want to either iterate another level or just process an element depending on if it's parenthetical. To do that I'll apply a pattern matcher based on an "indirect second macro". The idea is that you put a pattern in argument 1 and set up macros such that, if a pattern matches, it shifts tokens into argument 2; otherwise, your pattern just produces a bunch of tokens in argument 1 that get ignored. Here's the pattern matcher construct with a parenthetical detector (note that my second expands to empty if you pass it 1 argument):
#define SECOND(...) SECOND_I(__VA_ARGS__,,)
#define SECOND_I(A,B,...) B
#define CALLDETECT(...) ,
Given the link you provided you should already know how to apply sequences. I have a specific macro for sequence application:
#define PASTE_E(...) PASTE_E_I(__VA_ARGS__)
#define PASTE_E_I(...) __VA_ARGS__ ## E
...typical "simple" sequence processing toggles between two macros, call them A and B, until they hit the terminal... so they go A,B,A,B,...,E. But we want to leave a trail of comma delimited results with no extra commas. So we'll prepend the comma in each of these macros, but we'll start with another A that doesn't prepend; i.e., we're going to go A,B,C,B,C,...,E. We'll need four of these sets to (a) avoid blue paint, and (b) process differently at the top level. Here's the initial set:
#define BRACP0(X) {SECOND(CALLDETECT X MAGIC1)X}
#define BRAC0_A(X) BRACP0(X)BRAC0_B
#define BRAC0_B(X) ,BRACP0(X)BRAC0_C
#define BRAC0_C(X) ,BRACP0(X)BRAC0_B
#define BRAC0_BE
#define BRAC0_CE
This will chain the outer level MAGIC to an inner level MAGIC1; this set specifically adds an additional {} pair per spec (BRACP0). The rest of the sets look similar except for that {}. But for the last level we want to call a simple bracify instead of the "next magic" (this too is scalable; you need one of these sets per level recursion depth supported). The bracify and level 4's "BRACP" just look like this:
#define BRACIFY(...) {__VA_ARGS__}
#define BRACP4(X) SECOND(CALLDETECT X BRACIFY)X
I have an "unwrap" per magic macro level:
#define MAGIC(...) {PASTE_E(MAGIC_U(BRAC0_A PARN(__VA_ARGS__)))}
#define MAGIC_U(...) __VA_ARGS__
#define MAGIC1(...) {PASTE_E(MAGIC1_U(BRAC1_A PARN(__VA_ARGS__)))}
#define MAGIC1_U(...) __VA_ARGS__
...
#define MAGIC4(...) {PASTE_E(MAGIC4_U(BRAC3_A PARN(__VA_ARGS__)))}
#define MAGIC4_U(...) __VA_ARGS__
Full solution and demo
Link to coliru.stacked-crooked demo.
Final comments
I interpreted C99 as meaning standard here and this question as just an "is-it-possible"; as such, I haven't bothered to convert this to work with MSVS's preprocessor. It probably won't work with that as-is. Let me know if that bugs you at all.

Related

How to define a macro that takes an arbitrary number of arguments and expands to give only the even arguments? [duplicate]

I am working on a recursive macro. However, it seems that it is not expanded recursively. Here is a minimal working example to show what I mean:
// ignore input, do nothing
#define ignore(...)
// choose between 6 names, depending on arity
#define choose(_1,_2,_3,_4,_5,_6,NAME,...) NAME
// if more than one parameter is given to this macro, then execute f, otherwise ignore
#define ifMore(f,...) choose(__VA_ARGS__,f,f,f,f,f,ignore)(__VA_ARGS__)
// call recursively if there are more parameters
#define recursive(first,args...) first:ifMore(recursive,args)
recursive(a,b,c,d)
// should print: a:b:c:d
// prints: a:recursive(b,c,d)
The recursive macro should expand itself recursively and always concatenate the result, separated with a colon. However, it doesn't work. The recursive macro is generated correctly (as can be seen on the result a:recursive(b,c,d) which includes a well-formed call to the macro again), but the generated recursive call ist not exanded.
Why is this the case and how can I get the behaviour I want?
You can't get the behaviour you want. The C preprocessor is, by design, not turing complete.
You can use multiple macros to get multiple replacements, but you will not achieve true recursion with an arbitrary number of replacements.
As others have mentioned, pure recursion is impossible with C macros. It is, however, possible to simulate recursion-like effects.
The Boost Pre-Processor tools do this well for both C and C++ and are a stand-alone library:
http://www.boost.org/doc/libs/1_60_0/libs/preprocessor/doc/index.html
The compiler pre-processor will not re expand the macro that you define. That is it will blindly replace whatever string is found in the macro statement with the string that it finds in the definition. For example, Can we have recursive macros? or Macro recursive expansion to a sequence and C preprocessor, recursive macros
That is, recursive(a,b,c,d) will be expanded to a:recursive(b,c,d) and the pre-processor will then continue to the next line in the base code. It will not loop around to try to continue to expand the string (see the links that I cited).

Use macro expansion to create multiple #define constants

I am working on an open gl driver and need to define a set of constants using #define. The names of these constants iterate along with the value they represent. They are also bounded by a max value set by another #define which is hardware specific. I would like to define the these constants using the max value if possible.
Currently I have defined them as follows:
#define GL_MAX_TEXTURE_UNITS 24
#define GL_TEXTURE0 0
#define GL_TEXTURE1 1
...
#define GL_TEXTURE24 24
I would like to have something along the lines of the following:
#define GL_MAX_TEXTURE_UNITS 24
#define GL_TEXTURE(SOMETRICKYMACRO)
Where the macro is defined in such a way that at compilation I end up with an expansion equivalent to the first case but if I wanted to change the number of constants I would only need to modify GL_MAX_TEXTURE_UNITS.
The C preprocessor cannot produce new preprocessing directives itself. If you want to do something like this, you would need to generate the header file with a separate utility (perhaps a shell or awk script) as part of your build process.
If you find yourself really needing something like this (programmatically variable range of values), it might be an indication that referring to them symbolically via macro names is a bad design choice.

Can anybody please explain the behavour of C preprocessor in following examples?

I am implementing a C macro preprocessor (C99)...
I am surprised by the following behaviour....
Ex1:
#define PASTE(x) X_##x
#define EXPAND(x) PASTE(x)
#define TABSIZE 1024
#define BUFSIZE TABSIZE
PASTE(BUFSIZE)
EXPAND(BUFSIZE)
expands to:
X_BUFFSIZE
X_1024
Ex2:
#define EXPAND(s) TO_STRING(s)
#define TO_STRING(s) #s
#define FOUR 4
TO_STRING(FOUR)
EXPAND(FOUR)
Expands to:
"FOUR"
"4"
I have gone through the "free" standard of C but I couldn’t find following things...
Actually how many passes preprocessor performs?
Does it replace one macro first then other and so on
or does it store & replace them as #defines are encountered one by one?
Whether file inclusion is done first or the macro expansion?
You should read this page for starters. It contains gems such as:
The C standard states that, after any parameters have been replaced with their possibly-expanded arguments, the replacement list is scanned for nested macros. Further, any identifiers in the replacement list that are not expanded during this scan are never again eligible for expansion in the future, if the reason they were not expanded is that the macro in question was disabled.
I think one can infer from this that there is no fixed number of passes: each time a macro expansion happens (which generates a "replacement list"), the newly created text is scanned for further expansions. It's a recursive process.
Actually how many passes preprocessor performs?
It replaces all occurences of # PARAMETER by the stringification of that parameter
It joins all tokens that have a ## inbetween
it replaces all remaining ocurences of the parameters by their value
It recursively expands the replacement text for occurences of other macros. (The macro itself is blocked in these recursive calls.)
Does it replace one macro first then other and so on or does it store
& replace them as #defines are encountered one by one?
It replaces macros in the order it encounters them in the program text, or during the recursive replacement as described above.
Whether file inclusion is done first or the macro expansion?
first the argument of an #include is expanded if it doesn't consist in something that is either enclosed in <> or "". That then must lead to exactly that something that is in <> or in ""

Using a #define-d list as input to a C preprocessor macro

In an example project, I defined the macro
#define FOO(x, y) x + y .
This works perfectly well. For example, FOO(42, 1337) is evaluated to 1379.
However, I now want to use another #define:
#define SAMPLE 42, 1337
When I now call FOO(SAMPLE), this won't work. The compiler tells me that FOO takes two arguments, but is only called with one argument.
I guess that the reason for this is that, although, the arguments of a macro are evaluated in advance of the function itself, that the preprocessor does not parse the whole instruction again after this evaluation. This is a similar to the fact that it is not possible to output additional preprocessor directives from a macro.
Is there any possibility to get the desired functionality?
Replacing the FOO macro with a C function is not a possibility. The original macro is located in third party code I cannot change, and it outputs a comma-separated list of values to be directly used in array initializers. Therefore, a C function cannot replicate the same behaviour.
If it is not possible to accomplish this task by using simple means: How would you store the (x, y) pairs in a maintainable form? In my case, there are 8 arguments. Therefore, storing the individual parts in separate #define-s is also not easy maintainable.
You're running into a problem where the preprocessor is not matching and expanding macros in the order you want. Now you can generally get it to do what you want by inserting some extra macros to force it to get the order right, but in order to that you need to understand what the normal order is.
when the compiler sees the name of a macro with arguments followed by a ( it first scans in that argument list, breaking it into arguments WITHOUT recognizing or expanding any macros in the arguments.
after parsing and separating the arguments, it then rescans each argument for macros, and expands any it finds withing the argument UNLESS the argument is used with # or ## in the macro body
it then replaces each instance of the argument in the body with the (now possibly expanded) argument
finally, it rescans the body for any OTHER macros that may exist with the body for expansion. In this one scan, the original macro WILL NOT be recognized and reexpanded, so you can't have recursive macro expansions
So you can get the effect you want by careful use of an EXPAND macro that takes a single argument and expands it, allowing you to force extra expansions at the right point in the process:
#define EXPAND(X) X
#define FOO(x,y) x + y
#define SAMPLE 42, 1337
EXPAND(FOO EXPAND((SAMPLE)))
In this case you first explicitly expand macros in the argument list, and then manually expand the resulting macro call afterwards.
Update by question poster
#define INVOKE(macro, ...) macro(__VA_ARGS__)
INVOKE(FOO, SAMPLE)
provides an extended solution that works without cluttering the code with EXPANDs.

Can you capitalize a pasted token in a macro?

In a C macro, is it possible to capitalize a pasted-in token? For example, I currently have the following macro:
#define TEST(name, keyword) \
test_##name:
TEST_##keyword##_KEYWORD
I would invoke this as follows:
TEST(test1, TEST1)
which would yield the following:
test_test1:
TEST_TEST1_KEYWORD
Now, instead of having to type the same name twice (once with all lower case characters, and again with all upper case characters), is there any way that I could do either of the following, and either change the token into all uppercase letters or all lowercase letters?
TEST(test1) or TEST(TEST1)
Thanks,
Ryan
As far as I'm aware, the only operations that can be done on tokens in the C preprocessor (at least ISO/ANSI standard) is to replace, 'stringify' or concatenate them. I'm also unaware of any GCC or MSVC extensions that will let you do what you want.
However, people have been coming up with clever (or oddball) ways to do magical (or horrible) things with macros, so I wouldn't be surprised if someone surprises me.
You could do something like the following (untested, probably typos...)
#define NORMALIZE(TOK) NORMALIZE_ ## TOK
and then for each of the writings that may occur do
#define NORMALIZE_test1 test1
#define NORMALIZE_TEST1 test1
then use the NORMALIZE macro inside your real macro something like
#define TEST(name, keyword) \
test_ ## NORMALIZE(name): \
TEST_ ## NORMALIZE(keyword) ##_KEYWORD
(but maybe you'd have to do some intermediate helper macros until you
get all concatenations right)
It's not possible because the preprocessor works on an input stream of pp-token and has no construct that allows you to decompose these in a meaningful manner.
What the preprocessor has is constructs to replace pp-tokens with macro expansions, concatenate them, remove them (entirely) etc.
This means that your only hope for uppercasing is to start with individual characters and uppercase these and then glue everything together. Uppercasing individual characters is quite straight forward as you only have a finite set to work with. Glueing them together on the other hand would be possible, at least if you limit yourself to a fixed maximal length. You would end up in a macro that would be used like this:
TEST(t,e,s,t,1)

Resources