Can you capitalize a pasted token in a macro?

Can you capitalize a pasted token in a macro? - c

In a C macro, is it possible to capitalize a pasted-in token? For example, I currently have the following macro:
#define TEST(name, keyword) \
test_##name:
TEST_##keyword##_KEYWORD
I would invoke this as follows:
TEST(test1, TEST1)
which would yield the following:
test_test1:
TEST_TEST1_KEYWORD
Now, instead of having to type the same name twice (once with all lower case characters, and again with all upper case characters), is there any way that I could do either of the following, and either change the token into all uppercase letters or all lowercase letters?
TEST(test1) or TEST(TEST1)
Thanks,
Ryan

As far as I'm aware, the only operations that can be done on tokens in the C preprocessor (at least ISO/ANSI standard) is to replace, 'stringify' or concatenate them. I'm also unaware of any GCC or MSVC extensions that will let you do what you want.
However, people have been coming up with clever (or oddball) ways to do magical (or horrible) things with macros, so I wouldn't be surprised if someone surprises me.

You could do something like the following (untested, probably typos...)
#define NORMALIZE(TOK) NORMALIZE_ ## TOK
and then for each of the writings that may occur do
#define NORMALIZE_test1 test1
#define NORMALIZE_TEST1 test1
then use the NORMALIZE macro inside your real macro something like
#define TEST(name, keyword) \
test_ ## NORMALIZE(name): \
TEST_ ## NORMALIZE(keyword) ##_KEYWORD
(but maybe you'd have to do some intermediate helper macros until you
get all concatenations right)

It's not possible because the preprocessor works on an input stream of pp-token and has no construct that allows you to decompose these in a meaningful manner.
What the preprocessor has is constructs to replace pp-tokens with macro expansions, concatenate them, remove them (entirely) etc.
This means that your only hope for uppercasing is to start with individual characters and uppercase these and then glue everything together. Uppercasing individual characters is quite straight forward as you only have a finite set to work with. Glueing them together on the other hand would be possible, at least if you limit yourself to a fixed maximal length. You would end up in a macro that would be used like this:
TEST(t,e,s,t,1)

Related

Can the stringize macro be used here?

Using Pelles C I would like to show or log an unsigned char array.
Is it possible to use the stringize macro to show the whole array as hex values instead of iterating through the array with printf(%x)?

No. Stringification only encloses the specified parameters within double quotes.

Unfortunately, you can't. Stringification can convert your identifiers to strings, but not vice versa. Also, it can do nothing to the value a variable contains(neither read nor write), because values can be determined at run time, and macros expand at compile time.
In fact, I think preprocessor tricks such as # and ## should be avoided if possible, because they greatly reduce the readability. #include, #define and #if are enough for the most time.

What exactly is redefinition of a macro in C? What's the point, given so many restrictions on it?

This is the first time I ran into the "redefinition of macro" concept while reading the C book by Mike Banahan (Section 7.3.2). But from what I can gauge from the following paragraph given there, redefinition won't be of any use at all other than repeating the same thing, given the tight restrictions. Of course my understanding is wrong and the author must be having a point. So can you please explain in simple terms what exactly redefinition of a macro in C is, and what exactly can we do to redefine it after we comply with the restrictions and rules given for that. A sample code will be very helpful. Thank you.
Extracted text follows:
The Standard allows either type of macro to be redefined at any time,
using another # define, provided that there isn't any attempt to
change the type of the macro and that the tokens making up both the
original definition and the redefinition are identical in number,
ordering, spelling and use of white space. In this context all white
space is considered equal, so this would be correct:
#define XXX abc/*comment*/def hij
#define XXX abc def hij
because comment is a form of white space. The token sequence for both cases (w-s stands for a white-space token) is:
# w-s define w-s XXX w-s abc w-s def w-s hij w-s

In practice, you generally do not want to redefine macros. Most of the time it happens due to name collision (two pieces of code defining a macro with the same name that may or may not do the same thing). The rule you cite says redefinition is allowed in the case where the only difference between the two definitions is white space. In that case, both definitions will do the same thing. In any other case, all bets are off.
For example, a common thing to want is the maximum of two numbers. If you write a MAX macro, one way to do it would be:
// ASSUME: multiple references to macro parameters do not cause problems
#define MAX(a, b) ((a) > (b) ? (a) : (b))
Since MAX is the obvious name for a macro that returns the maximum of two numbers, there is a pretty good chance that someone else might have the same idea and also define a MAX macro. If they happen to define it exactly the same way you did, the compiler will accept the multiple definitions, because they do the same thing (though some compilers will still warn about it).
If someone defines MAX differently, the compiler will throw an error on the redefinition. Throwing an error is a good thing. Had the compiler always picked either the first or last definitions, the programmer would most likely not be aware that a different macro than they expected will be used.
If you need to work around multiple definitions of macros (e.g., two different 3rd party libraries choose the same name), you can use #ifdef to check if the macro is already defined and #undef to "undefine" the first definition if you would rather have the second. Such solutions are generally fragile. If you have a choice, avoiding name conflicts is a better solution.

Do preprocessor macro definitions need to be in CAPS in a header file?

In the code I'm writing, I have been told to define a variable in a header file in the following way:
#define CLR_BLACK 0x0000
and since this is the only example I've been given, I was wondering whether all variables defined in a header file with the #define command need to be in caps. For example, would the following be valid?
#define videoBuffer (u16*)0x6000000

No. You can use any combination of alphanumeric characters and underscores. Don't start with a number.
However a variable name like videoBuffer would be difficult to distinguish from regular variables (without syntax coloring). That's why most people either use all caps for preprocessor macros or start them with a lower case k, like this: kMyPreprocessorMacro
EDIT: Those are not "global variables" by the way (as you tagged). They're preprocessor macros. Basically an automatic find and replace mechanism that is run at compile time.

No.
#define is a pre-processor macro. It replaces every occurrence of the first string after it with whatever comes after the string. The first string does not need to be in caps.

No, but it's a common and useful convention so if you're reading the code you can see what's a macro and what isn't. See C++ #ifndef for include files, why is all caps used for the header file?

Is it possible to convert a C string literal to uppercase using the preprocessor (macros)?

Ignoring that there are sometimes better non-macro ways to do this (I have good reasons, sadly), I need to write a big bunch of generic code using macros. Essentially a macro library that will generate a large number of functions for some pre-specified types.
To avoid breaking a large number of pre-existing unit tests, one of the things the library must do is, for every type, generate the name of that type in all caps for printing. E.g. a type "flag" must be printed as "FLAG".
I could just manually write out constants for each type, e.g.
#define flag_ALLCAPSNAME FLAG
but this is not ideal. I'd like to be able to do this programatically.
At present, I've hacked this together:
char capname_buf[BUFSIZ];
#define __MACRO_TO_UPPERCASE(arg) strcpy(capname_buf, arg); \
for(char *c=capname_buf;*c;c++)*c = (*c >= 'a' && *c <= 'z')? *c - 'a' + 'A': *c;
__MACRO_TO_UPPERCASE(#flag)
which does what I want to some extent (i.e. after this bit of code, capname_buf has "FLAG" as its contents), but I would prefer a solution that would allow me to define a string literal using macros instead, avoiding the need for this silly buffer.
I can't see how to do this, but perhaps I'm missing something obvious?
I have a variadic foreach loop macro written (like this one), but I can't mutate the contents of the string literal produced by #flag, and in any case, my loop macro would need a list of character pointers to iterate over (i.e. it iterates over lists, not over indices or the like).
Thoughts?

It is not possible in portable C99 to have a macro which converts a constant string to all uppercase letters (in particular because the notion of letter is related to character encoding. An UTF8 letter is not the same as an ASCII one).
However, you might consider some other solutions.
customize your editor to do that. For example, you could write some emacs code which would update each C source file as you require.
use some preprocessor on your C source code (perhaps a simple C code generator script which would emit a bunch of #define in some #include-d file).
use GCC extensions to have perhaps
#define TO_UPPERCASE_COUNTED(Str,Cnt)
#define TO_UPPERCASE(Str) TO_UPPERCASE_COUNTED(Str,__COUNT__) {( \
static char buf_##Cnt[sizeof(Str)+4]; \
char *str_##Cnt = Str; \
int ix_##Cnt = 0; \
for (; *str_##Cnt; str_##Cnt++, ix_##Cnt++) \
if (ix_##Cnt < sizeof(buf_##Cnt)-1) \
buf_##Cnt[ix_##Cnt] = toupper(*str_##Cnt); \
buf_##Cnt; )}
customize GCC, perhaps using MELT (a domain specific language to extend GCC), to provide your __builtin_capitalize_constant to do the job (edit: MELT is now an inactive project). Or code in C++ your own GCC plugin doing that (caveat, it will work with only one given GCC version).

It's not possible to do this entirely using the c preprocessor. The reason for this is that the preprocessor reads the input as (atomic) pp-tokens from which it composes the output. There's no construct for the preprocessor to decompose a pp-token into individual characters in any way (no one that would help you here anyway).
In your example when the preprocessor reads the string literal "flag" it's to the preprocessor basically an atomic chunk of text. It have constructs to conditionally remove such chunks or glue them together into larger chunks.
The only construct that allows you in some sense to decompose a pp-token is via some expressions. However these expressions only can work on arithmetic types which is why they won't help you here.
Your approach circumvents this problem by using C language constructs, ie you do the conversion at runtime. The only thing the preprocessor does then is to insert the C code to convert the string.

Can a C macro definition refer to other macros?

What I'm trying to figure out is if something such as this (written in C):
#define FOO 15
#define BAR 23
#define MEH (FOO / BAR)
is allowed? I would want the preprocessor to replace every instance of
MEH
with
(15 / 23)
but I'm not so sure that will work. Certainly if the preprocessor only goes through the code once then I don't think it'd work out the way I'd like.
I found several similar examples but all were really too complicated for me to understand. If someone could help me out with this simple one I'd be eternally grateful!

Short answer yes. You can nest defines and macros like that - as many levels as you want as long as it isn't recursive.

The answer is "yes", and two other people have correctly said so.
As for why the answer is yes, the gory details are in the C standard, section 6.10.3.4, "Rescanning and further replacement". The OP might not benefit from this, but others might be interested.
6.10.3.4 Rescanning and further replacement
After all parameters in the replacement list have been substituted and
# and ## processing has taken place, all placemarker preprocessing tokens are removed.
Then, the resulting preprocessing token sequence
is rescanned, along with all subsequent preprocessing tokens of the
source file, for more macro names to replace.
If the name of the macro being replaced is found during this scan of
the replacement list (not including the rest of the source file's
preprocessing tokens), it is not replaced. Furthermore, if any nested
replacements encounter the name of the macro being replaced, it is not
replaced. These nonreplaced macro name preprocessing tokens are no
longer available for further replacement even if they are later
(re)examined in contexts in which that macro name preprocessing token
would otherwise have been replaced.
The resulting completely macro-replaced preprocessing token sequence
is not processed as a preprocessing directive even if it resembles
one, but all pragma unary operator expressions within it are then
processed as specified in 6.10.9 below.

Yes, it's going to work.
But for your personal information, here are some simplified rules about macros that might help you (it's out of scope, but will probably help you in the future). I'll try to keep it as simple as possible.
The defines are "defined" in the order they are included/read. That means that you cannot use a define that wasn't defined previously.
Usefull pre-processor keyword: #define, #undef, #else, #elif, #ifdef, #ifndef, #if
You can use any other previously #define in your macro. They will be expanded. (like in your question)
Function macro definitions accept two special operators (# and ##)
operator # stringize the argument:
#define str(x) #x
str(test); // would translate to "test"
operator ## concatenates two arguments
#define concat(a,b) a ## b
concat(hello, world); // would translate to "helloworld"
There are some predefined macros (from the language) as well that you can use:
__LINE__, __FILE__, __cplusplus, etc
See your compiler section on that to have an extensive list since it's not "cross platform"
Pay attention to the macro expansion
You'll see that people uses a log of round brackets "()" when defining macros. The reason is that when you call a macro, it's expanded "as is"
#define mult(a, b) a * b
mult(1+2, 3+4); // will be expanded like: 1 + 2 * 3 + 4 = 11 instead of 21.
mult_fix(a, b) ((a) * (b))

Yes, and there is one more advantage of this feature. You can leave some macro undefined and set its value as a name of another macro in the compilation command.
#define STR "string"
void main() { printf("value=%s\n", VALUE); }
In the command line you can say that the macro "VALUE" takes value from another macro "STR":
$ gcc -o test_macro -DVALUE=STR main.c
$ ./test_macro
Output:
value=string
This approach works as well for MSC compiler on Windows. I find it very flexible.

I'd like to add a gotcha that tripped me up.
Function-style macros cannot do this.
Example that doesn't compile when used:
#define FOO 1
#define FSMACRO(x) FOO + x

Yes, that is supported. And used quite a lot!
One important thing to note though is to make sure you paranthesize the expression otherwise you might run into nasty issues!
#define MEH FOO/BAR
// vs
#define MEH (FOO / BAR)
// the first could be expanded in an expression like 5 * MEH to mean something
// completely different than the second