CPP how to escape a quotation mark - c-preprocessor

I'm in the process of making code from 1991 work on Ubuntu 19.
I've got this file I need to run through CPP where I am forced to use the -traditional option.
#define ITEM_WEAPON 5
#define ITEM_FIREWEAPON 6
Trade types = "+ITEM_WEAPON+ITEM_FIREWEAPON+"
I want the line to become
Trade types = "+5+6+"
This worked just fine in 1991-1997 ;-) It seems cpp for obvious reasons no longer parse between quotation marks.
I've tried to escape the quotes using the backslash character e.g.
Trade types = \""+ITEM_WEAPON+ITEM_FIREWEAPON+\""
But still haven't found a good solution.
For clarity this is not a C program, instead we simply used cpp to expand various macros into a structured text file which was later run through a parser.
The closest I have come (with -traditional flag) to something that almost works is this:
#define WI 1
#define WJ 2
#define T(a,b) Trade types = "+a+b+"
T(1,2)
T(WI,WJ)
Which outputs:
Trade types = "+1+2+"
Trade types = "+WI+WJ+"
So the pre-processor does substitute the arguments between the quotes but does not expand the parametized macro.

#include <iostream>
#include <string>
#define ITEM_WEAPON 5
#define ITEM_FIREWEAPON 6
#define STRINGIFY_HELPER(x) #x
#define STRINGIFY(x) STRINGIFY_HELPER(x)
int main()
{
std::string types = "+" STRINGIFY(ITEM_WEAPON) "+" STRINGIFY(ITEM_FIREWEAPON) "+";
std::cout << types << '\n';
return 0;
}
https://wandbox.org/permlink/R5sLKvGhcnL9Jgyj

With more macros you can do that:
#define ITEM_WEAPON 5
#define ITEM_FIREWEAPON 6
#define MAKESTRING2(s) #s
#define MAKESTRING(s) MAKESTRING2(s)
Trade types = "+" MAKESTRING(ITEM_WEAPON) "+" MAKESTRING(ITEM_FIREWEAPON) "+";
But I also would try avoiding macros:
#define ITEM_WEAPON 5
#define ITEM_FIREWEAPON 6
const std::string types = "+" + std::to_string(ITEM_WEAPON) + "+" + std::to_string(ITEM_FIREWEAPON) + "+";
Or even better:
constexpr int ITEM_WEAPON = 5;
constexpr int ITEM_FIREWEAPON = 6;
const std::string types = "+" + std::to_string(ITEM_WEAPON) + "+" + std::to_string(ITEM_FIREWEAPON) + "+";

Running
cpp -traditional
on this file
#define WI 1
#define WJ 2
#define T(a) Trade types = \"+a+"
T(WI+WJ)
will produce the following output:
Trade types = \"+1+2+"
Which is the closest I could come to replicating the behavior of old times past :)
Given all the comments, which I agree on I think this will have to suffice.

Related

How to fix the macro expansion problem in C

How to fix the macro expansion issue below ?
#define GET_VAL 3,2
#define ADD_VAL(val0, val1) ((val0) + (val1))
void foo()
{
int res = ADD_VAL(GET_VAL);
}
The macro is getting expanded as below and resulting in an error. I am using MSVC 2019
res = 3,2 + ;
I even tried using a helper macro as below, but still getting the same error.
#define GET_VAL 3,2
#define ADD_VAL1(val0, val1) (val0 + val1)
#define ADD_VAL(val) ADD_VAL1(val)
Expecting expansion:
ADD_VAL(GET_VAL); --> ADD_VAL(3, 2); --> 3 + 2
By default msvc doesn't use a standard confirming preprocessor implementation, make sure to enable it with /Zc:preprocessor
Macros fully expand their arguments in isolation before pasting them into the replacement text, but the resulting tokens aren't separated into a new argument list. They way to fix your behavior is to create an intermediate macro that expands the arguments, and passes the expanded arguments to your macro:
#define GET_VAL 1,2
#define ADD_VAL(...) ADD_VAL_(__VA_ARGS__)
#define ADD_VAL_(a,b) ((a)+(b))
ADD_VAL(GET_VAL) // should work now
Another option is to write a fx macro that evaluates arguments and applies a function to them:
#define FX(f,...) f(__VA_ARGS__)
#define ADD_VAL(a,b) ((a)+(b))
FX(ADD_VAL,GET_VAL) // should work now
C preprocessor can be abused in horrible ways
#define GET_VAL 3,2
// #define ADD_VAL(val0, val1) ((val0) + (val1))
#define ADD_VAL(val) ((int [2]){val}[0] + (int [2]){val}[1])
int main() {
printf("%d\n",ADD_VAL(GET_VAL));
}
Output
5

How to pass string as prefix of defined macro

Is there any idea to pass C string as part of the defined macro like below code?
#define AAA_NUM 10
#define BBB_NUM 20
#define PREFIX_NUM(string) string##_NUM
int main()
{
char *name_a = "AAA";
char *name_b = "AAA";
printf("AAA_NUM: %d\n", PREFIX_NUM(name_a));
printf("BBB_NUM: %d\n", PREFIX_NUM(name_b));
return 0;
}
Expected output
AAA_NUM: 10
BBB_NUM: 20
As mentioned in other posts, you can't use run-time variables in the pre-processor. You could however create enum that way. Though it is usually not a good idea to generate identifiers with macros either, save for special cases like when maintaining an existing code base and you are limited in how much of the existing code you can/want to change. So it should be used as a last resort only.
The least bad way to write such macros would be by using a common design pattern called "X macros". These are used when it is important that code repetition should be reduced to a single place in the project. They tend to make the code look rather alien though... Example:
#define PREFIX_LIST(X) \
/* pre val */ \
X(AAA, 10) \
X(BBB, 20) \
X(CCC, 30) \
enum // used to generate constants like AAA_NUM = 10,
{
#define PREFIX_ENUMS(pre, val) pre##_NUM = (val),
PREFIX_LIST(PREFIX_ENUMS)
};
#include <stdio.h>
int main (void)
{
// one way to print
#define prefix_to_val(pre) pre##_NUM
printf("AAA_NUM: %d\n", prefix_to_val(AAA));
printf("BBB_NUM: %d\n", prefix_to_val(BBB));
// another alternative
#define STR(s) #s
#define print_all_prefixes(pre, val) printf("%s: %d\n", STR(pre##_NUM), val);
PREFIX_LIST(print_all_prefixes)
return 0;
}
A macro is only processed before compilation and not at runtime. Your code example does not work as you can see here.
Good practice (for example MISRA coding rules) recommend to use macros as little as possible since it is error prone.
Preprocessor works at compile time and here name_a and name_b are non constant, and even if they were (i.e. const char *str is a real constant in C++ but not in C), there is a literal substitution and the preprocessor does not know the contents of variables.
This works (notice that the parameter should be expanded by another macro in order to get a valid token):
#include <stdio.h>
#define AAA_NUM 10
#define BBB_NUM 20
#define _PREFIX_NUM(string) string##_NUM
#define PREFIX_NUM(string) _PREFIX_NUM(string)
int main(void)
{
#define name_a AAA
#define name_b BBB
printf("AAA_NUM: %d\n", PREFIX_NUM(name_a));
printf("BBB_NUM: %d\n", PREFIX_NUM(name_b));
return 0;
}
There is no way in C to create runtime symbols and use them. C is a compiled language and all symbols have to be known before the compilation.
The preprocessor (which do changes on the text level before the compilation) does not know anything about the C language.

In Brian Gladman's AES implementation, how is aes_encrypt_key128 being mapped to aes_xi?

I'm able to follow the code path up to a certain point. Briefly:
The program accepts an ASCII hexadecimal string and converts it to binary. https://github.com/BrianGladman/aes/blob/master/aesxam.c#L366-L382
If arg[3] is an “E”, it defines an aes_encrypt_ctx struct and passes the key, the calculated key_len value, and the aes_encrypt_ctx stuct to aes_encrypt_key. https://github.com/BrianGladman/aes/blob/master/aesxam.c#L409-L412
aes_encrypt_key is defined in aeskey.c. Depending on key_len, the function aes_encrypt_key<NNN> is called. They key and the struct are passed to the function. https://github.com/BrianGladman/aes/blob/master/aeskey.c#L545-L547
But where is the aes_encrypt_key128 function?
This line appears to be my huckleberry:
# define aes_xi(x) aes_ ## x
So hopefully I'm onto something. It's mapping aes_encrypt_key128 to aes_xi(encrypt_key128), right?
AES_RETURN aes_xi(encrypt_key128)(const unsigned char *key, aes_encrypt_ctx cx[1])
{ uint32_t ss[4];
cx->ks[0] = ss[0] = word_in(key, 0);
cx->ks[1] = ss[1] = word_in(key, 1);
cx->ks[2] = ss[2] = word_in(key, 2);
cx->ks[3] = ss[3] = word_in(key, 3);
#ifdef ENC_KS_UNROLL
ke4(cx->ks, 0); ke4(cx->ks, 1);
ke4(cx->ks, 2); ke4(cx->ks, 3);
ke4(cx->ks, 4); ke4(cx->ks, 5);
ke4(cx->ks, 6); ke4(cx->ks, 7);
ke4(cx->ks, 8);
#else
{ uint32_t i;
for(i = 0; i < 9; ++i)
ke4(cx->ks, i);
}
#endif
ke4(cx->ks, 9);
cx->inf.l = 0;
cx->inf.b[0] = 10 * AES_BLOCK_SIZE;
#ifdef USE_VIA_ACE_IF_PRESENT
if(VIA_ACE_AVAILABLE)
cx->inf.b[1] = 0xff;
#endif
MARK_AS_ENCRYPTION_CTX(cx);
return EXIT_SUCCESS;
}
I see some pattern replacement happening here. I guess at this point I was wondering if you could point me to the docs that explain this feature of #define?
Here are some docs which explain token concatenation. You can also take this as a suggestion about where to search systematically for reliable docs:
The C standard. At this website you can download WG14 N1570, which is quite similar to the C11 standard (it's a pre-standard draft, but it's basically the same as the standard except you don't have to pay for it.) There's an HTML version of this document at http://port70.net/~nsz/c/c11/n1570.html, which is handy for constructing links. With that in mind, I can point you at the actual standard definition of ## in §6.10.3.3 of the standard.
The C standard can be a bit rough going if you're not already an expert in C. It makes very few concessions for learners. A more readable document is Gnu GCC's C Preprocessor (CPP) manual, although it is does not always distinguish between standard features and GCC extensions. Still, it's quite readable and there's lots of useful information. The ## operator is explained in Chapter 3.5
cppreference.com is better known as a C++ reference site, but it also contains documentation about C. It's language is almost as telegraphic as the C++/C standards, and it is not always 100% accurate (although it is very good), but it has several advantages. For one thing, it combines documentation for different standard versions, so it is really useful for knowing when a feature entered the language (and consequently which compiler version you will need to use the feature). Also, it is well cross-linked, so it's very easy to navigate. Here's what it has to say about the preprocessor; you'll find documentation about ## here.
I've been at this a while but it started to become clear to me that there is pattern matching going on in the pre-processing macros of the aeskey.c file. The only doc I've been able to find is this one.
Pattern Matching
The ## operator is used to concatenate two tokens into one token. This
is provides a very powerful way to do pattern matching. Say we want to
write a IIF macro, we could write it like this:
#define IIF(cond) IIF_ ## cond
#define IIF_0(t, f) f
#define IIF_1(t, f) t
However there is one problem with this approach. A subtle side effect
of the ## operator is that it inhibits expansion. Heres an example:
#define A() 1
//This correctly expands to true
IIF(1)(true, false)
// This will however expand to
IIF_A()(true, false)
// This is because A() doesn't expand to 1,
// because its inhibited by the ## operator
IIF(A())(true, false)
The way to work around this is to use another indirection. Since this
is commonly done we can write a macro called CAT that will concatenate
without inhibition.
#define CAT(a, ...) PRIMITIVE_CAT(a, __VA_ARGS__)
#define PRIMITIVE_CAT(a, ...) a ## __VA_ARGS__
So now we can write the IIF macro (its called IIF right now, later we
will show how to define a more generalized way of defining an IF
macro):
#define IIF(c) PRIMITIVE_CAT(IIF_, c)
#define IIF_0(t, ...) __VA_ARGS__
#define IIF_1(t, ...) t
#define A() 1
//This correctly expands to true
IIF(1)(true, false)
// And this will also now correctly expand to true
IIF(A())(true, false)
With pattern matching we can define other operations, such as COMPL
which takes the complement:
#define COMPL(b) PRIMITIVE_CAT(COMPL_, b)
#define COMPL_0 1
#define COMPL_1 0
or BITAND:
#define BITAND(x) PRIMITIVE_CAT(BITAND_, x)
#define BITAND_0(y) 0
#define BITAND_1(y) y
We can define increment and decrement operators as macros:
#define INC(x) PRIMITIVE_CAT(INC_, x)
#define INC_0 1
#define INC_1 2
#define INC_2 3
#define INC_3 4
#define INC_4 5
#define INC_5 6
#define INC_6 7
#define INC_7 8
#define INC_8 9
#define INC_9 9
#define DEC(x) PRIMITIVE_CAT(DEC_, x)
#define DEC_0 0
#define DEC_1 0
#define DEC_2 1
#define DEC_3 2
#define DEC_4 3
#define DEC_5 4
#define DEC_6 5
#define DEC_7 6
#define DEC_8 7
#define DEC_9 8

C Preprocessor: Stringify int with leading zeros?

I've seen this topic which describes the "stringify" operation by doing:
#define STR_HELPER(x) #x
#define STR(x) STR_HELPER(x)
#define MAJOR_VER 2
#define MINOR_VER 6
#define MY_FILE "/home/user/.myapp" STR(MAJOR_VER) STR(MINOR_VER)
Is it possible to stringify with leading zeros? Let's say my MAJOR_REV needs to be two characters "02" in this case and MINOR_REV 4 characters "0006"
If I do:
#define MAJOR_VER 02
#define MINOR_VER 0006
The values will be treated as octal elsewhere in the application, which I don't want.
No clean nor handy way to do it. Just as a challenge, here a possible "solution":
1) create a header file (e.g. "smartver.h") containing:
#undef SMARTVER_HELPER_
#undef RESVER
#if VER < 10
#define SMARTVER_HELPER_(x) 000 ## x
#elif VER < 100
#define SMARTVER_HELPER_(x) 00 ## x
#elif VER < 1000
#define SMARTVER_HELPER_(x) 0 ## x
#else
#define SMARTVER_HELPER_(x) x
#endif
#define RESVER(x) SMARTVER_HELPER_(x)
2) In your source code, wherever you need a version number with leading zeroes:
#undef VER
#define VER ...your version number...
#include "smartver.h"
at this point, the expression RESVER(VER) is expanded as a four-digit sequence of character, and the expression STR(RESVER(VER)) is the equivalent string (NOTE: I have used the STR macro you posted in you answer).
The previous code matches the case of minor version in your example,it's trivial to modify it to match the "major version" case. But in truth I would use a simple external tool to produce the required strings.
I believe in the example provided by the question sprintf is the correct answer.
That said, there are a few instances where you really want to do this and with C preprocessor if there is a will and somebody stupid enough to write the code there is typically a way.
I wrote the macro FORMAT_3_ZERO(a) which creates a three digit zero padded number using brute force. It is in the file preprocessor_format_zero.h found at https://gist.github.com/lod/cd4c710053e0aeb67281158bfe85aeef as it is too large and ugly to inline.
Example usage
#include "preprocessor_format_zero.h"
#define CONCAT_(a,b) a#b
#define CONCAT(a,b) CONCAT_(a,b)
#define CUSTOM_PACK(a) cp_ ## a __attribute__( \
(section(CONCAT(".cpack.", FORMAT_3_ZERO(a))), \
aligned(1), used))
const int CUSTOM_PACK(23);

Is there a way to get the value of __LINE__ on one line and use that value on other lines?

Essentially, I want to do this:
#include "foo.h"
#include "bar.h"
static const unsigned line_after_includes = __LINE__;
int main()
{
foo(line_after_includes);
bar(line_after_includes);
return 0;
}
Except like this:
#include "foo.h"
#include "bar.h"
#define LINE_AFTER_INCLUDES __LINE__
int main()
{
FOO(LINE_AFTER_INCLUDES);
BAR(LINE_AFTER_INCLUDES);
return 0;
}
Is it possible to make LINE_AFTER_INCLUDES expand to the value of __LINE__ on the line on which LINE_AFTER_INCLUDES was defined, so I can use it with other macros later? I would just use a variable like in the first code snippet, but I need the value as an integer literal, because I will be using it in a switch statement. The alternative is to do
#define LINE_AFTER_INCLUDES 3 // MAKE SURE THIS IS ON LINE IT'S DEFINED AS!!!
which is ugly and harder to maintain. I would settle for that, but I will be doing something a bit more complex than this example...
The best way to do this is given in #Bathsheba's answer here: https://stackoverflow.com/a/24551912/1366431
typedef char LINE_AFTER_INCLUDES[__LINE__];
You can then access the value of __LINE__ at that declaration by calling sizeof(LINE_AFTER_INCLUDES), because its value has been fixed into part of the newly-declared array type. sizeof is a C-level compile-time constant, which works for switch and related things (no division is necessary: the size of char is guaranteed to be 1, as it's the unit of measurement). The only disadvantage of this is that it's C-level rather than preprocessor-level, so it can't be used with #if or for token-pasting.
(original, which I spent ages typing:)
So the problem here is that macro definitions are lazy; i.e. that macro invocations aren't expanded until they're actually needed to be inserted into some kind of output. Because of this, __LINE__'s expansion is delayed until it's too late.
Luckily, there's a way to force eager evaluation in the preprocessor - the macro only needs to be demanded by some kind of output - not necessarily program body text. Remember that macros can also be expanded to form the arguments to preprocessor directives - and that preprocessor directives are then - having been controlled by the forced expansion - able to create further definitions. This handy observation is the basis of Boost's "evaluated slots" functionality, which uses it to provide things like eager evaluation and mutable preprocessor variable slots.
Unfortunately you can't use Boost slots with __LINE__, as it suffers from the same problem - but we can steal the idea to produce two rather inelegant solutions. They're each ugly in their own special ways.
Option 1:
Use a shell script to produce a large number (a few hundred?) of include-able files with the following naming scheme and contents:
// linedefs/_42_.h
#define LINE_AFTER_INCLUDES 42
...then use as follows:
#define GEN_LINE_INC(L) _GEN_LINE_S(_GEN_LINE_(L))
#define _GEN_LINE_(L) linedefs/_##L##_.h
#define _GEN_LINE_S(S) _GEN_LINE_S2(S)
#define _GEN_LINE_S2(S) #S
#include "foo.h"
#include "bar.h"
#include GEN_LINE_INC(__LINE__)
int main()
{
FOO(LINE_AFTER_INCLUDES);
BAR(LINE_AFTER_INCLUDES);
return 0;
}
In other words, use the #include directive to force expansion of a macro which converts __LINE__ into a filename; including that filename produces the right value for the constant. This is inelegant because it requires a load of extra files and an external tool to generate them, but it's very simple.
Option 2:
Insert a large and ugly block into your main file below the #include section:
#include "foo.h"
#include "bar.h"
// <- This line is the one we generate the number for
#define LINE_BIT_0 0
#define LINE_BIT_1 0
#define LINE_BIT_2 0
#define LINE_BIT_3 0
#define LINE_BIT_4 0
#define LINE_BIT_5 0
#if (__LINE__ - 7) & 1
# undef LINE_BIT_0
# define LINE_BIT_0 1
#endif
#if (__LINE__ - 11) >> 1 & 1
# undef LINE_BIT_1
# define LINE_BIT_1 (1 << 1)
#endif
#if (__LINE__ - 15) >> 2 & 1
# undef LINE_BIT_2
# define LINE_BIT_2 (1 << 2)
#endif
#if (__LINE__ - 19) >> 3 & 1
# undef LINE_BIT_3
# define LINE_BIT_3 (1 << 3)
#endif
#if (__LINE__ - 23) >> 4 & 1
# undef LINE_BIT_4
# define LINE_BIT_4 (1 << 4)
#endif
#if (__LINE__ - 27) >> 5 & 1
# undef LINE_BIT_5
# define LINE_BIT_5 (1 << 5)
#endif
#define LINE_AFTER_INCLUDES (LINE_BIT_0 | LINE_BIT_1 | LINE_BIT_2 | LINE_BIT_3 | LINE_BIT_4 | LINE_BIT_5)
int main() ...
This version uses the #if directive to force expansion of the __LINE__ macro and convert it into bitflags, which are then recombined at the end. This is highly inelegant because it relies on precomputing the distance between each #if and the top of the block, since __LINE__ evaluates to different values in the course of the block; and it can't be factored out and hidden in a separate file, or else __LINE__ wouldn't work. Still, it works, and it doesn't require an external tool.
(In the event you have a huge number of #include lines, extending it to more than 6 bits should be straightforward.)
On the other hand, this sounds like an X/Y problem to me. There has to be an alternative to __LINE__ that would work better for this. If you're counting the number of #included files, perhaps you could use something like a line incrementing Boost.Counter at the end of each one? That way, you also wouldn't be vulnerable to formatting changes (e.g. blank lines in the #include section).

Resources