C Preprocessor: Evaluate macro early - c

Consider the following setup:
a.h
#define A 5
#define B A
#undef A
#define A 3
a.c
#include "a.h"
#include <stdio.h>
int main()
{
printf("%d\n", B);
return 0;
}
While this very reasonably prints 3, is there a way to make it print 5, i.e. do the substitution of 5 for A already at line two of a.h?

No, there's no way to do that. Unless you know all the possible values of A, and they are always integers, in which case you can laboriously test each one in turn:
#if A == 0
# define B 0
#elif A == 1
# define B 1
#elif A == 2
# define B 2
/* ... and a very long etc. */
#endif
If your use case only involves integers, you have more options. You could, for example, declare Bto be static const int or enum (depending on language) instead of a macro, which would obviously use the current value of the macro. If you really really want macros, the Boost preprocessing library has an implementation of the laborious sequence of #ifs above (with some cleverness to reduce the number of preprocessor statements needed to log(N) instead of N).
There is no macro substitution in the #define preprocessor directive; this fact is covered by §6.10 para. 7 of the C standard (§16 para. 6 of the C++ standard, with identical wording):
The preprocessing tokens within a preprocessing directive are not subject to macro expansion unless otherwise stated.
In the description of the #if and #include directives, the standard specifies that macro replacement does occur, which is why the #if solution above works (and the Boost implementation, which also uses a computed #include).

Yes. Boost's preprocessor library (a set of portable includes, not an extended preprocessor) includes support for "mutable" macro definitions. Instead of defining a macro to expand to a value directly, you can define it to expand to a reference a mutable slot, which can be changed because it expands the value "assigned" to it early. In this case you're less interested in the ability to change the value, than in the fact that this early expansion means it can grab the value out of A at a point ahead of both the use of B or the redefinition of A.
#include <boost/preprocessor/slot/slot.hpp>
#define A 5
#define B BOOST_PP_SLOT(1)
// "assign" A to B
#define BOOST_PP_VALUE A
#include BOOST_PP_ASSIGN_SLOT(1)
#undef A
#define A 3
#include "a.h"
#include <stdio.h>
int main()
{
printf("%d\n", B); // 5
return 0;
}
Support is limited to integers. It takes advantage of the fact that #if directives force expansion of any contained macros (so do #line and #error, although those are not very useful for this purpose), and uses them to build up an equivalent integer value for the slot being assigned to, stored in hidden backend macros. This way it can "extract" a value from A, and B can then refer to the value itself even if A changes or is removed.

Related

Restricting preprocessing-numbers in a C preprocessor to only handle valid floating and integer constants

I'm currently implementing a C11 compiler and I'm aiming to integrate the preprocessor into the rest compiler and not have it as a stand-alone component. As such, the preprocessor can safely assume that its output will be valid in the following stages.
Reading about the preprocessing number token, it seems like it only exists to simplify the implementation of a stand-alone preprocessor. Simplifying the format of numbers, it doesn't have to handle the full complexity of numeral expressions. Quoting the GCC docs:
The purpose of this unusual definition is to isolate the preprocessor from the full complexity of numeric constants. It does not have to distinguish between lexically valid and invalid floating-point numbers, which is complicated.
As the preprocessor will be integrated to the rest of the compiler framework, this is not an issue for me.
In section 6.4.8.4 [Preprocessing numbers; Semantics] of the C11 standard, it claims
A preprocessing number does not have type or a value; it acquires both after a successful conversion (as part of translation phase 7) to a floating constant token or an integer constant token.
So it seems like every preprocessing-number will be converted into a floating or integer constant later on in the compilation process. I cannot find any other references to preprocessing-numbers in the standard, so it seems like this is their only purpose, but I may be wrong.
My question is, would it be valid for the preprocessor to restrict preprocessing-numbers to only valid integer and floating point constants? Or are there cases where having such a restriction would cause otherwise valid programs to fail?
There are certainly valid programs which include pp-numbers not convertible to an integer or float. The common case is a preprocessing token which does not become a token.
For example, it might be stringified:
#define STRINGIFY_(X) #X
#define STRINGIFY(V) STRINGIFY_(V)
#define VERSION 3.4.6a
#define PROGNAME foo
int main(void) {
printf("%s-%s\n", STRINGIFY(PROGNAME), STRINGIFY(VERSION));
}
Moreover, the version number in the above example could have been produced with token concatenation, another way preprocessing tokens never become program tokens:
#include <stdio.h>
#define STRINGIFY_(X) #X
#define STRINGIFY(V) STRINGIFY_(V)
#define CONCAT3_(x,y,z) x##y##z
#define CONCAT3(x,y,z) CONCAT3_(x,y,z)
#define CONCAT_V(mj, mn, pl) CONCAT3(mj, ., CONCAT3(mn, ., pl))
#define MAJOR 3
#define MINOR 4
#define PATCH 6a
#define VERSION CONCAT_V(MAJOR, MINOR, PATCH)
#define PROGNAME foo
int main(void) {
printf("%s-%s\n", STRINGIFY(PROGNAME), STRINGIFY(VERSION));
}
There are other ways for a pp-number (or any other preprocessing token) to never be converted to a token:
As the argument to a macro which does not use the corresponding parameter in its replacement text.
In program text in a preprocessor conditional whose controlling expression is false.
This is often used "in the wild" by to hide not-completely written code inside an #if 0 … #endif block; the excluded code may have almost arbitrary syntax errors, as long as comments and strings are terminated, included invalid pp-numbers and even stray punctuation. (# is a valid preprocessing token which cannot be converted to a token.)

What does #define (integer) do?

Certainly a dup and I shall remove it ASAP I'll run into an answer. I just can't find what I'm looking for.
What does this two lines in c mean?
#define NN_DIGITS ( 4)
#define MM_MARKS_DONE (255)
I know what #define and #define () does, where #define () execute the macro in (), but I don't know this particular caveat (with an integer).
Is actually redundant to write down () to define an integer value? Shall this values be interpreted bitwise? What will happen if we shan't write (). Will 4 and 255 be taken as a string?
Keyword: "execute". This is the root of your misunderstanding.
Macros aren't executed. They are substituted. The preprcosseor replaces the token NN_DIGITS by the token sequence ( 4). As a matter of fact, it would replace it with practically any token sequence. Even #define NN_DIGITS ( %34 (DDd ] is a valid macro definition (courtesy of my cat), despite the fact we most certainly don't want to try and expand it.
Is actually redundant to write down () to define an integer value?
From a practical standpoint, yes, it's redundant. But some would probably do it to maintain consistency with other macros where the resulting expressions can depend on the presence of parenthesis.
Shall this values be interpreted bitwise?
Everything is bitwise to a computer.
What will happen if we shan't write (). Will 4 and 255 be taken as a string?
No, it will just be the tokens 4 and 255 as opposed to the sequences ( 4) and (255) respectfully. The preprocessor deals only in tokens, it knows practically nothing about the type system. If the macro appear in a program, say:
int a = NN_DIGITS;
It will be turned by the preprocessor into:
int a = ( 4);
And then compiled further by the other steps in the pipeline of turning a program into an executable.
The parenthesis does absolutely nothing in this case - it's just noise.
There's a general rule of survival saying that function-like macros should always:
Wrap each occurrence of a macro parameter in parenthesis, and
Wrap the whole macro in an outer parenthesis
That is:
#define ADD(x,y) x + y // BAD
#define ADD(x,y) (x) + (y) // BAD
#define ADD(x,y) ((x) + (y)) // correct
This is to dodge issues of operator precedence and will be addressed by any decent beginner-level learning material.
Overly pedantic people who've learned the above rules tend to apply them to all macros, not just function-like macros. But in case the macro contains nothing but a single integer constant (a single pre-processor token), then the parenthesis achieves absolutely nothing.
Is actually redundant to write down () to define an integer value?
Yes, it just adds noise.
Shall this values be interpreted bitwise?
Macros are mostly just to regard as text replacement. What you do with the value in the calling code is no business of the macro.
What will happen if we shan't write ()
The code will get slightly easier to read.
Will 4 and 255 be taken as a string?
No, why would they.
There is a specific case where the parenthesis causes harm though, and that is when you use macros to convert a pre-processor constant to a string. Suppose I have this program:
#define STR(x) #x
#define AGE(x) STR(x)
#define DOG_AGE 5
int main(void)
{
puts("My dog is " AGE(DOG_AGE) " years old.");
}
AGE expands the macro DOG_AGE to 5 and then the next macro converts it to a string. So this prints My dog is 5 years old. because the # operator converts the pre-processor token exactly as it is given. If I add "useless noise parenthesis" to the macro:
#define DOG_AGE (5)
Then the output becomes My dog is (5) years old. Not what I intended.

C Preprocessor #if handling of non-integer constant

I have the following code snippet to allow me flip easily between double and float representations of floating point values:
#define FLOATINGPOINTSIZE 64
#if FLOATINGPOINTSIZE == 64
typedef double FP_TYPE;
#define FP_LIT_SUFFIX
#else
typedef float FP_TYPE;
#define FP_LIT_SUFFIX f
#endif
At another location I had written the following:
/* set floating point limits used when initialising values that will be subject
* to the MIN() MAX() functions
*
* uses values from float.h */
#if FP_TYPE == double
#define FPTYPE_MAX DBL_MAX
#define FPTYPE_MIN DBL_MIN
#else
#define FPTYPE_MAX FLT_MAX
#define FPTYPE_MIN FLT_MIN
#endif
whereas I think I should have written:
#if FLOATINGPOINTSIZE == 64
I have -Wall compiler setting to give me plenty of warnings but this didn't get flagged up as an issue. Possibly -Wall is completely independent of the preprocessor though?
My question is how is the preprocessor interpreting:
#if FP_TYPE == double
The meaning is obvious to the programmer, but I'm not sure what the preprocessor makes of it?
Its got to be a bug right?
I have -Wall compiler setting to give me plenty of warnings but this
didn't get flagged up as an issue.
The code is valid, but you are right to be concerned.
My question is how is the
preprocessor interpreting:
#if FP_TYPE == double
Good question.
The meaning is obvious to the programmer, but I'm not sure what the
preprocessor makes of it?
The intended meaning seems obvious, and as the code's author, you know what you meant. But what you appear to have intended indeed is not how the preprocessor interprets that conditional.
The expression in a preprocessor conditional is interpreted as an integer constant expression. Just like in a C if statement, if the expression evaluates to 0 then the condition is considered false, and otherwise it is considered true. All macros in the expression are expanded before it is evaluated, and any remaining identifiers are replaced with 0. Details are presented in section 6.10.1 of the standard.
Supposing that there is no in-scope defnition of a macro named either FP_TYPE or double (and a typedef is not a macro definition), your conditional is equivalent to
#if 0 == 0
, which is always true.
Its got to be a bug right?
The preprocessing result will not be what you intended, so it's a bug in your code. The compiler, on the other hand, is correct to accept it.
The meaning is obvious to the programmer, but I'm not sure what the preprocessor makes of it?
Its got to be a bug right?
It's a bug from a user's point of view but it is not a bug in the preprocessor.
#if FP_TYPE == double
is interpreted as
#if 0 == 0
since neither FP_TYPE nor double is a known symbol for the pre-processor.
From https://gcc.gnu.org/onlinedocs/cpp/If.html#If:
Identifiers that are not macros, which are all considered to be the number zero. This allows you to write #if MACRO instead of #ifdef MACRO, if you know that MACRO, when defined, will always have a nonzero value. Function-like macros used without their function call parentheses are also treated as zero.

"Type" of symbolic constants?

When is it appropriate to include a type conversion in a symbolic constant/macro, like this:
#define MIN_BUF_SIZE ((size_t) 256)
Is it a good way to make it behave more like a real variable, with type checking?
When is it appropriate to use the L or U (or LL) suffixes:
#define NBULLETS 8U
#define SEEK_TO 150L
You need to do it any time the default type isn't appropriate. That's it.
Typing a constant can be important at places where the automatic conversions are not applied, in particular functions with variable argument list
printf("my size is %zu\n", MIN_BUF_SIZE);
could easily crash when the width of int and size_t are different and you wouldn't do the cast.
But your macro leaves room for improvement. I'd do that as
#define MIN_BUF_SIZE ((size_t)+256U)
(see the little + sign, there?)
When given like that the macro still can be used in preprocessor expressions (with #if). This is because in the preprocessor the (size_t) evaluates to 0 and thus the result is an unsigned 256 there, too.
#define is just token pasting preprocessor.
Whatever you write in #define it will replace with the replacement text before compilation.
So either way is correct
#define A a
int main
{
int A; // A will be replaced by a
}
There are many variations in #define like variadic macro or multiline macro
But the main aim of #define is the only one explained above.
Explicitly indicating the types in a constant was more relevant in Kernighan and Richie C (before ANSI/Standard C and its function prototypes came along).
Function prototypes like double fabs(double value); now allow the compiler to generate proper type conversions when needed.
You still want to explicitly indicate the constant sizes in some cases. The examples that come to my mind right now are bit masks:
#define VALUE_1 ((short) -1) might be 16 bits long while #define VALUE_2 ((char) -1) might be 8. Therefore, given a long x, x & VALUE_1 and x & VALUE_2would give very different results.
This would also be the case for the L or LL suffixes: the constants would use different numbers of bits.

C preprocessor #if expression

I am a bit confused on the type of expression we can use with the #IF preprocessor in the C language. I tried the following code, and it isn't working. Please explain and provide examples for expressions that can be used with the preprocessor.
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
int c=1;
#if c==1
#define check(a) (a==1)?a:5
#define TABLE_SIZE 100
#endif
int main()
{
int a = 0, b;
printf("a = %d\n", a);
b = check(a);
printf("a = %d %d\n", a, TABLE_SIZE);
system("PAUSE");
return 0;
}
The preprocessor cannot use variables from the C program in expressions - it can only act on preprocessor macros. So when you try to use c in the preprocessor you don't get what you might expect.
However, you also don't get an error because when the preprocessor tries to evaluate an identifier that isn't defined as a macro, it treats the identifier as having a value of zero.
So when you hit this snippet:
#if c==1
#define check(a) (a==1)?a:5
#define TABLE_SIZE 100
#endif
The c used by the preprocessor has nothing to do with the variable c from the C program. The preprocessor looks to see if there's a macro defined for c. Since there isn't, it evaluates the following expression:
#if 0==1
which is false of course.
Since you don't appear to use the variable c in your program, you can do the following to get behavior in line with what you're trying:
#define C 1
#if C==1
#define check(a) (a==1)?a:5
#define TABLE_SIZE 100
#endif
(Note that I also made the macro name uppercase in keeping with convention for macro names.)
The preprocessor is run on the text, before any compilation is done. It doesn't know how to parse C. What you probably wanted instead of int c=1; was
#define C 1
and the test works the way you had it:
#if C == 1
The key here is that this is all defined before compile time. The preprocessor doesn't care about C variables, and certainly doesn't care what their values are.
Note that the convention is to have preprocessor macro names defined in ALL_CAPS.
In your example c is a compiler generated symbol, c has no value until run-time, whereas preprocessor expressions are evaluated at build-time (in fact as the name suggests before the compiler processes the code), so can only operate on pre-processor symbols which do exist at build time.
Moreover such expressions must be compile time constants, or in fact more exactly preprocessing time constant, since compiler constant expressions such as sizeof(...) for example are also not defined during pre-processing.
The preprocessor does not evaluate C variables. It "preprocesses" the source code before it is compiled and thus has its own language. Instead do this:
#define c 1
#if c==1
#define check(a) (a==1)?a:5
#define TABLE_SIZE 100
#endif
...

Resources