What does #define (integer) do? - c

Certainly a dup and I shall remove it ASAP I'll run into an answer. I just can't find what I'm looking for.
What does this two lines in c mean?
#define NN_DIGITS ( 4)
#define MM_MARKS_DONE (255)
I know what #define and #define () does, where #define () execute the macro in (), but I don't know this particular caveat (with an integer).
Is actually redundant to write down () to define an integer value? Shall this values be interpreted bitwise? What will happen if we shan't write (). Will 4 and 255 be taken as a string?

Keyword: "execute". This is the root of your misunderstanding.
Macros aren't executed. They are substituted. The preprcosseor replaces the token NN_DIGITS by the token sequence ( 4). As a matter of fact, it would replace it with practically any token sequence. Even #define NN_DIGITS ( %34 (DDd ] is a valid macro definition (courtesy of my cat), despite the fact we most certainly don't want to try and expand it.
Is actually redundant to write down () to define an integer value?
From a practical standpoint, yes, it's redundant. But some would probably do it to maintain consistency with other macros where the resulting expressions can depend on the presence of parenthesis.
Shall this values be interpreted bitwise?
Everything is bitwise to a computer.
What will happen if we shan't write (). Will 4 and 255 be taken as a string?
No, it will just be the tokens 4 and 255 as opposed to the sequences ( 4) and (255) respectfully. The preprocessor deals only in tokens, it knows practically nothing about the type system. If the macro appear in a program, say:
int a = NN_DIGITS;
It will be turned by the preprocessor into:
int a = ( 4);
And then compiled further by the other steps in the pipeline of turning a program into an executable.

The parenthesis does absolutely nothing in this case - it's just noise.
There's a general rule of survival saying that function-like macros should always:
Wrap each occurrence of a macro parameter in parenthesis, and
Wrap the whole macro in an outer parenthesis
That is:
#define ADD(x,y) x + y // BAD
#define ADD(x,y) (x) + (y) // BAD
#define ADD(x,y) ((x) + (y)) // correct
This is to dodge issues of operator precedence and will be addressed by any decent beginner-level learning material.
Overly pedantic people who've learned the above rules tend to apply them to all macros, not just function-like macros. But in case the macro contains nothing but a single integer constant (a single pre-processor token), then the parenthesis achieves absolutely nothing.
Is actually redundant to write down () to define an integer value?
Yes, it just adds noise.
Shall this values be interpreted bitwise?
Macros are mostly just to regard as text replacement. What you do with the value in the calling code is no business of the macro.
What will happen if we shan't write ()
The code will get slightly easier to read.
Will 4 and 255 be taken as a string?
No, why would they.
There is a specific case where the parenthesis causes harm though, and that is when you use macros to convert a pre-processor constant to a string. Suppose I have this program:
#define STR(x) #x
#define AGE(x) STR(x)
#define DOG_AGE 5
int main(void)
{
puts("My dog is " AGE(DOG_AGE) " years old.");
}
AGE expands the macro DOG_AGE to 5 and then the next macro converts it to a string. So this prints My dog is 5 years old. because the # operator converts the pre-processor token exactly as it is given. If I add "useless noise parenthesis" to the macro:
#define DOG_AGE (5)
Then the output becomes My dog is (5) years old. Not what I intended.

Related

Restricting preprocessing-numbers in a C preprocessor to only handle valid floating and integer constants

I'm currently implementing a C11 compiler and I'm aiming to integrate the preprocessor into the rest compiler and not have it as a stand-alone component. As such, the preprocessor can safely assume that its output will be valid in the following stages.
Reading about the preprocessing number token, it seems like it only exists to simplify the implementation of a stand-alone preprocessor. Simplifying the format of numbers, it doesn't have to handle the full complexity of numeral expressions. Quoting the GCC docs:
The purpose of this unusual definition is to isolate the preprocessor from the full complexity of numeric constants. It does not have to distinguish between lexically valid and invalid floating-point numbers, which is complicated.
As the preprocessor will be integrated to the rest of the compiler framework, this is not an issue for me.
In section 6.4.8.4 [Preprocessing numbers; Semantics] of the C11 standard, it claims
A preprocessing number does not have type or a value; it acquires both after a successful conversion (as part of translation phase 7) to a floating constant token or an integer constant token.
So it seems like every preprocessing-number will be converted into a floating or integer constant later on in the compilation process. I cannot find any other references to preprocessing-numbers in the standard, so it seems like this is their only purpose, but I may be wrong.
My question is, would it be valid for the preprocessor to restrict preprocessing-numbers to only valid integer and floating point constants? Or are there cases where having such a restriction would cause otherwise valid programs to fail?
There are certainly valid programs which include pp-numbers not convertible to an integer or float. The common case is a preprocessing token which does not become a token.
For example, it might be stringified:
#define STRINGIFY_(X) #X
#define STRINGIFY(V) STRINGIFY_(V)
#define VERSION 3.4.6a
#define PROGNAME foo
int main(void) {
printf("%s-%s\n", STRINGIFY(PROGNAME), STRINGIFY(VERSION));
}
Moreover, the version number in the above example could have been produced with token concatenation, another way preprocessing tokens never become program tokens:
#include <stdio.h>
#define STRINGIFY_(X) #X
#define STRINGIFY(V) STRINGIFY_(V)
#define CONCAT3_(x,y,z) x##y##z
#define CONCAT3(x,y,z) CONCAT3_(x,y,z)
#define CONCAT_V(mj, mn, pl) CONCAT3(mj, ., CONCAT3(mn, ., pl))
#define MAJOR 3
#define MINOR 4
#define PATCH 6a
#define VERSION CONCAT_V(MAJOR, MINOR, PATCH)
#define PROGNAME foo
int main(void) {
printf("%s-%s\n", STRINGIFY(PROGNAME), STRINGIFY(VERSION));
}
There are other ways for a pp-number (or any other preprocessing token) to never be converted to a token:
As the argument to a macro which does not use the corresponding parameter in its replacement text.
In program text in a preprocessor conditional whose controlling expression is false.
This is often used "in the wild" by to hide not-completely written code inside an #if 0 … #endif block; the excluded code may have almost arbitrary syntax errors, as long as comments and strings are terminated, included invalid pp-numbers and even stray punctuation. (# is a valid preprocessing token which cannot be converted to a token.)

How to use the token pasting operator with a variable number of arguments?

I thought of having a generic version of #define concatenate(a, b, c) a ## b ## c
I tried it like this:
#include <stdio.h>
#define concatenate(arg1, ...) arg1 ## __VA_ARGS__
int main()
{
int dob = 121201;
printf("%d", concatenate(d, o, b));
return 0;
}
I also tried many other ways:
#define concatenate(arg1, ...) arg1 ## ##__VA_ARGS__
#define concatenate(...) ## ##__VA_ARGS__
#define concatenate(...) ##__VA_ARGS__
#define concatenate(arg1, ...) arg1 ## ...
#define concatenate(arg1, ...) arg1 ## concatenate(##__VA_ARGS__)
Alas, all my attempts failed. I was wondering if it is even possible to do this in any way?
It's possible. Jens Gustedt's interesting P99 macro library includes the macro P99_PASTE, which has precisely the signature of your concatenate, as well as the same semantics.
The mechanics which P99 utilizes to implement that function are complex, to say the least. In particular, they rely on several hundred numbered macros which compensate for the fact that the C preprocessor does not allow recursive macro expansion.
Another useful explanation of how to do iteration in the C preprocessor is found in the documentation for the Boost Preprocessor Library, particularly the topic on reentrancy.
Jens' documentation for P99_PASTE emphasizes the fact that the macro pastes left-to-right to avoid the ambiguity of ##. That might need a bit of explanation.
The token-paste (##) operator is a binary operator; if you want to paste more than two segments into a single token, you need to do it a pair at a time, which means that all intermediate results must be valid tokens. That can require a certain amount of caution. Consider, for example, this macro which attempts to add an exponent to the end of an integer:
#define EXPONENT(INT, EXP) INT ## E ## EXP
(This will only work if both macro arguments are literal integers. In order to allow the macro arguments to be macros, we would need to introduce another level of indirection in the macro expansion. But that's not the point here.)
What we will almost immediately discover is that EXPONENT(42,-3) doesn't work, because -3 is not a single token. It's two tokens, - and 3, and the paste operator will only paste the -. That will result in a two-token sequence 42E- 3, which will eventually lead to a compiler error.
42E and 42E- are valid tokens, by the way. They are ppnumbers, preprocessing numbers, which are any combination of dots, digits, letters and exponents, provided that the token starts with a digit or a dot followed by a digit. (Exponents are one of the letters E or P, possibly lower-case and possibly followed by a sign. Otherwise, sign characters cannot appear in a ppnumber.)
So we could try to fix this by asking the user to separate the sign from the number:
#define EXPONENT(INT, SIGN, EXP) INT ## E ## SIGN ## EXP
EXPONENT(42,-,3)
That will work if the ## operators are evaluated from left-to-right. But the C standard does not impose any particular evaluation order of multiple ## operators. If we're using a preprocessor which works from right to left, then the first thing it will try to do is to paste - and 3, which won't work because -3 is not a single token, just as with the simpler definition.
Now, I can't offer an example of a compiler which will fail on this macro, since I don't have a right-to-left preprocessor handy. Both gcc and clang evaluate ## left-to-right, and I think that's far and away the most common evaluation order. But you can't rely on that; in order to write portable code, you need to ensure that the paste operators are evaluated in the expected order. And that's the guarantee offered by P99_PASTE.
Note: It's possible that there is an application in which right-to-left pasting is required, but after thinking about it for some time, the only example I could come up with of a token paste which would work right-to-left but not left-to-right is the following rather obscure corner case:
#define DOUBLE_HASH %: ## % ## :
and I can't think of any plausible context in which that might come up.

Is a repeated macro invocation via token concatenation unspecified behavior?

The C11 standard admits vagueness with regard to at least one situation that can arise in macro expansion, when a function like macro expands to its unenvoked name, and is invoked by the next preprocessing token. The example given in the standard is this.
#define f(a) a*g
#define g(a) f(a)
// may produce either 2*f(9) or 2*9*g
f(2)(9)
That example does not clarify what happens when a macro, M, is expanded, and all or part of the result contributes via token concatenation to a second preprocessing token, M, which is invoked.
Question: Is such an invocation blocked?
Here is an example of such an invocation. This issue tends to only come up when using a fairly complicated set of macros, so this example is contrived for the sake of simplicity.
// arity gives the arity of its args as a decimal integer (good up to 4 args)
#define arity(...) arity_help(__VA_ARGS__,4,3,2,1,)
#define arity_help(_1,_2,_3,_4,_5,...) _5
// define 'test' to mimic 'arity' by calling it twice
#define test(...) test_help_A( arity(__VA_ARGS__) )
#define test_help_A(k) test_help_B(k)
#define test_help_B(k) test_help_##k
#define test_help_1 arity(1)
#define test_help_2 arity(1,2)
#define test_help_3 arity(1,2,3)
#define test_help_4 arity(1,2,3,4)
// does this expand to '1' or 'arity(1)'?
test(X)
test(X) expands to test_help_A( arity(X) ), which invokes test_help_A on rescanning, which expands its arg before substitution, and so is identical to test_help_A(1), which produces test_help_B(1), which produces test_help_1. This much is clear.
So, the question comes in here. test_help_1 is produced using a character, 1, that came from an expansion of arity. So can the expansion of test_help_1 invoke arity again? My versions of gcc and clang each think so.
Can anyone argue that the interpretations made by gcc and clang are required by something in the standard?
Is anyone aware of an implementation that interprets this situation differently?
I think that gcc's and clang's interpretation are correct. The two expansions of arity are not in the same call path. The first descends from the expansion of test_help_A's argument, the second from the expansion of test_help_A itself.
The idea of these rules is to guarantee that there can't be infinite recursion, which is guaranteed, here. There is progress in the evaluation of the macro between the two calls.

Struct vars size defined by macro returns "this operator is not allowed in a constant expression"

Any solution to this compiler ERROR?
#define TYPE_TOTAL 10
#define MAX_SIZE 20
#define NBITS2(n) ((n&2)?1:0)
#define NBITS4(n) ((n&(0xC))?(2+NBITS2(n>>2)):(NBITS2(n)))
#define NBITS8(n) ((n&0xF0)?(4+NBITS4(n>>4)):(NBITS4(n)))
#define NBITS16(n) ((n&0xFF00)?(8+NBITS8(n>>8)):(NBITS8(n)))
#define NBITS32(n) ((n&0xFFFF0000)?(16+NBITS16(n>>16)):(NBITS16(n)))
#define NBITS(n) (n==0?0:NBITS32(n)+1)
typedef struct StatsEntry_s
{
uint32 type:NBITS(TYPE_TOTAL);
uint32 subtype:NBITS(MAX_SIZE);
} StatsEntry_t;
uint32 type:NBITS(TYPE_TOTAL);
"this operator is not allowed in a constant expression".
Edit: John Bollinger commented correctly that your macros are actually valid constant expressions; indeed the code compiles and runs with a gcc 4.9.3, VS C++ (http://webcompiler.cloudapp.net/) as well as clang 3.5.1. We are not sure why you have problems -- which compiler are you using? But anyway, if you are stuck with a compiler that can't do that:
I think you can substitute the conditions with arithmetic, the way I suggested in my comment for the simplest macro. For example for NBITS4(n):
The original is
#define NBITS4(n) ((n&(0xC))?(2+NBITS2(n>>2)):(NBITS2(n)))
which makes two things dependent on the 0xC bits, the bit shift count and adding 2. Let's see. If there is a match we want to add 2: (n&(0xC) != 0)*2 should be 2 if a bit is set. For the bit shift we consider that n>>0 is (I think) n, so that we can again compute 0 or 2 dependent on n&0xC the same way as before. That should yield
#define NBITS4(n) (((n)&(0xC) != 0)*2+NBITS2((n)>>((n)&(0xC) != 0)*2))
I'm not sure whether that's the easiest way, and I didn't test it, but it should be a start.
A technicality: Always bracket your macro arguments; defines are text replacement and can expand to surprising expressions.
On a general note: Computations are much faster than jumps on modern CPUs. Sometimes using booleans as numbers instead of as conditions for branching gives surprising performance gains. The downside is that it can be absolutely unreadable, like here.
You could redefine your macros without the ternary operator. This is only good up to 255 but could be extended easily:
#define NBITS2(n) (((!(n&2))*(n&1))|(n&2))
#define NBITS4(n) (((!(n&4))*NBITS2(n))|(((n&4)>>2)*3))
#define NBITS8(n) (((!(n&8))*NBITS4(n))|(((n&8)>>3)*4))
#define NBITS16(n) (((!(n&16))*NBITS8(n))|(((n&16)>>4)*5))
#define NBITS32(n) (((!(n&32))*NBITS16(n))|(((n&32)>>5)*6))
#define NBITS64(n) (((!(n&64))*NBITS32(n))|(((n&64)>>6)*7))
#define NBITS(n) (((!(n&128))*NBITS64(n))|(((n&128)>>7)*8))

"Type" of symbolic constants?

When is it appropriate to include a type conversion in a symbolic constant/macro, like this:
#define MIN_BUF_SIZE ((size_t) 256)
Is it a good way to make it behave more like a real variable, with type checking?
When is it appropriate to use the L or U (or LL) suffixes:
#define NBULLETS 8U
#define SEEK_TO 150L
You need to do it any time the default type isn't appropriate. That's it.
Typing a constant can be important at places where the automatic conversions are not applied, in particular functions with variable argument list
printf("my size is %zu\n", MIN_BUF_SIZE);
could easily crash when the width of int and size_t are different and you wouldn't do the cast.
But your macro leaves room for improvement. I'd do that as
#define MIN_BUF_SIZE ((size_t)+256U)
(see the little + sign, there?)
When given like that the macro still can be used in preprocessor expressions (with #if). This is because in the preprocessor the (size_t) evaluates to 0 and thus the result is an unsigned 256 there, too.
#define is just token pasting preprocessor.
Whatever you write in #define it will replace with the replacement text before compilation.
So either way is correct
#define A a
int main
{
int A; // A will be replaced by a
}
There are many variations in #define like variadic macro or multiline macro
But the main aim of #define is the only one explained above.
Explicitly indicating the types in a constant was more relevant in Kernighan and Richie C (before ANSI/Standard C and its function prototypes came along).
Function prototypes like double fabs(double value); now allow the compiler to generate proper type conversions when needed.
You still want to explicitly indicate the constant sizes in some cases. The examples that come to my mind right now are bit masks:
#define VALUE_1 ((short) -1) might be 16 bits long while #define VALUE_2 ((char) -1) might be 8. Therefore, given a long x, x & VALUE_1 and x & VALUE_2would give very different results.
This would also be the case for the L or LL suffixes: the constants would use different numbers of bits.

Resources