Confused about C macro expansion and integer arithmetic [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
A riddle (in C)
I have a couple of questions regarding the following snippet:
#include<stdio.h>
#define TOTAL_ELEMENTS (sizeof(array) / sizeof(array[0]))
int array[] = {23,34,12,17,204,99,16};
int main()
{
int d;
for(d=-1;d <= (TOTAL_ELEMENTS-2);d++)
printf("%d\n",array[d+1]);
return 0;
}
Here the output of the code does not print the array elements as expected. But when I add a typecast of (int) the the macro definition of ELEMENTS as
#define TOTAL_ELEMENTS (int) (sizeof(array) / sizeof(array[0]))
It displays all array elements as expected.
How does this typecast work?
Based on this I have few questions:
Does it mean if I have some macro definition as:
#define AA (-64)
by default in C, all constants defined as macros are equivalent to signed int.
If yes, then
But if I have to forcibly make some constant defined in a macro behave as an unsigned int is there any constant suffix than I can use (I tried UL, UD neither worked)?
How can I define a constant in a macro definition to behave as unsigned int?

Look at this line:
for(d=-1;d <= (TOTAL_ELEMENTS-2);d++)
In the first iteration, you are checking whether
-1 <= (TOTAL_ELEMENTS-2)
The operator size_of returns unsigned value and the check fails (-1 signed = 0xFFFFFFFF unsigned on 32bit machines).
A simple change in the loop fixes the problem:
for(d=0;d <= (TOTAL_ELEMENTS-1);d++)
printf("%d\n",array[d]);
To answer your other questions: C macros are expanded text-wise, there is no notion of types. The C compiler sees your loop as this:
for(d=-1;d <= ((sizeof(array) / sizeof(array[0]))-2);d++)
If you want to define an unsigned constant in a macro, use the usual suffix (u for unsigned, ul for unsigned long).

sizeof returns the number of bytes in unsigned format. That's why you need the cast.
See more here.

Regarding your question about
#define AA (-64)
See Macro definition and expansion in the C preprocessor:
Object-like macros were conventionally used as part of good programming practice to create symbolic names for constants, e.g.
#define PI 3.14159
... instead of hard-coding those numbers throughout one's code. However, both C and C++ provide the const directive, which provides another way to avoid hard-coding constants throughout the code.
Constants defined as macros have no associated type. Use const where possible.

Answering just one of your sub-questions:
To "define a constant in a macro" (this is a bit sloppy, you're not defining a "constant", merely doing some text-replacement trickery) that is unsigned, you should use the 'u' suffix:
#define UNSIGNED_FORTYTWO 42u
This will insert an unsigned int literal wherever you type UNSIGNED_FORTYTWO.
Likewise, you often see (in <math.h> for instance) suffices used to set the exact floating-point type:
#define FLOAT_PI 3.14f
This inserts a float (i.e. "single precision") floating-point literal wherever you type FLOAT_PI in the code.

Related

Maximum/minimum value of #define values [duplicate]

This question already has answers here:
Type of #define variables
(7 answers)
Closed 3 years ago.
When using the #define command in C, what is the maximum or minimum amount the variable can be? For example, is
#define INT_MIN (pow(-2,31))
#define INT_MAX (pow(2,31))
an acceptable definition? I suppose a better way to ask is what is the datatype of the defined value?
#define performs token substitution. If you don't know what tokens are, you can think of this as text substitution on complete words, much like your editor's "search and replace" function could do. Therefore,
#define FOO 123456789123456789123456789123456789123456789
is perfectly valid so far — that just means that the preprocessor will replace every instance of FOO with that long number. It would also be perfectly legal (as far as preprocessing goes) to do
#define FOO this is some text that does not make sense
because the preprocessor doesn't know anything about C, and just replaces FOO with whatever it is defined as.
But this is not the answer you're probably looking for.
After the preprocessor has replaced the macro, the compiler will have to compile whatever was left in its place. And compilers will almost certainly be unable to compile either example I posted here and error out.
Integer constants can be as large as the largest integer type defined by your compiler, which is equivalent to uintmax_t (defined in <stdint.h>). For instance, if this type is 64 bits wide (very common case), the maximum valid integer constant is 18446744073709551615, i.e., 2 to the power of 64 minus 1.
This is independent of how this constant is written or constructed — whether it is done via a #define, written directly in the code, written in hexadecimal, it doesn't matter. The limit is the same, because it is given by the compiler, and the compiler runs after preprocessing is finished.
EDIT: as pointed out by #chux in comments, in recent versions of C (starting with C99), decimal constants will be signed by default unless they carry a suffix indicating otherwise (such as U/u, or a combined type/signedness suffix like ULL). In this case, the maximum valid unsuffixed constant would be whatever fits in an intmax_t value (typically half the max of uintmax_t rounded down); constants with unsigned suffixes can grow as large as an uintmax_t value can. (Note that C integer constants, signed or not, are never negative.)
#define INT_MIN (pow(-2,31)) is not acceptable, as it forms a maximum of the wrong type.
pow() returns a double.
Consider this: INT_MIN % 2 leads to invalid code, as % cannot be done on a double.
Your definition is ill-advised for a number of reasons:
These macro names are used in the standard library header limits.h where they are correctly defined for the toolchain's target platform.
Macros are not part of the C language proper; rather they cause replacement text to be inserted into the code for evaluation by the compiler; as such your definition will cause the functionpow() to be called everywhere these macros are used - evaluated at run-time (repeatedly) rather then being a compile-time constant.
The maximum value of a 32 bit two's complement integer is not 231 but 231 - 1.
The pow() function returns a double not an integer - your macro expressions therefore have type double.
Your macros assume the integer size of the platform to be 32 bit, which need not be the case - the definitions are not portable. This is possibly true also of those in , but there the entire library is platform specific, and you'd use a different library/toolchain with each platform.
If you must (and you really shouldn't) define your own macros for this purpose, you should:
define them using distinct macro names,
without assumptions regarding the target platform integer width,
use a constant-expression,
use an expression having int type.
For example:
#define PLATFORM_INDEPENDENT_INT_MAX ((int)(~0u >> 1u))
#define PLATFORM_INDEPENDENT_INT_MIN ((int)~(~0u >> 1u))
Using these the following code:
#include <stdio.h>
#include <limits.h>
#define PLATFORM_INDEPENDENT_INT_MAX ((int)(~0u >> 1u))
#define PLATFORM_INDEPENDENT_INT_MIN ((int)~(~0u >> 1u))
int main()
{
printf( "Standard: %d\t%d\n", INT_MIN, INT_MAX);
printf( "Mine: %d\t%d\n", PLATFORM_INDEPENDENT_INT_MIN, PLATFORM_INDEPENDENT_INT_MAX);
return 0;
}
Outputs:
Standard: -2147483648 2147483647
Mine: -2147483648 2147483647

does the c preprocessor handle floating point math constants

Say that I have
#define A 23.9318;
#define B 0.330043;
#define C 5.220628;
I want to do
const unsigned result = (unsigned)(0x01000000 * ( A * B / C )); // unsigned is 32 bit
What I hope for is to have result with fixed decimal representation of the floating point calculations.
I cannot pre combine A,B,C together as their definition is not part of my code and I need
it to work if they are changed.
No, the standard C preprocessor operations do not perform floating-point arithmetic.
A C implementation is permitted, but not required, by the C standard to perform these operations at compile-time.
Illustrates that although not required, some C implementations do include compile-time computations of floating point...
The following code was compiled using a C99 implementation and produced the indicated results (commented value in main():
#include <ansi_c.h>
#define A 23.9318
#define B 0.330043
#define C 5.220628
#define result A*B/C //1.512945007267325
const unsigned resultB = (unsigned)result*(0x01000000);
int main(void)
{
resultB; //24394701
return 0;
}

Using multiplied macro in array declaration

I know the following is valid code:
#define SOMEMACRO 10
int arr[SOMEMACRO];
which would result as int arr[10].
If I wanted to make an array 2x size of that (and still need the original macro elsewhere), is this valid code?
#define SOMEMACRO 10
int arr[2 * SOMEMACRO];
which would be int arr[2 * 10] after precompilation. Is this still considered as constant expression by the compiler?
After a quick look it seems to work, but is this defined behavior?
Yes it will work.MACRO will be placed as it is at compilation so a[2*SOMEMACRO] will become a[2*10] which is perfectly valid.
To check what is preprocessed you can use cc -E foo.c option
Is this still considered as constant expression by the compiler?
Yes. That's the difference between a constant expression and a literal: a constant expression need not be a single literal, bit it can be any expression of which the value can be computed at compile time (i. e. a combination of literals or other constant expressions).
(Just for the sake of clarity: of course literals are still considered constant expressions.)
However, in C, the size of the array need not be a compile-time constant. C99 and C11 supports variable-length arrays (VLAs), so
size_t sz = // some size calculated at runtime;
int arr[sz];
is valid C as well.
Yes you can use this expression. It will not result in UB.
Note that an array subcript may be an integer expression:
#define i 5
#define j 4
int a[i+j*10] = 0;
The value of of subscript i+j*10 will be calculated during compilation.
yes, as long as it a valid number it's a constant expression.
and if you say it worked then you know the compiler worked just fine with it.
as you know we can't do
int x;
scanf("%d", &x);
int arr[2 * x];
because that's no a constant number. but what you've written is a constant number, so you're good to go

will ~ operator change the data type?

When I read someone's code I find that he bothered to write an explicite type cast.
#define ULONG_MAX ((unsigned long int) ~(unsigned long int) 0)
When I write code
1 #include<stdio.h>
2 int main(void)
3 {
4 unsigned long int max;
5 max = ~(unsigned long int)0;
6 printf("%lx",max);
7 return 0;
8 }
it works as well. Is it just a meaningless coding style?
The code you read is very bad, for several reasons.
First of all user code should never define ULONG_MAX. This is a reserved identifier and must be provided by the compiler implementation.
That definition is not suitable for use in a preprocessor #if. The _MAX macros for the basic integer types must be usable there.
(unsigned long)0 is just crap. Everybody should just use 0UL, unless you know that you have a compiler that is not compliant with all the recent C standards with that respect. (I don't know of any.)
Even ~0UL should not be used for that value, since unsigned long may (theoretically) have padding bits. -1UL is more appropriate, because it doesn't deal with the bit pattern of the value. It uses the guaranteed arithmetic properties of unsigned integer types. -1 will always be the maximum value of an unsigned type. So ~ may only be used in a context where you are absolutely certain that unsigned long has no padding bits. But as such using it makes no sense. -1 serves better.
"recasting" an expression that is known to be unsigned long is just superfluous, as you observed. I can't imagine any compiler that bugs on that.
Recasting of expression may make sense when they are used in the preprocessor, but only under very restricted circumstances, and they are interpreted differently, there.
#if ((uintmax_t)-1UL) == SOMETHING
..
#endif
Here the value on the left evalues to UINTMAX_MAX in the preprocessor and in later compiler phases. So
#define UINTMAX_MAX ((uintmax_t)-1UL)
would be an appropriate definition for a compiler implementation.
To see the value for the preprocessor, observe that there (uintmax_t) is not a cast but an unknown identifier token inside () and that it evaluates to 0. The minus sign is then interpreted as binary minus and so we have 0-1UL which is unsigned and thus the max value of the type. But that trick only works if the cast contains a single identifier token, not if it has three as in your example, and if the integer constant has a - or + sign.
They are trying to ensure that the type of the value 0 is unsigned long. When you assign zero to a variable, it gets cast to the appropriate type.
In this case, if 0 doesn't happen to be an unsigned long then the ~ operator will be applied to whatever other type it happens to be and the result of that will be cast.
This would be a problem if the compiler decided that 0 is a short or char.
However, the type after the ~ operator should remain the same. So they are being overly cautious with the outer cast, but perhaps the inner cast is justified.
They could of course have specified the correct zero type to begin with by writing ~0UL.

"Type" of symbolic constants?

When is it appropriate to include a type conversion in a symbolic constant/macro, like this:
#define MIN_BUF_SIZE ((size_t) 256)
Is it a good way to make it behave more like a real variable, with type checking?
When is it appropriate to use the L or U (or LL) suffixes:
#define NBULLETS 8U
#define SEEK_TO 150L
You need to do it any time the default type isn't appropriate. That's it.
Typing a constant can be important at places where the automatic conversions are not applied, in particular functions with variable argument list
printf("my size is %zu\n", MIN_BUF_SIZE);
could easily crash when the width of int and size_t are different and you wouldn't do the cast.
But your macro leaves room for improvement. I'd do that as
#define MIN_BUF_SIZE ((size_t)+256U)
(see the little + sign, there?)
When given like that the macro still can be used in preprocessor expressions (with #if). This is because in the preprocessor the (size_t) evaluates to 0 and thus the result is an unsigned 256 there, too.
#define is just token pasting preprocessor.
Whatever you write in #define it will replace with the replacement text before compilation.
So either way is correct
#define A a
int main
{
int A; // A will be replaced by a
}
There are many variations in #define like variadic macro or multiline macro
But the main aim of #define is the only one explained above.
Explicitly indicating the types in a constant was more relevant in Kernighan and Richie C (before ANSI/Standard C and its function prototypes came along).
Function prototypes like double fabs(double value); now allow the compiler to generate proper type conversions when needed.
You still want to explicitly indicate the constant sizes in some cases. The examples that come to my mind right now are bit masks:
#define VALUE_1 ((short) -1) might be 16 bits long while #define VALUE_2 ((char) -1) might be 8. Therefore, given a long x, x & VALUE_1 and x & VALUE_2would give very different results.
This would also be the case for the L or LL suffixes: the constants would use different numbers of bits.

Resources