Suffix for a intmax_t literal - c

There doesn't seem to be a 'J' suffix (a la printf's %jd).
So, is it guaranteed that the LL and ULL suffixes are going to work with intmax_t and uintmax_t types?
#include <stdint.h>
intmax_t yuuge = 123456789101112131411516LL;
or is it possible that there are literals that are too big for the LL suffix? Say, a (hypothetical) system with 32 bit int, 32 bit long, 64 bit long long, 128 bit intmax_t.

No suffix is needed if you just want the value to be faithfully represented. The C language automatically gives integer literals the right type. Suffixes are only needed if you want to force a literal to have higher-rank type than it would naturally have due to its value (e.g. 1UL to get the value 1 as unsigned long rather than int, or -1UL as an alternate expression for ULONG_MAX).
If you do want to force a literal to have type intmax_t, use the INTMAX_C() macro from stdint.h.

it possible that there are literals that are too big for the LL suffix
Yes, if the integer constant exceeds the range of (u)intmax_t, it is too big, with or without the LL.
See Assigning 128 bit integer in C
for a similar problem.
LL and LLU are not for types. They are for integer constants.
An L or LL insures the minimum type of a constant. The is no suffix for intmax_t.
123 is an `int`
123L is a `long`
123LL is a `long long`
123456789012345 is a `long long` on OP's hypothetical system even without LL
intmax_t may have the same range as long long - or it may be wider. Both intmax_t and long long are at least 64-bit.
With a well warning enabled compiler, should the constant exceed the intmax_t range, a warning would occur. Examples:
// warning: integer overflow in expression
intmax_t yuuge1 = (intmax_t)123456*1000000000000000000 + 789101112131411516;
// warning: overflow in implicit constant conversion [-Woverflow]
intmax_t yuuge2 = 123456789101112131411516;
C provides macros for greatest-width integer constants
The following macro expands to an integer constant expression having the value specified by its argument and the type intmax_t: C11 §7.20.4.2 1
INTMAX_C(value)
The INTMAX_C(value) does have a limitation
The argument in any instance of these macros shall be an unsuffixed integer constant ... with a value that does not exceed the limits for the corresponding type.
The following does not meet that requirement on machines with with 64-bit intmax_t.
// Not so portable code
intmax_t yuuge = INTMAX_C(123456789101112131411516);
# pre-processing is also limited to intmax_t.
Code that attempts to create a constant outside the (u)int64_t range can easily have portability problems. For portability, another coding approach is advised (Avoid such large constants).

Related

What is the meaning of _BIG_ENUM=0xFFFFFFFF, in the last of the enum?

I am reading some codes and I found this _BIG_ENUM=0xFFFFFFFF in the last value of the enum.what is the correct meaning of this line. And this _BIG_ENUM is not used anywhere in the code;
#define CLS_COMM_CONFIG PORT_0,BAUD_9600,DATA_BIT_8,ONE_STOP,NO_FLOW_CONTROL
#define PLS_COMM_CONFIG PORT_1,BAUD_115200,DATA_BIT_8,ONE_STOP,NO_FLOW_CONTROL
typedef enum _comm_config
{
_Zero=0,
PORT_0=_Zero,
BAUD_2400=_Zero,
NONE=_Zero,
HALF_STOP=_Zero,
DATA_BIT_5=_Zero,
NO_FLOW_CONTROL=_Zero,
_One = 1,
PORT_1=_One,
BAUD_4800=_One,
ODD=_One,
ONE_STOP=_One,
DATA_BIT_6=_One,
_Two=2,
PORT_2=_Two,
BAUD_9600=_Two,
EVEN=_Two,
TWO_STOP=_Two,
DATA_BIT_7=_Two,
_Three=3,
PORT_3=_Three,
BAUD_19200=_Three,
DATA_BIT_8=_Three,
_Four=5,
PORT_5=_Four,
BAUD_115200=_Four,
DATA_BIT_9=_Four,
_BIG_ENUM=0xFFFFFFFF,
}COMMConfig;
It doesn't make any sense and is a bug.
I suppose the programmer didn't quite know how enums work and thought they could enforce enums to become 32 bit by assigning a large integer constant to one of the enumeration constants. This is true, but they picked a bad value which won't work as they thought it would.
The problem is that while enumeration variables may have implementation-defined sizes, enumeration constants such as _BIG_ENUM are always of type int 1).
But 0xFFFFFFFF won't fit in a 32 bit int so this is a bug. The hex constant 0xFFFFFFFF is actually of type unsigned int (assuming 32 bit) and it won't fit, so there will be an implementation-defined conversion from signed to unsigned. Meaning we end up with the value -1 on 2's complement systems. Although gcc and clang with strict standard settings even refuse to compile the code when the enumeration constant is given an integer constant larger than INT_MAX.
When faced with an enumeration constant of value -1, the compiler is free to pick any signed type2) for enumeration variables of that type, not necessarily a 32 bit one.
The code can be fixed by changing it to _BIG_ENUM=INT_MAX, (limits.h). Then the enumerated type will either become int or unsigned int.
1) C17 6.7.2.2/1
The expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int.
2) C16 6.7.2.2/4
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined,128) but shall be capable of representing the values of all the members of the enumeration.
128) An implementation may delay the choice of which integer type until all enumeration constants have
been seen.
Some compilers can optimize enum to smaller word-sizes, but this could cause problem with interoperability when not all of the code is compiled with the same compiler.
If an enum is assigned a 32-bit value this optimization is prevented, forcing this enum to be encoded as a 32 bit integer.

What's the difference between 1024 and 1024L while assigning variable?

What's the difference ? Both of them give same output while using printf("%ld")
long x = 1024;
long y = 1024L;
In C source code, 1024 is an int, and 1024L is a long int. During an assignment, the value on the right is converted to the type of the left operand. As long as the rules about which combinations of operands are obeyed and the value on the right is in the range of the left operand, there is no difference—the value remains unchanged.
In general, a decimal constant without a suffix is an int, and a decimal constant with an L is a long int. However, if its value is too big to be represented in the usual type, it will automatically be the next larger type. For example, in a C implementation where the maximum int is 2147483647, the constant 3000000000 in source code will be a long int even though it has no suffix. (Note that this rule means the same constant in source code can have different types in different C implementations.) If a long int is not big enough, it will be long long int. If that is not big enough, it can be a signed extended integer type, if the implementation supports one.
The rules above are for decimal constants. There are also hexadecimal constants (which begin with 0x or 0X) and octal constants (which begin with 0—020 is octal for sixteen, unlike 20 which is decimal for twenty), which may have signed or unsigned types. The different integer types are important because overflow and conversions behave differently depending on type. It is easy to take integer operations as a matter of course and assume they work, but it important to learn the details to avoid problems.

Why do we put suffixes following numeric literals? [duplicate]

This question already has answers here:
what is the reason for explicitly declaring L or UL for long values
(4 answers)
Closed 6 years ago.
I am pretty much sure this question has been answered, though I didn't manage to find it. I know the rules for the type conversions like this. Even if we assign 1 (which is by default of a type signed int) to unsigned int variable, the variable of type unsigned int will have the value if 1 in either case. In other words, why would I want to put U suffix, unless to avoid type conversion (if I intend to assign that value mostly to unsigned ints)?
Literal value suffixes are most important when you need to precisely control the type. For example 4 billion fits in an unsigned 32-bit int but not a signed one. So if you do this, your compiler may complain:
printf("%u", 4000000000);
warning: format specifies type 'unsigned int' but the argument has type 'long'
Also one may use the float suffix f to ensure a value is used that way in arithmetic, such as 1f / x (which could also be written 1. / x or 1.0 / x). This is important if x might be an integral type but the result is meant to be floating point.
An integer constant does not need a suffix to exist with a given value (aside from values representable as some unsigned in decimal but not signed). The trick is what type is that constant and how is that used.
Suppress warning as the integer decimal constant cannot be presented as a signed long long.
// pow(2,64) - 1
unsigned long long x1 = 18446744073709551615; // warning
unsigned long long x2 = 18446744073709551615u;// no warning
Consider #Eugene Sh. example
-1 >> 1 // Rigth shifting negative values is implementation defined behavior
// versus
-1U >> 1 // Well defined. -1U has the positive value of `UINT_MAX`
Sometime simple constants like 1u are used for gently type conversion
// The product is the wider of the type of `x` or `unsigned`
x*1u
#John Zwinck provides a good printf() example.
The suffixes u and U insure the type is some unsigned integer like unsigned or wider.
The is no suffix to insure the type is signed. Use decimal constants.
The suffixes l and L insure the type is at least long/unsigned long without changing it sign-ness.
The suffixes ll and LL insure the type is at least long long/unsigned long long without changing it sign-ness.
The is no suffix to insure the type is narrower than int/unsigned.
There is no standard suffix to insure the type is intmax_t/uintmax_t.

Are the L and LL integer suffixes ever needed? [duplicate]

From an Example
unsigned long x = 12345678UL
We have always learnt that the compiler needs to see only "long" in the above example to set 4 bytes (in 32 bit) of memory. The question is why is should we use L/UL in long constants even after declaring it to be a long.
When a suffix L or UL is not used, the compiler uses the first type that can contain the constant from a list (see details in C99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long int, long long int).
As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long. See also codebauer's answer for an example where the U part of the suffix is necessary.
There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:
printf("%lld", 1LL); // correct, because 1LL has type long long
printf("%lld", 1); // undefined behavior, because 1 has type int
A common reason to use a suffix is ensuring that the result of a computation doesn't overflow. Two examples are:
long x = 10000L * 4096L;
unsigned long long y = 1ULL << 36;
In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.
As Lightness Races in Orbit puts it, the litteral's suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.
Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.
int x = -3;
printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12
With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:
printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large
In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.
Note that, for decimal constants, the list of types suggested by C99 does not contain unsigned long long. In C90, the list ended with the largest standardized unsigned integer type at the time (which was unsigned long). A consequence was that the meaning of some programs was changed by adding the standard type long long to C99: the same constant that was typed as unsigned long in C90 could now be typed as a signed long long instead. I believe this is the reason why in C99, it was decided not to have unsigned long long in the list of types for decimal constants.
See this and this blog posts for an example.
Because numerical literals are of typicaly of type int. The UL/L tells the compiler that they are not of type int, e.g. assuming 32bit int and 64bit long
long i = 0xffff;
long j = 0xffffUL;
Here the values on the right must be converted to signed longs (32bit -> 64bit)
The "0xffff", an int, would converted to a long using sign extension, resulting in a negative value (0xffffffff)
The "0xffffUL", an unsigned long, would be converted to a long, resulting in a positive value (0x0000ffff)
The question is why is should we use L/UL in long constants even after declaring it to be a long.
Because it's not "after"; it's "before".
First you have the literal, then it is converted to whatever the type is of the variable you're trying to squeeze it into.
They are two objects. The type of the target is designated by the unsigned long keywords, as you've said. The type of the source is designated by this suffix because that's the only way to specify the type of a literal.
Related to this post is why a u.
A reason for u is to allow an integer constant greater than LLONG_MAX in decimal form.
// Likely to generate a warning.
unsigned long long limit63bit = 18446744073709551615; // 2^64 - 1
// OK
unsigned long long limit63bit = 18446744073709551615u;

Why do we cast constant literals

I am wondering why do we cast constants when using #define preprocessor?
For example:
#define WIDTH 128
Would it be equal to -128 in an 8bit platform?
What if I change it like this:
#define WIDTH 128U
What would it be equal to, on 8bit platform?
What is the default sizeof a constant integers like the above? Do its length/type depends on the platform architecture, or it depends on the type of the literal value they hold?
Sorry about my bad English.
Defining WIDTH as 128 poses no problems, int is at least 16 bit wide on all conforming platforms.
Defining WIDTH as 128U would make it an unsigned integer constant literal. Since the value fits in an unsigned int (mandated to be at least 16 bit wide), it has type unsigned int. sizeof(WIDTH) evaluates to sizeof(unsigned int), which is entirely platform specific.
Using this suffix is not recommended. It would have surprising side effects:
if (WIDTH > -1) {
printf("This will never print\n");
}
Since WIDTH expands to 128U, an unsigned constant, the comparison is performed as an unsigned comparison, -1 is converted to unsigned and becomes UINT_MAX, a value much larger than 128. Don't do this.
If you subsequently store WIDTH into a char variable, you may have a problem. It would actually not make a difference whether you define it as 128 or 128U, you would still have an overflow if the char type is 8 bit and signed, leading to undefined behavior. On most platforms, the value stored would indeed be -128 but you cannot even rely on that.
More importantly, you should use all the help the compiler can give you by enabling all compiler warnings and making them errors:
gcc -Wall -Wextra -Werror
or
clang -Weverything -Werror
Very few of these warnings are annoying, most of them are very useful and point to silly mistakes, typos and oversights.
First of all 128 is not equal to -128 on a 8-bit platform.
Second this has nothing to do with the preprocessor. What the preprocessor does is to replace WIDTH with whatever it's defined as. That is the question is why you write 128 or 128u in your source.
The suffix u is not about type casting, it's about to indicate the type of the literal. In this example 128 is an literal with value 128 of type int while 128u is a literal with value 128 of type unsigned int. It's not a problem immediately here, but if you start to use them and end up larger than 32767 you could run into problems. For example:
#define WIDTH 256u
#define HEIGHT 192u
unsigned npixels = WIDTH * HEIGHT;
it should be noted that the suffices are required to make it portable (what could happen is that the platform only uses 16-bit ints and with int the multiplication would overflow which means undefined behavior).
Also note that in newer C standards (but not the antique ones) will extend the literal to become as large as necessary if possible. For example the literal 32768 means a signed integral type with value 32768, if int isn't large enough to hold that signed number then larger types would be used.
The sizeof these integers are the same as sizeof(int) as the type of the literals are int and unsigned int. The actual value of sizeof(int) could be any positive integer.
Giving C99 chapters because I don't have the C11 document at hand.
ISO/IEC 9899:1999, 6.4.4.1 Integer constants
The type of an integer constant is the first of the corresponding list in which its value can be represented.
For decimal constants without suffix:
int
long int
long long int
For decimal constants with u or U suffix:
unsigned int
unsigned long int
unsigned long long int
ISO/IEC 9899:1999, 5.2.4.2.1 Sizes of integer types
The width of integer types is implementation-defined, as is the binary representation.
INT_MAX -- the largest value an int can take -- is guaranteed to be at least +32767.
UINT_MAX -- the largest value an unsigned int can take -- is guaranteed to be at least 65535.
ISO/IEC 9899:1999, 6.3.1.8 Usual arithmetic conversions
If you compare an int with an unsigned int, the int will be implicitly converted to unsigned int for the comparison. Similar for the short / long / long long types. As #chqrlie pointed out, this can be a problem; your compiler should give you a warning if this happens (you are always compiling with -Wall -Wextra / /W3 enabled, aren't you?).
Summary
Your constants will fit into an int / unsigned int even on an 8-bit machine. Assuming they would not, the compiler would use the next largest type for them (instead of casting the value).
As for why we do it...
If you, for example, intend to use WIDTH for comparisons with the result of sizeof(), or the return code of strlen(), or anything else that is unsigned by nature, you would want WIDTH to have the same value domain, i.e. being able to hold all possible values.
That is why you would want WIDTH to be unsigned as well.

Resources