Terminal L in C - c

In C, are the following equivalent:
long int x = 3L; (notice the L)
and
long int x = 3
They seem to be the same. In case they are, which one should be used? Should the L be specified explicitly?
If they are different, what is the difference?

3.14L is a long double literal, while 3.14 is a double literal. It won't make much difference in this case, since both are being used to initialize a long int. The result will be 3.
EDIT:
Ok, 3L is a long literal, while 3 is an int literal. It still won't make much difference, since the int will be "promoted" to a long. The result will be the same in both cases.
EDIT 2:
One place it might make a difference is something like this:
printf("%ld\n", 123);
This is undefined behavior, since the format string specifies a long and only an int is being passed. This would be correct:
printf("%ld\n", 123L);

A decimal integer constant without suffix has - depending on its value - the type int, long, long long, or possibly an implementation-defined extended signed integer type with range greater than long long.
Adding the L suffix means the type will be at least long, the LL suffix means that the type will be at least long long.
If you use the constant to initialize a variable, adding a suffix makes no difference, as the value will be converted to the target-type anyway. However, the type of the constant may well be relevant in more complex expressions as it affects operator semantics, argument promotion and possibly other things I didn't think of right now. For example, assuming a 16-bit int type,
long foo = 42 << 20;
invokes undefined behaviour, whereas
long bar = 42L << 20;
is well-defined.

Related

Declaration of Long (Modifier) in c

Original question
I have a piece of code here:
unsigned long int a =100000;
int a =100000UL;
Do the above two lines represent the same thing?
Revised question
#include <stdio.h>
int main(void)
{
long int x=50000*1024000;
printf("%ld\n",x);
return 0;
}
For a long int, my compiler uses 8 bytes, so the max range is (2^63-1). So here 50000*1024000 results in something which is definitely less than the max range of long int So why does my compiler warn of overflow and give the wrong output?
Original question
The two definitions are not the same.
The types of the variables are different — unsigned long versus (signed) int. The behaviour of these types is quite different because of the difference in signedness. They also may have quite different ranges of valid values.
Technically, the numeric constants are different too; the first is a (signed) int unless int cannot hold the value 100,000, in which case it will be (signed) long instead. That will be converted to unsigned long and assigned to the first a. The other constant is an unsigned long value because of the UL integer suffix, and will be converted to int using the normal rules. If int cannot hold the value 100,000, the normal conversion rules will apply. It is legitimate, though very unusual these days, for sizeof(int) == 2 * sizeof(CHAR_BIT) where CHAR_BIT is 8 — so int is a 16-bit signed type. This is normally treated as a short and normally int is a 32-bit signed type, but the standard does not rule out the alternative.
Most likely, the two variants of a both end up holding the value 100,000, but they are not the same because of the difference in signedness.
Revised question
The arithmetic is done in terms of the two operands of the * operator, and those are 50000 and 1024000. Each of those fits in a 32-bit int, so the calculation is done as int — and the result would be 51200000000, but that requires at least 36 bits to represent the value, so you have 32-bit arithmetic overflow, and the result is undefined behaviour.
After the arithmetic is complete, the int result is converted to 64-bit long — not before.
The compiler is correct to warn, and because you invoked undefined behaviour, anything that is printed is 'correct'.
To fix the code, you can write:
#include <stdio.h>
int main(void)
{
long x = 50000L * 1024000L;
printf("%ld\n", x);
return 0;
}
Strictly, you only need one of the two L suffixes, but symmetry suggests using both. You could use one or two (long) casts instead if you prefer. You can save on spaces too, if you wish, but they help the readability of the code.
The long int and int are not necessarily the same, but they might be. Unsigned and signed are not the same thing. Numerical constants can represent the same value without being the same thing, as in 100000 and 100000UL (the former being a signed int, the latter being unsigned long)

Are literal suffixes needed in standard C?

There is a question already answering the particular case of variable declaration, but what about other literal constant uses?
For example:
uint64_t a;
...
int32_t b = a / 1000000000;
Is last piece of code equivalent to next one in any standard C compiler?
uint64_t a;
...
int32_t b = (int32_t)(a / UINT64_C(1000000000));
In other words, are xINTn_C macros needed at all (supposing we are using explicit casting in cases where implicit one is wrong)?
EDIT
When compiler reads 1000000000, is it allowed to store it as int in an internal representation (dropping all overflowing bits) or it must store it at highest possible precision (long long) until it resolves whole expression type? Is it an implementation-defined behavior or it is mandated by the standard?
Your second example isn't valid C99 and looks like C++. Perhaps want you want is a cast, i.e. (int32_t)(a / UINT64_C(1000000000))?
Is there a difference between a / UINT64_C(1000000000) and a / 1000000000? No, they'll end up with the same operation. But I don't think that's really your question.
I think your question boils down to what will the type of the integer literal "1000000000" be? Will it be an int32_t or an int64_t? The answer in C99 comes from §6.4.4.1 paragraph 5:
The type of an integer constant is the first of the corresponding list in which its value can be represented.
For decimal constants with no suffix, the list is int, long int, long long int. So the first literal will almost certainly be an int (depend on the size of an int, which will likely be 32-bits and therefor large enough to hold one billion). The second literal with the UINT64_C macro will likely be either a unsigned long or unsigned long long, depending on the platform. It will be whatever type corresponds to uint64_t.
So the types of the constants are not the same. The first will be signed while the second is unsigned. And the second will most likely have more "longs", depending on the compiler's sizes of the basic int types.
In your example, it makes no difference that the literals have different types because the / operator will need to promote the literal to the type of a (because a will be of equal or greater rank than the literal in any case). Which is why I didn't think that was really your question.
For an example of why UINT64_C() would matter, consider an expression where the result changes if the literals are promoted to a larger type. I.e., overflow will occur in the literals' native types.
int32_t a = 10;
uint64_t b = 1000000000 * a; // overflows 32-bits
uint64_t c = UINT64_C(1000000000) * a; // constant is 64-bit, no overflow
To compute c, the compiler will need to promote a to uint64_t and perform a 64-bit multiplication. But to compute b the compiler will use 32-bit multiplication since both values are 32-bits.
In the last example, one could use a cast instead of the macro:
uint64_t c = (uint_least64_t)(1000000000) * a;
That would also force the multiplication to be at least 64 bits.
Why would you ever use the macro instead of casting a literal? One possibility is because decimal literals are signed. Suppose you want a constant that isn't representable as a signed value? For example:
uint64_t x = (uint64_t)9888777666555444333; // warning, literal is too large
uint64_t y = UINT64_C(9888777666555444333); // works
uint64_t z = (uint64_t)(9888777666555444333U); // also works
Another possibility is for preprocessor expressions. A cast isn't legal syntax for use in the expression of a #if directive. But the UINTxx_C() macros are.
Since the macros use suffixes pasted onto literals and there is no suffix for a short, one will likely find that UINT16_C(x) and UINT32_C(x) are identical. This gives the result that (uint_least16_t)(65537) != UINT16_C(65537). Not what one might expect. In fact, I have a hard time seeing how this complies with C99 §7.18.4.1:
The macro UINTN_C(value) shall expand to an integer constant expression corresponding to the type uint_leastN_t.

Are the L and LL integer suffixes ever needed? [duplicate]

From an Example
unsigned long x = 12345678UL
We have always learnt that the compiler needs to see only "long" in the above example to set 4 bytes (in 32 bit) of memory. The question is why is should we use L/UL in long constants even after declaring it to be a long.
When a suffix L or UL is not used, the compiler uses the first type that can contain the constant from a list (see details in C99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long int, long long int).
As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long. See also codebauer's answer for an example where the U part of the suffix is necessary.
There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:
printf("%lld", 1LL); // correct, because 1LL has type long long
printf("%lld", 1); // undefined behavior, because 1 has type int
A common reason to use a suffix is ensuring that the result of a computation doesn't overflow. Two examples are:
long x = 10000L * 4096L;
unsigned long long y = 1ULL << 36;
In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.
As Lightness Races in Orbit puts it, the litteral's suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.
Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.
int x = -3;
printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12
With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:
printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large
In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.
Note that, for decimal constants, the list of types suggested by C99 does not contain unsigned long long. In C90, the list ended with the largest standardized unsigned integer type at the time (which was unsigned long). A consequence was that the meaning of some programs was changed by adding the standard type long long to C99: the same constant that was typed as unsigned long in C90 could now be typed as a signed long long instead. I believe this is the reason why in C99, it was decided not to have unsigned long long in the list of types for decimal constants.
See this and this blog posts for an example.
Because numerical literals are of typicaly of type int. The UL/L tells the compiler that they are not of type int, e.g. assuming 32bit int and 64bit long
long i = 0xffff;
long j = 0xffffUL;
Here the values on the right must be converted to signed longs (32bit -> 64bit)
The "0xffff", an int, would converted to a long using sign extension, resulting in a negative value (0xffffffff)
The "0xffffUL", an unsigned long, would be converted to a long, resulting in a positive value (0x0000ffff)
The question is why is should we use L/UL in long constants even after declaring it to be a long.
Because it's not "after"; it's "before".
First you have the literal, then it is converted to whatever the type is of the variable you're trying to squeeze it into.
They are two objects. The type of the target is designated by the unsigned long keywords, as you've said. The type of the source is designated by this suffix because that's the only way to specify the type of a literal.
Related to this post is why a u.
A reason for u is to allow an integer constant greater than LLONG_MAX in decimal form.
// Likely to generate a warning.
unsigned long long limit63bit = 18446744073709551615; // 2^64 - 1
// OK
unsigned long long limit63bit = 18446744073709551615u;

C and number suffixes

I've got a question about using suffix for numbers in C.
Example:
long long c;
The variable c is of long long type. To initiate its value, I do (usually)
c = 12;
When done like that, the compiler recognizes c as a long long type.
Then, if I do
printf("%d",sizeof(c));
the result is 8 - which of course is 64 bit. So the compiler remembers that c is of long long type.
But I've seen some examples where I need to force the type to be long long, by doing
c = 12LL
Why is that?
You're declaring the variable c as a long long, so it's a long long int. The type of the variable is not dependent on its value; rather, the range of possible values for c is dependent on the type of c.
On the other hand: For an integer constant/literal, the type is determined by its value and suffix (if any). 12 has no prefix, so it's a decimal constant. And it has no suffix, meaning it has a type of int, since 12 is guaranteed to be in the long range of it. 12LL has no prefix, so it's also a decimal constant. It has a suffix of LL, meaning it has a type of long long int. It's safe to assign 12 to the variable c, because an int can safely be converted to a long long int.
Hope that helps.
long long c;
c = 12;
c is of type long long but 12 is of type int. When 12 is assigned to long long object c it is first converted to long long and then assigned to c.
c = 12LL;
does exactly the same assignment, only there is no need to implicitly convert it first. Both assignments are equivalent and no sane compiler will make a difference.
Note that some coding guides, from example MISRA (for automotive embedded code) requires constants assigned to unsigned types to be suffixed with U:
Example, in C both assignments (here unsigned int x;) are equivalent:
x = 0; /* non-MISRA compliant */
x = 0U;
but MISRA requires the second form (MISRA-C:2004, rule 10.6).

What does 'u' mean after a number?

Can you tell me what exactly does the u after a number, for example:
#define NAME_DEFINE 1u
Integer literals like 1 in C code are always of the type int. int is the same thing as signed int. One adds u or U (equivalent) to the literal to ensure it is unsigned int, to prevent various unexpected bugs and strange behavior.
One example of such a bug:
On a 16-bit machine where int is 16 bits, this expression will result in a negative value:
long x = 30000 + 30000;
Both 30000 literals are int, and since both operands are int, the result will be int. A 16-bit signed int can only contain values up to 32767, so it will overflow. x will get a strange, negative value because of this, rather than 60000 as expected.
The code
long x = 30000u + 30000u;
will however behave as expected.
It is a way to define unsigned literal integer constants.
It is a way of telling the compiler that the constant 1 is meant to be used as an unsigned integer. Some compilers assume that any number without a suffix like 'u' is of int type. To avoid this confusion, it is recommended to use a suffix like 'u' when using a constant as an unsigned integer. Other similar suffixes also exist. For example, for float 'f' is used.
it means "unsigned int", basically it functions like a cast to make sure that numeric constants are converted to the appropriate type at compile-time.
A decimal literal in the code (rules for octal and hexadecimal literals are different, see https://en.cppreference.com/w/c/language/integer_constant) has one of the types int, long or long long. From these, the compiler has to choose the smallest type that is large enough to hold the value. Note that the types char, signed char and short are not considered. For example:
0 // this is a zero of type int
32767 // type int
32768 // could be int or long: On systems with 16 bit integers
// the type will be long, because the value does not fit in an int there.
If you add a u suffix to such a number (a capital U will also do), the compiler will instead have to choose the smallest type from unsigned int, unsigned long and unsigned long long. For example:
0u // a zero of type unsigned int
32768u // type unsigned int: always fits into an unsigned int
100000u // unsigned int or unsigned long
The last example can be used to show the difference to a cast:
100000u // always 100000, but may be unsigned int or unsigned long
(unsigned int)100000 // always unsigned int, but not always 100000
// (e.g. if int has only 16 bit)
On a side note: There are situations, where adding a u suffix is the right thing to ensure correctness of computations, as Lundin's answer demonstrates. However, there are also coding guidelines that strictly forbid mixing of signed and unsigned types, even to the extent that the following statement
unsigned int x = 0;
is classified as non-conforming and has to be written as
unsigned int x = 0u;
This can lead to a situation where developers that deal a lot with unsigned values develop the habit of adding u suffixes to literals everywhere. But, be aware that changing signedness can lead to different behavior in various contexts, for example:
(x > 0)
can (depending on the type of x) mean something different than
(x > 0u)
Luckily, the compiler / code checker will typically warn you about suspicious cases. Nevertheless, adding a u suffix should be done with consideration.

Resources