C interpretation of hexadecimal long integer literal "L" - c

How does a C compiler interpret the "L" which denotes a long integer literal, in light of automatic conversion? The following code, when run on a 32-bit platform (32-bit long, 64-bit long long), seems to cast the expression "(0xffffffffL)" into the 64-bit integer 4294967295, not 32-bit -1.
Sample code:
#include <stdio.h>
int main(void)
{
long long x = 10;
long long y = (0xffffffffL);
long long z = (long)(0xffffffffL);
printf("long long x == %lld\n", x);
printf("long long y == %lld\n", y);
printf("long long z == %lld\n", z);
printf("0xffffffffL == %ld\n", 0xffffffffL);
if (x > (long)(0xffffffffL))
printf("x > (long)(0xffffffffL)\n");
else
printf("x <= (long)(0xffffffffL)\n");
if (x > (0xffffffffL))
printf("x > (0xffffffffL)\n");
else
printf("x <= (0xffffffffL)\n");
return 0;
}
Output (compiled with GCC 4.5.3 on a 32-bit Debian):
long long x == 10
long long y == 4294967295
long long z == -1
0xffffffffL == -1
x > (long)(0xffffffffL)
x <= (0xffffffffL)

It's a hexadecimal literal, so its type can be unsigned. It fits in unsigned long, so that's the type it gets. See section 6.4.4.1 of the standard:
The type of an integer constant is the first of the corresponding list in which its value can
be represented.
where the list for hexadecimal literals with a suffix L is
long
unsigned long
long long
unsigned long long
Since it doesn't fit in a 32-bit signed long, but an unsigned 32-bit unsigned long, that's what it becomes.

The thing is that the rules of determining the type of the integral literal are different depending on whether you have a decimal number or a hexadecimal(or octal number). A decimal literal is always signed unless postfixes with U. A hexadecimal or octal literal can also be unsigned if the signed type can not contain the value.

Related

How can I confirm the range of unsigned long integer in C?

unsigned long has 8 bytes on my Linux gcc.
unsigned long long has 8 bytes on my Linux gcc, too.
So I think the range of integers they can show is from 0 min to (2^64 - 1)max.
Now I want to confirm if I'm correct.
Here is my code:
#include <stdio.h>
int main(void)
{
printf("long takes up %d bytes:\n", sizeof(long));
printf("long long takes up %d bytes:\n", sizeof(long long));
unsigned long a = 18446744073709551615;
a++;
printf("a + 1 = %lu\n", a);
unsigned long long b = 18446744073709551615;
b++;
printf("b + 1 = %llu\n", b);
return 0;
}
However, the code cannot be compiled and I get the following warning:
warning: integer constant is so large that it is unsigned
Where did I do wrong? How can I modify the code ?
When you initialize num, you can append the "UL" for unsigned long and ULL for unsigned long long.
For example:
unsigned long a = 18446744073709551615UL;
unsigned long long b = 18446744073709551615ULL;
Also, use %zu instead of %d because sizeof return size_t.
According to cppreference:
integer-suffix, if provided, may contain one or both of the following (if both are provided, they may appear in any order:
unsigned-suffix (the character u or the character U)
long-suffix (the
character l or the character L) or the long-long-suffix (the character
sequence ll or the character sequence LL) (since C99)
C standard 5.2.4.2.1 Sizes of integer types <limits.h> :
1 The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Moreover, except for
CHAR_BIT and MB_LEN_MAX, the following shall be replaced by
expressions that have the same type as would an expression that is an
object of the corresponding type converted according to the integer
promotions. Their implementation-defined values shall be equal or
greater in magnitude (absolute value) to those shown, with the same
sign.
You find some useful definitions in <limits.h>.
Initialize unsigned numbers with -1. This will automatically be MAX value in C.
#include <stdio.h>
int main(void)
{
printf("long takes up %d bytes:\n", sizeof(long));
printf("long long takes up %d bytes:\n", sizeof(long long));
unsigned long a = -1;
printf("a = %lu\n", a);
unsigned long long b = -1;
printf("b = %llu\n", b);
return 0;
}
Update: Changed the code based on comments :)
How can I confirm the range of unsigned long integer in C?
Best, just use the macros from <limits.h>. It better self documents code's intent.
unsigned long long b_max = ULLONG_MAX;
Alternatively, assign -1 to the unsigned type. As -1 is not in the range of an unsigned type, it will get converted to the target type by adding the MAX value of that type plus 1. The works even on rare machines that have padding.
... if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. C11dr §6.3.1.3 2
The min values is of course 0 for an unsigned type.
unsigned long long b_min = 0;
unsigned long long b_max = -1;
printf("unsigned long long range [%llu %llu]\n", b_min, b_max);
Note that picky compilers will complain about assigning an out-of-range value with b_max = -1;. Use ULLONG_MAX.
Where did I do wrong?
The warning "warning: integer constant is so large that it is unsigned" is due to 18446744073709551615 is a integer decimal constant outside the long long range on your platform. Unadorned decimal constants are limited to that. Append a U or u. Then the compiler will consider unsigned long long.
unsigned long long b = 18446744073709551615u;
Further, there is no C spec that says 18446744073709551615 is the max value of unsigned long long. It must be at least that. It could be larger. So assigning b = 18446744073709551615u may not assign the max value.
How can I modify the code ?
Shown above
As rsp stated you can specify the type of the literal with UL and ULL.
But this won't lead to a conclusive result in your code for the arithmetics.
The value your print will always be 0 because
2^64 % 64 = 0 // 64 = 8 byte
2^64 % 32 = 0 // 32 = 4 byte
2^64 % 16 = 0 // 16 = 2 byte
as you can see the variable size always doubles so if you us a wrapping number for 8 bytes it just wraps multiple types on the smaller sizes and yields the same result.
The sizeof will show you the right values.
But generally you want to check for these things in code and not on output so you could use limits.h as suggested by Arndt Jonasson.
or you can use static_assert to check at compile time.

The modulo operation doesn't seem to work on a 64-bit value of all ones

So... the modulo operation doesn't seem to work on a 64-bit value of all ones.
Here is my C code to set up the edge case:
#include <stdio.h>
int main(int argc, char *argv[]) {
long long max_ll = 0xFFFFFFFFFFFFFFFF;
long long large_ll = 0x0FFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("\n64-bit numbers:\n");
printf("0x%016llX\n", max_ll % mask_ll);
printf("0x%016llX\n", large_ll % mask_ll);
long max_l = 0xFFFFFFFF;
long large_l = 0x0FFFFFFF;
long mask_l = 0x00000F00;
printf("\n32-bit numbers:\n");
printf("0x%08lX\n", max_l % mask_l);
printf("0x%08lX\n", large_l % mask_l);
return 0;
}
The output shows this:
64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF
32-bit numbers:
0xFFFFFFFF
0x000000FF
What is going on here?
Why doesn't modulo work on a 64-bit value of all ones, but it will on a 32-bit value of all ones?
It this a bug with the Intel CPU? Or with C somehow? Or is it something else?
More Info
I'm on a Windows 10 machine with an Intel i5-4570S CPU. I used the cl compiler from Visual Studio 2015.
I also verified this result using the Windows Calculator app (Version 10.1601.49020.0) by going into the Programmer mode. If you try to modulus 0xFFFF FFFF FFFF FFFF with anything, it just returns itself.
Specifying unsigned vs signed didn't seem to make any difference.
Please enlighten me :) I actually did have a use case for this operation... so it's not purely academic.
Your program causes undefined behaviour by using the wrong format specifier.
%llX may only be used for unsigned long long. If you use the right specifier, %lld then the apparent mystery will go away:
#include <stdio.h>
int main(int argc, char* argv[])
{
long long max_ll = 0xFFFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("%lld %% %lld = %lld\n", max_ll, mask_ll, max_ll % mask_ll);
}
Output:
-1 % 16492674416640 = -1
In ISO C the definition of the % operator is such that (a/b)*b + a%b == a. Also, for negative numbers, / follows "truncation towards zero".
So -1 / 16492674416640 is 0, therefore -1 % 16492674416640 must be -1 to make the above formula work.
As discussed in comments, the following line:
long long max_ll = 0xFFFFFFFFFFFFFFFF;
causes implementation-defined behaviour (assuming that your system has long long as a 64-bit type). The constant 0xFFFFFFFFFFFFFFFF has type unsigned long long, and it is out of range for long long whose maximum permitted value is 0x7FFFFFFFFFFFFFFF.
When an out-of-range assignment is made to a signed type, the behaviour is implementation-defined, which means the compiler documentation must say what happens.
Typically, this will be defined as generating the value which is in range of long long and has the same representation as the unsigned long long constant has. In 2's complement , (long long)-1 has the same representation as the unsigned long long value 0xFFFFFFFFFFFFFFFF, which explains why you ended up with max_ll holding the value -1.
Actually it does make a difference whether the values are defined as signed or unsigned:
#include <stdio.h>
#include <limits.h>
int main(void) {
#if ULLONG_MAX == 0xFFFFFFFFFFFFFFFF
long long max_ll = 0xFFFFFFFFFFFFFFFF; // converts to -1LL
long long large_ll = 0x0FFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("\n" "signed 64-bit numbers:\n");
printf("0x%016llX\n", max_ll % mask_ll);
printf("0x%016llX\n", large_ll % mask_ll);
unsigned long long max_ull = 0xFFFFFFFFFFFFFFFF;
unsigned long long large_ull = 0x0FFFFFFFFFFFFFFF;
unsigned long long mask_ull = 0x00000F0000000000;
printf("\n" "unsigned 64-bit numbers:\n");
printf("0x%016llX\n", max_ull % mask_ull);
printf("0x%016llX\n", large_ull % mask_ull);
#endif
#if UINT_MAX == 0xFFFFFFFF
int max_l = 0xFFFFFFFF; // converts to -1;
int large_l = 0x0FFFFFFF;
int mask_l = 0x00000F00;
printf("\n" "signed 32-bit numbers:\n");
printf("0x%08X\n", max_l % mask_l);
printf("0x%08X\n", large_l % mask_l);
unsigned int max_ul = 0xFFFFFFFF;
unsigned int large_ul = 0x0FFFFFFF;
unsigned int mask_ul = 0x00000F00;
printf("\n" "unsigned 32-bit numbers:\n");
printf("0x%08X\n", max_ul % mask_ul);
printf("0x%08X\n", large_ul % mask_ul);
#endif
return 0;
}
Produces this output:
signed 64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF
unsigned 64-bit numbers:
0x000000FFFFFFFFFF
0x000000FFFFFFFFFF
signed 32-bit numbers:
0xFFFFFFFF
0x000000FF
unsigned 32-bit numbers:
0x000000FF
0x000000FF
64 bit hex constant 0xFFFFFFFFFFFFFFFF has value -1 when stored into a long long. This is actually implementation defined because of out of range conversion into a signed type, but on Intel processors, with current compilers, the conversion just keeps the same bit pattern.
Note that you are not using the fixed size integers defined in <stdint.h>: int64_t, uint64_t, int32_t and uint32_t. long long types are specified in the standard as having at least 64 bits, and on Intel x86_64, they do, and long has at least 32 bits, but for the same processor, the size differs between environments: 32 bits in Windows 10 (even in 64 bit mode) and 64 bits on MaxOS/10 and linux64. This is the reason why you observe surprising behavior for the long case where unsigned and signed may produce the same result. They don't on Windows, but they do in linux and MacOS because the computation is done in 64 bits and these values are just positive numbers.
Also note that LLONG_MIN / -1 and LLONG_MIN % -1 both invoke undefined behavior because of signed arithmetic overflow, and this one is not ignored on Intel PCs, it usually fires an uncaught exception and exits the program, just like 1 / 0 and 1 % 0.
Try putting unsigned before your long long. As a signed number, your 0xFF...FF is actually -1 on most platforms.
Also, in your code, your 32-bit numbers are still 64-bits (you have them declared as long long as well).

long long division in c

am trying to pass along a "long long" number, the problem is when I try to divise this number by 10 , the answer is incorrect ..
#include <stdio.h>
int main(void)
{
int n = 0 ;
long long s = 4111111111111111;
n = s % 10 ;
printf("n after modulos %i\n",n );
s = s / 10 ;
printf("this is s after division %llo \n",s );
return 0;
}
Output :
n after modulos 1
this is s after division 13536350357330707
printf("this is s after division %llo \n",s );
^ (this prints (correct)value in octal representation)
Use specifier %lld (to get value in decimal ) .
man 3 printf
o, u, x, X
The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u), or unsigned hexadecimal (x and X) notation. The letters abcdef are used for x conversions; the letters ABCDEF are
used for X conversions. The precision, if any, gives the minimum number of digits that must appear; if the converted value requires fewer digits, it is padded on the left with zeros. The default
precision is 1. When 0 is printed with an explicit precision 0, the output is empty.
and some more on manual
man 3p printf
ll (ell-ell)
Specifies that a following d, i, o, u, x, or X conversion specifier applies to a long long or unsigned long long argument; or that a following n conversion specifier applies to a pointer to a long
long argument.
To print a long long, you should use %lld not %llo
The %llo format is used for representing a long long value in octal.

C - copy characters ASCII value to 64bits integer

As we all know, each printable character has its ascii value. I'm trying to 8 characters' ascii value to 64 bits integer, but it only copies 32 bits.
char * ch = "AAAABBBB";
unsigned long int i;
//copy charater's ascii to 64 bits int
memcpy(&i, ch, 8);
printf("integer hold: 0x%x\n", i);
Is there something wrong with this code?
Output I expect was:
integer hold: 0x4141414142424242
but output was:
integer hold: 0x41414141
If unsigned long is indeed a 64-bit type (you can output sizeof(unsigned long) to check this), you still need to use %lx format string to print it.
If unsigned long is 32 bits, you'll probably have to resort to unsigned long long and use the %llx format string.
From C11 7.20.6.1 The fprintf function:
o,u,x,X The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u), or unsigned hexadecimal notation (x or X) in the style dddd; the letters abcdef are used for x conversion and the letters ABCDEF for X conversion. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it is expanded with leading zeros. The default precision is 1. The result of converting a zero value with a precision of zero is no characters.
l (ell): Specifies that a following d, i, o, u, x, or X conversion specifier applies to a long int or unsigned long int argument.
ll (ell-ell): Specifies that a following d, i, o, u, x, or X conversion specifier applies to a long long int or unsigned long long int argument.
long versus long long.
In VC++ the long datatype is still only 32 bits.
And of course, the printf format %x is used for int which is 32 bits on most platforms. You want %llx (or possibly %lx if your long already is 64 bits).
%x is used to print unsigned int value & its 32 bit that's why you are getting such result.
%d %i Decimal signed integer.
%o Octal integer.
%x %X Hex integer.
%u Unsigned integer.
%c Character.
%s String.
%f double
%e %E double.
%g %G double.
%p pointer.
& if you want to print the remaining data try ...
int *p;
p=(char *)(&i)+4;
printf("integer hold: 0x%x\n", i);
printf("integer hold: 0x%x\n",*p);

What type-conversions are happening?

#include "stdio.h"
int main()
{
int x = -13701;
unsigned int y = 3;
signed short z = x / y;
printf("z = %d\n", z);
return 0;
}
I would expect the answer to be -4567. I am getting "z = 17278".
Why does a promotion of these numbers result in 17278?
I executed this in Code Pad.
The hidden type conversions are:
signed short z = (signed short) (((unsigned int) x) / y);
When you mix signed and unsigned types the unsigned ones win. x is converted to unsigned int, divided by 3, and then that result is down-converted to (signed) short. With 32-bit integers:
(unsigned) -13701 == (unsigned) 0xFFFFCA7B // Bit pattern
(unsigned) 0xFFFFCA7B == (unsigned) 4294953595 // Re-interpret as unsigned
(unsigned) 4294953595 / 3 == (unsigned) 1431651198 // Divide by 3
(unsigned) 1431651198 == (unsigned) 0x5555437E // Bit pattern of that result
(short) 0x5555437E == (short) 0x437E // Strip high 16 bits
(short) 0x437E == (short) 17278 // Re-interpret as short
By the way, the signed keyword is unnecessary. signed short is a longer way of saying short. The only type that needs an explicit signed is char. char can be signed or unsigned depending on the platform; all other types are always signed by default.
Short answer: the division first promotes x to unsigned. Only then the result is cast back to a signed short.
Long answer: read this SO thread.
The problems comes from the unsigned int y. Indeed, x/y becomes unsigned. It works with :
#include "stdio.h"
int main()
{
int x = -13701;
signed int y = 3;
signed short z = x / y;
printf("z = %d\n", z);
return 0;
}
Every time you mix "large" signed and unsigned values in additive and multiplicative arithmetic operations, unsigned type "wins" and the evaluation is performed in the domain of the unsigned type ("large" means int and larger). If your original signed value was negative, it first will be converted to positive unsigned value in accordance with the rules of signed-to-unsigned conversions. In your case -13701 will turn into UINT_MAX + 1 - 13701 and the result will be used as the dividend.
Note that the result of signed-to-unsigned conversion on a typical 32-bit int platform will result in unsigned value 4294953595. After division by 3 you'll get 1431651198. This value is too large to be forced into a short object on a platform with 16-bit short type. An attempt to do that results in implementation-defined behavior. So, if the properties of your platform are the same as in my assumptions, then your code produces implementation-defined behavior. Formally speaking, the "meaningless" 17278 value you are getting is nothing more than a specific manifestation of that implementation-defined behavior. It is possible, that if you compiled your code with overflow checking enabled (if your compiler supports them), it would trap on the assignment.

Resources