Printing unsigned long long using %d - c

Why do I get -1 when I print the following?
unsigned long long int largestIntegerInC = 18446744073709551615LL;
printf ("largestIntegerInC = %d\n", largestIntegerInC);
I know I should use llu instead of d, but why do I get -1 instead of 18446744073709551615LL?
Is it because of overflow?

In C (99), LLONG_MAX, the maximum value of long long int type is guaranteed to be at least 9223372036854775807. The maximum value of an unsigned long long int is guaranteed to be at least 18446744073709551615, which is 264−1 (0xffffffffffffffff).
So, initialization should be:
unsigned long long int largestIntegerInC = 18446744073709551615ULL;
(Note the ULL.) Since largestIntegerInC is of type unsigned long long int, you should print it with the right format specifier, which is "%llu":
$ cat test.c
#include <stdio.h>
int main(void)
{
unsigned long long int largestIntegerInC = 18446744073709551615ULL;
/* good */
printf("%llu\n", largestIntegerInC);
/* bad */
printf("%d\n", largestIntegerInC);
return 0;
}
$ gcc -std=c99 -pedantic test.c
test.c: In function ‘main’:
test.c:9: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long long unsigned int’
The second printf() above is wrong, it can print anything. You are using "%d", which means printf() is expecting an int, but gets a unsigned long long int, which is (most likely) not the same size as int. The reason you are getting -1 as your output is due to (bad) luck, and the fact that on your machine, numbers are represented using two's complement representation.
To see how this can be bad, let's run the following program:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
int main(int argc, char *argv[])
{
const char *fmt;
unsigned long long int x = ULLONG_MAX;
unsigned long long int y = 42;
int i = -1;
if (argc != 2) {
fprintf(stderr, "Need format string\n");
return EXIT_FAILURE;
}
fmt = argv[1];
printf(fmt, x, y, i);
putchar('\n');
return 0;
}
On my Macbook, running the program with "%d %d %d" gives me -1 -1 42, and on a Linux machine, the same program with the same format gives me -1 42 -1. Oops.
In fact, if you are trying to store the largest unsigned long long int number in your largestIntegerInC variable, you should include limits.h and use ULLONG_MAX. Or you should store assing -1 to your variable:
#include <limits.h>
#include <stdio.h>
int main(void)
{
unsigned long long int largestIntegerInC = ULLONG_MAX;
unsigned long long int next = -1;
if (next == largestIntegerInC) puts("OK");
return 0;
}
In the above program, both largestIntegerInC and next contain the largest possible value for unsigned long long int type.

It's because you're passing a number with all the bits set to 1. When interpreted as a two's complement signed number, that works out to -1. In this case, it's probably only looking at 32 of those one bits instead of all 64, but that doesn't make any real difference.

In two's complement arithmetic, the signed value -1 is the same as the largest unsigned value.
Consider the bit patterns for negative numbers in two's complement (I'm using 8 bit integers, but the pattern applies regardless of the size):
0 - 0x00
-1 - 0xFF
-2 - 0xFE
-3 - 0xFD
So, you can see that negative 1 has the bit pattern of all 1's which is also the bit pattern for the largest unsigned value.

You used a format for a signed 32-bit number, so you got -1. printf() can't tell internally how big the number you passed in is, so it just pulls the first 32 bits from the varargs list and uses them as the value to be printed out. Since you gave a signed format, it prints it that way, and 0xffffffff is the two's complement representation of -1.

You can (should) see why in compiler warning. If not, try to set the highest warning level. With VS I've got this warning: warning C4245: 'initializing' : conversion from '__int64' to 'unsigned __int64', signed/unsigned mismatch.

No, there is no overflow. It's because it isn't printing the entire value:
18446744073709551615 is the same as 0xFFFFFFFFFFFFFFFF. When printf %d processes that, it grabs only 32 bits (or 64 bits if it's a 64-bit CPU) for conversion, and those are the signed value -1.
If the printf conversion had been %u instead, it would show either 4294967295 (32 bits) or 18446744073709551615 (64 bits).
An overflow is when a value increases to the point where it won't fit in the storage allocated. In this case, the value is allocated just fine, but isn't being completely retrieved.

Related

Why does unsigned short (0xffff) print 65,535 and unsigned int (0xffffffff) print -1 in C?

I think the title explains pretty well what I'm asking so here is my code.
#include <stdio.h>
unsigned short u_short = 0xffff;
unsigned int u_int = 0xffffffff;
int main(){
printf("unsigned short = %d\n", u_short);
printf("unsigned int = %d\n", u_int);
return 0;
}
Here is my printout.
printout picture
printf("unsigned int = %d\n", u_int); is undefined behavior (UB) when u_int is out of the positive int range. Do not used "%d" to print unsigned.
Use printf("unsigned int = %u\n", u_int);
This is likely what happened in your C implementation:
In printf("unsigned short = %d\n", u_short);, the unsigned short value 65,535 is automatically converted to an int with the same value.1,2
The int value 65,535 is passed to printf, which formats it as “65535” due to the %d conversion specification.
In printf("unsigned int = %d\n", u_int);, the unsigned int value 4,294,967,295 is passed to printf; it is not converted to an int. As an unsigned int, 4,294,967,295 is represented with 32 one bits.
Because of the %d conversion specification, printf seeks an int value that was passed as an argument. For this, it finds the bits passed for your unsigned int, because an unsigned int and an int are passed in the same place in your C implementation, so the printf looking for an int finds the bits in the same place the calling routine put the unsigned int bits.3
When interpreted as an int type, these bits, 32 ones, represent the value −1.3 Given the −1 value, printf formats it as “-1” due to the %d conversion specification.
Footnotes
1 In many places in expressions, including arguments corresponding to ... of a function declaration, values of types narrower than int are automatically promoted to int, as part of the integer promotions.
2 A C implementation could have an unsigned short as wide as an int, in which case this conversion would not occur. That is rare these days.
3 This is a description of what likely happened in your C implementation. The behavior is not defined by the C standard and may vary in other C implementations or even in different programs in your C implementation.
printf has some anomalies due to the usual argument promotions. In particular, arguments of type char and short are promoted to int when passing them to printf. Usually this is fine, but sometimes it results in surprises like these. What you get when you promote an unsigned 16-bit 0xffff to 32 bits is not 0xffffffff.
printf has some relatively little-known and relatively rarely-used modifiers to, in effect, undo those promotions and print char and short arguments as what they "really were". So you'll see more-consistent results if you tell printf that you were actually passing a short, like this:
printf("unsigned short = %hd\n", u_short);
printf("unsigned int = %d\n", u_int);
Now printf knows that the argument in the first call was really a short, so it treats it as such. On my machine, this now prints
unsigned short = -1
unsigned int = -1
(Now, with that said, it's arguably a bad idea to print unsigned integers with %d, as the other answers and comments have explained.)

How does printf know when it's being passed a signed int

I'm trying to figure out how variables really work in C and find it strange how the printf function seems to know the difference between different variables of the same size, I'm assuming they're both 16 bits.
#include <stdio.h>
int main(void) {
unsigned short int positive = 0xffff;
short int negative = 0xffff;
printf("%d\n%d\n", positive, negative);
return 0;
}
Output:
65535
-1
I think we have to more carefully distinguish between the type conversions on the different integer types on one hand and the printf format specifiers (allowing to force printf how to interpret the data) on the other hand.
Systematically:
printf("%hd %hd\n", positive, negative);
// gives: -1 -1
Both values are interpreted as signed short int by printf, regardless of the declaration.
printf("%hu %hu\n", positive, negative);
// gives: 65535 65535
Both values are interpreted as unsigned short int by printf, regardless of the declaration.
However,
printf("%d %d\n", positive, negative);
// gives: 65535 -1
Both values are implicitly converted to (a longer) int, while the sign is kept.
Finally,
printf("%u %u\n", positive, negative);
// gives 65535 4294967295
Again, both values are implicitly converted to int, while the sign is kept, but then the negative value is interpreted as unsigned. As we can see here, plain int is actually 32-bit (on this system).
Curiously, only if I compile with gcc and -Wpedantic, it gives me a warning for the assignment short int negative = 0xffff;.

What is the result of 'wrapping' a multiplication overflow?

The bcache source here contains the following line:
schedule_delayed_work(&dc->writeback_rate_update,
dc->writeback_rate_update_seconds * HZ);
writeback_rate_update_seconds is defined as unsigned int, which appears to be 32 bit on x86_64, and I am not sure what type HZ has but I believe the value is 1000 and assume it is 32-bit or less.
If I set writeback_rate_update_seconds to 2147483647, what value actually gets passed to schedule_delayed_work? The second parameter of schedule_delayed_work appears to be a long, but that won't mean the operands are promoted to long prior to the multiplication overflow, will it?
Given:
#include <stdio.h>
#include <stdlib.h>
int schedule_delayed_work( unsigned long param )
{
printf("value: %lu\n", param);
return 0;
}
int main(int argc, char **argv)
{
unsigned int writeback_rate_update_seconds;
unsigned int HZ;
writeback_rate_update_seconds = 2147483647;
HZ = 1000;
schedule_delayed_work( writeback_rate_update_seconds * HZ );
return 0;
}
You will get 4294966296 passed to the function.
If you change the function call to cast:
schedule_delayed_work( (unsigned long) writeback_rate_update_seconds * HZ );
... you will get 2147483647000 passed to the function.
I've not looked in the C standard to see what the standard behaviour is, but this was tested with:
Apple LLVM version 8.1.0 (clang-802.0.38)
Target: x86_64-apple-darwin16.7.0
If both operands fit in unsigned int (if HZ is constant 1000, it is of type int and fits in unsigned int) they're promoted to unsigned int. With unsigned integers the overflow is well-defined; the resulting value is the value of calculation modulo (UINT_MAX plus one). That is, the maximum result is UINT_MAX; UINT_MAX + 1 will result in 0, UINT_MAX + 2 will result in 1 and so on.
The type of the receiver (here, the type of the argument that receives the result) doesn't matter at all. To avoid wraparounds, cast one of the arguments as a wider integer type (for example unsigned long is 64 bits in 64-bit Linux; or even better, use a fixed-width type such as uint64_t).

The modulo operation doesn't seem to work on a 64-bit value of all ones

So... the modulo operation doesn't seem to work on a 64-bit value of all ones.
Here is my C code to set up the edge case:
#include <stdio.h>
int main(int argc, char *argv[]) {
long long max_ll = 0xFFFFFFFFFFFFFFFF;
long long large_ll = 0x0FFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("\n64-bit numbers:\n");
printf("0x%016llX\n", max_ll % mask_ll);
printf("0x%016llX\n", large_ll % mask_ll);
long max_l = 0xFFFFFFFF;
long large_l = 0x0FFFFFFF;
long mask_l = 0x00000F00;
printf("\n32-bit numbers:\n");
printf("0x%08lX\n", max_l % mask_l);
printf("0x%08lX\n", large_l % mask_l);
return 0;
}
The output shows this:
64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF
32-bit numbers:
0xFFFFFFFF
0x000000FF
What is going on here?
Why doesn't modulo work on a 64-bit value of all ones, but it will on a 32-bit value of all ones?
It this a bug with the Intel CPU? Or with C somehow? Or is it something else?
More Info
I'm on a Windows 10 machine with an Intel i5-4570S CPU. I used the cl compiler from Visual Studio 2015.
I also verified this result using the Windows Calculator app (Version 10.1601.49020.0) by going into the Programmer mode. If you try to modulus 0xFFFF FFFF FFFF FFFF with anything, it just returns itself.
Specifying unsigned vs signed didn't seem to make any difference.
Please enlighten me :) I actually did have a use case for this operation... so it's not purely academic.
Your program causes undefined behaviour by using the wrong format specifier.
%llX may only be used for unsigned long long. If you use the right specifier, %lld then the apparent mystery will go away:
#include <stdio.h>
int main(int argc, char* argv[])
{
long long max_ll = 0xFFFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("%lld %% %lld = %lld\n", max_ll, mask_ll, max_ll % mask_ll);
}
Output:
-1 % 16492674416640 = -1
In ISO C the definition of the % operator is such that (a/b)*b + a%b == a. Also, for negative numbers, / follows "truncation towards zero".
So -1 / 16492674416640 is 0, therefore -1 % 16492674416640 must be -1 to make the above formula work.
As discussed in comments, the following line:
long long max_ll = 0xFFFFFFFFFFFFFFFF;
causes implementation-defined behaviour (assuming that your system has long long as a 64-bit type). The constant 0xFFFFFFFFFFFFFFFF has type unsigned long long, and it is out of range for long long whose maximum permitted value is 0x7FFFFFFFFFFFFFFF.
When an out-of-range assignment is made to a signed type, the behaviour is implementation-defined, which means the compiler documentation must say what happens.
Typically, this will be defined as generating the value which is in range of long long and has the same representation as the unsigned long long constant has. In 2's complement , (long long)-1 has the same representation as the unsigned long long value 0xFFFFFFFFFFFFFFFF, which explains why you ended up with max_ll holding the value -1.
Actually it does make a difference whether the values are defined as signed or unsigned:
#include <stdio.h>
#include <limits.h>
int main(void) {
#if ULLONG_MAX == 0xFFFFFFFFFFFFFFFF
long long max_ll = 0xFFFFFFFFFFFFFFFF; // converts to -1LL
long long large_ll = 0x0FFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("\n" "signed 64-bit numbers:\n");
printf("0x%016llX\n", max_ll % mask_ll);
printf("0x%016llX\n", large_ll % mask_ll);
unsigned long long max_ull = 0xFFFFFFFFFFFFFFFF;
unsigned long long large_ull = 0x0FFFFFFFFFFFFFFF;
unsigned long long mask_ull = 0x00000F0000000000;
printf("\n" "unsigned 64-bit numbers:\n");
printf("0x%016llX\n", max_ull % mask_ull);
printf("0x%016llX\n", large_ull % mask_ull);
#endif
#if UINT_MAX == 0xFFFFFFFF
int max_l = 0xFFFFFFFF; // converts to -1;
int large_l = 0x0FFFFFFF;
int mask_l = 0x00000F00;
printf("\n" "signed 32-bit numbers:\n");
printf("0x%08X\n", max_l % mask_l);
printf("0x%08X\n", large_l % mask_l);
unsigned int max_ul = 0xFFFFFFFF;
unsigned int large_ul = 0x0FFFFFFF;
unsigned int mask_ul = 0x00000F00;
printf("\n" "unsigned 32-bit numbers:\n");
printf("0x%08X\n", max_ul % mask_ul);
printf("0x%08X\n", large_ul % mask_ul);
#endif
return 0;
}
Produces this output:
signed 64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF
unsigned 64-bit numbers:
0x000000FFFFFFFFFF
0x000000FFFFFFFFFF
signed 32-bit numbers:
0xFFFFFFFF
0x000000FF
unsigned 32-bit numbers:
0x000000FF
0x000000FF
64 bit hex constant 0xFFFFFFFFFFFFFFFF has value -1 when stored into a long long. This is actually implementation defined because of out of range conversion into a signed type, but on Intel processors, with current compilers, the conversion just keeps the same bit pattern.
Note that you are not using the fixed size integers defined in <stdint.h>: int64_t, uint64_t, int32_t and uint32_t. long long types are specified in the standard as having at least 64 bits, and on Intel x86_64, they do, and long has at least 32 bits, but for the same processor, the size differs between environments: 32 bits in Windows 10 (even in 64 bit mode) and 64 bits on MaxOS/10 and linux64. This is the reason why you observe surprising behavior for the long case where unsigned and signed may produce the same result. They don't on Windows, but they do in linux and MacOS because the computation is done in 64 bits and these values are just positive numbers.
Also note that LLONG_MIN / -1 and LLONG_MIN % -1 both invoke undefined behavior because of signed arithmetic overflow, and this one is not ignored on Intel PCs, it usually fires an uncaught exception and exits the program, just like 1 / 0 and 1 % 0.
Try putting unsigned before your long long. As a signed number, your 0xFF...FF is actually -1 on most platforms.
Also, in your code, your 32-bit numbers are still 64-bits (you have them declared as long long as well).

when storing a large positive integer it is converting to different negative number

When i storing a large positive integer in unsigned long int in c, then it unknowingly converting to negative number.
for example
a=2075000020, b=100000000,here a+b=-2119967266.
Please help me understand.
You cannot have been printing an unsigned integer, because it has printed a sign. Even if you declare the variable as unsigned, once it is on the stack for printf() to use, it is interpreted as a binary value to be used as specified by the format in printf().
Note the difference between these, and the results. In the the third example, you can see that bit 31 is set, which is the sign bit for signed long int.
#include <stdio.h>
#include <string.h>
int main () {
unsigned long int a=2075000020, b=100000000, c;
c = a + b;
printf ("Signed %ld\n", c);
printf ("Unsigned %lu\n", c);
printf ("Hexadecimal 0x%lX\n", c);
return 0;
}
Program output:
Signed -2119967276
Unsigned 2175000020
Hexadecimal 0x81A3DDD4

Resources