I am using gcc to test some simple casts between float to unsigned int.
The following piece of code gives the result 0.
const float maxFloat = 4294967295.0;
unsigned int a = (unsigned int) maxFloat;
printf("%u\n", a);
0 is printed (which I belive is very strange).
On the other hand the following piece of code:
const float maxFloat = 4294967295.0;
unsigned int a = (unsigned int) (signed int) maxFloat;
printf("%u\n", a);
prints 2147483648 which I belive is the correct results.
What happens that I get 2 different results?
If you first do this:
printf("%f\n", maxFloat);
The output you'll get is this:
4294967296.000000
Assuming a float is implemented as an IEEE754 single precision floating point type, the value 4294967295.0 cannot be represented exactly by this type because there's aren't enough bits of precision. The closest value it can store is 4294967296.0.
Assuming an int (and likewise unsigned int) is 32 bits, the value 4294967296.0 is outside the range of both of these types. Converting a floating point type to an integer type when the value cannot be represented in the given integer type invokes undefined behavior.
This is detailed in section 6.3.1.4 of the C standard which dictates conversion from floating point types to integer types:
1 When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e.,
the value is truncated toward zero). If the value of the integral part
cannot be represented by the integer type, the behavior is undefined.61)
...
61) The remaindering operation performed when a value of integer type
is converted to unsigned type need not be performed when a value of
real floating type is converted to unsigned type. Thus, the range of
portable real floating values is (−1, Utype_MAX+1).
The footnote in the above passage is referencing section 6.3.1.3, which details integer to integer conversions:
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new
type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
The behavior you see in the first code snippet is consistent with an out-of-range conversion to an unsigned type when the value in question is an integer, however because the value being converted has a floating point type it is undefined behavior.
Just because one implementation does this doesn't mean that all will. In fact, gcc gives a different result if you change the optimization settings.
For example, on my machine using gcc 5.4.0, given this code:
float n = 4294967296;
printf("n=%f\n", n);
unsigned int a = (unsigned int) n;
int b = (signed int) n;
unsigned int c = (unsigned int) (signed int) n;
printf("a=%u\n", a);
printf("b=%d\n", b);
printf("c=%u\n", c);
I get the following results with -O0:
n=4294967296.000000
a=0
b=-2147483648
c=2147483648
And this with -O1:
n=4294967296.000000
a=4294967295
b=2147483647
c=2147483647
If on the other hand n is defined as long or long long, you would always get this output:
n=4294967296
a=0
b=0
c=0
The conversion to unsigned is well defined by the C standard as sited above, and the conversion to signed is implementation defined, which gcc defines as follows:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object
of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N
to be within range of the type; no signal is raised.
Assuming IEEE 754 floating point numbers, the number 4294967295.0 can't be stored exactly in a float. It will be stored as 4294967296.0 instead (which is 232).
Further assuming your unsigned int has 32 value bits, this is just by one too large to fit in an unsigned int, so the result of the conversion is undefined according to the C standard -- 0 is a "reasonable" outcome.
In your second case, you have undefined behavior as well, and I have no theory what's happening here on the representation level. Fact is, the number is much too large for a 32 bit signed int (still assuming this is what your machine uses).
From this remark in your question:
prints 2147483648 which I belive is the correct results.
I assume you wanted to see the representation of your float in memory. Casting will convert the value, so that's not the way to see the representation. The following code would do:
int main(void) {
const float maxFloat = 4294967295.0;
unsigned char *floatBytes = &maxFloat;
for (int i=0; i < sizeof maxFloat; ++i)
{
printf("0x%02x ", floatBytes[i]);
}
puts("");
}
online example
Related
I have been reading KnR lately and have come across a statement:
"Type conversions of Expressions involving unsigned type values are
complicated"
So to understand this i have written a very small code:
#include<stdio.h>
int main()
{
signed char x = -1;
printf("Signed : %d\n",x);
printf("Unsigned : %d\n",(unsigned)x);
}
signed x bit form will be : 1000 0001 = -1
when converted to unsigned its value has to be 1000 0001 = 129.
But even after type conversion it prints the -1.
Note: I'm using gcc compiler.
C11 6.3.1.3, paragraph 2:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
−1 cannot be represented as an unsigned int value, the −1 is converted to UINT_MAX. Thus -1 becomes a very large positive integer.
Also, use %u for unsigned int, otherwise result invoked undefined behaviour.
C semantics regarding conversion of integers is defined in terms of the mathematical values, not in terms of the bits. It is good to understand how values are represented as bits, but, when analyzing expressions, you should think about values.
In this statement:
printf("Signed : %d\n",x);
the signed char x is automatically promoted to int. Since x is −1, the new int also has the value −1, and printf prints “-1”.
In this statement:
printf("Unsigned : %d\n",(unsigned)x);
the signed char x is automatically promoted to int. The value is still −1. Then the cast converts this to unsigned. The rule for conversion to unsigned is that UINT_MAX+1 is added to or subtracted from the value as needed to bring it into the range of unsigned. In this case, adding UINT_MAX+1 to −1 once brings the value to UINT_MAX, which is within range. So the result of the conversion is UINT_MAX.
However, this unsigned value is then passed to printf to be printed with the %d conversion. That violates C 2018 7.21.6.1 9, which says the behavior is undefined if the types do not match.
This means a C implementation is allowed to do anything. In your case, it seems what happened is:
The unsigned value UINT_MAX is represented with all one bits.
The printf interpreted the all-one-bits value as an int.
Your implementation uses two’s complement for int types.
In two’s complement, an object with all one bits represents −1.
So printf printed “-1”.
If you had used this correct code instead:
printf("Unsigned : %u\n",(unsigned)x);
then printf would print the value of UINT_MAX, which is likely 4,294,967,295, so printf would print “4294967295”.
Why does casting a double 728.3 to an unsigned char produce zero? 728 is 0x2D8, so shouldn't w be 0xD8 (216)?
int w = (unsigned char)728.3;
int x = (int)728.3;
int y = (int)(unsigned char)728.3;
int z = (unsigned char)(int)728.3;
printf( "%i %i %i %i", w, x, y, z );
// prints 0 728 0 216
From the C standard 6.3.1.4p1:
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
So, unless you have >=10 bit unsigned char, your code invokes undefined behaviour.
Note that the cast explicitly tells the compiler you know what you are doing, thus suppresses a warning.
Supposing that unsigned char has 8 value bits, as is nearly (but not completely) certain for your implementation, the behavior of converting the double value 728.3 to type unsigned char is undefined, as specified by paragraph 6.3.1.4/1 of the standard:
When a finite value of real floating type is converted to an integer
type other than _Bool, the fractional part is discarded (i.e., the
value is truncated toward zero). If the value of the integral part
cannot be represented by the integer type, the behavior is undefined.
This applies to both your w and your y. It does not apply to your x, and the rules covering conversions between integer values (i.e. your z) are different.
Basically, then, there is no answer at the C level for why you see the specific results you do, nor for why I see different ones when I run your code. The behavior is undefined; I can be thankful that it did not turn out to be an outpouring of nasal demons.
Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Short Answer
Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).
Long Answer
According to the C99 Standard:
6.3.1.8 Usual arithmetic conversions
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:
6.2.5 (9)
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Bonus: Arithmetic Conversion Semi-WTF
#include <stdio.h>
int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;
if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");
return 0;
}
You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes
Bonus: Arithmetic Conversion Side Effect
Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:
unsigned int umax = -1; // umax set to UINT_MAX
This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
Your addition operation causes the int to be converted to an unsigned int.
Assuming two's complement representation and equally sized types, the bit pattern does not change.
Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Horrible Answers Galore
Ozgur Ozcitak
When you cast from signed to unsigned
(and vice versa) the internal
representation of the number does not
change. What changes is how the
compiler interprets the sign bit.
This is completely wrong.
Mats Fredriksson
When one unsigned and one signed
variable are added (or any binary
operation) both are implicitly
converted to unsigned, which would in
this case result in a huge result.
This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.
smh
Your addition operation causes the int
to be converted to an unsigned int.
Wrong. Maybe it does and maybe it doesn't.
Conversion from unsigned int to signed
int is implementation dependent. (But
it probably works the way you expect
on most platforms these days.)
Wrong. It is either undefined behavior if it causes overflow or the value is preserved.
Anonymous
The value of i is converted to
unsigned int ...
Wrong. Depends on the precision of an int relative to an unsigned int.
Taylor Price
As was previously answered, you can
cast back and forth between signed and
unsigned without a problem.
Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.
Now I can finally answer the question.
Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.
Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.
What is precision?
Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.
[Gotchas]
The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.
[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.
Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Short Answer
Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).
Long Answer
According to the C99 Standard:
6.3.1.8 Usual arithmetic conversions
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:
6.2.5 (9)
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Bonus: Arithmetic Conversion Semi-WTF
#include <stdio.h>
int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;
if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");
return 0;
}
You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes
Bonus: Arithmetic Conversion Side Effect
Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:
unsigned int umax = -1; // umax set to UINT_MAX
This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
Your addition operation causes the int to be converted to an unsigned int.
Assuming two's complement representation and equally sized types, the bit pattern does not change.
Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Horrible Answers Galore
Ozgur Ozcitak
When you cast from signed to unsigned
(and vice versa) the internal
representation of the number does not
change. What changes is how the
compiler interprets the sign bit.
This is completely wrong.
Mats Fredriksson
When one unsigned and one signed
variable are added (or any binary
operation) both are implicitly
converted to unsigned, which would in
this case result in a huge result.
This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.
smh
Your addition operation causes the int
to be converted to an unsigned int.
Wrong. Maybe it does and maybe it doesn't.
Conversion from unsigned int to signed
int is implementation dependent. (But
it probably works the way you expect
on most platforms these days.)
Wrong. It is either undefined behavior if it causes overflow or the value is preserved.
Anonymous
The value of i is converted to
unsigned int ...
Wrong. Depends on the precision of an int relative to an unsigned int.
Taylor Price
As was previously answered, you can
cast back and forth between signed and
unsigned without a problem.
Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.
Now I can finally answer the question.
Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.
Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.
What is precision?
Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.
[Gotchas]
The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.
[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.
Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Short Answer
Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).
Long Answer
According to the C99 Standard:
6.3.1.8 Usual arithmetic conversions
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:
6.2.5 (9)
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Bonus: Arithmetic Conversion Semi-WTF
#include <stdio.h>
int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;
if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");
return 0;
}
You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes
Bonus: Arithmetic Conversion Side Effect
Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:
unsigned int umax = -1; // umax set to UINT_MAX
This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
Your addition operation causes the int to be converted to an unsigned int.
Assuming two's complement representation and equally sized types, the bit pattern does not change.
Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Horrible Answers Galore
Ozgur Ozcitak
When you cast from signed to unsigned
(and vice versa) the internal
representation of the number does not
change. What changes is how the
compiler interprets the sign bit.
This is completely wrong.
Mats Fredriksson
When one unsigned and one signed
variable are added (or any binary
operation) both are implicitly
converted to unsigned, which would in
this case result in a huge result.
This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.
smh
Your addition operation causes the int
to be converted to an unsigned int.
Wrong. Maybe it does and maybe it doesn't.
Conversion from unsigned int to signed
int is implementation dependent. (But
it probably works the way you expect
on most platforms these days.)
Wrong. It is either undefined behavior if it causes overflow or the value is preserved.
Anonymous
The value of i is converted to
unsigned int ...
Wrong. Depends on the precision of an int relative to an unsigned int.
Taylor Price
As was previously answered, you can
cast back and forth between signed and
unsigned without a problem.
Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.
Now I can finally answer the question.
Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.
Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.
What is precision?
Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.
[Gotchas]
The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.
[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.