Signed Short (Signed Int16) Multiplication Explanation - c

Signed Short (Signed Int16) Multiplication Explanation?
short ss = -32768; // 0x8000 SHRT_MIN
ss *= (short) -1;
printf ("%d", (int)ss); // Prints -32768
What are mechanics of how an unsigned short with a value of -32768 times -1 can be itself? My guess is that (int)32768 ---> overflows and wraps back around to -32768 but nowhere here is asking for promotion to integer or any larger datatype.
Looking for the part of the C specification that defines this behavior.

There is no undefined behavior here -- assuming that int is wider than 16 bits.
This:
ss *= (short) -1;
is equivalent to:
ss = ss * (short)-1;
Both operands of * are promoted from short to int (by the integer promotions), and the multiplication is done in type int. It yields the int value 32768 (again, unless INT_MAX == 32767, which is legal but rare in modern non-embedded systems). (C has no arithmetic operations on integer types narrower than int and unsigned int.)
That int value is converted back to short. Unlike arithmetic operations, a conversion of an integer value to a signed integer result, when the value does not fit in the target type, yields an implementation-defined result (or raises an implementation-defined signal, but I don't think any implementations do that).
Converting 32768 to type short will probably yield -32768.
The behavior of signed conversion is specified in N1570 6.3.1.3p3:
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised.
The integer promotions are described in 6.3.1.1p2:
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer promotions.

Related

How you avoid implicit conversion from short to integer during addition?

I'm doing a few integer for myself, where I'm trying to fully understand integer overflow.
I kept reading about how it can be dangerous to mix integer types of different sizes. For that reason i wanted to have an example where a short would overflow much faster than a int.
Here is the snippet:
unsigned int longt;
longt = 65530;
unsigned short shortt;
shortt = 65530;
if (longt > (shortt+10)){
printf("it is bigger");
}
But the if-statement here is not being run, which must mean that the short is not overflowing. Thus I conclude that in the expression shortt+10 a conversion happens from short to integer.
This is a bit strange to me, when the if statement evaluates expressions, does it then have the freedom to assign a new integer type as it pleases?
I then thought that if I was adding two short's then that would surely evaluate to a short:
unsigned int longt;
longt = 65530;
unsigned short shortt;
shortt = 65530;
shortt = shortt;
short tmp = 10;
if (longt > (shortt+tmp)){
printf("Ez bigger");
}
But alas, the proporsition still evaluates to false.
I then try do do something where I am completely explicit, where I actually do the addition into a short type, this time forcing it to overflow:
unsigned int longt;
longt = 65530;
unsigned short shortt;
shortt = 65530;
shortt = shortt;
short tmp = shortt + 10;
if (longt > tmp){
printf("Ez bigger");
}
Finally this worked, which also would be really annoying if it did'nt.
This flusters me a little bit though, and it reminds me of a ctf exercise that I did a while back, where I had to exploit this code snippet:
#include <stdio.h>
int main() {
int impossible_number;
FILE *flag;
char c;
if (scanf("%d", &impossible_number)) {
if (impossible_number > 0 && impossible_number > (impossible_number + 1)) {
flag = fopen("flag.txt","r");
while((c = getc(flag)) != EOF) {
printf("%c",c);
}
}
}
return 0;
}
Here, youre supposed to trigger a overflow of the "impossible_number" variable which was actually possible on the server that it was deployed upon, but would make issues when run locally.
int impossible_number;
FILE *flag;
char c;
if (scanf("%d", &impossible_number)) {
if (impossible_number > 0 && impossible_number > (impossible_number + 1)) {
flag = fopen("flag.txt","r");
while((c = getc(flag)) != EOF) {
printf("%c",c);
}
}
}
return 0;
You should be able to give "2147483647" as input, and then overflow and hit the if statement. However this does not happen when run locally, or running at an online compiler.
I don't get it, how do you get an expression to actually overflow the way that is is actually supossed to do, like in this example from 247ctf?
I hope someone has a answer for this
How you avoid implicit conversion from short to integer during addition?
You don't.
C has no arithmetic operations on integer types narrower than int and unsigned int. There is no + operator for type short.
Whenever an expression of type short is used as the operand of an arithmetic operator, it is implicitly converted to int.
For example:
short s = 1;
s = s + s;
In s + s, s is promoted from short to int and the addition is done in type int. The assignment then implicitly converts the result of the addition from int to short.
Some compilers might have an option to enable a warning for the narrowing conversion from int to short, but there's no way to avoid it.
What you're seeing is a result of integer promotions. What this basically means it that anytime an integer type smaller than int is used in an expression it is converted to int.
This is detailed in section 6.3.1.1p2 of the C standard:
The following may be used in an expression wherever an int or unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to
the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer
promotions
That is what's happening here. So let's look at the first expression:
if (longt > (shortt+10)){
Here we have a unsigned short with value 65530 being added to the constant 10 which has type int. The unsigned short value is converted to an int value, so now we have the int value 65530 being added to the int value 10 which results in the int value 65540. We now have 65530 > 65540 which is false.
The same happens in the second case where both operands of the + operator are first promoted from unsigned short to int.
In the third case, the difference happens here:
short tmp = shortt + 10;
On the right side of the assignment, we still have the int value 65540 as before, but now this value needs to be assigned back to a short. This undergoes an implementation defined conversion to short, which is detailed in section 6.3.1.3:
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new
type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Paragraph 3 takes effect in this particular case. In most implementations you're likely to come across, this will typically mean "wraparound" of the value.
So how do you work with this? The closest thing you can do is either what you did, i.e. assign the intermediate result to a variable of the desired type, or cast the intermediate result:
if (longt > (short)(shortt+10)) {
As for the "impossible" input in the CTF example, that actually causes signed integer overflow as a result of the the addition, and that triggers undefined behavior. For example, when I ran it on my machine, I got into the if block if I compiled with -O0 or -O1 but not with -O2.
How you avoid implicit conversion from short to integer during addition?
Not really avoidable.
On 16-bit and wider machines, the conversion short to int and unsigned short to unsigned does not affect the value. But addition overflow and the implicit conversion from int to unsigned renders a different result in 16-but vs. 32-bit for OP's values. For in 16-bit land, unsigned short to int does not implicitly occur. Instead, code does unsigned short to unsigned.
int/unsigned as 16-bit
If int/unsigned were 16-bit -common on many embedded processors, then shortt would not convert to an int, but to unsigned.
// Given 16-bit int/unsigned
unsigned int longt;
longt = 65530; // 32-bit long constant assigned to 16-bit unsigned - no value change as value in range.
unsigned short shortt;
shortt = 65530; // 32-bit long constant assigned to 16-bit unsigned short - no value change as value in range.
// (shortt+10)
// shortt+10 is a unsigned short + int
// unsigned short promotes to unsigned - no value change.
// Then since unsigned + int, the int 10 converts to unsigned 10 - no value change.
// unsigned 65530 + unsigned 10 exceeds unsigned range so 65536 subtracted.
// Sum is 4.
// Statment is true.
if (longt > (shortt+10)){
printf("it is bigger");
}
It is called an implicit conversion.
From C standard:
Several operators convert operand values from one type to another
automatically. This subclause specifies the result required from such
an implicit conversion, as well as those that result from a cast
operation (an explicit conversion ). The list in 6.3.1.8 summarizes
the conversions performed by most ordinary operators; it is
supplemented as required by the discussion of each operator in 6.5
Every integer type has an integer conversion rank defined as follows:
No two signed integer types shall have the same rank, even if they
have the same representation.
The rank of a signed integer type
shall be greater than the rank of any signed integer type with less
precision.
The rank of long long int shall be greater than the rank
of long int, which shall be greater than the rank of int, which shall
be greater than the rank of short int, which shall be greater than the
rank of signed char.
The rank of any unsigned integer type shall
equal the rank of the corresponding signed integer type, if any.
The
rank of any standard integer type shall be greater than the rank of
any extended integer type with the same width.
The rank of char
shall equal the rank of signed char and unsigned char.
The rank of
_Bool shall be less than the rank of all other standard integer types.
The rank of any enumerated type shall equal the rank of the
compatible integer type (see 6.7.2.2).
The rank of any extended
signed integer type relative to another extended signed integer type
with the same precision is implementation-defined, but still subject
to the other rules for determining the integer conversion rank.
For
all integer types T1, T2, and T3, if T1 has greater rank than T2 and
T2 has greater rank than T3, then T1 has greater rank than T3.
The
following may be used in an expression wherever an int or unsigned int
may be used:
— An object or expression with an integer type (other than int or unsigned
int) whose integer conversion rank is less than or equal to the rank
of int and unsigned int.
A bit-field of type _Bool, int, signed int,
or unsigned int. If an int can represent all v alues of the original
type (as restricted by the width, for a bit-field), the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions.58) All other types are
unchanged by the integer promotions.
The integer promotions preserve
value including sign. As discussed earlier, whether a ‘‘plain’’ char
is treated as signed is implementation-defined.
You cant avoid implicit conversion but you can cast the result of the operation to the required type
if (longt > (short)(shortt+tmp))
{
printf("Ez bigger");
}
https://godbolt.org/z/39Exa8E7K
But this conversion invokes Undefined Behaviour as your short integer overflows. You have to be very careful doing it as it can be a source of very hard to find and debug errors.

What happens when when casting an int to unsigned int in C? [duplicate]

Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Short Answer
Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).
Long Answer
According to the C99 Standard:
6.3.1.8 Usual arithmetic conversions
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:
6.2.5 (9)
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Bonus: Arithmetic Conversion Semi-WTF
#include <stdio.h>
int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;
if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");
return 0;
}
You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes
Bonus: Arithmetic Conversion Side Effect
Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:
unsigned int umax = -1; // umax set to UINT_MAX
This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
Your addition operation causes the int to be converted to an unsigned int.
Assuming two's complement representation and equally sized types, the bit pattern does not change.
Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Horrible Answers Galore
Ozgur Ozcitak
When you cast from signed to unsigned
(and vice versa) the internal
representation of the number does not
change. What changes is how the
compiler interprets the sign bit.
This is completely wrong.
Mats Fredriksson
When one unsigned and one signed
variable are added (or any binary
operation) both are implicitly
converted to unsigned, which would in
this case result in a huge result.
This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.
smh
Your addition operation causes the int
to be converted to an unsigned int.
Wrong. Maybe it does and maybe it doesn't.
Conversion from unsigned int to signed
int is implementation dependent. (But
it probably works the way you expect
on most platforms these days.)
Wrong. It is either undefined behavior if it causes overflow or the value is preserved.
Anonymous
The value of i is converted to
unsigned int ...
Wrong. Depends on the precision of an int relative to an unsigned int.
Taylor Price
As was previously answered, you can
cast back and forth between signed and
unsigned without a problem.
Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.
Now I can finally answer the question.
Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.
Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.
What is precision?
Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.
[Gotchas]
The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.
[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.

safe to assign uint32_t variable to a signed value [duplicate]

Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Short Answer
Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).
Long Answer
According to the C99 Standard:
6.3.1.8 Usual arithmetic conversions
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:
6.2.5 (9)
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Bonus: Arithmetic Conversion Semi-WTF
#include <stdio.h>
int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;
if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");
return 0;
}
You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes
Bonus: Arithmetic Conversion Side Effect
Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:
unsigned int umax = -1; // umax set to UINT_MAX
This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
Your addition operation causes the int to be converted to an unsigned int.
Assuming two's complement representation and equally sized types, the bit pattern does not change.
Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Horrible Answers Galore
Ozgur Ozcitak
When you cast from signed to unsigned
(and vice versa) the internal
representation of the number does not
change. What changes is how the
compiler interprets the sign bit.
This is completely wrong.
Mats Fredriksson
When one unsigned and one signed
variable are added (or any binary
operation) both are implicitly
converted to unsigned, which would in
this case result in a huge result.
This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.
smh
Your addition operation causes the int
to be converted to an unsigned int.
Wrong. Maybe it does and maybe it doesn't.
Conversion from unsigned int to signed
int is implementation dependent. (But
it probably works the way you expect
on most platforms these days.)
Wrong. It is either undefined behavior if it causes overflow or the value is preserved.
Anonymous
The value of i is converted to
unsigned int ...
Wrong. Depends on the precision of an int relative to an unsigned int.
Taylor Price
As was previously answered, you can
cast back and forth between signed and
unsigned without a problem.
Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.
Now I can finally answer the question.
Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.
Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.
What is precision?
Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.
[Gotchas]
The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.
[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.

Is conversion from unsigned to signed undefined?

void fun(){
signed int a=-5;
unsigned int b=-5;
printf("the value of b is %u\n",b);
if(a==b)
printf("same\n");
else
printf("diff");
}
It is printing :
4294967291
same
In the 2nd line signed value is converted to unsigned value. So b has the value UINTMAX + 1 - 5 = 4294967291.
My question is what is happening in the comparison operation .
1) Is a again converted to unsigned and compared with b ?
2) Will b(ie unsigned ) be ever casted to signed value and compared automatically?
3) Is conversion from unsigned to signed undefined due to int overflow ?
I have read other posts on the topic. I just want clarification on questions 2 and 3 .
1) Is a again converted to unsigned and compared with b ?
Yes. In the expression (a==b), the implicit type conversion called "balancing" takes place (the formal name is "the usual arithmetic conversions"). Balancing rules specify that if a signed and a unsigned operand of the same size and type are compared, the signed operand is converted to a unsigned.
2) Will b(ie unsigned ) be ever casted to signed value and compared automatically?
No, it will never be converted to signed in your example.
3) Is conversion from unsigned to signed undefined due to int overflow ?
This is what the standard says: (C11)
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
2 Otherwise, if the new type is unsigned, the value is
converted by repeatedly adding or subtracting one more than the
maximum value that can be represented in the new type until the value
is in the range of the new type.
3 Otherwise, the new type is
signed and the value cannot be represented in it; either the result is
implementation-defined or an implementation-defined signal is raised.
In other words, if the compiler can manage to do the conversion in 2) above, then the behavior is well-defined. If it cannot, then the result depends on the compiler implementation.
It is not undefined behavior.
Answers:
a is converted to unsigned int.
If a had a wider range than the signed counterpart of b (I can imagine long long a would do), b would be converted to a signed type.
If an unsigned value can't be correctly represented after conversion to a signed type, you'll have implementation-defined behavior. If it can, no problem.
b = -5;
That counts as overflow and is undefined behavior. Comparing signed with unsigned automatically promotes the operands to unsigned. -In case of overflow, you'll find yourself again in a case of undefined behavior.- plain wrong, see [edit] below
See also http://c-faq.com/expr/preservingrules.html.
[edit] Correction - the standard does state that the 2 complement should be used when converting from negative signed to unsigned.

Signed to unsigned conversion in C - is it always safe?

Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Short Answer
Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).
Long Answer
According to the C99 Standard:
6.3.1.8 Usual arithmetic conversions
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:
6.2.5 (9)
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Bonus: Arithmetic Conversion Semi-WTF
#include <stdio.h>
int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;
if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");
return 0;
}
You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes
Bonus: Arithmetic Conversion Side Effect
Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:
unsigned int umax = -1; // umax set to UINT_MAX
This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
Your addition operation causes the int to be converted to an unsigned int.
Assuming two's complement representation and equally sized types, the bit pattern does not change.
Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Horrible Answers Galore
Ozgur Ozcitak
When you cast from signed to unsigned
(and vice versa) the internal
representation of the number does not
change. What changes is how the
compiler interprets the sign bit.
This is completely wrong.
Mats Fredriksson
When one unsigned and one signed
variable are added (or any binary
operation) both are implicitly
converted to unsigned, which would in
this case result in a huge result.
This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.
smh
Your addition operation causes the int
to be converted to an unsigned int.
Wrong. Maybe it does and maybe it doesn't.
Conversion from unsigned int to signed
int is implementation dependent. (But
it probably works the way you expect
on most platforms these days.)
Wrong. It is either undefined behavior if it causes overflow or the value is preserved.
Anonymous
The value of i is converted to
unsigned int ...
Wrong. Depends on the precision of an int relative to an unsigned int.
Taylor Price
As was previously answered, you can
cast back and forth between signed and
unsigned without a problem.
Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.
Now I can finally answer the question.
Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.
Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.
What is precision?
Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.
[Gotchas]
The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.
[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.

Resources