integer promotion and unsigned interpretation - c

int_8 int8 = ~0;
uint_16 uInt16 = (uint_16) int8;
Regarding the typecast above; where in C standard can I find reference to an indication for the following behaviour?
- sign extension to the larger type before the unsigned interpretation (uInt16=0xFFFF) rather than unsigned interpretation followed by 0 extension to the larger type (uInt16=0xFF).
From C99 6.3.1.8
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Above statement is clear about which variable needs to be converted however it is not very clear about how the conversation should actually be performed hence my question asking for a reference from the standard.
Thanks

As per the standard:
6.3.1.3 Signed and unsigned integers
......
2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
And the footnote to avoid the confusion when interpreting the above:
The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
I.e. if your int8 has a value of -1 (assuming the negatives representations is 2's complement, it does in your example), when converted into uint16_t, the value (0xFFFF + 1) will be added to it (which one more than the max value that can be represented by uint16_t), which yields the result of 0xFFFF + 1 - 1 = 0xFFFF.

Answer I believe is actually part of 6.3.1.8 as well:
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
....
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand.....
meaning that integer promotions are performed first before the conversion to unsigned using the rule 6.3.1.3.

Related

Does x ^= x & -x; where x is an unsigned integer invoke UB?

Does this function invoke undefined behavior due to the - operator being applied to x which is unsigned? I searched the standard and couldn't find an explanation.
unsigned foo(unsigned x)
{
return x ^= x & -x;
}
IMO yes.
edit
void func(unsigned x)
{
printf("%x", -x);
}
int main(void)
{
func(INT_MIN);
}
IMO The only explanation is that it was promoted to larger signed integer size then converted to unsigned.
If it is promoted to larger integer size, what will happen if there is no larger signed integer type?
The behavior of this expression is well defined.
Constructs similar to x = x + 1 are allowed because x isn't assigned a value until all other subexpressions are evaulated. The same applies in this case.
There is also no problem with -x because the expression has unsigned type and thus has well defined wraparound behavior as opposed to overflowing.
Section 6.5.3.3p3 of the C standard regarding the unary - operator states:
The result of the unary - operator is the negative of its (promoted) operand. The integer promotions are performed on the operand, and the result has the promoted type.
So since no promotion occurs the type remains unsigned throughout the expression. Though not explicitly stated in the standard, -x is effectively the same as 0 - x.
For the specific case of INT_MIN being passed to this function, it has type int and is outside of the range of unsigned, so it is converted when passed to the function. This results in the signed value -2,147,483,648 being converted to the unsigned value 2,147,483,648 (which in two's complement happen to have the same representation, i.e. 0x80000000). Then when -x is evaluated, it wraps around resulting in 2,147,483,648.
6.2.5 Types
...
9 The range of nonnegative values of a signed integer type is a subrange of the
corresponding unsigned integer type, and the representation of the same value in each
type is the same.41) A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
41) The same representation and alignment requirements are meant to imply interchangeability as
arguments to functions, return values from functions, and members of unions.
...
6.3 Conversions
...
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
60) The rules describe arithmetic on the mathematical value, not the value of a given type of expression
...
6.3.1.8 Usual arithmetic conversions
...
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
In short - the -x does not lead to undefined behavior. The result of the expression is still unsigned, it just maps to a well-defined, non-negative value.

Usual Arithmetic Conversions: unexpected output

signed short Temp;
Temp = 0xF2C9;
Temp2 = 0x100;
unsigned char a;
a = (unsigned char)(Temp/((unsigned short)Temp2));
What will be the expected output?
As per my understanding due to "Usual Arithmetic Conversions" first Temp should be converted into unsigned shortand result in a should be 0xF2, but I am getting the response 0xF3 which means operation is performed with signed value of Temp. Please explain the behavior.
Is endianess also relevant in this scenario?
No, first all arguments to arithmetic operators are promoted, that is narrow types, such as your shorts are converted to int. (at least on all common architectures).
Assuming short is 16 bit wide on your system, the initialization of Temp is implementation defined because the value of 0xF2C9 doesn't fit to the type. Most probably it is a negative value. Then, for the computation, that negative signed short value is promoted to int. The result of the division is a negative value, which then in turn is converted to unsigned char.
It depends if INT_MAX >= USHRT_MAX
The "Usual Arithmetic Conversions" convert Temp into (int) Temp. This will only "extend the sign" if int is wider than short.
((unsigned short)Temp2) is promoted to (int)((unsigned short)Temp2) or (unsigned)((unsigned short)Temp2).
If INT_MAX >= USHRT_MAX, then the division is done as (int)/(int).
Otherwise, like on a 16-bit system, the division is done as (int)/(unsigned), which is done as (unsigned)/(unsigned).
[Edit]
Temp, initialized with 0xF2C9 (see note), likely has the value of -3383 (or has the value of 62153 should short unlikely be wider than 16 bits.)
With (int)/(int), -3383/256 --> -13.21... --> -13. -13 converted to unsigned char --> 256 - 13 --> 243 or 0xF3.
(Assuming 16-bit int/unsigned) With (unsigned)/(unsigned), -3383 is converted to unsigned 65536 - 3383 --> 62153. 62153/256 --> 242.78... --> 242. 242 converted to unsigned char --> 242 or 0xF2.
Endian-ness in not relevant in this scenario.
Note: As pointed out by #Jens Gustedt, the value in Temp is implementation defined when Temp is 16-bit.
integer division works with "truncation toward zero", so the result of this division is F3, which is -13, instead of F2, which is -14. if you calculate this in decimal represenation the result would be -13.21 and then you would cut the .21.
"The question asked why the integers are signed"
"As per my understanding due to "Usual Arithmetic Conversions" first Temp should be converted into unsigned short..." NO. You are wrong here. The signed short Temp has its high bit set, hence is negative. The bit gets extended to the left when converted to int. The signed short Temp2 does not have its high bt set; the cast to (unsigned short) has no effect; it is converted to a positive int. The negative int is now divided by the positive int, resulting in a negtive value.
In the conversion of Temp2 to int, you don't want the sign bit extended. Use:
a = (unsigned char)(((unsigned short)Temp)/Temp2);
(I didn't test it; just theory)
Jens ,Paul and Mch,
Thanks for your clarification. But as per "ISO/IEC 9899 section 6.3.1.1
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
and as per 6.3.1.8 Usual arithmetic conversions" following rules should be applicable.
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
**Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.**
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type
so according to the above mentioned rule 3 1st signed short (integer type) should be converted in to unsigned short (integer type) and then arithmatic operation should be performed and result should be 0xF2.

unsigned and int gotcha [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Arithmetic operations on unsigned and signed integers
unsigned int b=2;
int a=-2;
if(a>b)
printf("a>b");
else
printf("b>a");
OUTPUT: a>b
int b=2;
int a=-2;
if(a>b)
printf("a>b");
else
printf("b>a");
OUTPUT: b>a
PLEASE, someone explain the output
In the first case both operands are converted to unsigned int, the converted a will be UINT_MAX-1, which is much larger than b and hence the output.
Don't compare signed and unsigned integers unless you understand the semantics of arithematic conversions, the results might surprise you.
When signed and unsigned values are compared, and when the unsigned values can't all be represented in the signed type, then the signed operand is promoted to unsigned. This is done with a formula that amounts to a reinterpretation of the 2-s complement bit pattern.
Well, negative numbers have lots of high bits set...
Since your operands are all of the same rank, it's just a matter of unsigned bit patterns being compared.
And so -2 is represented with 111111..110, one less than the largest possible, and it easily beats 2 when interpreted as unsigned.
The following is taken from The C Programming Language by Kernighan and Ritchie - 2.7 Type Conversions - page 44; the second half of the page explains the same scenario in detail. A small portion is below for your reference.
Conversion rules are complicated when unsigned operands are involved. The problem is that comparison between signed and unsigned values are machine dependent, because they depend on the sizes of the various integer types. For example, suppose that int is 16 bits long and long is 32 bits. Then -1L < 1U, because 1U, which is an int, is promoted to a signed long. But -1L > 1UL, because -1L is promoted to unsigned long and thus appears to be a larger positive number.
You need to learn the operation of the operators in C and the C promotion and conversion rules. They are explained in the C standard. Some excerpts from it plus my comments:
6.5.8 Relational operators
Syntax
1 relational-expression:
shift-expression
relational-expression < shift-expression
relational-expression > shift-expression
relational-expression <= shift-expression
relational-expression >= shift-expression
Semantics
3 If both of the operands have arithmetic type, the usual arithmetic conversions are
performed.
Most operators include this "usual arithmetic conversions" step before the actual operation (addition, multiplication, comparison, etc etc). - Alex
6.3.1.8 Usual arithmetic conversions
1 Many operators that expect operands of arithmetic type cause conversions and yield result
types in a similar way. The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the common real type. Unless
explicitly stated otherwise, the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if they are the same,
and complex otherwise. This pattern is called the usual arithmetic conversions:
First, if the corresponding real type of either operand is long double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is float.
Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type. (The rules describe arithmetic on the mathematical value, not the value of a given type of expression.)
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
So, in your a>b (with a being an int and b being an unsigned int), per the above rules you get a converted to unsigned int before the comparison. Since a is negative (-2), the unsigned value becomes UINT_MAX+1+a (this is the repeatedly adding or
subtracting one more than the maximum value bit). And UINT_MAX+1+a in your case is UINT_MAX+1-2 = UINT_MAX-1, which is a huge positive number compared to the value of b (2). And so a>b yields "true".
Forget the math you learned at school. Learn how C does it.
In this first case you get the unsigned a converted into a signed int. Then these two are compared.
Type conversion ranks between signed and unsigned types could have the same rank in C99. This is when a unsigned and a signed type have the corresponding types, when this happens the result is up to the compiler.
Here is a summary of the rules.

K&R chapter 2: Difference between u and l integer suffix in C

I am a student, going through the book by Kerningham and Ritchie for C.
A line in the book says that -1l is less than 1u because in that case unsigned int is promoted to signed long. But -1l > 1ul because in this case -1l is promoted to unsigned long.
I can't really understand the promotion properly. What will be the value of -1l when it is promoted to unsigned long? It'll be great if anyone can help.
Thanks.
In -1l > 1ul the -1l is promoted to unsigned long, and by definition and Standard, -1 cast to an unsigned type will be the largets value representable by that unsigned type.
I got my inspiration from memory of this answer here to a quite relevant question.
And after looking at the C99 draft I have lingering around, see for example 6.3.1.3(2), where it says the maximum value representable by the type will be added or subtracted from the original value until it fits in the new type. I must warn you that char, although it is an integer type, is treated special: it is implementation defined if char is signed or unsigned. But that is, strictly, beside the question at hand.
Implicit promotions are one of the most difficult things in the C language. If you have a C code expression looking like
if(-1l > 1ul)
then no "integer promotions" take place. Both types are of the same size, but different signedness. -1l will then be converted to unsigned long with a very large value. This is one of the rules in the "usual arithmetic conversions".
This is actually a conversion. Promotions go from types with less rank than an integer to integer.
The rules for integer conversions in C are somewhat complex. They are, as per ISO C99 §6.3.1.8 ¶1:
Otherwise, the integer promotions are
performed on both operands. Then the
following rules are applied to the
promoted operands:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned
integer types, the operand with the type of lesser integer conversion
rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or
equal to the rank of the type of the other operand, then the operand
with
signed integer type is converted to the type of the operand with
unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can
represent
all of the values of the type of the operand with unsigned integer
type, then
the operand with unsigned integer type is converted to the type
of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
I'll try to explain them:
Try to convert to the larger type. When there is conflict between signed and unsigned, if the larger (including the case where the two types have the same rank) type is unsigned, go with unsigned. Otherwise, go with signed only in the case it can represent all the values of both types.
When you're learning C, if you have a question, just write yourself a simple program:
#include <stdio.h>
main() {
int si = -1;
unsigned int ui = 1;
if ( si > ui ) printf("-1l > 1u\n");
else printf("-1l <= 1u\n");
}
You'll see that -1l > 1u is shown for the output.
Because both si and ui have the same rank (they're both ints), the rule says that the negative value will be promoted to unsigned at set to UINT_MAX which is the largest possible unsigned value.

In a C expression where unsigned int and signed int are present, which type will be promoted to what type?

I have a query about data type promotion rules in C language standard.
The C99 says that:
C integer promotions also require that "if an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int."
My questions is in case of a C language expression where unsigned int and signed int are present, which type will be promoted to what type?
E.g. int cannot represent all the values of the unsigned int (values larger than MAX_INT values) whereas unsigned int cannot represent the -ve values, so what type is promoted to what in such cases?
I think you are confusing two things. Promotion is the process by which values of integer type "smaller" that int/unsigned int are converted either to int or unsigned int. The rules are expressed somewhat strangely (mostly for the benefit of handling adequately char) but ensure that value and sign are conserved.
Then there is the different concept of usual arithmetic conversion by which operands of arithmetic operators are converted to a common type. It begins by promoting the operand (to either int or unsigned) if they are of a type smaller than int and then choosing a target type by the following process (for integer types, 6.3.1.8/1)
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
(Note that ISTR that those rules have changed slightly between C89 and C99)
I think the following answers your question:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is
converted to another integer type
other than _Bool, if the value can be
represented by the new type, it is
unchanged.
2 Otherwise, if the new
type is unsigned, the value is
converted by repeatedly adding or
subtracting one more than the maximum
value that can be represented in the
new type until the value is in the
range of the new type.
3 Otherwise,
the new type is signed and the value
cannot be represented in it; either
the result is implementation-defined
or an implementation-defined signal is
raised.

Resources