Extension on shifting or arithemtic operations in standard C - c

Sorry for bad English.
uint16_t a, c;
uint8_t b = 0xff;
a = b<<8;
c = b*10;
What is value of a and c we get? What is situation with arbitrary integer types?

uint16_t a, c;
uint8_t b = 0xff;
a = b<<8;
First, the integer promotions are performed on the arguments of <<. The constant 8 is an int and thus is not converted. Since the conversion rank of uint8_t is smaller than that of int, and all values of uint8_t are representable as ints, b is converted - preserving its value - to int. The resulting int value is then shifted left by eight bits.
If int is only 16 bits wide, the value 0xff * 2^8 is not representable as an int, and then the shift invokes undefined behaviour - 6.5.7 (4) in n1570 and C99:
If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
Otherwise, the result is 255*256 = 65280 = 0xFF00. Since that value is representable in the type of a, the conversion of the int result of the shift to uint16_t preserves the value; if the result were out of range (e.g. if the shift distance were 9 [and int wide enough]), it would be reduced modulo 2^16 to obtain a value in the range 0 to 2^16 - 1 of uint16_t.
c = b*10;
The usual arithmetic conversions are performed on the operands of *. Both operands have an integer type, thus first the integer promotions are performed. Since 10 is an int and all values of b's type are representable as an int, the integer promotions give both operands the same type, int, and the usual arithmetic conversions don't require any further conversions. The multiplication is done at type int, its result, 2550, is again representable in the type of c, so the conversion to uint16_t that is done before storing the value in c preserves the value.
What is situation with arbitrary integer types?
For <<:
integer promotions; values/expressions of an integer type whose conversion rank is less than or equal to that of int (and unsigned int) [integer types with width <= that of (unsigned) int], and bitfields of type _Bool, int, signed int and unsigned int are converted to int or unsigned int (int if that can represent all values of the original type, unsigned int otherwise).
If the (promoted) right operand (shift distance) is negative or greater than or equal to the width (number of value bits plus sign bits; there is either one sign bit or none) of the (promoted) left operand, the behaviour is undefined. If the value of the (promoted) left operand is negative, the behaviour is undefined. If the type of the (promoted) left operand is unsigned, the result is value * 2^distance, reduced modulo 2^width. If the type of the (promoted) left operand is signed and the value nonnegative, the result is value * 2^distance if that is representable in the type, the behaviour is undefined otherwise.
If no undefined behaviour occurred in 2., the result is converted to the type of the variable it is stored in.
If the target type is _Bool (or an alias thereof), a nonzero result is converted to 1, a zero result to 0, otherwise
If the result can be represented in the target type, its value is preserved, otherwise
If the target type is unsigned, the result is reduced modulo 2^width, otherwise
the result is converted in an implementation-defined manner or an implementation-defined signal is raised.
For *:
The usual arithmetic conversions are performed, so that both (converted) operands have the same type.
The multiplication is performed at the resulting type; if that is a signed integer type and the multiplication overflows, the behaviour is undefined.
The result is converted to the target type in the same manner as above.
That's how the abstract machine is defined, if the implementation can achieve the same results (where the behaviour is defined) in another manner, it can do as it pleases under the as-if rule.

Related

Does x ^= x & -x; where x is an unsigned integer invoke UB?

Does this function invoke undefined behavior due to the - operator being applied to x which is unsigned? I searched the standard and couldn't find an explanation.
unsigned foo(unsigned x)
{
return x ^= x & -x;
}
IMO yes.
edit
void func(unsigned x)
{
printf("%x", -x);
}
int main(void)
{
func(INT_MIN);
}
IMO The only explanation is that it was promoted to larger signed integer size then converted to unsigned.
If it is promoted to larger integer size, what will happen if there is no larger signed integer type?
The behavior of this expression is well defined.
Constructs similar to x = x + 1 are allowed because x isn't assigned a value until all other subexpressions are evaulated. The same applies in this case.
There is also no problem with -x because the expression has unsigned type and thus has well defined wraparound behavior as opposed to overflowing.
Section 6.5.3.3p3 of the C standard regarding the unary - operator states:
The result of the unary - operator is the negative of its (promoted) operand. The integer promotions are performed on the operand, and the result has the promoted type.
So since no promotion occurs the type remains unsigned throughout the expression. Though not explicitly stated in the standard, -x is effectively the same as 0 - x.
For the specific case of INT_MIN being passed to this function, it has type int and is outside of the range of unsigned, so it is converted when passed to the function. This results in the signed value -2,147,483,648 being converted to the unsigned value 2,147,483,648 (which in two's complement happen to have the same representation, i.e. 0x80000000). Then when -x is evaluated, it wraps around resulting in 2,147,483,648.
6.2.5 Types
...
9 The range of nonnegative values of a signed integer type is a subrange of the
corresponding unsigned integer type, and the representation of the same value in each
type is the same.41) A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
41) The same representation and alignment requirements are meant to imply interchangeability as
arguments to functions, return values from functions, and members of unions.
...
6.3 Conversions
...
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
60) The rules describe arithmetic on the mathematical value, not the value of a given type of expression
...
6.3.1.8 Usual arithmetic conversions
...
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
In short - the -x does not lead to undefined behavior. The result of the expression is still unsigned, it just maps to a well-defined, non-negative value.

Reverse precedence of shift and bitwise complement in an expression

In this code:
unsigned short int i = 3;
unsigned short int x = 30;
unsigned short int z = (~x) >> i;
On the third row it seems that it first does the shift and THEN the complement (~) even when I use parentheses.
However, the strange result doesn't occur if I replace short with long.
It happens both in Windows and in Unix. Why is that?
It performs the operations exactly in the order you prescribed.
However, the operands are not unsigned short ints. Integral promotion turns x and i into good old regular signed integers before preforming the operation. To quote the C standard on this:
6.3.1 Arithmetic operands / paragraph 2
The following may be used in an expression wherever an int or unsigned
int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to
the rank of int and unsigned int.
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int.
And unsigned shorts can fit snugly in a signed integer on the machines you tried.
Furthermore, right shifting a signed integer has implementation defined results for negative values:
6.5.7 Bitwise shift operators / paragraph 5
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1
has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the quotient of
E1 / 2E2 . If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
And ~x is some negative integer (which one precisely depends on the value representation of signed integers).
All of the above more then likely accounts for you not getting the expected result when converting it back to an unsigned short integer.

integer promotion and unsigned interpretation

int_8 int8 = ~0;
uint_16 uInt16 = (uint_16) int8;
Regarding the typecast above; where in C standard can I find reference to an indication for the following behaviour?
- sign extension to the larger type before the unsigned interpretation (uInt16=0xFFFF) rather than unsigned interpretation followed by 0 extension to the larger type (uInt16=0xFF).
From C99 6.3.1.8
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Above statement is clear about which variable needs to be converted however it is not very clear about how the conversation should actually be performed hence my question asking for a reference from the standard.
Thanks
As per the standard:
6.3.1.3 Signed and unsigned integers
......
2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
And the footnote to avoid the confusion when interpreting the above:
The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
I.e. if your int8 has a value of -1 (assuming the negatives representations is 2's complement, it does in your example), when converted into uint16_t, the value (0xFFFF + 1) will be added to it (which one more than the max value that can be represented by uint16_t), which yields the result of 0xFFFF + 1 - 1 = 0xFFFF.
Answer I believe is actually part of 6.3.1.8 as well:
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
....
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand.....
meaning that integer promotions are performed first before the conversion to unsigned using the rule 6.3.1.3.

Usual Arithmetic Conversions: unexpected output

signed short Temp;
Temp = 0xF2C9;
Temp2 = 0x100;
unsigned char a;
a = (unsigned char)(Temp/((unsigned short)Temp2));
What will be the expected output?
As per my understanding due to "Usual Arithmetic Conversions" first Temp should be converted into unsigned shortand result in a should be 0xF2, but I am getting the response 0xF3 which means operation is performed with signed value of Temp. Please explain the behavior.
Is endianess also relevant in this scenario?
No, first all arguments to arithmetic operators are promoted, that is narrow types, such as your shorts are converted to int. (at least on all common architectures).
Assuming short is 16 bit wide on your system, the initialization of Temp is implementation defined because the value of 0xF2C9 doesn't fit to the type. Most probably it is a negative value. Then, for the computation, that negative signed short value is promoted to int. The result of the division is a negative value, which then in turn is converted to unsigned char.
It depends if INT_MAX >= USHRT_MAX
The "Usual Arithmetic Conversions" convert Temp into (int) Temp. This will only "extend the sign" if int is wider than short.
((unsigned short)Temp2) is promoted to (int)((unsigned short)Temp2) or (unsigned)((unsigned short)Temp2).
If INT_MAX >= USHRT_MAX, then the division is done as (int)/(int).
Otherwise, like on a 16-bit system, the division is done as (int)/(unsigned), which is done as (unsigned)/(unsigned).
[Edit]
Temp, initialized with 0xF2C9 (see note), likely has the value of -3383 (or has the value of 62153 should short unlikely be wider than 16 bits.)
With (int)/(int), -3383/256 --> -13.21... --> -13. -13 converted to unsigned char --> 256 - 13 --> 243 or 0xF3.
(Assuming 16-bit int/unsigned) With (unsigned)/(unsigned), -3383 is converted to unsigned 65536 - 3383 --> 62153. 62153/256 --> 242.78... --> 242. 242 converted to unsigned char --> 242 or 0xF2.
Endian-ness in not relevant in this scenario.
Note: As pointed out by #Jens Gustedt, the value in Temp is implementation defined when Temp is 16-bit.
integer division works with "truncation toward zero", so the result of this division is F3, which is -13, instead of F2, which is -14. if you calculate this in decimal represenation the result would be -13.21 and then you would cut the .21.
"The question asked why the integers are signed"
"As per my understanding due to "Usual Arithmetic Conversions" first Temp should be converted into unsigned short..." NO. You are wrong here. The signed short Temp has its high bit set, hence is negative. The bit gets extended to the left when converted to int. The signed short Temp2 does not have its high bt set; the cast to (unsigned short) has no effect; it is converted to a positive int. The negative int is now divided by the positive int, resulting in a negtive value.
In the conversion of Temp2 to int, you don't want the sign bit extended. Use:
a = (unsigned char)(((unsigned short)Temp)/Temp2);
(I didn't test it; just theory)
Jens ,Paul and Mch,
Thanks for your clarification. But as per "ISO/IEC 9899 section 6.3.1.1
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
and as per 6.3.1.8 Usual arithmetic conversions" following rules should be applicable.
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
**Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.**
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type
so according to the above mentioned rule 3 1st signed short (integer type) should be converted in to unsigned short (integer type) and then arithmatic operation should be performed and result should be 0xF2.

C usual arithmetic conversions

I was reading in the C99 standard about the usual arithmetic conversions.
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser integer
conversion rank is converted to the type of the operand with greater
rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type of the
operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is converted
to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
So let's say I have the following code:
#include <stdio.h>
int main()
{
unsigned int a = 10;
signed int b = -5;
printf("%d\n", a + b); /* 5 */
printf("%u\n", a + b); /* 5 */
return 0;
}
I thought the bolded paragraph applies (since unsigned int and signed int have the same rank. Why isn't b converted to unsigned ? Or perhaps it is converted to unsigned but there is something I don't understand ?
Thank you for your time :-)
Indeed b is converted to unsigned. However what you observed is that b converted to unsigned and then added to 10 gives as value 5.
On x86 32bit this is what happens
b, coverted to unsigned, becomes 4294967291 (i.e. 2**32 - 5)
adding 10 becomes 5 because of wrap-around at 2**32 (2**32 - 5 + 10 = 2**32 + 5 = 5)
0x0000000a plus 0xfffffffb will always be 0x00000005 regardless of whether you are dealing with signed or unsigned types, as long as only 32 bits are used.
Repeating the relevant portion of the code from the question:
unsigned int a = 10;
signed int b = -5;
printf("%d\n", a + b); /* 5 */
printf("%u\n", a + b); /* 5 */
In a + b, b is converted to unsigned int, (yielding UINT_MAX + 1 - 5 by the rule for unsigned-to-signed conversion). The result of adding 10 to this value is 5, by the rules of unsigned arithmetic, and its type is unsigned int. In most cases, the type of a C expression is independent of the context in which it appears. (Note that none of this depends on the representation; conversion and arithmetic are defined purely in terms of numeric values.)
For the second printf call, the result is straightforward: "%u" expects an argument of type unsigned int, and you've given it one. It prints "5\n".
The first printf is a little more complicated. "%d" expects an argument of type int, but you're giving it an argument of type unsigned int. In most cases, a type mismatch like this results in undefined behavior, but there's a special-case rule that corresponding signed and unsigned types are interchangeable as function arguments -- as long as the value is representable in both types (as it is here). So the first printf also prints "5\n".
Again, all this behavior is defined in terms of values, not representations (except for the requirement that a given value has the same representation in corresponding signed and unsigned types). You'd get the same result on a system where signed int and unsigned int are both 37 bits, signed int has 7 padding bits, unsigned int has 11 padding bits, and signed int uses a 1s'-complement or sign-and-magnitude representation. (No such system exists in real life, as far as I know.)
It is converted to unsigned, the unsigned arithmetic just happens to give the result you see.
The result of unsigned arithmetic is equivalent to doing signed arithmetic with two's complement and no out of range exception.

Resources