kbuild C: ~ Operator Converts Unsigned to Signed? [duplicate] - c

Let say I have a 32-bit machine.
I know during integer promotion the expressions are converted to:
int if all values of the original type can be represented in int
unsigned otherwise
Could you please explain what will happen for the following expression? and In general, how ranking works here?
First snippet:
int16_t x, pt;
int32_t speed;
uint16_t length;
x = (speed*pt)/length;
Second one:
x = pt + length;
#EDIT:
I found the following link that has described the issue very clearly:
Implicit type conversion.
Concretely, read the answer of Lundin, very helpful!

The integer promotion rule, correctly cited C11 6.3.1.1:
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer
promotions.
Where "otherwise, it is converted to an unsigned int" is in practice only used in one particular special case, namely where the smaller integer type unsigned short has the same size as unsigned int. In that case it will remain unsigned.
Apart from that special case, all small integer types will always get promoted to (signed) int regardless of their signedness.
Assuming 32 bit int, then:
x = (speed*pt)/length;
speed is signed 32, it will not get promoted. pt will get integer promoted to int (signed 32). The result of speed*pt will have type int.
length will get integer promoted to int. The division will get carried out with operands of type int and the resulting type will be int.
The result will get converted to signed 16 as it is assigned to x (lvalue conversion during assignment).
x = pt + length; is similar, here both operands of + will get promoted to int before addition and the result will afterwards get converted to signed 16.
For details see Implicit type promotion rules.

The integer promotion rules are defined in 6.3.1.8 Usual arithmetic conversions.
1. int16_t x, pt;
int32_t speed;
uint16_t length;
x = (speed*pt)/length;
2. x = pt + length;
Ranking means effectively the number of bits from the type as defined by CAM in limits.h. The standards imposes for the types of lower rank in CAM to correspond types of lower rank in implementation.
For your code,
speed * pt
is multiplication between int32_t and int16_t, which means, it is transformed in
speed * (int16_t => int32_t) pt
and the result tmp1 will be int32_t.
Next, it will continue
tmp1_int32 / length
Length will be converted from uint16_t to int32_t, so it will compute tmp2 so:
tmp1_int32 / (uint16_t => int32_t) length
and the result tmp2 will be of type int32_t.
Next it will evaluate an assignment expression, left side of 16 bits and the right side of 32, so it will cut the result so:
x = (int32_t => int16_t) tmp2_int32
Your second case will be evaluated as
x = (int32_t => int16_t) ( (int16_t => int32_t) pt + (uint16_t => int32_t) length )
In case an operator has both operands with rank smaller than the rank of int, the CAM allows to add both types if the operation does not overflow and then to convert the result to integer.
In other words, it is possible to covert INT16+INT16 either in
INT16+INT16
or in
(int32_t => int16_t) ((int16_t => int32_t) INT16 + (int16_t => int32_t) INT16)
provided the addition can be done without overflow.

Related

C arithmetic conversion multiplying unsigned with signed and result in float

int main()
{
printf("Hello World\n");
int x = -10;
unsigned y = 25;
float z = x*y;
printf("x=%d,y=%u,z=%f\n",x,y,z);
return 0;
}
When I run the above code, I get the following output:
Hello World
x=-10,y=25,z=4294967046.000000
My question is:
For the second printf, I would have expected z=(float) ( (unsigned)(-10)*25 ) = (float) (4294967286 x 25) = (float) 107374182150, what am I missing here?
Here's what's happening. As per C11 6.3.1.8 Usual arithmetic conversions (the "otherwise" comes into play here since previous paragraphs discuss what happens when either type is already floating point):
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
This means your signed value of -10 becomes an unsigned value of 0xffff'fff6, or 4,294,967,286. Multiplying that by 25 gives 107,374,182,150 or 0x18'ffff'ff06 (which is the result you want).
However, at this point, no float calculations have been done, the multiplication is a pure integer calculation and the resultant value will be an integer. And that, combined with the fact your unsigned integers are 32 bits long, means it gets truncated to 0xffff'ff06, or 4,294,967,046.
Then you put that into the float.
To fix this to match your expected results, you should change he expression to force this:
float z = 1.0f * (unsigned)x * y;
This changes the int * unsigned-int calculation into a float * unsigned-int * unsigned-int one. The unsigned cast first ensures x will be converted to the equivalent unsigned value and the multiplication by 1.0f ensures the multiplication are done in the float arena to avoid integer truncation.
Following on from the correct answer from #paxdiablo, the starting point for the result is due to unsigned having a rank equal to the rank of the int, e,g,
The rank of any unsigned integer type shall equal the rank of the
corresponding signed integer type, if any. C11 Standard - 6.3.1
Arithmetic
operands(p1)
This comes into play with the integer conversion cited in #paxdiablo's answer:
6.3.1.8 Usual arithmetic conversions
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
The problem is that -10 (negative values) are stored (in almost all computers) in two-complement. In two's complement the value for -10 takes the bitwise NOT of 10 and adds 1 (so in binary 00001010 become 11110110 sign extended to 32-bits). That is:
11111111111111111111111111110110
For which the unsigned values is 4294967286. When multiplied by 25, it exceeds the range of unsigned so the value is reduced modulo until it fits within the range of unsigned resulting in 4294967046. 6.2.5 Types(p9).
What Am I Missing?
The part that is missing is understanding the result of unsigned multiplication is being assigned as a float value. The intermediate result from x * y is unsigned. float f = x * y; is just an assignment of the result to a float.
What you want is for the intermediate calculation to be done as a float, so cast one of the operands (not the result) to float, e.g.
float f = (float)x * y
It does not matter which of the two values is cast to float, the following would be just fine:
float f = x * (float)y;
Now the result will be -250.

Data size definition in C programming while using mathematical expressions

When using mathematical operators in c programming, it is very important to use casts or define size of variable properly. I need help in it.
#include <stdio.h>
#include <stdint.h>
int main(void)
{
uint32_t a;
uint8_t b;
uint8_t d;
uint64_t c;
float cd;
a = 4294967295;
b = 2;
d = 2;
c = a * b * d;
cd = c;
printf("%f\n", cd);
return 0;
}
The result variable is large enough to store the 2 * 2 * uint32_max. However I noticed that the b or d variable need to be 64 bit wide (or use cast) to get the proper result. For this time I thought the mathematical operations takes place in the result variable, but it looks like is not true. Can somebody explain me which variable need to be widened (b or d) and what is the theoretical background behind it?
What is the situation in division? Should I consider whether I want to divide a 32 bit number with a 8bit long. will the result be in that case 8 bit only? Does any rule about the type of the denominator?
When you perform the multiplication a * b * d, what will happen is that b and d will get promoted to uint32_t (or int if int is wider than uint32_t) to match the type of a. However, this operation might overflow. So what you need to do is to cast at least one of them to uint_64_t to prevent this from happening.
Do note that (uint64_t)(a * b * d) will NOT work. Type casts has lower priority than parenthesis.
In C, the evaluation of an expression is determined by the operator and its operands, not by where the result will eventually be stored.
The expression a * b * d is structured as (a * b) * d. So a * b is evaluated, and then the result is multiplied by d.
One of the rules for * is in C 2018 6.5.5 3:
The usual arithmetic conversions are performed on the operands.
The usual arithmetic conversions are defined in 6.3.1.8 1. They are a bit complicated, and I give most of the details below. Applying them to your example:
In a * b, a is a uint32_t , and b is a uint8_t.
The integer promotions convert b to an int—essentially all arithmetic in C is done in a width of at least int.
If int is 32 bits or narrower, a remains uint32_t. Otherwise, a is converted to int.
If converted types of a and b are both int, the conversions are done, and the multiplication is performed.
If the converted type of a is uint32_t, b is converted to uint32_t, and the multiplication is performed.
Then the multiplication with d is performed the same way.
So, if int is 32 bits or narrower, the multiplications are performed with uint32_t, and the result is uint32_t. If int is wider, the multiplications are performed with int, and the result is int.
Casting either operand to uint64_t would cause the arithmetic to be done with uint64_t. (Except it is theoretically possible that int is wider than uint64_t, in which case the arithmetic would be done with int, but that is still satisfactory—performing a cast guarantees the arithmetic will be done with at least that width.)
For real numbers, the usual arithmetic conversions are largely:
If either operand is long double, the other is converted to long double.
Otherwise, if either is double, the other is converted to double.
Otherwise, if either is float, the other is converted to float.
Otherwise, the integer promotions are performed on both operands.
Then, if both have the same type, no further conversion is performed.
Otherwise, if both are signed or both are unsigned, the narrower (actually “lesser rank”) operand is converted to the type of the other.
Otherwise, if the unsigned operand is the same width or wider (greater or equal rank), the signed operand is converted to the type of the unsigned operand.
Otherwise, if the type of the signed operand can represent all the values of the type of the unsigned operand, the unsigned operand is converted to the type of the signed operand.
Otherwise, both operands are converted to the unsigned type that has the same width as the signed operand.
The integer promotions are defined in 6.3.1.1 2. They apply to all integer types as wide as or narrower than int or unsigned int (technically of rank less than or equal to the rank of int and unsigned int), including bit-fields of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
a * b * d is an expression of type uint32_t, or int if int is wider than uint32_t (due to the conversion rule for uint8_t).
The fact that this expression is assigned to a wider type is not a factor. That's the crux of the issue.
Writing c = 1ULL * a * b * d is a tractable fix.

Does the order of multiplication variables of different data type cause different results?

Lets say I have 3 variables: a long, an int and a short.
long l;
int i;
short s;
long lsum;
If this is a pure math, since multiplication has a commutative property, the order of these variables doesn't matter.
l * i * s = i * s * l = s * i * l.
Let lsum be the container of the multiplication of these 3 variables.
In C, would there be a case where a particular order of these variables cause different result?
If there is a case where the order does matter, not necessarily from this example, what would that be?
The order does matter due to integer promotions.
When applying an arithmetic operator, each of its operands is first promoted to int if its rank is less than int (such as char or short). If one of those operands then has a higher rank still (such as long), than the smaller is promoted.
From section 6.3.1 of the C standard:
2 The following may be used in an expression wherever an int or unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to
the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer
promotions.
From section 6.3.1.8:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser integer
conversion rank is converted to the type of the operand with greater
rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type of the
operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is converted
to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
As an example (assuming sizeof(int) is 4 and sizeof(long) is 8):
int i;
short s;
long l, result;
i = 0x10000000;
s = 0x10;
l = 0x10000000;
result = s * i * l;
printf("s * i * l=%lx\n", result);
result = l * i * s;
printf("l * i * s=%lx\n", result);
Output:
s * i * l=0
l * i * s=1000000000000000
In this example, s * i is evaluated first. The value of s is promoted to int, then the two int values are multiplied. At this point an overflow occurs unvoking undefined behavior. The result is then promoted to long and multiplied by l, with the result being of type long.
In the latter case, l * i is evaluated first. The value of i is promoted to long, then the two long values are multiplied and an overflow does not occur. The result is then multiplied by s, which is first promoted to long. Again, the result does not overflow.
In a situation like this, I'd recommend casting the leftmost operand to long so that all other operands are promoted to that type. If you have parenthesized subexpressions you may need to apply a cast there as well, depending on the result you want.
Yes, see "Type conversion" and "Type promotion" on http://www.cplusplus.com/articles/DE18T05o/
unsigned a = INT_MAX;
unsigned b = INT_MAX;
unsigned long c = 255;
unsigned long r1 = a * b * c;
unsigned long r2 = c * a * b;
r1=255
r2=13835056960065503487
r1 reflects that (a*b) is done first with types as least as long as an int, and the result is of the longest operand type, which is unsigned, so the result is unsigned and that overflows.

Why are the results of integer promotion different?

Please look at my test code:
#include <stdlib.h>
#include <stdio.h>
#define PRINT_COMPARE_RESULT(a, b) \
if (a > b) { \
printf( #a " > " #b "\n"); \
} \
else if (a < b) { \
printf( #a " < " #b "\n"); \
} \
else { \
printf( #a " = " #b "\n" ); \
}
int main()
{
signed int a = -1;
unsigned int b = 2;
signed short c = -1;
unsigned short d = 2;
PRINT_COMPARE_RESULT(a,b);
PRINT_COMPARE_RESULT(c,d);
return 0;
}
The result is the following:
a > b
c < d
My platform is Linux, and my gcc version is 4.4.2.
I am surprised by the second line of output.
The first line of output is caused by integer promotion. But why is the result of the second line different?
The following rules are from C99 standard:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
I think both of the two comparisons should belong to the same case, the second case of integer promotion.
When you use an arithmetic operator, the operands go through two conversions.
Integer promotions: If int can represent all values of the type, then the operand is promoted to int. This applies to both short and unsigned short on most platforms. The conversion performed on this stage is done on each operand individually, without regard for the other operand. (There are more rules, but this is the one that applies.)
Usual arithmetic conversions: If you compare an unsigned int against a signed int, since neither includes the entire range of the other, and both have the same rank, then both are converted to the unsigned type. This conversion is done after examining the type of both operands.
Obviously, the "usual arithmetic conversions" don't always apply, if there are not two operands. This is why there are two sets of rules. One gotcha, for example, is that shift operators << and >> don't do usual arithmetic conversions, since the type of the result should only depend on the left operand (so if you see someone type x << 5U, then the U stands for "unnecessary").
Breakdown: Let's assume a typical system with 32-bit int and 16-bit short.
int a = -1; // "signed" is implied
unsigned b = 2; // "int" is implied
if (a < b)
puts("a < b"); // not printed
else
puts("a >= b"); // printed
First the two operands are promoted. Since both are int or unsigned int, no promotions are done.
Next, the two operands are converted to the same type. Since int can't represent all possible values of unsigned, and unsigned can't represent all possible values of int, there is no obvious choice. In this case, both are converted to unsigned.
When converting from signed to unsigned, 232 is repeatedly added to the signed value until it is in the range of the unsigned value. This is actually a noop as far as the processor is concerned.
So the comparison becomes if (4294967295u < 2u), which is false.
Now let's try it with short:
short c = -1; // "signed" is implied
unsigned short d = 2;
if (c < d)
puts("c < d"); // printed
else
puts("c >= d"); // not printed
First, the two operands are promoted. Since both can be represented faithfully by int, both are promoted to int.
Next, they are converted to the same type. But they already are the same type, int, so nothing is done.
So the comparison becomes if (-1 < 2), which is true.
Writing good code: There's an easy way to catch these "gotchas" in your code. Just always compile with warnings turned on, and fix the warnings. I tend to write code like this:
int x = ...;
unsigned y = ...;
if (x < 0 || (unsigned) x < y)
...;
You have to watch out that any code you do write doesn't run into the other signed vs. unsigned gotcha: signed overflow. For example, the following code:
int x = ..., y = ...;
if (x + 100 < y + 100)
...;
unsigned a = ..., b = ...;
if (a + 100 < b + 100)
...;
Some popular compilers will optimize (x + 100 < y + 100) to (x < y), but that is a story for another day. Just don't overflow your signed numbers.
Footnote: Note that while signed is implied for int, short, long, and long long, it is NOT implied for char. Instead, it depends on the platform.
Taken from the C++ standard:
4.5 Integral promotions [conv.prom] 1 An rvalue of type char, signed char, unsigned char, short int, or unsigned short int can be
converted to an rvalue of type int if int can represent all the values of the
source type; otherwise, the source rvalue can be converted to an
rvalue of type unsigned int.
In practice it means, that all operations (on the types in the list) are actually evaluated on the type int if it can cover the whole value set you are dealing with, otherwise it is carried out on unsigned int.
In the first case the values are compared as unsigned int because one of them was unsigned int and this is why -1 is "greater" than 2. In the second case the values a compared as signed integers, as int covers the whole domain of both short and unsigned short and so -1 is smaller than 2.
(Background story: Actually, all this complex definition about covering all the cases in this way is resulting that the compilers can actually ignore the actual type behind (!) :) and just care about the data size.)
The conversion process for C++ is described as the usual arithmetic conversions. However, I think the most relevant rule is at the sub-referenced section conv.prom: Integral promotions 4.6.1:
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank ([conv.rank]) is less than the
rank of int can be converted to a prvalue of type int if int can
represent all the values of the source type; otherwise, the source
prvalue can be converted to a prvalue of type unsigned int.
The funny thing there is the use of the word "can", which I think suggests that this promotion is performed at the discretion of the compiler.
I also found this C-spec snippet that hints at the omission of promotion:
11 EXAMPLE 2 In executing the fragment
char c1, c2;
/* ... */
c1 = c1 + c2;
the ``integer promotions'' require that the abstract machine promote the value of each variable to int size
and then add the two ints and truncate the sum. Provided the addition of two chars can be done without
overflow, or with overflow wrapping silently to produce the correct result, the actual execution need only
produce the same result, possibly omitting the promotions.
There is also the definition of "rank" to be considered. The list of rules is pretty long, but as it applies to this question "rank" is straightforward:
The rank of any unsigned integer type shall equal the rank of the
corresponding signed integer type.

C usual arithmetic conversions

I was reading in the C99 standard about the usual arithmetic conversions.
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser integer
conversion rank is converted to the type of the operand with greater
rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type of the
operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is converted
to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
So let's say I have the following code:
#include <stdio.h>
int main()
{
unsigned int a = 10;
signed int b = -5;
printf("%d\n", a + b); /* 5 */
printf("%u\n", a + b); /* 5 */
return 0;
}
I thought the bolded paragraph applies (since unsigned int and signed int have the same rank. Why isn't b converted to unsigned ? Or perhaps it is converted to unsigned but there is something I don't understand ?
Thank you for your time :-)
Indeed b is converted to unsigned. However what you observed is that b converted to unsigned and then added to 10 gives as value 5.
On x86 32bit this is what happens
b, coverted to unsigned, becomes 4294967291 (i.e. 2**32 - 5)
adding 10 becomes 5 because of wrap-around at 2**32 (2**32 - 5 + 10 = 2**32 + 5 = 5)
0x0000000a plus 0xfffffffb will always be 0x00000005 regardless of whether you are dealing with signed or unsigned types, as long as only 32 bits are used.
Repeating the relevant portion of the code from the question:
unsigned int a = 10;
signed int b = -5;
printf("%d\n", a + b); /* 5 */
printf("%u\n", a + b); /* 5 */
In a + b, b is converted to unsigned int, (yielding UINT_MAX + 1 - 5 by the rule for unsigned-to-signed conversion). The result of adding 10 to this value is 5, by the rules of unsigned arithmetic, and its type is unsigned int. In most cases, the type of a C expression is independent of the context in which it appears. (Note that none of this depends on the representation; conversion and arithmetic are defined purely in terms of numeric values.)
For the second printf call, the result is straightforward: "%u" expects an argument of type unsigned int, and you've given it one. It prints "5\n".
The first printf is a little more complicated. "%d" expects an argument of type int, but you're giving it an argument of type unsigned int. In most cases, a type mismatch like this results in undefined behavior, but there's a special-case rule that corresponding signed and unsigned types are interchangeable as function arguments -- as long as the value is representable in both types (as it is here). So the first printf also prints "5\n".
Again, all this behavior is defined in terms of values, not representations (except for the requirement that a given value has the same representation in corresponding signed and unsigned types). You'd get the same result on a system where signed int and unsigned int are both 37 bits, signed int has 7 padding bits, unsigned int has 11 padding bits, and signed int uses a 1s'-complement or sign-and-magnitude representation. (No such system exists in real life, as far as I know.)
It is converted to unsigned, the unsigned arithmetic just happens to give the result you see.
The result of unsigned arithmetic is equivalent to doing signed arithmetic with two's complement and no out of range exception.

Resources