Unexpected results on difference of unsigned ints

Unexpected results on difference of unsigned ints - c

I was surprised that this function produces different values for dif1 and dif2
void test()
{
unsigned int x = 0, y = 1;
long long dif1 = x - y;
long long dif2 = (int)(x - y);
printf("dif = %lld %lld",dif1,dif2);
}
Is that correct behavior? In the dif1 computation it first promotes the 32-bit unsigned difference to a 64-bit unsigned value, then adds the sign. Is that standard behavior, not specified by the language, or a compiler bug? Is the second form guaranteed to produce -1, or up to the compiler implementation? I guess the safest construction is:
long long dif3 = (long long)x - (long long)y;

The first one is definitely defined, if we assume that long long is wider than unsigned int. If it isn't, then the assignment gives the same problem as the second part of the answer.
long long dif1 = x - y;
Unsigned integers will wrap and you get a maximum value that can be stored in an unsigned int.
6.2.5 p9: A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
As for the second
long long dif2 = (int)(x - y);
it is implementation defined:
6.3.1.3 p3: Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
In this case a maximum value for unsigned int cannot be represented in an int ant the above rule is in effect.

There's nothing surprising about it.
unsigned int x = 0, y = 1;
long long dif1 = x - y;
long long dif2 = (int)(x - y);
The second has one difference to the first:
A cast to signed.
The cast is defined to be value-preserving if possible (Not possible as UINT_MAX is bigger than INT_MAX), and otherwise implementation-defined (though it is allowed to trap).
If we have 2s-complement on cast (likely), the result of the cast is -1.
Next, we have an assignment to a wider signed type, which is always value-preserving.

Related

16-bit unsigned variable casting to generate correct result [duplicate]

I'm trying to subtract two unsigned ints and compare the result to a signed int (or a literal). When using unsigned int types the behavior is as expected. When using uint16_t (from stdint.h) types the behavior is not what I would expect. The comparison was done using gcc 4.5.
Given the following code:
unsigned int a;
unsigned int b;
a = 5;
b = 20;
printf("%u\n", (a-b) < 10);
The output is 0, which is what I expected. Both a and b are unsigned, and b is larger than a, so the result is a large unsigned number which is greater than 10. Now if I change a and b to type uint16_t:
uint16_t a;
uint16_t b;
a = 5;
b = 20;
printf("%u\n", (a-b) < 10);
The output is 1. Why is this? Is the result of subtraction between two uint16_t types stored in an int in gcc? If I change the 10 to 10U the output is again 0, which seems to support this (if the subtraction result is stored as an int and the comparison is made against an unsigned int than the subtraction results will be converted to an unsigned int).

Because calculations are not done with types below int / unsigned int (char, short, unsigned short etc; but not long, unsigned long etc), but they are first promoted to one of int or unsigned int. "uint16_t" is possibly "unsigned short" on your implementation, which is promoted to "int" on your implementation. So the result of that calculation then is "-15", which is smaller than 10.
On older implementations that calculate with 16bit, "int" may not be able to represent all values of "unsigned short" because both have the same bitwidth. Such implementations must promote "unsigned short" to "unsigned int". On such implementations, your comparison results in "0".

Before both the - and < operations are performed, a set of conversions called the usual arithmetic conversions are applied to convert the operands to a common type. As part of this process, the integer promotions are applied, which promote types narrower than int or unsigned int to one of those two types.
In the first case, the types of a and b are unsigned int, so no change of types occurs due to the - operator - the result is an unsigned int with the large positive value UINT_MAX - 14. Then, because int and unsigned int have the same rank, the value 10 with type int is converted to unsigned int, and the comparison is then performed resulting in the value 0.
In the second case, it is apparent that on your implementation the type int can hold all the values of the type uint16_t. This means that when the integer promotions are applied, the values of a and b are promoted to type int. The subtraction is performed, resulting in the value -15 with type int. Both operands to the < are already int, so no conversions are performed; the result of the < is 1.
When you use a 10U in the latter case, the result of a - b is still -15 with type int. Now, however, the usual arithmetic conversions cause this value to be converted to unsigned int (just as the 10 was in the first example), which results in the value UINT_MAX - 14; the result of the < is 0.

[...] Otherwise, the integer promotions are performed on both operands. (6.3.1.8)

When uin16_t is a sub-range of int, (a-b) < 10 is performed using int math.
Use an unsigned constant to gently nudge the left side to unsigned math.
// printf("%u\n", (a-b) < 10);
printf("%d\n", (0u + a - b) < 10); // Using %d as the result of a compare is int.
// or to quiet some picky warnings
printf("%d\n", (0u + a - b) < 10u);
(a - b) < 10u also works with this simple code, yet the idea I am suggesting to to perform both sides as unsigned math as that may be needed with more complex code.

How you avoid implicit conversion from short to integer during addition?

I'm doing a few integer for myself, where I'm trying to fully understand integer overflow.
I kept reading about how it can be dangerous to mix integer types of different sizes. For that reason i wanted to have an example where a short would overflow much faster than a int.
Here is the snippet:
unsigned int longt;
longt = 65530;
unsigned short shortt;
shortt = 65530;
if (longt > (shortt+10)){
printf("it is bigger");
}
But the if-statement here is not being run, which must mean that the short is not overflowing. Thus I conclude that in the expression shortt+10 a conversion happens from short to integer.
This is a bit strange to me, when the if statement evaluates expressions, does it then have the freedom to assign a new integer type as it pleases?
I then thought that if I was adding two short's then that would surely evaluate to a short:
unsigned int longt;
longt = 65530;
unsigned short shortt;
shortt = 65530;
shortt = shortt;
short tmp = 10;
if (longt > (shortt+tmp)){
printf("Ez bigger");
}
But alas, the proporsition still evaluates to false.
I then try do do something where I am completely explicit, where I actually do the addition into a short type, this time forcing it to overflow:
unsigned int longt;
longt = 65530;
unsigned short shortt;
shortt = 65530;
shortt = shortt;
short tmp = shortt + 10;
if (longt > tmp){
printf("Ez bigger");
}
Finally this worked, which also would be really annoying if it did'nt.
This flusters me a little bit though, and it reminds me of a ctf exercise that I did a while back, where I had to exploit this code snippet:
#include <stdio.h>
int main() {
int impossible_number;
FILE *flag;
char c;
if (scanf("%d", &impossible_number)) {
if (impossible_number > 0 && impossible_number > (impossible_number + 1)) {
flag = fopen("flag.txt","r");
while((c = getc(flag)) != EOF) {
printf("%c",c);
}
}
}
return 0;
}
Here, youre supposed to trigger a overflow of the "impossible_number" variable which was actually possible on the server that it was deployed upon, but would make issues when run locally.
int impossible_number;
FILE *flag;
char c;
if (scanf("%d", &impossible_number)) {
if (impossible_number > 0 && impossible_number > (impossible_number + 1)) {
flag = fopen("flag.txt","r");
while((c = getc(flag)) != EOF) {
printf("%c",c);
}
}
}
return 0;
You should be able to give "2147483647" as input, and then overflow and hit the if statement. However this does not happen when run locally, or running at an online compiler.
I don't get it, how do you get an expression to actually overflow the way that is is actually supossed to do, like in this example from 247ctf?
I hope someone has a answer for this

How you avoid implicit conversion from short to integer during addition?
You don't.
C has no arithmetic operations on integer types narrower than int and unsigned int. There is no + operator for type short.
Whenever an expression of type short is used as the operand of an arithmetic operator, it is implicitly converted to int.
For example:
short s = 1;
s = s + s;
In s + s, s is promoted from short to int and the addition is done in type int. The assignment then implicitly converts the result of the addition from int to short.
Some compilers might have an option to enable a warning for the narrowing conversion from int to short, but there's no way to avoid it.

What you're seeing is a result of integer promotions. What this basically means it that anytime an integer type smaller than int is used in an expression it is converted to int.
This is detailed in section 6.3.1.1p2 of the C standard:
The following may be used in an expression wherever an int or unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to
the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer
promotions
That is what's happening here. So let's look at the first expression:
if (longt > (shortt+10)){
Here we have a unsigned short with value 65530 being added to the constant 10 which has type int. The unsigned short value is converted to an int value, so now we have the int value 65530 being added to the int value 10 which results in the int value 65540. We now have 65530 > 65540 which is false.
The same happens in the second case where both operands of the + operator are first promoted from unsigned short to int.
In the third case, the difference happens here:
short tmp = shortt + 10;
On the right side of the assignment, we still have the int value 65540 as before, but now this value needs to be assigned back to a short. This undergoes an implementation defined conversion to short, which is detailed in section 6.3.1.3:
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new
type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Paragraph 3 takes effect in this particular case. In most implementations you're likely to come across, this will typically mean "wraparound" of the value.
So how do you work with this? The closest thing you can do is either what you did, i.e. assign the intermediate result to a variable of the desired type, or cast the intermediate result:
if (longt > (short)(shortt+10)) {
As for the "impossible" input in the CTF example, that actually causes signed integer overflow as a result of the the addition, and that triggers undefined behavior. For example, when I ran it on my machine, I got into the if block if I compiled with -O0 or -O1 but not with -O2.

How you avoid implicit conversion from short to integer during addition?
Not really avoidable.
On 16-bit and wider machines, the conversion short to int and unsigned short to unsigned does not affect the value. But addition overflow and the implicit conversion from int to unsigned renders a different result in 16-but vs. 32-bit for OP's values. For in 16-bit land, unsigned short to int does not implicitly occur. Instead, code does unsigned short to unsigned.
int/unsigned as 16-bit
If int/unsigned were 16-bit -common on many embedded processors, then shortt would not convert to an int, but to unsigned.
// Given 16-bit int/unsigned
unsigned int longt;
longt = 65530; // 32-bit long constant assigned to 16-bit unsigned - no value change as value in range.
unsigned short shortt;
shortt = 65530; // 32-bit long constant assigned to 16-bit unsigned short - no value change as value in range.
// (shortt+10)
// shortt+10 is a unsigned short + int
// unsigned short promotes to unsigned - no value change.
// Then since unsigned + int, the int 10 converts to unsigned 10 - no value change.
// unsigned 65530 + unsigned 10 exceeds unsigned range so 65536 subtracted.
// Sum is 4.
// Statment is true.
if (longt > (shortt+10)){
printf("it is bigger");
}

It is called an implicit conversion.
From C standard:
Several operators convert operand values from one type to another
automatically. This subclause specifies the result required from such
an implicit conversion, as well as those that result from a cast
operation (an explicit conversion ). The list in 6.3.1.8 summarizes
the conversions performed by most ordinary operators; it is
supplemented as required by the discussion of each operator in 6.5
Every integer type has an integer conversion rank defined as follows:
No two signed integer types shall have the same rank, even if they
have the same representation.
The rank of a signed integer type
shall be greater than the rank of any signed integer type with less
precision.
The rank of long long int shall be greater than the rank
of long int, which shall be greater than the rank of int, which shall
be greater than the rank of short int, which shall be greater than the
rank of signed char.
The rank of any unsigned integer type shall
equal the rank of the corresponding signed integer type, if any.
The
rank of any standard integer type shall be greater than the rank of
any extended integer type with the same width.
The rank of char
shall equal the rank of signed char and unsigned char.
The rank of
_Bool shall be less than the rank of all other standard integer types.
The rank of any enumerated type shall equal the rank of the
compatible integer type (see 6.7.2.2).
The rank of any extended
signed integer type relative to another extended signed integer type
with the same precision is implementation-defined, but still subject
to the other rules for determining the integer conversion rank.
For
all integer types T1, T2, and T3, if T1 has greater rank than T2 and
T2 has greater rank than T3, then T1 has greater rank than T3.
The
following may be used in an expression wherever an int or unsigned int
may be used:
— An object or expression with an integer type (other than int or unsigned
int) whose integer conversion rank is less than or equal to the rank
of int and unsigned int.
A bit-field of type _Bool, int, signed int,
or unsigned int. If an int can represent all v alues of the original
type (as restricted by the width, for a bit-field), the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions.58) All other types are
unchanged by the integer promotions.
The integer promotions preserve
value including sign. As discussed earlier, whether a ‘‘plain’’ char
is treated as signed is implementation-defined.
You cant avoid implicit conversion but you can cast the result of the operation to the required type
if (longt > (short)(shortt+tmp))
{
printf("Ez bigger");
}
https://godbolt.org/z/39Exa8E7K
But this conversion invokes Undefined Behaviour as your short integer overflows. You have to be very careful doing it as it can be a source of very hard to find and debug errors.

Why is signed char not getting upcasted to unsigned int here?

#include <stdio.h>
int main()
{
unsigned int x =1;
signed char y = -1;
unsigned int sum = x + y;
printf("%u", sum);
}
In the above program I expected signed char to be upcasted to unsigned int and hence sum to be x + y = 1 + 2^32 -1 = 2^32. But surprisingly it prints 0.
Previously, I had tried printing (x>y) and got false (0) as the output. I can't figure out what's going on here, could someone explain how does one about casting in such cases?

It shouldn't be surprising that computing 232 as an unsigned int results in 0. On a machine with 32-bit ints, UINT_MAX is 232−1 and 232 is out of range. As with any other unsigned arithmetic, the out-of-range value is reduced modulus UINT_MAX + 1 (i.e., 232), resulting in 0.
Specifically, in the evaluation of x + y:
First, y is converted to a (signed) int, as per the "integer promotions". This doesn't change the value of y; it is still -1.
Then, as per the "usual arithmetic conversions", since unsigned int and int have the same rank, y is converted to unsigned int, making its value 232 − 1.
Finally, the addition is computed using unsigned int arithmetic. That results in 0, as above.
This exact same sequence is followed for the evaluation of x > y. Since y has been converted to an unsigned int before the comparison is evaluated, the result is (perhaps unexpectedly) false. That's why some compilers will warn about comparison between signed and unsigned values.
Also, the type of the variable being assigned to does not alter the computation. Only when the result is computed is any consideration taken of what will be done with the result. If, for example, sum had been declared unsigned long long int, the computation would be done identically and sum would still be 0. For the extra precision to be useful, you would have to first cast y to unsigned int manually, and then ensure that the addition was computed with extra precision by manually casting one of the arguments to +:
unsigned int y_as_int = y;
unsigned long long sum = x + (unsigned long long)y_as_int;

Is unsigned char always promoted to int?

Suppose the following:
unsigned char foo = 3;
unsigned char bar = 5;
unsigned int shmoo = foo + bar;
Are foo and bar values guaranteed to be promoted to int values for the evaluation of the expression foo + bar -- or are implementations allowed to promote them to unsigned int?
In section 6.2.5 paragraph 8:
For any two integer types with the same signedness and different integer conversion rank
(see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a
subrange of the values of the other type.
In section 6.2.5 paragraph 9:
If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int.
The guarantee that an integer type with smaller integer conversion rank has a range of values that is a subrange of the values of the other type seems dependent on the signedness of the integer type.
signed char corresponds to signed int
unsigned char corresponds to unsigned int
Does this mean that the value of an unsigned char is only guaranteed to be in the subrange of unsigned int and not necessarily int? If so, does that imply that an implementation could theoretically have an unsigned char value which is not in the subrange of an int?

are implementations allowed to promote them to unsigned int?
Implementations will promote to unsigned int if not all unsigned char values are representable in an int (as ruled by 6.2.5p9 in C99). See below for implementation examples.
If so, does that imply that an implementation could theoretically have an unsigned char value which is not in the subrange of an int?
Yes, example: DSP cpu with CHAR_BIT 16 or 32.
For example, TI C compiler for TMS320C55x: CHAR_BIT is 16 and UCHAR_MAX 65535, UINT_MAX 65535 but INT_MAX 32767.
http://focus.ti.com/lit/ug/spru281f/spru281f.pdf

I ran across this yesterday - hope that my answer is on topic.
uint8_t x = 10;
uint8_t y = 250;
if (x - y > 0) {
// never happens
}
if (x - y < 0U) {
// always happens
}
To my eyes at least it was appearing as though values x and y were being unexpectedly promoted, when in fact is was their result that was promoted.

What type-conversions are happening?

#include "stdio.h"
int main()
{
int x = -13701;
unsigned int y = 3;
signed short z = x / y;
printf("z = %d\n", z);
return 0;
}
I would expect the answer to be -4567. I am getting "z = 17278".
Why does a promotion of these numbers result in 17278?
I executed this in Code Pad.

The hidden type conversions are:
signed short z = (signed short) (((unsigned int) x) / y);
When you mix signed and unsigned types the unsigned ones win. x is converted to unsigned int, divided by 3, and then that result is down-converted to (signed) short. With 32-bit integers:
(unsigned) -13701 == (unsigned) 0xFFFFCA7B // Bit pattern
(unsigned) 0xFFFFCA7B == (unsigned) 4294953595 // Re-interpret as unsigned
(unsigned) 4294953595 / 3 == (unsigned) 1431651198 // Divide by 3
(unsigned) 1431651198 == (unsigned) 0x5555437E // Bit pattern of that result
(short) 0x5555437E == (short) 0x437E // Strip high 16 bits
(short) 0x437E == (short) 17278 // Re-interpret as short
By the way, the signed keyword is unnecessary. signed short is a longer way of saying short. The only type that needs an explicit signed is char. char can be signed or unsigned depending on the platform; all other types are always signed by default.

Short answer: the division first promotes x to unsigned. Only then the result is cast back to a signed short.
Long answer: read this SO thread.

The problems comes from the unsigned int y. Indeed, x/y becomes unsigned. It works with :
#include "stdio.h"
int main()
{
int x = -13701;
signed int y = 3;
signed short z = x / y;
printf("z = %d\n", z);
return 0;
}

Every time you mix "large" signed and unsigned values in additive and multiplicative arithmetic operations, unsigned type "wins" and the evaluation is performed in the domain of the unsigned type ("large" means int and larger). If your original signed value was negative, it first will be converted to positive unsigned value in accordance with the rules of signed-to-unsigned conversions. In your case -13701 will turn into UINT_MAX + 1 - 13701 and the result will be used as the dividend.
Note that the result of signed-to-unsigned conversion on a typical 32-bit int platform will result in unsigned value 4294953595. After division by 3 you'll get 1431651198. This value is too large to be forced into a short object on a platform with 16-bit short type. An attempt to do that results in implementation-defined behavior. So, if the properties of your platform are the same as in my assumptions, then your code produces implementation-defined behavior. Formally speaking, the "meaningless" 17278 value you are getting is nothing more than a specific manifestation of that implementation-defined behavior. It is possible, that if you compiled your code with overflow checking enabled (if your compiler supports them), it would trap on the assignment.