adding and subtracting float from unsigned short in C

adding and subtracting float from unsigned short in C - c

I ran to some problem and it is driven me nuts.
I have a code like this
float a;
unsigned short b;
b += a;
When a is negative, b is going bananas.
I even did a cast
b += (unsigned short) a;
but it doesn't work.
What did I do wrong? How can I add float to a unsigned short?
FYI:
When 'a' is -1 and b is 0 then I'll see 'b +=a' will give b = 65535.

The way to add a float to an unsigned short is simply to add it, exactly as you've done. The operands of the addition will undergo conversions, as I'll describe below.
A simple example, based on your code, is:
#include <stdio.h>
int main(void) {
float a = 7.5;
unsigned short b = 42;
b += a;
printf("b = %hu\n", b);
return 0;
}
The output, unsurprisingly, is:
b = 49
The statement
b += a;
is equivalent to:
b = b + a;
(except that b is only evaluated once). When operands of different types are added (or subtracted, or ...), they're converted to a common type based on a set of rules you can find in the C standard section 6.3.1.8. In this case, b is converted from unsigned short to float. The addition is equivalent to 42.0f + 7.5f, which yields 49.5f. The assignment then converts this result from float to unsigned short, and the result,49is stored inb`.
If the mathematical result of the addition is outside the range of float (which is unlikely), or if it's outside the range of unsigned short (which is much more likely), then the program will have undefined behavior. You might see some garbage value stored in b, your program might crash, or in principle quite literally anything else could happen. When you convert a signed or unsigned integer to an unsigned integer type, the result is wrapped around; this does not happen when converting a floating-point value to an unsigned type.
Without more information, it's impossible to tell what problem you're actually having or how to fix it.
But it does seem that adding an unsigned short and a float and storing the result in an unsigned short is an unusual thing to do. There could be situations where it's exactly what you need (if so you need to avoid overflow), but it's possible that you'd be better off storing the result in something other than an unsigned short, perhaps in a float or double. (Incidentally, double is used more often than float for floating-point data; float is useful mostly for saving space when you have a lot of data.)
If you're doing numeric conversions, even implicit ones, it's often (but by no means always) an indication that you should have used a variable of a different type in the first place.

Your question would be improved by showing actual values you have trouble with, and explaining what value you expected to get.
But in the meantime, the definition of floating to integer conversion in C11 6.3.1.4/1 is:
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
This comes into play at the point where the result of b + a, which is a float, is assigned back to b. Recall that b += a is equivalent to b = b + a.
If b + a is a negative number of -1 or greater magnitude, then its integral part is out of range for unsigned short so the code causes undefined behaviour which means anything can happen; including but not limited to going bananas.
A footnote repeats the point that the float is not first converted to a signed integer and then to unsigned short:
The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (−1, Utype_MAX+1)
As an improvement you could write:
b += (long long)a;
which will at least not cause UB so long as a > LLONG_MIN.

You want b to be positive (it is unsigned), but a can be negative. It is OK as long as a is not larger than b. This is first point.
Second - when you are casting negative value to unsign.. what actually the result is supposed to be? Number sign is stored in most significant bit and for negative values it is 1. When value is unsigned when if most significant bit is 1 the value is really high and has nothing in common with negative one.
Maybe trying b -= fabs(a) for negative a. Isn't that what you are looking for?

You are observing the combination of the float being converted to an integer, and unsigned integer wrap-around ( https://stackoverflow.com/a/9052112/1149664 ).
Consider
b += a
for example with a = -100.67 you add a negative value to a signed data type, and depending on the initial value of b the result aught to be negative. How come you got the idea to use an unsigned short and not just float or double for this task?

Related

C is integer math equivalent with unsigned math?

So can i cast the values to unsigned values, do the operation and cast back, and get the same result? I want to do this because unsigned integers can overflow, while signed cant.

Unsigned integer arithmetic does not overflow in C terminology because it is defined to wrap modulo 2N, where N is the number of bits in the unsigned type being operated on, per C 2018 6.2.5 9:
… A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
For other types, if an overflow occurs, the behavior is not defined by the C standard, per 6.5 5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined. Note that not just the result is undefined; the entire behavior of the program is undefined. It could give a result you do not expect, it could trap, or it could execute entirely different code from what you expect.
Regarding your question:
So can I cast the values to unsigned values, do the operation and cast back, and get the same result?
we have two problems. First, consider a + b given int a, b;. If a + b overflows, then the behavior is not defined by the C standard. So we cannot say whether converting to unsigned, adding, and converting back to int will produce the same result because there is no defined result for a + b to start with.
Second, the conversion back is partly implementation-defined, per C 6.3.1.3. Consider int c = (unsigned) a + (unsigned) b;, which implicitly converts the unsigned sum to an int to store in c. Paragraph 1 tells us that, if the value of the sum is representable in int, it is the result of the conversion. But paragraph 3 tells us what happens if the value is not representable in int:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
GCC, for example, defines the result to be the result of wrapping modulo 2N. So, for int c = (unsigned) a + (unsigned) b;, GCC will produce the same result as int c = a + b; would if a + b wrapped modulo 2N. However, GCC does not guarantee the latter. When optimizing, GCC expects overflow will not occur, which can result in it eliminating any code branches where the program does allow overflow to occur. (GCC may have some options regarding its treatment of overflow.)
Additionally, even if both signed arithmetic and unsigned arithmetic wrap, performing an operation using unsigned values and converting back does not mathematically produce the same result as doing the operation with signed values. For example, consider -3/2. The int result is −1. But if -3 is converted to 32-bit unsigned, the resulting value is 232−3, and then (int) ((unsigned) -3 / (unsigned) 2) is 2−31−2 = 2,147,483,646.

why the result type of two operand should match the type of the operands?

I'm new to C and I was reading a textbook which shows some pieces of buggy code of a function that determines whether one string is longer than another:
int strlonger(char *s, char *t) {
return strlen(s) - strlen(t) > 0;
}
The reason it is buggy is because return type of strlen is unsigned integer, so the when the left operand result is negative, it will be casted to unsigned type, therefore produces incorrect result, e.g. -1 will be the maximum unsigned value which is of course great than 0.
it seems that the result of strlen(s) - strlen(t) is also unsigned integer, but why it has to be in this way? I mean for example, 0u-1u is -1, -1 is an signed integer, then C should keep this value -1 without casting it back to unsigned, because I'm not coding like:
...
unsigned int result = strlen(s) - strlen(t);
return result > 0;
or C has some special rule that the result type of two operand should match the type of the operands?

First of all the result of strlen is size_t which can be different than the unsigned integer. It is unsigned but it can have a different size https://godbolt.org/z/GrB5_z
The result of the operation has to fit in the resulting type so the best type is the same as the operands (assuming they are the same)
What to do - just change the logic of your function.
int strlonger(const char *s, const char *t)
{
return strlen(s) > strlen(t);
}
even the function name suggests this approach
I mean for example, 0u-1u is -1, -1 is an signed integer
no it is not. When substract 1u from 0u the unsigned integer wraps around. So the resulting integer will have all bits set (for 32bits unsigned integer it will be 0xffffffff)

… why it has to be in this way?
Theoretically, it does not have to be this way, but that is how it is designed in C.
The specific rule is in C 2018 6.3.1.8, “Usual arithmetic conversions”:
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise… [Emphasis added.]
This passage tells us that, if x and y have type unsigned int, the result of x - y is also unsigned int. (It goes on to present further rules about cases where the operands have mixed types, such as one double and one int, or have narrow integer types, such as char, but the situation is simple for unsigned int operands.)
Why is this a good design? Consider what the result of subtracting to unsigned int can be. Let M be the maximum value of an unsigned int. The minimum value is of course 0. If x is M and y is 0, x - y is M. If x is 0, and y is M, x -y is −M. So the result of x - y can be anything from −M to +M. If unsigned int has N bits, we would need a signed integer with N+1 bits in order to hold any potential result.
If we defined the result type of an operation to be a type that could hold any mathematical result, this design would be unworkable. Subtracting two 32-bit unsigned int would require a 33-bit integer type. And adding or subtracting two of those would require a 34-bit type. And further operations would require wider and wider types. Not only does hardware generally not support the necessary widths, but doing calculations in a loop would require dynamic types that grow with each iteration of the loop. (And this is considering only addition and subtraction. With multiplication, the required sizes would grow even faster.)
So, our design has to use fix sizes for the result types. What should they be? Whether we define the result of subtracting two unsigned int values to be unsigned int or int, only some of the possible mathematical results can be represented in the result type. For the most part, it is simpler to say the result type is the same as the operand types. It is up to the programmer to ensure they stay within the bounds of the type or, if they want something different, to write the code to get the result they want.
As an example, if you know size_t is unsigned int in your C implementation, and you know long int is wider, then you can write (long int) x - y. This explicitly converts x to the wider (and signed) type. y is also implicitly converted, by the rule cited above, and the result is produced in the type long int. Then there will be no overflow regardless of the values of x and y.
In summary, it is not feasible for the compiler to manage types to avoid overflows, so it is left to the programmer to do this.

Weirdness with unsigned int, float data types and multiplication

I am not very good at C language and just met a problem I don't understand. The code is:
int main()
{
unsigned int a = 100;
unsigned int b = 200;
float c = 2;
int result_i;
unsigned int result_u;
float result_f;
result_i = (a - b)*2;
result_u = (a - b);
result_f = (a-b)*c;
printf("%d\n", result_i);
printf("%d\n", result_u);
printf("%f\n", result_f);
return 0;
}
And the output is:
-200
-100
8589934592.000000
Program ended with exit code: 0
For (a-b) is negative and a,b are unsigned int type, (a-b) is trivial. And after multiplying a float type number c, the result is 8589934592.000000. I have two questions:
First, why the result is non-trivial after multiplying int type number 2 and assigned to an int type number?
Second, why the result_u is non-trivial even though (a-b) is negative and result_u is unsigned int type?
I am using Xcode to test this code, and the compiler is the default APPLE LLVM 6.0.
Thanks!

Your assumption that a - b is negative is completely incorrect.
Since a and b have unsigned int type all arithmetic operations with these two variables are performed in the domain of unsigned int type. The same applies to mixed "unsigned int with int" arithmetic as well. Such operations implement modulo arithmetic, with the modulo being equal to UINT_MAX + 1.
This means that expression a - b produces a result of type unsigned int. It is a large positive value equal to UINT_MAX + 1 - 100. On a typical platform with 32-bit int it is 4294967296 - 100 = 4294967196.
Expression (a - b) * 2 also produces a result of type unsigned int. It is also a large positive value (UINT_MAX + 1 - 100 multiplied by 2 and taken modulo UINT_MAX + 1). On a typical platform it is 4294967096.
This latter value is too large for type int. Which means that when you force it into a variable result_i, signed integer overflow occurs. The result of signed integer overflow on assignment is implementation defined. In your case result_i ended up being -200. It looks "correct", but this is not guaranteed by the language. (Albeit it might be guaranteed by your implementation.)
Variable result_u receives the correct unsigned result - a positive value UINT_MAX + 1 - 100. But you print that result using %d format specifier in printf, instead of the proper %u. It is illegal to print unsigned int values that do not fit into the range of int using %d specifier. The behavior of your code is undefined for that reason. The -100 value you see in the output is just a manifestation of that undefined behavior. This output is formally meaningless, even though it appears "correct" at the first sight.
Finally, variable result_f receives the "proper" result of (a-b)*c expression, calculated without overflows, since the multiplication is performed in the float domain. What you see is that large positive value I mentioned above, multiplied by 2. It is likely rounded to the precision of float type though, which is implementation-defined. The exact value would be 4294967196 * 2 = 8589934392.
One can argue that the last value you printed is the only one that properly reflects the properties of unsigned arithmetic, i.e. it is "naturally" derived from the actual result of a - b.

You get negative numbers in the printf because you've asked it to print a signed integer with %d. Use %u if you want to see the actual value you ended up with. That will also show you how you ended up with the output for the float multiplication.

is it safe to subtract between unsigned integers?

Following C code displays the result correctly, -1.
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%d", y );
}
But in general, is it always safe to do subtraction involving
unsigned integers?
The reason I ask the question is that I want to do some conditioning
as follows:
unsigned x = 1; // x was defined by someone else as unsigned,
// which I had better not to change.
for (int i=-5; i<5; i++){
if (x+i<0) continue
f(x+i); // f is a function
}
Is it safe to do so?
How are unsigned integers and signed integers different in
representing integers? Thanks!

1: Yes, it is safe to subtract unsigned integers. The definition of arithmetic on unsigned integers includes that if an out-of-range value would be generated, then that value should be adjusted modulo the maximum value for the type, plus one. (This definition is equivalent to truncating high bits).
Your posted code has a bug though: printf("%d", y); causes undefined behaviour because %d expects an int, but you supplied unsigned int. Use %u to correct this.
2: When you write x+i, the i is converted to unsigned. The result of the whole expression is a well-defined unsigned value. Since an unsigned can never be negative, your test will always fail.
You also need to be careful using relational operators because the same implicit conversion will occur. Before I give you a fix for the code in section 2, what do you want to pass to f when x is UINT_MAX or close to it? What is the prototype of f ?
3: Unsigned integers use a "pure binary" representation.
Signed integers have three options. Two can be considered obsolete; the most common one is two's complement. All options require that a positive signed integer value has the same representation as the equivalent unsigned integer value. In two's complement, a negative signed integer is represented the same as the unsigned integer generated by adding UINT_MAX+1, etc.
If you want to inspect the representation, then do unsigned char *p = (unsigned char *)&x; printf("%02X%02X%02X%02X", p[0], p[1], p[2], p[3]);, depending on how many bytes are needed on your system.

Its always safe to subtract unsigned as in
unsigned x = 1;
unsigned y=x-2;
y will take on the value of -1 mod (UINT_MAX + 1) or UINT_MAX.
Is it always safe to do subtraction, addition, multiplication, involving unsigned integers - no UB. The answer will always be the expected mathematical result modded by UINT_MAX+1.
But do not do printf("%d", y ); - that is UB. Instead printf("%u", y);
C11 §6.2.5 9 "A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
When unsigned and int are used in +, the int is converted to an unsigned. So x+i has an unsigned result and never is that sum < 0. Safe, but now if (x+i<0) continue is pointless. f(x+i); is safe, but need to see f() prototype to best explain what may happen.
Unsigned integers are always 0 to power(2,N)-1 and have well defined "overflow" results. Signed integers are 2's complement, 1's complement, or sign-magnitude and have UB on overflow. Some compilers take advantage of that and assume it never occurs when making optimized code.

Rather than really answering your questions directly, which has already been done, I'll make some broader observations that really go to the heart of your questions.
The first is that using unsigned in loop bounds where there's any chance that a signed value might crop up will eventually bite you. I've done it a bunch of times over 20 years and it has ultimately bit me every time. I'm now generally opposed to using unsigned for values that will be used for arithmetic (as opposed to being used as bitmasks and such) without an excellent justification. I have seen it cause too many problems when used, usually with the simple and appealing rationale that “in theory, this value is non-negative and I should use the most restrictive type possible”.
I understand that x, in your example, was decided to be unsigned by someone else, and you can't change it, but you want to do something involving x over an interval potentially involving negative numbers.
The “right” way to do this, in my opinion, is first to assess the range of values that x may take. Suppose that the length of an int is 32 bits. Then the length of an unsigned int is the same. If it is guaranteed to be the case that x can never be larger than 2^31-1 (as it often is), then it is safe in principle to cast x to a signed equivalent and use that, i.e. do this:
int y = (int)x;
// Do your stuff with *y*
x = (unsigned)y;
If you have a long that is longer than unsigned, then even if x uses the full unsigned range, you can do this:
long y = (long)x;
// Do your stuff with *y*
x = (unsigned)y;
Now, the problem with either of these approaches is that before assigning back to x (e.g. x=(unsigned)y; in the immediately preceding example), you really must check that y is non-negative. However, these are exactly the cases where working with the unsigned x would have bitten you anyway, so there's no harm at all in something like:
long y = (long)x;
// Do your stuff with *y*
assert( y >= 0L );
x = (unsigned)y;
At least this way, you'll catch the problems and find a solution, rather than having a strange bug that takes hours to find because a loop bound is four billion unexpectedly.

No, it's not safe.
Integers usually are 4 bytes long, which equals to 32 bits. Their difference in representation is:
As far as signed integers is concerned, the most significant bit is used for sign, so they can represent values between -2^31 and 2^31 - 1
Unsigned integers don't use any bit for sign, so they represent values from 0 to 2^32 - 1.
Part 2 isn't safe either for the same reason as Part 1. As int and unsigned types represent integers in a different way, in this case where negative values are used in the calculations, you can't know what the result of x + i will be.

No, it's not safe. Trying to represent negative numbers with unsigned ints smells like bug. Also, you should use %u to print unsigned ints.
If we slightly modify your code to put %u in printf:
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%u", y );
}
The number printed is 4294967295

The reason the result is correct is because C doesn't do any overflow checks and you are printing it as a signed int (%d). This, however, does not mean it is safe practice. If you print it as it really is (%u) you won't get the correct answer.

An Unsigned integer type should be thought of not as representing a number, but as a member of something called an "abstract algebraic ring", specifically the equivalence class of integers congruent modulo (MAX_VALUE+1). For purposes of examples, I'll assume "unsigned int" is 16 bits for numerical brevity; the principles would be the same with 32 bits, but all the numbers would be bigger.
Without getting too deep into the abstract-algebraic nitty-gritty, when assigning a number to an unsigned type [abstract algebraic ring], zero maps to the ring's additive identity (so adding zero to a value yields that value), one means the ring's multiplicative identity (so multiplying a value by one yields that value). Adding a positive integer N to a value is equivalent to adding the multiplicative identity, N times; adding a negative integer -N, or subtracting a positive integer N, will yield the value which, when added to +N, would yield the original value.
Thus, assigning -1 to a 16-bit unsigned integer yields 65535, precisely because adding 1 to 65535 will yield 0. Likewise -2 yields 65534, etc.
Note that in an abstract algebraic sense, every integer can be uniquely assigned into to algebraic rings of the indicated form, and a ring member can be uniquely assigned into a smaller ring whose modulus is a factor of its own [e.g. a 16-bit unsigned integer maps uniquely to one 8-bit unsigned integer], but ring members are not uniquely convertible to larger rings or to integers. Unfortunately, C sometimes pretends that ring members are integers, and implicitly converts them; that can lead to some surprising behavior.
Subtracting a value, signed or unsigned, from an unsigned value which is no smaller than int, and no smaller than the value being subtracted, will yield a result according to the rules of algebraic rings, rather than the rules of integer arithmetic. Testing whether the result of such computation is less than zero will be meaningless, because ring values are never less than zero. If you want to operate on unsigned values as though they are numbers, you must first convert them to a type which can represent numbers (i.e. a signed integer type). If the unsigned type can be outside the range that is representable with the same-sized signed type, it will need to be upcast to a larger type.

Comparison operation on unsigned and signed integers

See this code snippet
int main()
{
unsigned int a = 1000;
int b = -1;
if (a>b) printf("A is BIG! %d\n", a-b);
else printf("a is SMALL! %d\n", a-b);
return 0;
}
This gives the output: a is SMALL: 1001
I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?

Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.
When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.
The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.
Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls within the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.

On a typical implementation where int is 32-bit, -1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.
Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.
That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)

You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.
The output is signed because of %d in printf.
N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.

#include<stdio.h>
int main()
{
int a = 1000;
signed int b = -1, c = -2;
printf("%d",(unsigned int)b);
printf("%d\n",(unsigned int)c);
printf("%d\n",(unsigned int)a);
if(1000>-1){
printf("\ntrue");
}
else
printf("\nfalse");
return 0;
}
For this you need to understand the precedence of operators
Relational Operators works left to right ...
so when it comes
if(1000>-1)
then first of all it will change -1 to unsigned integer because int is by default treated as unsigned number and it range it greater than the signed number
-1 will change into the unsigned number ,it changes into a very big number

Find a easy way to compare, maybe useful when you can not get rid of unsigned declaration, (for example, [NSArray count]), just force the "unsigned int" to an "int".
Please correct me if I am wrong.
if (((int)a)>b) {
....
}

The hardware is designed to compare signed to signed and unsigned to unsigned.
If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values.
And -1 is represented as 1111..1111, so it a very big quantity ... The biggest ... When interpreted as unsigned.

while comparing a>b where a is unsigned int type and b is int type, b is type casted to unsigned int so, signed int value -1 is converted into MAX value of unsigned**(range: 0 to (2^32)-1 )**
Thus, a>b i.e., (1000>4294967296) becomes false. Hence else loop printf("a is SMALL! %d\n", a-b); executed.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight