Signed division with unsigned numerator - c

I'm trying to calculate a rolling average, and to try and get and optimize a bit, I've simplified the calculation so there is only one division. When the value is decreasing, there is a point where the current value is lowered to less than the average. At this point the average jumps. I imagine this is because the division is unsigned, and my numerator's sign bit is interpreted as a massive unsigned number. I am just not sure where I need to cast unsigned to insure this problem doesn't reappear.
unsigned int AverageUsage;
unsigned int TotalUsage;
unsigned int incCount;
AverageUsage = (TotalUsage - AverageUsage)/++incCount + AverageUsage;
AverageUsage will always be positive, but when TotalUsage drops below AverageUsage, I'm not sure what to expect with the division
AverageUsage = (signed int)(TotalUsage - AverageUsage)/++incCount + AverageUsage;
Will set the numerator to signed, but I am not sure how the division will occur.
AverageUsage = (signed int)((signed int)(TotalUsage - AverageUsage)/++incCount) + AverageUsage;
Should work (I can guarantee the result of this full operation will never be negative), but I am worried about cases when incCount reaches a value that 'looks' negative.
Is there a simple solution to this that hopefully:
Doesn't need an if statement
Doesn't require QWORDs
Thanks!

The general rule of C binary ops (including division) is that the operands will both be converted to the same type, which is one of: int, unsigned int, long, unsigned long, intmax_t, uintmax_t, float, double, long double. If both operands are of types in that list, they'll both be converted to the later one. If neither is, they'll both be converted to int
So in your example:
AverageUsage = (signed int)(TotalUsage - AverageUsage)/++incCount + AverageUsage
if incCount is unsigned int, then your cast has no effect -- the subtract will be converted to signed int and then right back to unisgned int and an unsigned division will be done. If you want a signed division, you'll need:
AverageUsage = (int)(TotalUsage - AverageUsage)/(int)++incCount + AverageUsage
which as you note may get you into trouble if incCount exceeds INT_MAX.
In general, processor instructions for division only specify one type, which is used for both operands. When there is a special instruction for division with differing types, its usually for a larger (double width) dividend, not a different signedness.

You have 2 options.
Use Floating Point Math
I think you want to do this to get a proper average anyway.
There is no such thing as a mixed floating/integer divide. So, both numerator and denominator will be converted to a floating point.
Whether the numerator or denominator is signed or unsigned then doesn't matter. There is no such thing as unsigned floating point. The denominator incCount will be converted to a floating point and full floating point division will be done.
Use Integer division and handle the special cases
If for some reason you want to stay with integer division, then both the numerator and denominator have to be the same signed/unsigned type.
Both Numerator/Denominator are signed
incCount will be converted to a signed number. If it is too large then it will look like a negative number and your answer will be wrong. You have to test for this overflow.
Both Numerator/Denominator are unsigned
You have to make the numerator unsigned and use a if () statement to handle the two cases: TotalUsage < AverageUsage and TotalUsage > AverageUsage. Here incCount can use the full range of integer bits since it will be treated as an unsigned number.

Note of course that this is not a standard average. A standard average would be:
Averageusage = TotalUsage / ++incCount
Assuming (ideally) that incCount is some useful periodically increasing value (like seconds).
A decaying average is typically implemented more like: http://donlehmanjr.com/Science/03%20Decay%20Ave/032.htm which if I have translated correctly is:
AverageUsage = TotalUsage / (incCount+1) + incCount/(incCount+1) * AverageUsage;
incCount++;
As Himadri mentioned, these should probably be done in floating point arithmetic.

If it is foreseeable and valid for TotalUsage < AverageUsage, then it is entirely inappropriate for these variables to be of unsigned type. TotalUsage < AverageUsage would imply that AverageUsage could then be negative (which would be the result if TotalUsage < AverageUsage. If the data being 'averaged' is never negative, then it is arithmetically impossible for TotalUsage < AverageUsage to be true.
If TotalUsage < AverageUsage is not valid, then for it to be true would indicate an error in your code or an arithmetic overflow. You might guard against that possibility with an assert; perhaps one implemented as a macro that is removed in a release build. If the assert occurs then either the input data was invalid, or an overflow occurred, in the latter case the data type is too small, and either a long long, unsigned long long, or a double would be appropriate.
Even with casting, if TotalUsage < AverageUsage is true then the result of the expression is arithmetically negative, but ultimately assigned to an unsigned type, so the result will still be incorrect.
The ultimate conclusion then is either that TotalUsage < AverageUsage can never be true, or your data has inappropriate type. The solution is almost certainly not any kind of type cast.
My advice is generally to always use a signed type for variables on which arithmetic will be performed. This is because the language semantics of mixed signed/unsigned arithmetic are somewhat arcane and easily misunderstood, and because intermediate operations may generate otherwise negative values. Even if a negative value for the variable is semantically meaningless, I would still advocate the use of signed types in all cases where the positive range of such a type remains sufficient to avoid overflow, and where it is not sufficient. to use a larger type where possible rather than resort to an unsigned type of the same size. Further, where arithmetic operations on unsigned types is required, then all operands should be unsigned (including literals), and no intermediate operation should result under or overflow.

Do you truly /need/ a rolling-average, or can you use some other low-pass filter? A single-pole (sometimes called an "alpha") filter might suit you:
new_output = alpha * previous_output + (1-alpha)*new_input;
previous_output = new_output;
where alpha is between 0 and 0.9999....
The closer alpha is to 1, the "slower" the filter is
You can do this in floating point for ease, or in integers quite straightforwardly.

Related

How does the Integer addition result arrive at its value in the case of overflow in C [duplicate]

Unsigned integer overflow is well defined by both the C and C++ standards. For example, the C99 standard (§6.2.5/9) states
A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
However, both standards state that signed integer overflow is undefined behavior. Again, from the C99 standard (§3.4.3/1)
An example of undefined behavior is the behavior on integer overflow
Is there an historical or (even better!) a technical reason for this discrepancy?
The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.
In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).
Relevant quotes:
C99 6.2.6.1:3:
Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.
C99 6.2.6.2:2:
If the sign bit is one, the value shall be modified in one of the following ways:
— the corresponding value with sign bit 0 is negated (sign and magnitude);
— the sign bit has the value −(2N) (two’s complement);
— the sign bit has the value −(2N − 1) (one’s complement).
Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.
Aside from Pascal's good answer (which I'm sure is the main motivation), it is also possible that some processors cause an exception on signed integer overflow, which of course would cause problems if the compiler had to "arrange for another behaviour" (e.g. use extra instructions to check for potential overflow and calculate differently in that case).
It is also worth noting that "undefined behaviour" doesn't mean "doesn't work". It means that the implementation is allowed to do whatever it likes in that situation. This includes doing "the right thing" as well as "calling the police" or "crashing". Most compilers, when possible, will choose "do the right thing", assuming that is relatively easy to define (in this case, it is). However, if you are having overflows in the calculations, it is important to understand what that actually results in, and that the compiler MAY do something other than what you expect (and that this may very depending on compiler version, optimisation settings, etc).
First of all, please note that C11 3.4.3, like all examples and foot notes, is not normative text and therefore not relevant to cite!
The relevant text that states that overflow of integers and floats is undefined behavior is this:
C11 6.5/5
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
A clarification regarding the behavior of unsigned integer types specifically can be found here:
C11 6.2.5/9
The range of nonnegative values of a signed integer type is a subrange
of the corresponding unsigned integer type, and the representation of
the same value in each type is the same. A computation involving
unsigned operands can never overflow, because a result that cannot be
represented by the resulting unsigned integer type is reduced modulo
the number that is one greater than the largest value that can be
represented by the resulting type.
This makes unsigned integer types a special case.
Also note that there is an exception if any type is converted to a signed type and the old value can no longer be represented. The behavior is then merely implementation-defined, although a signal may be raised.
C11 6.3.1.3
6.3.1.3 Signed and unsigned integers
When a value with integer
type is converted to another integer type other than _Bool, if the
value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Otherwise, the new type is signed and the value
cannot be represented in it; either the result is
implementation-defined or an implementation-defined signal is raised.
In addition to the other issues mentioned, having unsigned math wrap makes the unsigned integer types behave as abstract algebraic groups (meaning that, among other things, for any pair of values X and Y, there will exist some other value Z such that X+Z will, if properly cast, equal Y and Y-Z will, if properly cast, equal X). If unsigned values were merely storage-location types and not intermediate-expression types (e.g. if there were no unsigned equivalent of the largest integer type, and arithmetic operations on unsigned types behaved as though they were first converted them to larger signed types, then there wouldn't be as much need for defined wrapping behavior, but it's difficult to do calculations in a type which doesn't have e.g. an additive inverse.
This helps in situations where wrap-around behavior is actually useful - for example with TCP sequence numbers or certain algorithms, such as hash calculation. It may also help in situations where it's necessary to detect overflow, since performing calculations and checking whether they overflowed is often easier than checking in advance whether they would overflow, especially if the calculations involve the largest available integer type.
Perhaps another reason for why unsigned arithmetic is defined is because unsigned numbers form integers modulo 2^n, where n is the width of the unsigned number. Unsigned numbers are simply integers represented using binary digits instead of decimal digits. Performing the standard operations in a modulus system is well understood.
The OP's quote refers to this fact, but also highlights the fact that there is only one, unambiguous, logical way to represent unsigned integers in binary. By contrast, Signed numbers are most often represented using two's complement but other choices are possible as described in the standard (section 6.2.6.2).
Two's complement representation allows certain operations to make more sense in binary format. E.g., incrementing negative numbers is the same that for positive numbers (expect under overflow conditions). Some operations at the machine level can be the same for signed and unsigned numbers. However, when interpreting the result of those operations, some cases don't make sense - positive and negative overflow. Furthermore, the overflow results differ depending on the underlying signed representation.
The most technical reason of all, is simply that trying to capture overflow in an unsigned integer requires more moving parts from you (exception handling) and the processor (exception throwing).
C and C++ won't make you pay for that unless you ask for it by using a signed integer. This isn't a hard-fast rule, as you'll see near the end, but just how they proceed for unsigned integers. In my opinion, this makes signed integers the odd-one out, not unsigned, but it's fine they offer this fundamental difference as the programmer can still perform well-defined signed operations with overflow. But to do so, you must cast for it.
Because:
unsigned integers have well defined overflow and underflow
casts from signed -> unsigned int are well defined, [uint's name]_MAX - 1 is conceptually added to negative values, to map them to the extended positive number range
casts from unsigned -> signed int are well defined, [uint's name]_MAX - 1 is conceptually deducted from positive values beyond the signed type's max, to map them to negative numbers)
You can always perform arithmetic operations with well-defined overflow and underflow behavior, where signed integers are your starting point, albeit in a round-about way, by casting to unsigned integer first then back once finished.
int32_t x = 10;
int32_t y = -50;
// writes -60 into z, this is well defined
int32_t z = int32_t(uint32_t(y) - uint32_t(x));
Casts between signed and unsigned integer types of the same width are free, if the CPU is using 2's compliment (nearly all do). If for some reason the platform you're targeting doesn't use 2's Compliment for signed integers, you will pay a small conversion price when casting between uint32 and int32.
But be wary when using bit widths smaller than int
usually if you are relying on unsigned overflow, you are using a smaller word width, 8bit or 16bit. These will promote to signed int at the drop of a hat (C has absolutely insane implicit integer conversion rules, this is one of C's biggest hidden gotcha's), consider:
unsigned char a = 0;
unsigned char b = 1;
printf("%i", a - b); // outputs -1, not 255 as you'd expect
To avoid this, you should always cast to the type you want when you are relying on that type's width, even in the middle of an operation where you think it's unnecessary. This will cast the temporary and get you the signedness AND truncate the value so you get what you expected. It's almost always free to cast, and in fact, your compiler might thank you for doing so as it can then optimize on your intentions more aggressively.
unsigned char a = 0;
unsigned char b = 1;
printf("%i", (unsigned char)(a - b)); // cast turns -1 to 255, outputs 255

Does signed to unsigned casting in C changes the bit values

I've done some quick tests that a signed int to unsigned int cast in C does not change the bit values (on an online debugger).
What I want to know is whether it is guaranteed by a C standard or just the common (but not 100% sure) behaviour ?
Conversion from signed int to unsigned int does not change the bit representation in two’s-complement C implementations, which are the most common, but will change the bit representation for negative numbers, including possible negative zeroes on one’s complement or sign-and-magnitude systems.
This is because the cast (unsigned int) a is not defined to retain the bits but the result is the positive remainder of dividing a by UINT_MAX + 1 (or as the C standard (C11 6.3.1.3p2) says,
the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
The two’s complement representation for negative numbers is the most commonly used representation for signed numbers exactly because it has this property of negative value n mapping to the same bit pattern as the mathematical value n + UINT_MAX + 1 – it makes it possible to use the same machine instruction for signed and unsigned addition, and the negative numbers will work because of wraparound.
Casting from a signed to an unsigned integer is required to generate the correct arithmetic result (the same number), modulo the size of the unsigned integer, so to speak. That is, after
int i = anything;
unsigned int u = (unsigned int)i;
and on a machine with 32-bit ints, the requirement is that u is equal to i, modulo 232.
(We could also try to say that u receives the value i % 0x100000000, except it turns out that's not quite right, because the C rules say that when you divide a negative integer by a positive integer, you get a quotient rounded towards 0 and a negative remainder, which isn't the kind of modulus we want here.)
If i is 0 or positive, it's not hard to see that u will have the same bit pattern.
If i is negative, and if you're on a 2's complement machine, it turns out the result is also guaranteed to have the same bit pattern. (I'd love to present a nice proof of that result here, but I don't have time just now to try to construct it.)
The vast majority of today's machines use 2's complement. But if you were on a 1's complement or sign/magnitude machine, I'm pretty sure the bit patterns would not always be the same.
So, bottom line, the sameness of the bit patterns is not guaranteed by the C Standard, but arises due to a combination of the C Standard's requirements, and the particulars of 2's complement arithmetic.

Is it always possible to convert an `int` to a `float`

Is a conversion from an int to a float always possible in C without the float becoming one of the special values like +Inf or -Inf?
AFAIK there is is no upper limit on the range of int.
I think a 128 bit int would cause an issue for a platform with an IEEE754 float as that has an upper value of around the 127th power of 2.
Short answer to your question: no, it is not always possible.
But it is worthwhile to go a little bit more into details. The following paragraph shows what the standard says about integer to floating-point conversions (online C11 standard draft):
6.3.1.4 Real floating and integer
2) When a value of integer type is converted to a real floating type,
if the value being converted can be represented exactly in the new
type, it is unchanged. If the value being converted is in the range of
values that can be represented but cannot be represented exactly, the
result is either the nearest higher or nearest lower representable
value, chosen in an implementation-defined manner. If the value being
converted is outside the range of values that can be represented, the
behavior is undefined. ...
So many integer values may be converted exactly. Some integer values may lose precision, yet a conversion is at least possible. For some values, however, the behaviour might be undefined (if, for example, an integer value would not be able to be represented with the maximum exponent of the float value). But actually I cannot assume a case where this will happen.
Is it always possible to convert an int to a float?
Reasonably - yes. An int will always convert to a finite float. The conversion may lose some precision for great int values.
Yet for the pedantic, an odd compiler could have trouble.
C allows for excessively wide int, not just 16, 32 or 64 bit ones and float could have a limit range, as small as 1e37.
It is not the upper range of int or INT_MAX that should be of concern. It is the lower end. INT_MIN which often has +1 greater magnitude than INT_MAX.
A 124 bit int min value could be about -1.06e37, so that does exceed the minimal float range.
With the common binary32 float, an int would need to be more than 128 bits to cause a float infinity.
So what test is needed to detect this rare situation?
Form an exact power-of-2 limit and perform careful math to avoid overflow or imprecision.
#if -INT_MAX == INT_MIN
// rare non 2's complement machine
#define INT_MAX_P1_HALF (INT_MAX/2 + 1)
_Static_assert(FLT_MAX/2 >= INT_MAX_P1_HALF, "non-2's comp.`int` range exceeds `float`");
#else
_Static_assert(-FLT_MAX <= INT_MIN, "2's complement `int` range exceeds `float`");
#endif
The standard only requires floating point representations to include a finite number as large as 1037 (§5.2.4.2.2/12) and does not put any limit on the maximum size of an integer. So if your implementation has 128-bit integers (or even 124-bit integers), it is possible for an integer-to-float conversion to exceed the range of finite representable floating point numbers.
No, it not always possible to convert an int to a float, due to how floats work. 32 bit floats greater than 16777216 (or less than -16777216) need to be even, greater than 33554432 (or less than -33554432) need to be evenly divisibly by 4, greater than 67108864 (or less than -67108864) need to be evenly divisibly by 8, etc. The IEEE-754 float standard defines round to nearest even as the default mode, but other modes exist depending upon implementation.
Also, the largest 128 bit int = 2^128 - 1 is greater than the largest 32 bit float = 2^127 x 1.11111111111111111111111 = 2^127 x (2-2^-23) = 2^127 x (2^1-2^-23) = 2^(127+1) - 2^(127-23) = 2^(127+1)-2^(127-23) = 2^(128) - 2^(104)

adding and subtracting float from unsigned short in C

I ran to some problem and it is driven me nuts.
I have a code like this
float a;
unsigned short b;
b += a;
When a is negative, b is going bananas.
I even did a cast
b += (unsigned short) a;
but it doesn't work.
What did I do wrong? How can I add float to a unsigned short?
FYI:
When 'a' is -1 and b is 0 then I'll see 'b +=a' will give b = 65535.
The way to add a float to an unsigned short is simply to add it, exactly as you've done. The operands of the addition will undergo conversions, as I'll describe below.
A simple example, based on your code, is:
#include <stdio.h>
int main(void) {
float a = 7.5;
unsigned short b = 42;
b += a;
printf("b = %hu\n", b);
return 0;
}
The output, unsurprisingly, is:
b = 49
The statement
b += a;
is equivalent to:
b = b + a;
(except that b is only evaluated once). When operands of different types are added (or subtracted, or ...), they're converted to a common type based on a set of rules you can find in the C standard section 6.3.1.8. In this case, b is converted from unsigned short to float. The addition is equivalent to 42.0f + 7.5f, which yields 49.5f. The assignment then converts this result from float to unsigned short, and the result,49is stored inb`.
If the mathematical result of the addition is outside the range of float (which is unlikely), or if it's outside the range of unsigned short (which is much more likely), then the program will have undefined behavior. You might see some garbage value stored in b, your program might crash, or in principle quite literally anything else could happen. When you convert a signed or unsigned integer to an unsigned integer type, the result is wrapped around; this does not happen when converting a floating-point value to an unsigned type.
Without more information, it's impossible to tell what problem you're actually having or how to fix it.
But it does seem that adding an unsigned short and a float and storing the result in an unsigned short is an unusual thing to do. There could be situations where it's exactly what you need (if so you need to avoid overflow), but it's possible that you'd be better off storing the result in something other than an unsigned short, perhaps in a float or double. (Incidentally, double is used more often than float for floating-point data; float is useful mostly for saving space when you have a lot of data.)
If you're doing numeric conversions, even implicit ones, it's often (but by no means always) an indication that you should have used a variable of a different type in the first place.
Your question would be improved by showing actual values you have trouble with, and explaining what value you expected to get.
But in the meantime, the definition of floating to integer conversion in C11 6.3.1.4/1 is:
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
This comes into play at the point where the result of b + a, which is a float, is assigned back to b. Recall that b += a is equivalent to b = b + a.
If b + a is a negative number of -1 or greater magnitude, then its integral part is out of range for unsigned short so the code causes undefined behaviour which means anything can happen; including but not limited to going bananas.
A footnote repeats the point that the float is not first converted to a signed integer and then to unsigned short:
The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (−1, Utype_MAX+1)
As an improvement you could write:
b += (long long)a;
which will at least not cause UB so long as a > LLONG_MIN.
You want b to be positive (it is unsigned), but a can be negative. It is OK as long as a is not larger than b. This is first point.
Second - when you are casting negative value to unsign.. what actually the result is supposed to be? Number sign is stored in most significant bit and for negative values it is 1. When value is unsigned when if most significant bit is 1 the value is really high and has nothing in common with negative one.
Maybe trying b -= fabs(a) for negative a. Isn't that what you are looking for?
You are observing the combination of the float being converted to an integer, and unsigned integer wrap-around ( https://stackoverflow.com/a/9052112/1149664 ).
Consider
b += a
for example with a = -100.67 you add a negative value to a signed data type, and depending on the initial value of b the result aught to be negative. How come you got the idea to use an unsigned short and not just float or double for this task?

is it safe to subtract between unsigned integers?

Following C code displays the result correctly, -1.
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%d", y );
}
But in general, is it always safe to do subtraction involving
unsigned integers?
The reason I ask the question is that I want to do some conditioning
as follows:
unsigned x = 1; // x was defined by someone else as unsigned,
// which I had better not to change.
for (int i=-5; i<5; i++){
if (x+i<0) continue
f(x+i); // f is a function
}
Is it safe to do so?
How are unsigned integers and signed integers different in
representing integers? Thanks!
1: Yes, it is safe to subtract unsigned integers. The definition of arithmetic on unsigned integers includes that if an out-of-range value would be generated, then that value should be adjusted modulo the maximum value for the type, plus one. (This definition is equivalent to truncating high bits).
Your posted code has a bug though: printf("%d", y); causes undefined behaviour because %d expects an int, but you supplied unsigned int. Use %u to correct this.
2: When you write x+i, the i is converted to unsigned. The result of the whole expression is a well-defined unsigned value. Since an unsigned can never be negative, your test will always fail.
You also need to be careful using relational operators because the same implicit conversion will occur. Before I give you a fix for the code in section 2, what do you want to pass to f when x is UINT_MAX or close to it? What is the prototype of f ?
3: Unsigned integers use a "pure binary" representation.
Signed integers have three options. Two can be considered obsolete; the most common one is two's complement. All options require that a positive signed integer value has the same representation as the equivalent unsigned integer value. In two's complement, a negative signed integer is represented the same as the unsigned integer generated by adding UINT_MAX+1, etc.
If you want to inspect the representation, then do unsigned char *p = (unsigned char *)&x; printf("%02X%02X%02X%02X", p[0], p[1], p[2], p[3]);, depending on how many bytes are needed on your system.
Its always safe to subtract unsigned as in
unsigned x = 1;
unsigned y=x-2;
y will take on the value of -1 mod (UINT_MAX + 1) or UINT_MAX.
Is it always safe to do subtraction, addition, multiplication, involving unsigned integers - no UB. The answer will always be the expected mathematical result modded by UINT_MAX+1.
But do not do printf("%d", y ); - that is UB. Instead printf("%u", y);
C11 §6.2.5 9 "A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
When unsigned and int are used in +, the int is converted to an unsigned. So x+i has an unsigned result and never is that sum < 0. Safe, but now if (x+i<0) continue is pointless. f(x+i); is safe, but need to see f() prototype to best explain what may happen.
Unsigned integers are always 0 to power(2,N)-1 and have well defined "overflow" results. Signed integers are 2's complement, 1's complement, or sign-magnitude and have UB on overflow. Some compilers take advantage of that and assume it never occurs when making optimized code.
Rather than really answering your questions directly, which has already been done, I'll make some broader observations that really go to the heart of your questions.
The first is that using unsigned in loop bounds where there's any chance that a signed value might crop up will eventually bite you. I've done it a bunch of times over 20 years and it has ultimately bit me every time. I'm now generally opposed to using unsigned for values that will be used for arithmetic (as opposed to being used as bitmasks and such) without an excellent justification. I have seen it cause too many problems when used, usually with the simple and appealing rationale that “in theory, this value is non-negative and I should use the most restrictive type possible”.
I understand that x, in your example, was decided to be unsigned by someone else, and you can't change it, but you want to do something involving x over an interval potentially involving negative numbers.
The “right” way to do this, in my opinion, is first to assess the range of values that x may take. Suppose that the length of an int is 32 bits. Then the length of an unsigned int is the same. If it is guaranteed to be the case that x can never be larger than 2^31-1 (as it often is), then it is safe in principle to cast x to a signed equivalent and use that, i.e. do this:
int y = (int)x;
// Do your stuff with *y*
x = (unsigned)y;
If you have a long that is longer than unsigned, then even if x uses the full unsigned range, you can do this:
long y = (long)x;
// Do your stuff with *y*
x = (unsigned)y;
Now, the problem with either of these approaches is that before assigning back to x (e.g. x=(unsigned)y; in the immediately preceding example), you really must check that y is non-negative. However, these are exactly the cases where working with the unsigned x would have bitten you anyway, so there's no harm at all in something like:
long y = (long)x;
// Do your stuff with *y*
assert( y >= 0L );
x = (unsigned)y;
At least this way, you'll catch the problems and find a solution, rather than having a strange bug that takes hours to find because a loop bound is four billion unexpectedly.
No, it's not safe.
Integers usually are 4 bytes long, which equals to 32 bits. Their difference in representation is:
As far as signed integers is concerned, the most significant bit is used for sign, so they can represent values between -2^31 and 2^31 - 1
Unsigned integers don't use any bit for sign, so they represent values from 0 to 2^32 - 1.
Part 2 isn't safe either for the same reason as Part 1. As int and unsigned types represent integers in a different way, in this case where negative values are used in the calculations, you can't know what the result of x + i will be.
No, it's not safe. Trying to represent negative numbers with unsigned ints smells like bug. Also, you should use %u to print unsigned ints.
If we slightly modify your code to put %u in printf:
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%u", y );
}
The number printed is 4294967295
The reason the result is correct is because C doesn't do any overflow checks and you are printing it as a signed int (%d). This, however, does not mean it is safe practice. If you print it as it really is (%u) you won't get the correct answer.
An Unsigned integer type should be thought of not as representing a number, but as a member of something called an "abstract algebraic ring", specifically the equivalence class of integers congruent modulo (MAX_VALUE+1). For purposes of examples, I'll assume "unsigned int" is 16 bits for numerical brevity; the principles would be the same with 32 bits, but all the numbers would be bigger.
Without getting too deep into the abstract-algebraic nitty-gritty, when assigning a number to an unsigned type [abstract algebraic ring], zero maps to the ring's additive identity (so adding zero to a value yields that value), one means the ring's multiplicative identity (so multiplying a value by one yields that value). Adding a positive integer N to a value is equivalent to adding the multiplicative identity, N times; adding a negative integer -N, or subtracting a positive integer N, will yield the value which, when added to +N, would yield the original value.
Thus, assigning -1 to a 16-bit unsigned integer yields 65535, precisely because adding 1 to 65535 will yield 0. Likewise -2 yields 65534, etc.
Note that in an abstract algebraic sense, every integer can be uniquely assigned into to algebraic rings of the indicated form, and a ring member can be uniquely assigned into a smaller ring whose modulus is a factor of its own [e.g. a 16-bit unsigned integer maps uniquely to one 8-bit unsigned integer], but ring members are not uniquely convertible to larger rings or to integers. Unfortunately, C sometimes pretends that ring members are integers, and implicitly converts them; that can lead to some surprising behavior.
Subtracting a value, signed or unsigned, from an unsigned value which is no smaller than int, and no smaller than the value being subtracted, will yield a result according to the rules of algebraic rings, rather than the rules of integer arithmetic. Testing whether the result of such computation is less than zero will be meaningless, because ring values are never less than zero. If you want to operate on unsigned values as though they are numbers, you must first convert them to a type which can represent numbers (i.e. a signed integer type). If the unsigned type can be outside the range that is representable with the same-sized signed type, it will need to be upcast to a larger type.

Resources