The expression of detection with the unsigned integer overflow - c

The program will generate factorial number until it occurs overflow. The expression r <= UINT_MAX / j is to detect the overflow. For now, if I rewrite the expression to this r * j <= UINT_MAX, it will enter the infinite loop. My question is NOT about how to detect the integer overflow. That is why the expression will let the process enter the infinite loop.
for (unsigned i = 0;;++i) {
unsigned r = 1;
for (unsigned j = 2; j <= i; ++j) {
if (r <= UINT_MAX / j)
r *= j;
else {
printf("Overflow!\n");
}
}
}

Unsigned integer types have arithmetic modulo its maximum value plus 1.
In your case, we have arithmetic modulo (UINT_MAX+1).
So, the operation r*j always gives a result <= UINT_MAX, and the comparison is always true.
However, when r is compared against UINT_MAX / j, the real (mathematical) arithmetic coincides with the unsigned type arithmetic, because UINT_MAX / j is mathematically in the range of unsigned.
This explains why in the C-code the condition would be true or false exactly the same way as the mathematical comparison is.
Thus, the r <= UINT_MAX/j approach is the correct one.

Related

How are these two methods to find what power of 2 a number is, different?

So, let's say I have a number N that's guaranteed to be a power of 2 and is always greater than 0. Now, I wrote two C methods to find what power of 2 N is, based on bitwise operators -
Method A -
int whichPowerOf2(long long num) {
int ret = -1;
while (num) {
num >>= 1;
ret += 1;
}
return ret;
}
Method B -
int whichPowerOf2(long long num) {
int idx = 0;
while (!(num & (1<<idx))) idx += 1;
return idx;
}
Intuitively, the two methods seem one and the same and also return the same values for different (smaller) values of N. However, Method B doesn't work for me when I try to submit my solution to a coding problem.
Can anyone tell me what's going on here? Why is Method A right and Method B wrong?
The problem is with this subexpression:
1<<idx
The constant 1 has type int. If idx becomes larger than the bit width of an int, you invoked undefined behavior. This is specified in section 6.5.7p3 of the C standard regarding bitwise shift operators:
The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the value of
the right operand is negative or is greater than or equal to the width
of the promoted left operand, the behavior is undefined.
Change the constant to 1LL to give it type long long, matching the type of num.
while (!(num & (1LL<<idx))) idx += 1;
In your Method B, the following line can cause undefined behaviour:
while (!(num & (1<<idx))) idx += 1;
Why? Well, the expression 1<<idx is evaluated as an int because the constant 1 is an int. Further, as num is a long long (which we'll assume has more bits than an int), then you could end up left-shifting by more than the number of bits in an int.
To fix the issue, use the LL suffix on the constant:
while (!(num & (1LL<<idx))) idx += 1;

Proper way to count down with unsigned

I am reading Carnegie Mellon slides on computer systems for my quiz. In the slide page 49 :
Counting Down with Unsigned
Proper way to use unsigned as loop index
unsigned i;
for (i = cnt-2; i < cnt; i--)
a[i] += a[i+1];
Even better
size_t i;
for (i = cnt-2; i < cnt; i--)
a[i] += a[i+1];
I don't get why it's not going to be infinite loop. I am decrementing i and it is unsigned so it should be always less than cnt. Please explain.
The best option for down counting loops I have found so far is to use
for(unsigned i=N; i-->0; ) { }
This invokes the loop body with i=N-1 ... 0. This works the same way for both signed and unsigned data types and does not rely on any overflows.
This loop is simply relying on the fact that i will be decremented past 0, which makes it the max uint value. Which breaks the loop because now i < cnt == false.
Per Overflowing of Unsigned Int:
unsigned numbers can't overflow, but instead wrap around using the
properties of modulo.
Both the C and C++ standard guarantee this uint wrapping behavior, but it's undefined for signed integers.
The goal of the loops is to loop from cnt-2 down to 0. It achieves the effect of writing i >= 0.
The previous slide correctly talks about why a loop condition of i >= 0 doesn't work. Unsigned numbers are always greater than or equal to 0, so such a condition would be vacuously true. A loop condition of i < cnt ends up looping until i goes past 0 and wraps around. When you decrement an unsigned 0 it becomes UINT_MAX (232 - 1 for a 32-bit integer). When that happens, i < cnt is guaranteed to be false, and the loop terminates.
I would not write loops like this. It is technically correct but very poor style. Good code is not just correct, it is readable, so others can easily figure out what it's doing.
It's taking advantage of what happens when you decrement unsigned integer 0. Here's a simple example.
unsigned cnt = 2;
for (int i = 0; i < 5; i++) {
printf("%u\n", cnt);
cnt--;
}
That produces...
2
1
0
4294967295
4294967294
Unsigned integer 0 - 1 becomes UINT_MAX. So instead of looking for -1, you watch for when your counter becomes bigger than its initial state.
Simplifying the example a bit, here's how you can count down to 0 from 5 (exclusive).
unsigned i;
unsigned cnt = 5;
for (i = cnt-1; i < cnt; i--) {
printf("%d\n", i);
}
That prints:
4
3
2
1
0
On the final iteration i = UINT_MAX which is guaranteed to be larger than cnt so i < cnt is false.
size_t is "better" because it's unsigned and it's as big as the biggest thing in C, so you don't have to ensure that cnt is the same type as i.
This appears to be an alternative expression of the established idiom for implementing the same thing
for (unsigned i = N; i != -1; --i)
...;
They simply replaced the more readable condition of i != -1 with a slightly more cryptic i < cnt. When 0 is decremented in the unsigned domain it actually wraps around to the UINT_MAX value, which compares equal to -1 (in the unsigned domain) and which is greater than or equal to cnt. So, either i != -1 or i < cnt works as a condition for continuing iterations.
Why would they do it that way specifically? Apparently because they start from cnt - 2 and the value of cnt can be smaller than 2, in which case their condition does indeed work properly (and i != -1 doesn't). Aside from such situations there's no reason to involve cnt into the termination condition. One might say that an even better idea would be to pre-check the value of cnt and then use the i != -1 idiom
if (cnt >= 2)
for (unsigned i = cnt - 2; i != -1; --i)
...;
Note, BTW, that as long as the starting value of i is known to be non-negative, the implementation based on the i != -1 condition works regardless of whether i is signed or unsigned.
I think you are confused with int and unsigned int data types. These two are different. In the int datatype (2 byte storage size), you have its range from -32,768 to 32,767 whereas in the unsigned int datatype (2 byte storage size). you have the range from 0 to 65,535 .
In the above example mentioned, you are using the variable i of type unsigned int. It will decrements up to i=0 and then ends the for loop as per the semantics.

INT_MAX does not behave right in an If-Statement

My C-Program performs a "Turmrechnung"(A predefined number("init_num" gets multiplied with a predefined range of numbers(init_num*2*3*4*5*6*7*8*9 in my case, defined by the variable "h"), and after that it is divided by those same numbers and the result should be the initial value of "init_num". My task is to integrate a way to stop the calculation if the value of init_num becomes larger than INT_MAX(from limits.h).
But the If-Statement is always true, even if it is not, in case of a larger initial value of "init_num", which results in values bigger than INT_MAX along the way of the calculation.
It only works if i replace "INT_MAX" with a smaller number than INT_MAX like 200000000 in my If-Statement. Why?
#include <limits.h>
#include <stdio.h>
int main() {
int init_num = 1000000;
int h = 9;
for (int i = 2; i < h+1; ++i)
{
if (init_num * i < INT_MAX)
{
printf("%10i %s %i\n", init_num, "*", i);
init_num *= i;
}
else
{
printf("%s\n","An overflow has occurred!");
break;
}
}
for (int i = 2; i < h+1; ++i)
{
printf("%10i %s %i\n", init_num, ":", i);
init_num /= i;
}
printf("%10i\n", init_num);
}
if (init_num * i < INT_MAX)
INT_MAX is the maximum value of int , therefore , this condition will never be false or in other words 0 (except when it is equal to INT_MAX).
If you want you can write your condition like this -
if (init_num < INT_MAX/i)
init_num * i < INT_MAX will only be 0 if int_num * i is INT_MAX. This is not particularly likely. Note that signed integer overflow is undefined behaviour in C, so do be particularly careful here.
You can rewrite your statement to init_num < INT_MAX / i in your particular case. Do note that integer division truncates though.
The problem is signed integer overflow is undefined behaviour. Concentrate on the "undefined" part and think about it. Briefly: avoid under all circumstances.
To avoid this, you can either use a devinitively wider type which is gauranteed to hold the result of the multiplication and then test:
// ensure the type we use for cast is large enough
_Static_assert(LLONG_MAX > INT_MAX, "LLONG too small.");
if ( (long long)init_num * i < (long long)INT_MAX )
This apparently does not work is you are already at the limit (i.e. use the largest data type). So you have to check in advance:
if ( init_num < (INT_MAX / i) ) {
init_num *= i;
Although more time-consuming due to the extra division, this is in general the better approach, as it does not require a larger data type (where multiplication might be also more expensive).
init_num * int
results in an int and though cannot grow beyond a maximum possible int (INT_MAX) by defintion.
So provide "room" for calculations "larger" than an int by replacing
if (init_num * i < INT_MAX)
with
if ((long) init_num * i < (long) INT_MAX)
The casting to long leads to a long result.
(The above approach assumes long being wider then int.)

Sum of positive values in an array gives negative result in a c program

I have a problem that is, when I sum the values of an array (that are all positive, I verified by printing the values of the array), I end up with a negative value. My code for the sum is:
int summcp = 0;
for (k = 0; k < SIMUL; k++)
{
summcp += mcp[k];
}
printf("summcp: %d.\n", summcp);`
Any hint about this problem would be appreciated.
You are invoking undefined behaviour.
As integer variables can only hold a limited range of values, going beyond this range is undefined by the standard. Basically anything can happen. In your (and the most common) case, it simply wraps around its binary representation. As this is used for negative values, you will read this as such.
To circomvent this, use a type for summcp which can hold all possible values. An alternative would be to check if the next addition will overflow by:
if ( summcp >= INT_MAX - mcp[k] ) {
// handle positive overflow
}
Note that the above only works for mcp[k] >= 0 and positive overflow. The other 3 cases have to be handled differently. It is in general best, faster and much easier to use a large enough type for the sum and test lateron for overflow, if required.
Do not feel tempted to add and test the result!. As integer overflow is undefined behaviour, this will not work for all architectures.
This smells like an overflow issue; you might want to add a check against that like
for (k = 0; k < SIMUL && (INT_MAX - summcp > mcp[k]); k++ )
{
sumcp += mcp[k];
}
if (k < SIMUL)
{
// sum of all mcp values is larger than what an int can represent
}
else
{
// use summcp
}
If you are running into overflow issues, you might want to use a wider type for summcp like long or long long.
EDIT
The problem is that the behavior of signed integer overflow is not well-defined; you'll get a different result for INT_MAX + 1 based on whether your platform uses one's complement, two's complement, sign magnitude, or some other representation for signed integers.
Like I said in my comment below, if mcp can contain negative values, you should add a check for underflow as well.
Regardless of the type you use (int, long, long long), you should keep the over- and underflow checks.
If mcp only ever contains non-negative values, then consider using an unsigned type for your sum. The advantage of this is that the behavior of unsigned integer overflow is well-defined, and the result will be the sum of all elements of mcp modulo UINT_MAX (or ULONG_MAX, or ULLONG_MAX, etc.).
Declare variable summcp like
long long int summcp = 0;
I think your final output is going out of range. For this you will have to declare summcp as long long int
long long int summcp = 0;
for (k = 0; k < SIMUL; k++)
{
summcp += mcp[k];
}
printf("summcp: %lld.\n", summcp);
int variable throws garbage value if its range is exceeded.

For loop condition to stop at 0 when using unsigned integers?

I have a loop that has to go from N to 0 (inclusively). My i variable is of type size_t which is usually unsigned. I am currently using the following code:
for (size_t i = N; i != (size_t) -1; --i) {
...
}
Is that correct? Is there a better way to handle the condition?
Thanks,
Vincent.
Yes, it's correct and it is a very common approach. I wouldn't consider changing it.
Arithmetic on unsigned integer types is guaranteed to use modulo 2^N arithmetic (where N is the number of value bits in the type) and behaviour on overflow is well defined. The result is converted into the range 0 to 2^N - 1 by adding or subtracting multiples of 2^N (i.e. modulo 2^N arithmetic).
-1 converted to an unsigned integer type (of which size_t is one) converts to 2^N - 1. -- also uses modulo 2^N arithmetic for unsigned types so an unsigned type with value 0 will be decremented to 2^N - 1. Your loop termination condition is correct.
Just because for has a convenient place to put a test at the beginning of each iteration doesn't mean you have to use it. To handle N to 0 inclusive, the test should be at the end, at least if you care about handling the maximum value. Don't let the convenience suck you in to putting the test in the wrong place.
for (size_t i = N;; --i) {
...
if (i == 0) break;
}
A do-while loop would also work but then you'd additionally give up i being scoped to the loop.
You can use this:
for (size_t i = n + 1; i-- > 0;)
{
}
Hope that helps.
Personally, I would just use a different loop construct, but to each their own:
size_t i = N;
do {
...
} while (i --> 0);
(you could just use (i--) as the loop condition, but one should never pass up a chance to use the --> "operator").
for ( size_t i = N ; i <= N ; i-- ) { .... }
This would do it because size_t is an unsigned int. Unsigned ints are 32bits. When the variable i has a value of 0, you want your loop to execute the condition. If you perform i--, the computer does
00000000000000000000000000000000
-00000000000000000000000000000001
Which results in a clear overflow, giving a value of 111111111...1. For a signed two's complement integer, this value is clearly negative. However, the type of i is an unsigned int so the computer will interpret 111111...1 to be a very large positive value.
So you have a few options:
1) Do as above and make the loop terminate when overflow occurs.
2) Make the loop run from i = 0 to i <= N but use (N-i) instead of i in everywhere in your loop. For example, myArray[i] would become myArray[N-i] (off by one depending on what the value of N actually represents).
3) Make the condition of your for loop exploit the precedence of the unary -- operator. As another user posted,
for ( size_t i = N + 1 ; i-- > 0 ; ) { ... }
This will set i to N+1, check to see if the condition N+1 > 0 still holds. It does, but i-- has a side effect, so the value of i is decremented to i = N. Keep going until you get to i = 1. The condition will be test, 1 > 0 is true, the side effect occurs, then i = 0 and it executse.
You can use a second variable as the loop counter to make the range of iteration clear to a future reviewer.
for (size_t j=0, i=N; j<=N; ++j, --i) {
// code here ignores j and uses i which runs from N to 0
...
}
for (i=N; i+1; i--)
Since unsigned integer will roll into its max value when decremented from zero, you can try the following, provided N is less then that maximum value (someone please correct me if this is UB):
for ( size_t i = N; i <= N; i-- ) { /* ... */ }

Resources