Given this simple random generator:
int i, r = 0;
for (i = 0; i < 50; i++) {
r = (1234 * r + 101) % (11000000);
printf("%d\n", r);
}
Surprisingly, I get negative values!
101
124735
10923091
192507
6553739
-7620565
-10842517
-10763989
-1860437
8188139
Isn't supposed to be positive values? Can someone explain this?
You get negative values because your program has integer arithmetics overflows. The behavior is actually undefined for signed type int. You should use a larger type to avoid this. Type unsigned long long is guaranteed to have at least 64 value bits, which is enough for the maximum intermediary result 1234 * 10999999 + 101.
int i;
unsigned long long r = 0;
for (i = 0; i < 50; i++) {
r = (1234 * r + 101) % 11000000;
printf("%llu\n", r);
}
rici commented that r does not need be a larger type since it's value is in range 0..10999999. This is not completely true as type int may be too small to handle such values. The range for int can be as small as -32767..32767.
Nevertheless, The intermediary computation must be performed with a larger type to avoid arithmetic overfow. Here is the corresponding code:
int i, r = 0; // assuming 32-bit ints
for (i = 0; i < 50; i++) {
r = (1234ULL * r + 101) % 11000000;
printf("%d\n", r);
}
As you've seen in other answers, this behavior is due to overflow.
If you want to be able to detect stuff like this earlier, use gcc or clang's Undefined Behavior Sanitizer (UBSan).
$ /opt/clang+llvm-4.0.0-armv7a-linux-gnueabihf/bin/clang -fsanitize=undefined don.c
$ ./a.out
don.c:8:18: runtime error: signed integer overflow: 1234 * 10923091 cannot be represented in type 'int'
don.c, line 8, column 18 is the multiplication in this line: r = (1234*r +101) % (11000000);.
You have to be careful, as your code is producing overflows, even if you do unsigned arithmetic.
Probably your int variable is a 32 bit integer which overflows after the number 2.147.483.647, and if you consider the worst case of your computation, you'll have 1.234*10.999.999 + 101 ==> 13.573.998.867, before calculating the modulus operation, and this will lead you to error.
The best thing you can do is to use 64 bit number for this kind of calculation not to overflow, with this sample code (you'll see different results even for your normal positive ones)
$ cat pru.c
#include <stdio.h>
#include <stdint.h>
int main()
{
uint64_t i, r=0;
for (i = 0; i < 50; i++) {
r = (1234*r +101) % (11000000);
printf("%llu\n", r);
}
}
which results in:
$ pru
101
124735
10923091
4094395
3483531
8677355
4856171
8515115
2652011
5581675
1787051
5221035
7757291
2497195
1538731
6794155
1987371
10415915
5239211
8186475
4110251
1049835
8496491
1669995
3773931
4030955
2198571
7036715
4306411
1111275
7313451
4798635
3515691
4362795
4689131
387755
5489771
9377515
10853611
6356075
396651
5467435
3814891
10575595
4284331
6864555
860971
6438315
2880811
1920875
This is correct, as 1.234*10.999.999 + 101 ==> 13.573.998.867 will never overflow a uint64_t number (this is the maximum result you can have) and will produce correct results.
Related
So I have this factorial function written in C:
unsigned int factorial(unsigned int n){
int fn = 1;
if(n == 0 || n == 1){
return 1;
} else{
for(int i = 1; i <= n; i++){
fn *= i;
}
}
return fn;
}
I tested it out with smaller numbers like 5 and it worked. Then I put it into this loop:
for(int i = 0; i < 100; i++){
printf("\n%d! = %d", i, factorial(i));
}
When i reaches 17, the factorial is apparently -288522240 which is obviously wrong. These kinds of answers continue until i reaches 34 and it says that the factorial is 0. It then does this for the rest of the numbers.
I don't understand what's wrong with my code. I see no reason for the number to become negative or 0. What's happened here?
100! or 9.3326...E+157 or
9332621544394415268169923885626670049071596826438162146859296389521759999322991560894146397615651828625369792082722375825118521091686400000000000000000000000, a 525 bit number, is outside the range of int - likely 32-bit [-2147483648 ... 2147483647]
Signed integer math that overflows is undefined behavior (UB). In OP's case, it appears that the lower 32-bits of the product fn * i, as 2's complement, was the result. Eventually enough multiplication of even numbers kept shifting the non-zero portion of the product "to the left" and resulted in that lower 32 bits becoming 0.
To calculate large factorials one needs another approach. Example
I'm implementing my own decrease-and-conquer method for an.
Here's the program:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>
double dncpow(int a, int n)
{
double p = 1.0;
if(n != 0)
{
p = dncpow(a, n / 2);
p = p * p;
if(n % 2)
{
p = p * (double)a;
}
}
return p;
}
int main()
{
int a;
int n;
int a_upper = 10;
int n_upper = 50;
int times = 5;
time_t t;
srand(time(&t));
for(int i = 0; i < times; ++i)
{
a = rand() % a_upper;
n = rand() % n_upper;
printf("a = %d, n = %d\n", a, n);
printf("pow = %.0f\ndnc = %.0f\n\n", pow(a, n), dncpow(a, n));
}
return 0;
}
My code works for small values of a and n, but a mismatch in the output of pow() and dncpow() is observed for inputs such as:
a = 7, n = 39
pow = 909543680129861204865300750663680
dnc = 909543680129861348980488826519552
I'm pretty sure that the algorithm is correct, but dncpow() is giving me wrong answers.
Can someone please help me rectify this? Thanks in advance!
Simple as that, these numbers are too large for what your computer can represent exactly in a single variable. With a floating point type, there's an exponent stored separately and therefore it's still possible to represent a number near the real number, dropping the lowest bits of the mantissa.
Regarding this comment:
I'm getting similar outputs upon replacing 'double' with 'long long'. The latter is supposed to be stored exactly, isn't it?
If you call a function taking double, it won't magically operate on long long instead. Your value is simply converted to double and you'll just get the same result.
Even with a function handling long long (which has 64 bits on nowadays' typical platforms), you can't deal with such large numbers. 64 bits aren't enough to store them. With an unsigned integer type, they will just "wrap around" to 0 on overflow. With a signed integer type, the behavior of overflow is undefined (but still somewhat likely a wrap around). So you'll get some number that has absolutely nothing to do with your expected result. That's arguably worse than the result with a floating point type, that's just not precise.
For exact calculations on large numbers, the only way is to store them in an array (typically of unsigned integers like uintmax_t) and implement all the arithmetics yourself. That's a nice exercise, and a lot of work, especially when performance is of interest (the "naive" arithmetic algorithms are typically very inefficient).
For some real-life program, you won't reinvent the wheel here, as there are libraries for handling large numbers. The arguably best known is libgmp. Read the manuals there and use it.
I was working on Exercise 2-1 of K&R, the goal is to calculate the range of different variable types, bellow is my function to calculate the maximum value a short int can contain:
short int max_short(void) {
short int i = 1, j = 0, k = 0;
while (i > k) {
k = i;
if (((short int)2 * i) > (short int)0)
i *= 2;
else {
j = i;
while (i + j <= (short int)0)
j /= 2;
i += j;
}
}
return i;
}
My problem is that the returned value by this function is: -32768 which is obviously wrong since I'm expecting a positive value. I can't figure out where the problem is, I used the same function (with changes in the variables types) to calculate the maximum value an int can contain and it worked...
I though the problem could be caused by comparison inside the if and while statements, hence the typecasting but that didn't help...
Any ideas what is causing this ? Thanks in advance!
EDIT: Thanks to Antti Haapala for his explanations, the overflow to the sign bit results in undefined behavior NOT in negative values.
You can't use calculations like this to deduce the range of signed integers, because signed integer overflow has undefined behaviour, and narrowing conversion at best results in an implementation-defined value, or a signal being raised. The proper solution is to just use SHRT_MAX, INT_MAX ... of <limits.h>. Deducing the maximum value of signed integers via arithmetic is a trick question in standardized C language, and has been so ever since the first standard was published in 1989.
Note that the original edition of K&R predates the standardization of C by 11 years, and even the 2nd one - the "ANSI-C" version predates the finalized standard and differs from it somewhat - they were written for a language that wasn't almost, but not quite, entirely unlike the C language of this day.
You can do it easily for unsigned integers though:
unsigned int i = -1;
// i now holds the maximum value of `unsigned int`.
Per definition, you cannot calculate the maximum value of a type in C, by using variables of that very same type. It simply doesn't make any sense. The type will overflow when it goes "over the top". In case of signed integer overflow, the behavior is undefined, meaning you will get a major bug if you attempt it.
The correct way to do this is to simply check SHRT_MAX from limits.h.
An alternative, somewhat more questionable way would be to create the maximum of an unsigned short and then divide that by 2. We can create the maximum by taking the bitwise inversion of the value 0.
#include <stdio.h>
#include <limits.h>
int main()
{
printf("%hd\n", SHRT_MAX); // best way
unsigned short ushort_max = ~0u;
short short_max = ushort_max / 2;
printf("%hd\n", short_max);
return 0;
}
One note about your code:
Casts such as ((short int)2*i)>(short int)0 are completely superfluous. Most binary operators in C such as * and > implement something called "the usual arithmetic conversions", which is a way to implicitly convert and balance types of an expression. These implicit conversion rules will silently make both of the operands type int despite your casts.
You forgot to cast to short int during comparison
OK, here I assume that the computer would handle integer overflow behavior by changing into negative integers, as I believe that you have assumed in writing this program.
code that outputs 32767:
#include <stdlib.h>
#include <stdio.h>
#include <malloc.h>
short int max_short(void)
{
short int i = 1, j = 0, k = 0;
while (i>k)
{
k = i;
if (((short int)(2 * i))>(short int)0)
i *= 2;
else
{
j = i;
while ((short int)(i + j) <= (short int)0)
j /= 2;
i += j;
}
}
return i;
}
int main() {
printf("%d", max_short());
while (1);
}
added 2 casts
I am new to C (and programming in general, minus a few weeks with Python). I am interested in learning how information is handled on a machine level, therefore I moved to C. Currently, I am working through some simple coding challenges and am having trouble finding information to resolve my current issue.
The challenge is to take N large integers into an array from input and print the sum of the numbers. The transition from Python to C has actually been more difficult than I expected due to the simplified nature of Python code.
Example input for the code below:
5
1000000001 1000000002 1000000003 1000000004 1000000005
Expected output:
5000000015
Code:
int main() {
long long unsigned int sum = 0;
int nums[200], n, i;
scanf("%i", &n);
for (i = 0; i =! n; i++) {
scanf("%i", &nums[i]);
sum = sum + nums[i];
}
printf("%llu", sum);
return 0;
}
The program seems to accept input for N, but it stops there.
One last question, in simple terms, what is the difference between a signed and unsigned variable?
Change your for loop like this
for (i = 0; i != n; i++) {
scanf("%i", &nums[i]);
sum = sum + nums[i];
}
if you say i =! n that is the same as i = !n. What that does is to assign the negated value of n to i. Since you gave a non-zero value to n the result is zero and the loop terminates.
Welcome to C!
Regarding the signed vs unsigned question. signed types can have negative values and unsigned can't. But they both take up the same space (number of bits) in memory. For instance, assuming twos' complement representation and a 32 bit integer, the range of values is
singed : -2^31 to 2^31 - 1 or –2147483648 to 2147483647
unsigned : 0 to 2^32 - 1 or 0 to 4294967295
My C-Program performs a "Turmrechnung"(A predefined number("init_num" gets multiplied with a predefined range of numbers(init_num*2*3*4*5*6*7*8*9 in my case, defined by the variable "h"), and after that it is divided by those same numbers and the result should be the initial value of "init_num". My task is to integrate a way to stop the calculation if the value of init_num becomes larger than INT_MAX(from limits.h).
But the If-Statement is always true, even if it is not, in case of a larger initial value of "init_num", which results in values bigger than INT_MAX along the way of the calculation.
It only works if i replace "INT_MAX" with a smaller number than INT_MAX like 200000000 in my If-Statement. Why?
#include <limits.h>
#include <stdio.h>
int main() {
int init_num = 1000000;
int h = 9;
for (int i = 2; i < h+1; ++i)
{
if (init_num * i < INT_MAX)
{
printf("%10i %s %i\n", init_num, "*", i);
init_num *= i;
}
else
{
printf("%s\n","An overflow has occurred!");
break;
}
}
for (int i = 2; i < h+1; ++i)
{
printf("%10i %s %i\n", init_num, ":", i);
init_num /= i;
}
printf("%10i\n", init_num);
}
if (init_num * i < INT_MAX)
INT_MAX is the maximum value of int , therefore , this condition will never be false or in other words 0 (except when it is equal to INT_MAX).
If you want you can write your condition like this -
if (init_num < INT_MAX/i)
init_num * i < INT_MAX will only be 0 if int_num * i is INT_MAX. This is not particularly likely. Note that signed integer overflow is undefined behaviour in C, so do be particularly careful here.
You can rewrite your statement to init_num < INT_MAX / i in your particular case. Do note that integer division truncates though.
The problem is signed integer overflow is undefined behaviour. Concentrate on the "undefined" part and think about it. Briefly: avoid under all circumstances.
To avoid this, you can either use a devinitively wider type which is gauranteed to hold the result of the multiplication and then test:
// ensure the type we use for cast is large enough
_Static_assert(LLONG_MAX > INT_MAX, "LLONG too small.");
if ( (long long)init_num * i < (long long)INT_MAX )
This apparently does not work is you are already at the limit (i.e. use the largest data type). So you have to check in advance:
if ( init_num < (INT_MAX / i) ) {
init_num *= i;
Although more time-consuming due to the extra division, this is in general the better approach, as it does not require a larger data type (where multiplication might be also more expensive).
init_num * int
results in an int and though cannot grow beyond a maximum possible int (INT_MAX) by defintion.
So provide "room" for calculations "larger" than an int by replacing
if (init_num * i < INT_MAX)
with
if ((long) init_num * i < (long) INT_MAX)
The casting to long leads to a long result.
(The above approach assumes long being wider then int.)