Calculating max long value using exponent overflows - c

I'm trying to figure out maximum value for type long by calculating an exponential of base 2 to the power of the bit number.
Unfortunately the calculation overflows at step 61 and I don't understand why.
long exponential(int base, int exponent)
{
long result = (long)base;
for (int i = 0; i < exponent; i++) {
result *= base;
}
return result;
}
unsigned int sLong = sizeof(long);
long lResult = exponential(2, (sLong * 8) - 1);
lResult is 0 after running the function.
What's odd is that when I do this for char, short and int it works fine.

The code here has an off-by-one error.
Consider the following: what is the result of exponential(10, 2)? Empirical debugging (use a printf statement) shows that it's 1000. So exponential calculates the mathematical expression be+1.
The long type usually has 64 bits. This seems to be your case (seeing that the overflow happens around step 64). Seeing that it's a signed type, its range is (typically) from -263 to 263-1. That is, the maximal power of 2 that the data type can represent is 262; if the code tries to calculate 263, then overflow happens (you probably want to avoid it).
So, because of the off-by-one error, the code will cause an overflow for exponent greater or equal to 62.
To fix the off-by-one error, start multiplying from 1:
long power_of(int base, int exponent)
{
long result = (long)1; // 0th power of base is 1
for (int i=0; i<exponent;i++) {
result*=base;
}
return result;
}
However, this will not get rid of the overflow, because the long data type cannot represent the number 263. Fortunately, you can use unsigned long long, which is guaranteed to be big enough for the type of calculations you are doing:
unsigned long long power_of(int base, int exponent)
{
unsigned long long result = 1ULL; // 0th power of base is 1
for (int i=0; i<exponent;i++) {
result*=base;
}
return result;
}
Print it with the llu format:
printf("%llu", power_of(2, 63));

Related

How would you interpret the behaviour of my C Hash function (type of Fowler–Noll–Vo_hash_function)?

I dont understand why the interger Value "hash" is getting lower in/after the 3 loop.
I would guess this happen because the uint limitation is 2,147,483,647.
BUT... when i try to go step by step the value is equal to 2146134658?.
I´m not that good in math but it should be lower than the limitation.
#define FNV_PRIME_32 16777619
#define FNV_OFFSET_32 2166136261U
unsigned int hash_function(const char *string, unsigned int size)
{
unsigned int str_len = strlen(string);
if (str_len == 0) exit(0);
unsigned int hash = FNV_OFFSET_32;
for (unsigned int i = 0; i < str_len; i++)
{
hash = hash ^ string[i];
// Multiply by prime number found to work well
hash = hash * FNV_PRIME_32;
if (hash > 765010506)
printf("YO!\n");
else
printf("NOO!\n");
}
return hash % size;
}
If you are wondering this if statement is only for me.
if (hash > 765010506)
printf("YO!\n");
else
printf("NOO!\n");
765010506 is the value for hash after the next run through the loop.
I dont understand why the interger Value "hash" is getting lower in/after the 3 loop.
All unsigned integer arithmetic in C is modular arithmetic. For unsigned int, it is modulo UINT_MAX + 1; for unsigned long, modulo ULONG_MAX + 1, and so on.
(a modulo m means the remainder of a divided by m; in C, a % m if both a and m are unsigned integer types.)
On many current architectures, unsigned int is a 32-bit unsigned integer type, with UINT_MAX == 4294967295.
Let's look at what this means in practice, for multiplication (by 65520, which happens to be an interesting value; 216 - 16):
unsigned int x = 1;
int i;
for (i = 0; i < 10; i++) {
printf("%u\n", x);
x = x * 65520;
}
The output is
1
65520
4292870400
50327552
3221291008
4293918720
16777216
4026531840
0
0
What? How? How come the result ends up zero? That cannot happen!
Sure it can. In fact, you can show mathematically that it happens eventually whenever the multiplier is even, and the modulo is with respect to a power of two (232, here).
Your particular multiplier is odd, however; so, it does not suffer from the above. However, it still wraps around due to the modulo operation. If we retry the same with your multiplier, 16777619, and a bit longer sequence,
unsigned int x = 1;
int i;
for (i = 0; i < 20; i++) {
printf("%u\n", x);
x = x * 16777619;
}
we get
1
16777619
637696617
1055306571
1345077009
1185368003
4233492473
878009595
1566662433
558416115
1485291145
3870355883
3549196337
924097827
3631439385
3600621915
878412353
2903379027
3223152297
390634507
In fact, it turns out that this sequence is 1,073,741,824 iterations long (before it repeats itself), and will never yield 0, 2, 4, 5, 6, 7, 8, 10, 12, 13, 14, or 15, for example -- that is, if it starts from 1. It even takes 380 iterations to get a result smaller than 16,777,619 (16,689,137).
For a hash function, that is okay. Each new nonzero input changes the state, so the sequence is not "locked". But, there is no reason to expect the hash value increases monotonically as the length of the hashed data increases; it is much better to assume it is "roughly random" instead: not really random, as it depends on the input only, but also not obviously regular-looking.
I would guess this happen because the uint limitation is 2,147,483,647.
The maximum value of a 32-bit unsigned integer is roughly 4 billion (232 - 1 = 4,294,967,295). The number you're thinking of is the maximum value of a signed integer (231 - 1).
2,146,134,658 is slightly less than 231 (so it could fit in even an unsigned 32-bit integer), but it's still very close to the limit. Multiplying it by FNV_PRIME_32 -- which is roughly 224 -- will give a result of roughly 255, which will cause overflow.

pow() function weird behavior when used in brackets and with long long integer

I found something weird.
This function puts a digit in a number at the given spot and returns the modified number.
Now we want to do put_digit(123456789123456784, 0, 9);
That will put 9 at the end of the number, replacing the last number (4).
This is the code that WORKS:
long long int put_digit(long long int number, char place, char digit)
{
long long int result = number;
long long int power = number;
power = pow(10, (place));
result -= get_digit(number, place)*power;
result += digit*pow(10, (place));
return result;
}
The code returns 123456789123456789
This is the code that DOES NOT WORK:
long long int put_digit(long long int number, char place, char digit)
{
long long int result = number;
result -= get_digit(number, place)*pow(10, (place));
result += digit*pow(10, (place));
return result;
}
This code returns 123456789123456800 as the result.
The functions get_digit() returns the digit from the number in the given place.
This is it's code:
char get_digit(long long int number, char place)
{
long long int target = number;
char digit = 0;
target /= pow(10, place);
digit = target % 10;
return digit;
}
• This does not happen with lower numbers.
• get_digit() always returns the correct value (4 in this case).
• get_digit() is a char because this is not a counter function, and thus it is better to focus on using less memory rather than using a faster variable like int.
• I've tried using brackets to avoid troublesome operator precedence, but to no avail.
• A weird behavior is also observed when doing put_digit(123456789123456000, 2, 7), which for some reason returns 123456789123456704. This is solved by replacing the pow function in the second result calculation with the variable "power".
I just don't understand why this is happening.
Am I getting some kind of an overflow? Is it my system's fault or my own? Am I using pow() in a bad way?
The declaration of pow() is: double pow( double base, double exponent );
In the first case:
long long int power = number;
power = pow(10, (place));
the value returned by pow() is converted to long long int when it is assigned power. The rest of the computation is processed using integer numbers and the result is the one you expect.
On the second case:
result -= get_digit(number, place)*pow(10, (place));
the value returned by get_digit(number, place) is converted to double because it needs to be multiplied with a floating point number (returned by pow()). Also, the value of result is converted to double before subtracting the result of the multiplication. In the end, the computed value is converted from double to long long int to be stored in result.
But starting on some magnitude, the floating point numbers lose the precision of their least significant digit(s).
Try this simple piece of code to see for yourself:
long long int i = 123456789123456785;
for (; i <= 123456789123456795; i ++) {
printf("long long int: %lld; double: %f\n", i, (double)i);
}
It outputs:
long long int: 123456789123456785; double: 123456789123456784.000000
long long int: 123456789123456786; double: 123456789123456784.000000
long long int: 123456789123456787; double: 123456789123456784.000000
long long int: 123456789123456788; double: 123456789123456784.000000
long long int: 123456789123456789; double: 123456789123456784.000000
long long int: 123456789123456790; double: 123456789123456784.000000
long long int: 123456789123456791; double: 123456789123456784.000000
long long int: 123456789123456792; double: 123456789123456800.000000
long long int: 123456789123456793; double: 123456789123456800.000000
long long int: 123456789123456794; double: 123456789123456800.000000
long long int: 123456789123456795; double: 123456789123456800.000000
This behaviour is not a bug but a limitation of the floating point numbers.
The solution for your code is to convert the value returned by pow(10, place) to long long int as soon as it returns:
result -= get_digit(number, place)*(long long int)pow(10, place);

ProjectEuler#3: Why is it telling me that this division breaks even when it doesn't?

For smaller numbers, my code seems to find the highest prime number ok. However, for this particular larger number it thinks 600851475143 is divisible by 7 (It is not a factor, which I verified using wolfram).
My algorithm to solve the problem is as follows:
1) Let p=value for which we want to find the immediately smaller prime number.
2) Check if p is prime using isComposite(p)
3) If p is composite, decrement p and try again
4) Stop when found a prime number
isComposite(p) works as follows:
1) Let i=2
2) Given some number 'limit', check to see if i divides 'limit'
3) If it does, then return i (Originally, it was 1 but I wanted to check what it think divides 'limit')
4) If i does not divide limit, increment i and repeat starting from 2, stopping when i>= sqrt(limit). This is because factors of a number, if they exist, occur in pairs in which one value is less than sqrt(number).
Here is the call tree:
printf("P3:%lu\n",p3v2(600851475143)); //Print the highest prime
Here is the p3v2() function:
unsigned long p3v2(unsigned long limit)
{
unsigned long i = limit;
while(i>1)
{
printf("Checking %lu\n", i);
if(!(isComposite(i) != 0))
{
printf("%lu is prime!\n", isComposite(i));
return i;
}
else
{
printf("%lu is composite!\n", isComposite(i));
i--;
}
}
return -1;
}
And isComposite()
unsigned long isComposite(unsigned int limit)
{
unsigned long i = 2;
unsigned long searchUpperBound = (unsigned long)sqrt(limit); //Only need to search up to sqrt(limit) to see if there is a factor
while(i<=searchUpperBound) //See if numbers up through searchUpperBound divides limit
{
if(limit%i==0) //If factor found
return i;
else
i++;
}
return 0;
}
unsigned long is probably 32-bit on your machine. 600851475143 (hex 0x8BE589EAC7) doesn't fit in an unsigned long, it's actually 0xE589EAC7 (decimal 3851020999, which is dividable by 7) that is used in calculation.
The solution is to use unsigned long long instead.
I'm going to assume that unsigned longs are 64-bit on your machine and unsigned ints are 32-bit. Otherwise your printfs would probably be printing weird numbers already. If this is all true, the culprit is this line:
unsigned long isComposite(unsigned int limit)
Here, you should make limit an unsigned long:
unsigned long isComposite(unsigned long limit)
However, I would suggest you remove all your unsigned longs and replace them with uint64_t:
#include <stdint.h>
uint64_t isComposite(uint64_t limit)

Precision of double numbers with 17 or more digits

I am getting precision loss when converting a big double (17+ digits) number to integer.
#include <stdio.h>
int main() {
int n = 20;
double acum = 1;
while (n--) acum *= 9;
printf("%.0f\n", acum);
printf("%llu\n", (unsigned long long)acum);
return 0;
}
The output of this code is:
12157665459056929000
12157665459056928768
I can't use unsigned long long for the calculations because this is just a pseudo code and I need the precision on the real code, where divisions are included.
If I increase the decimals the first output becomes, for e.g 12157665459056929000.0000000000.
I've tried round(acum) and trunc(acum) and in both cases the result were the same as the second output. Shouldn't they be equal to the first??
I know float has only 6 decimals precision and double has about 17. But what's wrong with the digits?!?
Actually, when I change the acum's type to unsigned long long like:
unsigned long long acum = 1;
the result is:
12157665459056928801
When I use Python to calculate the accurate answer:
>>9**20
12157665459056928801L
You see?
12157665459056929000 is not an accurate answer at all and is actually an approximation of the accurate.
Then I change the code like this:
printf("%llu\n", (unsigned long long)1.2157665459056929e+019);
printf("%llu\n", (unsigned long long)1.2157665459056928e+019);
printf("%llu\n", (unsigned long long)1.2157665459056927e+019);
printf("%llu\n", (unsigned long long)1.2157665459056926e+019);
And result is:
12157665459056928768
12157665459056928768
12157665459056926720
12157665459056926720
In fact 19 digits is exceeding the numeric digit limit of cpp and the result of converting such a big number is unexpectable and unsafe.

miscalculation with long long integers

In last iteration of the loop result is wrong. I know that before subtraction numbers can be bigger than long. That is why I set power to long long. Result in last iteration should be 17888888888888888889. Why it is not?
const int NR_LEVELS = 18;
unsigned long levels[NR_LEVELS];
unsigned long long power = 10;
for(unsigned int i = 0; i < NR_LEVELS; i++) {
levels[i] = ((i+1)*10*power-(i+2)*power+1)/9;
cout << levels[i] << endl;
power *= 10;
}
levels[17] = 17888888888888888889lu;
for(unsigned int i = 0; i < NR_LEVELS; i++) {
cout << levels[i] << endl;
}
The intermediate value (before dividing by 9) overflows 64-bit integer. That is the reason why you don't get the expected result.
To be more precise, the maximum value of 64-bit integer is:
18446744073709551615
Compared to the (smallest) intermediate value before division:
161000000000000000001
This answer is assuming that long type in your code translate to a 64-bit integral type (the standard mandates that long type is at least 32-bit, so you might also get 32-bit integral type depending on the environment). Depending on the OS, computer architecture and the compiler, the upper limit of long type may vary.

Resources