How to find the largest prime factor of 600851475143? - c

#include <stdio.h>
main()
{
long n=600851475143;
int i,j,flag;
for(i=2;i<=n/2;i++)
{
flag=1;
if(n%i==0)//finds factors backwards
{
for(j=2;j<=(n/i)/2;j++)//checks if factor is prime
{
if((n/i)%j==0)
flag=0;
}
if(flag==1)
{
printf("%d\n",n/i);//displays largest prime factor and exits
exit(0);
}
}
}
}
The code above works for n = 6008514751. However, it doesn't work for n = 600851475143, even though that number still is within the range of a long.
What can I do to make it work?

One potential problem is that i and j are int, and could overflow for large n (assuming int is narrower than long, which it often is).
Another issue is that for n=600,851,475,143 your program does quite a lot of work (the largest factor is 6857). It is not unreasonable to expect it to take a long time to complete.

Use longs in place of ints. Better still, use uint64_t which has been defined since C99 (acknowledge Zaibis). It is a 64 bit unsigned integral type on all platforms. (The code as you have it will overflow on some platforms).
And now we need to get your algorithm working more quickly:
Your test for prime is inefficient; you don't need to iterate over all the even numbers. Just iterate over primes; up to and equal to the square root of the number you're testing (not half way which you currently do).
Where do you get the primes from? Well, call your function recursively. Although in reality I'd be tempted to cache the primes up to, say, 65536.

From ISO/IEC 9899:TC3
5.2.4.2.1 Sizes of integer types
[...]
Their implementation-defined values shall be equal or greater in magnitude(absolute value) to those shown, with the same sign.
[...]
— minimum value for an object of type long int
LONG_MIN -2147483647 // -(2^31 - 1)
— maximum value for an object of type long int
LONG_MAX +2147483647 // 2^31 - 1
EDIT:
Sorry I forgot to add what this should tell you.
The point is long doesn't even need to be able to hold the value you mentioned, as the standard says it has to be able to hold at least 4 Bytes with sign so it could be possible that your machine is just able to hold values up to 2147483647 in a variable of type long.

On 32-bit machine long range from -2,147,483,648 to 2,147,483,647 and On 64-bit machine its range is from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (NOTE: This is not mandated by C standard and may vary from one compiler to another).
As OP said in comment he is on 32-bit, 600851475143 goes out of range as it is not fit in the range of long.

Try changing n to long long int .. and change i,j to long
EDIT: define n like this :
long long int n = 600851475143LL;
LL - is a suffix to enforce long long type ...

Related

What's the role of L in modulo 1000000007L?

When you're a total noob programming, a lot of things looks like magic,
so solving some classic problems at SPOJ using C language, I found one called DIAGONAL.
After some attempts I gave up and went searching for solutions and I found this one:
#include <stdio.h>
int main() {
int num_cases, i;
long long mod_by = 24*1000000007L;
scanf("%d", &num_cases);
long long n;
long long answer;
for(i = 0; i < num_cases; i++) {
scanf("%lld", &n);
long long x = (n*(n-1)) % (mod_by);
long long y = (x*(n-2)) % (mod_by);
long long z = (y*(n-3)) % (mod_by);
answer = z / 24;
printf("%lld\n", answer);
}
return 0;
}
At first glance I thought that L with the modulo was some kind of mistake made by the user who posted the code (haha who would mix numbers with letters this way?! nonsense! -thought the noob-), but when I fixed the (many) wrongs in my code and used this modulo, it didn't work without the magic L (I got Wrong Answer).
Then I substituted the L with the ASCII code number (because well, maybe it was that!) and it also didn't work.
Since then I'm trying to understand what is the logic behind this. How can I get the same results removing this L?
It's not like one just woke up in the morning and "uhm, maybe if I add this L it will work", but I just couldn't find other examples of this (random letter added to a large number for calculations) googling.
long long mod_by = 24*1000000007L; is a problem.
The L suffix insures the constant 1000000007 is at least of type long.1 Without the L, the type may have been int or long.
Yet since long may only be 32-bit, the ~36-bit product can readily overflow long math leading to undefined behavior (UB).
The assignment to a long long happens after and does not affect the type nor range of the multiplication.
Code should use LL to form the product and protect against overflow.
long long mod_by = 24LL*1000000007;
// or
long long mod_by = 24*1000000007LL;
In general, make certain the calculation occurs at least with the width of the destination.
See also Why write 1,000,000,000 as 1000*1000*1000 in C?.
1 In OP's case, apparently int is 32-bit and long is 64-bit so code "worked" with a L and failed without it. Yet porting this code to a 32-bit long implementation, code fails with or without an L. Insure long long math with an LL constant.
The L suffix when applied to an integer constant means the constant has type long.
This is done to prevent overflow when the value is multiplied by 24. Without the suffix, two constants of type int are multiplied giving an int result. Assuming 32 is 32 bits, the result will overflow the range of an int (231-1) causing undefined behavior.
By making the constant have type long, assuming a long is 64 bits, it allows the multiplication to be done using and type and therefore not cause overflow and give you the correct value.
L is a suffix that means that the number is of type long, otherwise it will default to type int that means the result of the operation 24 * 1000000007 will also be of type int.
As an int is usually 4 bytes in size the result will overflow, this happens before it being assigned to mod_by, and for that reason it invokes undefined behavior.
As the result of the arithmetic operation is converted to the larger type, e.g:
int * int = int
int * long = long
For the result of the operation to be of type long one of the operands must also be of type long.
Note that long is not guaranteed to have 8 bytes in size, the minimum allowed size for a long is 4 bytes, so you can invoke the same undefined behavior deppending on the platform where you compile your program.
Using LL for long long will be more portable.

Integer overflow for looping over large vector in C?

We can use a for loop to iterate over a vector in C, for example:
int len = length(x);
for (int i=0; i<len; i++) {
double val = REAL(x)[i];
}
This seems to work fine, but I don' understand why. According to wikipedia the default int type ranges from −32767 to +32767. So why does this still work for vectors longer than that?
Does R somehow override int to always be long int? Is there a maximum length of the vector that this code will support?
You are misreading Wikipedia. Here is a more complete quote (emphasis mine):
At least in the [−32767,+32767] range
On most modern platforms (at least those powerful enough to run R), int is at least 32 bits wide, which gives a range of [−2147483647,+2147483647] or more.
Additionally, R's ?integer has the following to say:
Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.
Finally, ?.Machine say:
integer.max - the largest integer which can be represented. Always 2147483647.

What's wrong with my C code? (Prime factors of a big number)

Could you please help me?
I'm a beginner at C and my code doesn't work.
I'm trying to determine largest prime factor of 600851475143 and when I run the code, it just does nothing. Trying with smaller number however works.
long i;
for (i = 600851475143; i > 1; i--)
{
if (600851475143 % i == 0)
{
printf("%d\n", i);
}
};
It's probably a 32-bit system. The number 600851475143 is bigger than 32 bits.
Instead of long i try:
long long i;
And instead of printf("%d\n", i); try:
printf("%lld\n", i);
And use 600851475143LL in place of 600851475143.
First of all, the correct way to print a long is not %d but %ld (d = decimal, ld = long decimal). If long and int have different sizes on your system (which is not unusual), the results would not print correctly to begin with.
Next possible problem is that 600851475143 is more than fits into a 32 bit variable, yet long is only guaranteed to be at least 32 bit. It may be bigger than that, but only 32 bits are guaranteed. So are you sure that long is big enough on your system?
Try
printf("%zu\n", sizeof(long));
if it says 8, everything is fine, if it says only 4, long is not sufficient and you need to use long long instead (and %lld for printf, lld = long long decimal).
Last but not least, you are aware that your loop needs to do 600 billion iterations, aren't you? Even if you have a very fast system with a very fast CPU this will take quite some time to complete. So you will see 600851475143 printed to the screen immediately, but it will take quite some time before your code terminates (or finds another divisor, in case this is not a prime number).
Optionally:
Instead of writing 600851475143, you may write 600851475143LL to let the compiler know you want this number to be of type long long. It seems like the C 2011 standard doesn't require this any longer (numbers are automatically treated as long or long long if required), yet I know that pior to C 2011 some compilers least issued a warning for numbers being bigger than int (or bigger than long).
You can start your loop with the greatest integer less than or equal to the square root of your large number. Then you can find factor pairs working down through the loop. Write a separate function to check whether a given number is prime. If the larger factor of the pair is prime, return it. If the larger is not prime, check if the smaller is prime and if so return it.

Why am I getting a negative value?

There is some mistake in power function of my code it returns correct answer for small values but gives wrong answer for large values.
#include<stdio.h>
long long MOD= 1000000007;
long long power(long long i,long long j)
{
if(j==0)
return 1;
long long d;
d=power(i,j/(long long)2);
if(j%2==0)
return (d*d)%MOD;
else
return (d*d*i)%MOD;
}
int main()
{
long long inv=1;
inv=power(25,MOD-2)%MOD;
printf("%lld\n",inv);
}
Your arithmetic overflowed the values representable in a long long in your C implementation.
Also, you are working on a challenge problem or homework which has been discussed numerous times on Stack Overflow, as you can see by searching for “1000000007”.
You need to be careful with (d*d*i)%MOD;, because more than one operation without modulo is incorrect, taking precision and risk of overflowing in account. To be strictly correct you need to write
return ((d*d)%MOD * i)%MOD;
Even in this case I assumed i == i%MOD
EDIT: I take an example :
log2((MOD-1)x(MOD-1)x25) = log2(1000000006x1000000006x25) = 64.438
Which means you will be overflowing during your computation with this input.
range of signed long long is -2^63 to (2^63 - 1) and the last bit is used to store sign . In your case when multiplication of large numbers is evaluated than your result will overflow and 64th bit of number(which stores sign) will get value 1 (which means -ve value).
That's the reason you are getting negative value as output.
Read this

What is the difference between #define and declaration

#include <stdio.h>
#include <math.h>
// #define LIMIT 600851475143
int isP(long i);
void run();
// 6857
int main()
{
//int i = 6857;
//printf("%d\n", isP(i));
run();
}
void run()
{
long LIMIT = 600851475143;
// 3, 5
// under 1000
long i, largest =1, temp=0;
for(i=3; i<=775147; i+=2)
{
temp = ((LIMIT/i)*i);
if(LIMIT == temp)
if(isP(i)==1)
largest = i;
}
printf("%d\n",largest);
}
int isP(long i)
{
long j;
for(j=3; j<= i/2; j+=2)
if(i == (i/j)*j)
return 0;
return 1;
}
I just met an interesting issue. As above shows, this piece of code is designed to calculate the largest prime number of LIMIT. The program as shows above gave me an answer of 29, which is incorrect.
While, miraculously, when I defined the LIMIT value (instead of declaring it as long), it could give me the correct value: 6857.
Could someone help me to figure out the reason? Thanks a lot!
A long on many platforms is a 4 byte integer, and will overflow at 2,147,483,647. For example, see Visual C++'s Data Type Ranges page.
When you use a #define, the compiler is free to choose a more appropriate type which can hold your very large number. This can cause it to behave correctly, and give you the answer you expect.
In general, however, I would recommend being explicit about the data type, and choosing a data type that will represent the number correctly without requiring compiler and platform specific behavior, if possible.
I would suspect a numeric type issue here.
#define is a preprocessor directive, so it would replace LIMIT with that number in the code before running the compiler. This leaves the door open for the compiler to interpret that number how it wants, which may not be as a long.
In your case, long probably isn't big enough, so the compiler chooses something else when you use #define. For consistent behavior, you should specify a type that you know has an appropriate range and not rely on the compiler to guess correctly.
You should also turn on full warnings on your compiler, it might be able to detect this sort of problem for you.
When you enter an expression like:
(600851475143 + 1)
everything is fine, as the compiler automatically promotes both of those constants to an appropriate type (like long long in your case) large enough to perform the calculation. You can do as many expressions as you want in this way. But when you write:
long n = 600851475143;
the compiler tries to assign a long long (or whatever the constant is implicitly converted to) to a long, which results in a problem in your case. Your compiler should warn you about this, gcc for example says:
warning: overflow in implicit constant conversion [-Woverflow]
Of course, if a long is big enough to hold that value, there's no problem, since the constant will be a type either the same size as long or smaller.
It is probably because 600851475143 is larger than LONG_MAX (2147483647 according to this)
Try replacing the type of limit with 'long long'. The way it stands, it wraps around (try printing limit as a long). The preprocessor knows that and uses the right type.
Your code basically reduce to these two possibilities:
long LIMIT = 600851475143;
x = LIMIT / i;
vs.
#define LIMIT 600851475143
x = LIMIT / i;
The first one is equivalent to casting the constant into long:
x = (long)600851475143 / i;
while the second one will be precompiled into:
x = 600851475143 / i;
And here lies the difference: 600851475143 is too big for your compiler's long type so if it is cast into long it overflows and goes crazy. But if it is used directly in the division, the compiler knows that it doesn't fit into a long, automatically interprets it as a long long literal, the i is promoted and the division is done as a long long.
Note, however, that even it the algorithm seems to work most of the time, you still have overflows elsewhere and so the code is incorrect. You should declare any variable that may hold these big values as long long.

Resources