I am trying to find the largest prime factor of a huge number in C ,for small numbers like 100 or even 10000 it works fine but fails (By fail i mean it keeps running and running for tens of minutes on my core2duo and i5) for very big target numbers (See code for the target number.)
Is my algorithm correct?
I am new to C and really struggling with big numbers. What i want is correction or guidance not a solution i can do this using python with bignum bindings and stuff (I have not tried yet but am pretty sure) but not in C. Or i might have done some tiny mistake that i am too tired to realize , anyways here is the code i wrote:
#include <stdio.h>
// To find largest prime factor of target
int is_prime(unsigned long long int num);
long int main(void) {
unsigned long long int target = 600851475143;
unsigned long long int current_factor = 1;
register unsigned long long int i = 2;
while (i < target) {
if ( (target % i) == 0 && is_prime(i) && (i > current_factor) ) { //verify i as a prime factor and greater than last factor
current_factor = i;
}
i++;
}
printf("The greates is: %llu \n",current_factor);
return(0);
}
int is_prime (unsigned long long int num) { //if num is prime 1 else 0
unsigned long long int z = 2;
while (num > z && z !=num) {
if ((num % z) == 0) {return 0;}
z++;
}
return 1;
}
600 billion iterations of anything will take some non-trivial amount of time. You need to substantially reduce this.
Here's a hint: Given an arbitrary integer value x, if we discover that y is a factor, then we've implicitly discovered that x / y is also a factor. In other words, factors always come in pairs. So there's a limit to how far we need to iterate before we're doing redundant work.
What is that limit? Well, what's the crossover point where y will be greater than x / y?
Once you've applied this optimisation to the outer loop, you'll find that your code's runtime will be limited by the is_prime function. But of course, you may apply a similar technique to that too.
By iterating until the square root of the number, we can get all of it's factors.( factor and N/factor and factor<=sqrt(N)). Under this small idea the solution exists. Any factor less than the sqrt(N) we check, will have corresponding factor larger than sqrt(N). So we only need to check up to the sqrt(N), and then we can get the remaining factors.
Here you don't need to use explicitly any prime finding algorithm. The factorization logic itself will deduce whether the target is prime or not. So all that is left is to check the pairwise factors.
unsigned long long ans ;
for(unsigned long long i = 2; i<=target/i; i++)
while(target % i == 0){
ans = i;
target/=i;
}
if( target > 1 ) ans = target; // that means target is a prime.
//print ans
Edit: A point to be added (chux)- i*i in the earlier code is may lead to overflow which can be avoided if we use i<=target/i.
Also another choice would be to have
unsigned long long sqaure_root = isqrt(target);
for(unsigned long long i = 2; i<=square_root; i++){
...
}
Here note than use of sqrt is not a wise choice since -
mixing of double math with an integer operation is prone to round-off errors.
For target given the answer will be 6857.
Code has 2 major problems
The while (i < target) loop is very inefficient. Upon finding a factor, target could be reduced to target = target / i;. Further, a factor i could occur multiple times. Fix not shown.
is_prime(n) is very inefficient. Its while (num > z && z !=num) could loop n time. Here too, use the quotient to limit the iterations to sqrt(n) times.
int is_prime (unsigned long long int num) {
unsigned long long int z = 2;
while (z <= num/z) {
if ((num % z) == 0) return 0;
z++;
}
return num > 1;
}
Nothing is wrong, it just needs optimization, for example:
int is_prime(unsigned long long int num) {
if (num == 2) {
return (1); /* Special case */
}
if (num % 2 == 0 || num <= 1) {
return (0);
}
unsigned long long int z = 3; /* We skipped the all even numbers */
while (z < num) { /* Do a single test instead of your redundant ones */
if ((num % z) == 0) {
return 0;
}
z += 2; /* Here we go twice as fast */
}
return 1;
}
Also the big other problem is while (z < num) but since you don't want the solution i let you find how to optimize that, similarly look out by yourself the first function.
EDIT: Someone else posted 50 seconds before me the array-list of primes solution which is the best but i chose to give an easy solution since you are just a beginner and manipulating arrays may not be easy at first (need to learn pointers and stuff).
is_prime has a chicken-and-egg problem in that you need to test num only against other primes. So you don't need to check against 9 because that is a multiple of 3.
is_prime could maintain an array of primes and each time a new num is tested that is a pime, it can be added to the array. num isr tested against each prime in the array and if it is not divisable by any of the primes in the array, it is itself a prime and is added to the array. The aray needs to be malloc'd and relloc'd unless there is a formue to calculate the number of primes up intil your target (I believe such formula does not exist).
EDIT: the number of primes to test for the target 600,851,475,143 will be approximately 7,500,000,000 and the table could run out of memory.
The approach can be adapted as follows:
to use unsiged int up until primes of UINT_max
to use unsigned long long int for primes above that
to use brute force above a certain memory consumption.
UINT_MAX is defined as 4,294,967,295 and would cover the primes up to around 100,000,000,000 and would cost 7.5*4= 30Gb
See also The Prime Pages.
Related
I am writing an article regarding the importance of the Prime numbers on today's criptography. I want to develop a small application showing how long a program written in C (low level language, at least to me) would take to factorize a compound number in its prime factors. I came up with a simple algorithm to do so, but I ran into a problem:
I would like the user to be able to type gigantic numbers, for example: 7777777777777777777777777772
So the computer would take some hours to process that, showing how good our criptography based upon primes is.
But in C the largest data type I could find was LONG which goes up to 2147483646.
Do you guys know how I could be able to type and process a big number in C?
Thanks in advance
Factorization of really big numbers
I would like the user to be able to type gigantic numbers, for example: 7777777777777777777777777772
That is a 93 bit number, not that gigantic, so one could simplistically brute force it.
Something like the below if you have access to a unsigned __int128. C does specify 64-bit types, yet beyond that, you are on your own.
This modest factorization I'd estimate could take some minutes.
https://www.dcode.fr/prime-factors-decomposition reports the answer in seconds.
Of course many improvement can be had.
unsigned __int128 factor(unsigned __int128 x) {
if (x <= 3) {
return x;
}
if (x %2 == 0) return 2;
for (unsigned __int128 i = 3; i <= x/i; i += 2) {
static unsigned long n = 0;
if (++n >= 100000000) {
n = 0;
printf(" %llu approx %.0f\n", (unsigned long long) i, (double)(x/i));
fflush(stdout);
}
if (x%i == 0) {
return i;
}
}
return x;
}
void factors(unsigned __int128 x) {
do {
unsigned __int128 f = factor(x);
printf("%llu approx %.0f\n", (unsigned long long) f, (double)x);
fflush(stdout);
x /= f;
} while (x > 1);
}
void factors(unsigned __int128 x) {
do {
unsigned __int128 f = factor(x);
printf("approx %0.f approx %.0f\n", (double) f, (double)x);
fflush(stdout);
x /= f;
} while (x > 1);
}
Output
approx 2 approx 7777777777777778308713283584
approx 2 approx 3888888888888889154356641792
approx 487 approx 1944444444444444577178320896
approx 2687 approx 3992699064567647864619008
99996829 approx 14859790387308
199996829 approx 7429777390798
299996829 approx 4953158749339
399996829 approx 3714859245385
499996829 approx 2971882684351
...
38399996829 approx 38696146902
38499996829 approx 38595637421
approx 1485931918335559335936 approx 1485931918335559335936
The right answer though is to use more efficient algorithms and then consider the types needed.
The same way you do it on paper. You break the number into pieces and use long division, long addition, and long multiplication.
Perhaps the simplest way is to store the number as a base 10 string and write code to do all the operations you need on those strings. You would do addition with carries the same way you do it on paper. Multiplication would be done with single-digit multiplication combined with addition (which you'd have already don). And so on.
There are plenty of libraries available to do this for you such as libgmp's MPZ library and OpenSSL's BN library.
You can use a struct, and just set the numbers you want, the code below is not tested but should give you some direction.
I believe this should give you the ability to get somewhere around 4294967295 (max_int) to the power of x x being the places you define in the struct
typedef struct big_number{
int thousands;
int millions;
int billions;
}
//Then do some math
big_number add(big_number n1, big_number n2){
int thousands = n1.thousands + n2.thousands;
int millions = n1.millions + n2.millions;
//etc... (note each part of your struct will have a maximum value of 999
if(thousands > 999){
int r = thousands - 999;
millions += r; //move the remainder up
}
}
Here is my code:
I don't understand why it gives me the wrong answer above 50.
#include<stdio.h>
int main()
{
long long int i, sum=0;
long long int a[50];
a[0] = 1;
a[1] = 1;
for(i=2;i<50;i++)
{
a[i] = a[i-1] + a[i-2];
if(a[i]%2==0 && a[i]<4000000)
sum = sum + a[i];
}
printf("%lld", sum);
return 0;
}
Your first mistake was not breaking out of the loop when a term
exceeded 4,000,000. You don’t need to consider terms beyond that for the
stated problem; you don’t need to deal with integer overflow if you stop
there; and you don’t need anywhere near 50 terms to get that far.
Nor, for that matter, do you need to store all of the terms, unless you
want to look at them to check correctness (and simply printing them
would work just as well for that).
You have an integer overflow. Fibonacci numbers get really big. Around F(94) things get out of the range of 64 bit integers (like long long).
F(90) = 2880067194370816120 >= 2^61
F(91) = 4660046610375530309 >= 2^62
F(92) = 7540113804746346429 >= 2^62
F(93) = 12200160415121876738 >= 2^63
F(94) = 19740274219868223167 >= 2^64
F(95) = 31940434634990099905 >= 2^64
F(96) = 51680708854858323072 >= 2^65
When the overflow happens, you will get smaller, or even negative numbers in a instead of the real fibonacci numbers. You need to workaround this overflow.
#include <stdio.h>
int main()
{
int i,j,k,t;
long int n;
int count;
int a,b;
float c;
scanf("%d",&t);
for(k=0;k<t;k++)
{
count=0;
scanf("%d",&n);
for(i=1;i<n;i++)
{
a=pow(i,2);
for(j=i;j<n;j++)
{
b=pow(j,2);
c=sqrt(a+b);
if((c-floor(c)==0)&&c<=n)
++count;
}
}
printf("%d\n",count);
}
return 0;
}
The above is a c code that counts the number of Pythagorean triplets within range 1..n.
How do I optimize it ? It times out for large input .
1<=T<=100
1<=N<=10^6
Your inner two loops are O(n*n) so there's not too much that can be done without changing algorithms. Just looking at the inner loop the best I could come up with in a short time was the following:
unsigned long long int i,j,k,t;
unsigned long long int n = 30000; //Example for testing
unsigned long long int count = 0;
unsigned long long int a, b;
unsigned long long int c;
unsigned long long int n2 = n * n;
for(i=1; i<n; i++)
{
a = i*i;
for(j=i; j<n; j++)
{
b = j*j;
unsigned long long int sum = a + b;
if (sum > n2) break;
// Check for multiples of 2, 3, and 5
if ( (sum & 2) || ((sum & 7) == 5) || ((sum & 11) == 8) ) continue;
c = sqrt((double)sum);
if (c*c == sum) ++count;
}
}
A few comments:
For the case of n=30000 this is roughly twice as fast as your original.
If you don't mind n being limited to 65535 you can switch to unsigned int to get a x2 speed increase (or roughly x4 faster than your original).
The check for multiples of 2/3/5 increases the speed by a factor of two. You may be able to increase this by looking at the answers to this question.
Your original code has integer overflows when i > 65535 which is the reason I switched to 64-bit integers for everything.
I think your method of checking for a perfect square doesn't always work due to the inherent in-precision of floating point numbers. The method in my example should get around that and is slightly faster anyways.
You are still bound to the O(n*n) algorithm. On my machine the code for n=30000 runs in about 6 seconds which means the n=1000000 case will take close to 2 hours. Looking at Wikipedia shows a host of other algorithms you could explore.
It really depends on what the benchmark is that you are expecting.
But for now, the power function could be a bottle neck in this. I think you can do either of the two things:
a) precalculate and save in a file and then load into a dictionary all the squared values. Depending on the input size, that might be loading your memory.
b) memorize previously calculated squared values so that when asked again, you could reuse it there by saving CPU time. This again, would eventually load your memory.
You can define your indexes as (unsigned) long or even (unsigned) long long, but you may have to use big num libraries to solve your problem for huge numbers. Using unsigned uppers your Max number limit but forces you to work with positive numbers. I doubt you'll need bigger than long long though.
It seems your question is about optimising your code to make it faster. If you read up on Pythagorean triplets you will see there is a way to calculate them using integer parameters. If 3 4 5 are triplets then we know that 2*3 2*4 2*5 are also triplets and k*3 k*4 k*5 are also triplets. Your algorithm is checking all of those triplets. There are better algorithms to use, but I'm afraid you will have to search on Google to study about Pythagorean triplets.
Optimized way to handle the value of n^n (1 ≤ n ≤ 10^9)
I used long long int but it's not good enough as the value might be (1000^1000)
Searched and found the GMP library http://gmplib.org/ and BigInt class but don't wanna use them. I am looking for some numerical method to handle this.
I need to print the first and last k (1 ≤ k ≤ 9) digits of n^n
For the first k digits I am getting it like shown below (it's bit ugly way of doing it)
num = pow(n,n);
while(num){
arr[i++] = num%10;
num /= 10;
digit++;
}
while(digit > 0){
j=digit;
j--;
if(count<k){
printf("%lld",arr[j]);
count++;
}
digit--;
}
and for last k digits am using num % 10^k like below.
findk=pow(10,k);
lastDigits = num % findk;
enter code here
maximum value of k is 9. so i need only 18 digits at max.
I am think of getting those 18 digits without really solving the complete n^n expression.
Any idea/suggestion??
// note: Scope of use is limited.
#include <stdio.h>
long long powerMod(long long a, long long d, long long n){
// a ^ d mod n
long long result = 1;
while(d > 0){
if(d & 1)
result = result * a % n;
a = (a * a) % n;
d >>=1;
}
return result;
}
int main(void){
long long result = powerMod(999, 999, 1000000000);//999^999 mod 10^9
printf("%lld\n", result);//499998999
return 0;
}
Finding the Least Significant Digits (last k digits) are easy because of the property of modular arithmetic, which says: (n*n)%m == (n%m * n%m)%m, so the code shown by BLUEPIXY which followed exponentiation by squaring method will work well for finding k LSDs.
Now, Most Significant Digits (1st k digits) of N^N can be found in this way:
We know,
N^N = 10^(N log N)
So if you calculate N log (N) you will get a number of this format xxxx.yyyy, now we have to use this number as a power of 10, it is easily understandable that xxxx or integer part of the number will add xxxx zeros after 10, which is not important for you! That means, if you calculate 10^0.yyyy, you will get those significants digits you are looking for.
So the solution will be something like this:
double R = N * log10 (N);
R = R - (long long) R; //so taking only the fractional part
double V = pow(10, R);
int powerK = 1;
for (int i=0; i<k; i++) powerK *=10;
V *= powerK;
//Now Print the 1st K digits from V
Why don't you want to use bigint libraries?
bignum arithmetic is very hard to do right and efficiently. You could still get a PhD by working on that subject.
Fist, bigint arithmetic have non-trivial algorithmics
Then, bigint implementations usually need some machine instructions (like add with carry) which are not easily accessible in plain C.
For your specific problem (first and last few digits of NN) you'll better also reason on paper (using arithmetic theorems) to lower the complexity. I am not an expert, but I guess that still remains intractable, perhaps with a complexity worse than O(N)
I started working on Project Euler problems today to keep myself busy over break. One of the problems asks for the sum of all prime numbers below 2 million, so I threw together a Sieve of Eratosthenes to find all those numbers.
unsigned long i, j, sum = 0, limit = 2000000;
// Allocate an array to store the state of numbers (1 is prime, 0 is not).
int* primes = malloc(limit * sizeof(int));
// Initialize every number as prime.
for (i = 0; i < limit; i++)
primes[i] = 1;
// Use the sieve to check off values that are not prime.
for (i = 2; i*i < limit; i++)
if (primes[i] != 0)
for (j = 2; i*j < limit; j++)
primes[i*j] = 0;
// Get the sum of all numbers still marked prime.
for (i = 2; i < limit; i++)
if (primes[i] != 0)
sum += i;
printf("%d", sum);
This works perfectly up to limit around half a million. After this, it returns random values (for example, 1000000 returns -1104303641). I've tried declaring all the unsigned long variables as unsigned long long to no avail. The error seems to be happening in the last 4 lines, because primes[] contains nothing but 1's and 0's at that point. I figure this has something to do with the size of the values being worked with, can anyone offer any guidance?
Wolfram Alpha tells me that the sum of the primes less than 500000 is 9914236195.
That number doesn't fit in a 32 bit integer, so you're overflowing an int during your sum loop. You could try to use a uint64_t, but that problem will eventually occur again with a high enough limit (although I suspect that a limit of 2000000 will fit).
Change the %d to %ld and you should get:
142913828922
which seems like it should be (close to or) the right answer.
..assuming your longs aren't 32-bit.
If you're on a 32-bit platform, you're going to need some sort of third-party big int library.
BigInteger in C?
recommends: http://gmplib.org/