Why does this prime number algorithm work?

Why does this prime number algorithm work? - c

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main() {
int anz;
scanf("%d", &anz);
time_t start = time(0);
int *primZ = malloc(anz * sizeof(int));
primZ[0] = 2;
int Num = 0;
for (int i = 1, num = 3; i < anz; num += 2) {
for (int j = 1; j < i; j++) {
if (num % primZ[j] == 0) {
num += 2;
j = 0;
}
//this part
if (primZ[j] > i / 2)
break;
}
primZ[i] = num;
i++;
printf("%d ,",num);
}
time_t delta = time(0) - start;
printf("%d", delta);
getchar();
getchar();
return 0;
}
The code works perfectly fine, the question is why. The part if(primZ[j] > i/2) makes the program 2 - 3 times faster. It was actually meant to be if(primZ[j] > num/3) which makes perfect sense because num can only be an odd number. But it is the number of found prime numbers. It makes no sense to me. Please explain.

You check if the prime is composite by checking if it divisible by already found prime numbers. But in doing so you only have to check up to and including the square root of the number because any number larger than that that divides the number will leave a smaller number than the square root of the number.
For example 33 is composite, but you only have to check numbers up to 5 to realize that, you don't need to check it being divisible by 11 because it leaves 3 (33/11=3) which we already checked.
This means that you could improve your algorithm by
for (int j = 1; j < i; j++) {
if( primZ[j]*primZ[j] > num )
break;
if (num % primZ[j] == 0) {
num += 2;
j = 0;
}
}
The reason you can get away with comparing with cutting of at i/2 is due to the distribution of the prime numbers. The prime counting function is approximately i = num/log(num) and then you get that i/2 > sqrt(num).

The reason is that the actual bound is much tighter than num/3 - you could use:
if (primZ[j] > sqrt(num))
The reason for that being that if a prime higher than the square root of num divides num, there must also be a lower prime that does (since the result of such a division must be lower than the square root).
This means that as long as i/2 is higher than sqrt(num), the code will work. What happens is that the number of primes lower than a number grows faster than the square root of that number, meaning that (completely accidentally) i/2 is a safe bound to use.
You can check out how your i value behaves here - they call it pi(x), the number of primes less than x.

It makes sense, since if n has two factors one of them is surely less than or equal to n/2, sense the program found no factors of i in primZ that are less than or equal to i/2 it means there's no factors of i -except 1 of course-.
Sense primZ is sorted in ascending order and j only increases, when primeZ[j] > i/2 it indicates that there's no factors of i in primZ that are less than i/2.
P.S.The point of starting the search is stated in the first part of the for statement num=3 , and the recurring statement num += 2 ensures you only test odd numbers

Related

I want to optimize this program

I have recently started to learn c and as a programming exercise, I've written a program that computes and lists out prime numbers from 0 up to a maximum entered by the user. It's a rather short program so I'll post the source code here.
// playground.c
#include <stdio.h>
#include <stdbool.h>
#include <math.h>
int main ()
{
int max;
printf("Please enter the maximum number up to which you would like to see all primes listed: "
); scanf("%i", &max);
printf("All prime numbers in the range 0 to %i:\nPrime number: 2\n", max);
bool isComposite;
int primesSoFar[(max >> 1) + 1];
primesSoFar[0] = 2;
int nextIdx = 1;
for (int i = 2; i <= max; i++)
{
isComposite = false;
for (int k = 2; k <= (int)sqrt(i) + 1; k++)
{
if (k - 2 < nextIdx)
{
if (i % primesSoFar[k - 2] == 0)
{
isComposite = true;
k = primesSoFar[k - 2];
}
}else
{
if (i % k == 0) isComposite = true;
}
}
if (!isComposite)
{
printf("Prime number: %i\n", i);
primesSoFar[nextIdx] = i;
nextIdx++;
}
}
double primeRatio = (double)(nextIdx + 1) / (double)(max);
printf("The ratio of prime numbers to composites in range 0 to %d is %lf", max, primeRatio);
return 0;
}
I have become strangely fascinated with optimizing this program but I've hit a wall. The array primesSoFar is allocated based on a computed maximum size which ideally would be no larger than the number of prime numbers from 0 to max. Even if it were just slightly larger, that would be fine; as long as it's not smaller. Is there a way to compute the size the array needs to be that doesn't depend on first computing the primes up to max?
I've updated the code both applying suggested optimizations and adding internal documentation wherever it seemed helpful.
// can compute all the primes up to 0x3FE977 (4_188_535). Largest prime 4_188_533
#include <stdio.h>
#include <stdbool.h>
#include <math.h>
int main ()
{
int max;
printf("Please enter the maximum number up to which you would like to see all primes listed: "
); scanf("%i", &max);
// The algorithm proper doesn't print 2.
printf("All prime numbers in the range 0 to %i:\nPrime number: 2\n", max);
bool isComposite;
// primesSoFar is a memory hog. It'd be nice to reduce its size in proportion to max. The frequency
// of primes diminishes at higher numerical ranges. A formula for calculating the number of primes for
// a given numerical range would be nice. Sadly, it's not linear.
int PRIMES_MAX_SIZE = (max >> 1) + 1;
int primesSoFar[PRIMES_MAX_SIZE];
primesSoFar[0] = 2;
int nextIdx = 1;
int startConsecCount = 0;
for (int i = 2; i <= max; i++)
{
isComposite = false; // Assume the current number isn't composite.
for (int k = 2; k <= (int)sqrt(i) + 1; k++)
{
if (k - 2 < nextIdx) // Check it against all primes found so far.
{
if (i % primesSoFar[k - 2] == 0)
{
// If i is divisible by a previous prime number, break.
isComposite = true;
break;
}else
{
// Prepare to start counting consecutive integers at the largest prime + 1. if i
// isn't divisible by any of the primes found so far.
startConsecCount = primesSoFar[k - 2] + 1;
}
}else
{
if (startConsecCount != 0) // Begin counting consecutively at the largest prime + 1.
{
k = startConsecCount;
startConsecCount = 0;
}
if (i % k == 0)
{
// If i is divisible by some value of k, break.
isComposite = true;
break;
}
}
}
if (!isComposite)
{
printf("Prime number: %i\n", i);
if (nextIdx < PRIMES_MAX_SIZE)
{
// If the memory allocated for the array is sufficient to store an additional prime, do so.
primesSoFar[nextIdx] = i;
nextIdx++;
}
}
}
// I'm using this to get data with which I can find a way to compute a smaller size for primesSoFar.
double primeRatio = (double)(nextIdx + 1) / (double)(max);
printf("The ratio of prime numbers to composites in range 0 to %d is %lf\n", max, primeRatio);
return 0;
}
edit: primesSoFar should be half the size of the range 0 to max. No doubt that's caused some confusion.

I can give you two main ideas as I have worked on a project discussing this problem.
A prime number bigger than 3 is either 6k-1 or 6k+1, so for example 183 can't be prime because 183=6x30+3, so you don't even have to check it. (Be careful, this condition is necessary but not sufficient, 25 for exemple is 6x4+1 but is not prime)
A number is prime if it can't be divided by any prime number smaller or equal to its root, so it's preferable to take a benefit out of the smaller primes you already found.
Thus, you can start with a primesList containing 2 and 3, and iterate k to test all the 6k-1 and 6k+1 numbers (5, 7, 11, 13, 17, 19, 23, 25...) using the second rule I gave you, by using division on elements in the primesList which are smaller than or equal to the root of the number you are checking, if you found only one element dividing it, you just stop and pass to another element, 'cause this one is not prime, otherwise (if no one can divide it): update the primesList by adding this new prime number.

There is some debugging to be done first.
When I saw that the test was <= my brain said BUG as Arrays are subscripted from 0 .. max - 1.
for (int i = 2; i <= max; i++)
So I went to look at the array.
int primesSoFar[(max >> 1) + 1];
Oh he is adding one to the size so it should be ok.
Wait. Why is that shift in there? (max >> 1) is a divide by two.
I compiled the code and ran it, and MSVC reported a memory error.
I removed the shift, and the memory error report went away. The program worked as expected.
With that out of the way, PiNaKa30 and II Saggio Vecchino have very good advice. The choice of algorithm is going to effect the performance dramatically.
Mat gives very good advice. Read the Wikipedia entry. It is filled with wonderful information.
Picking the correct algorithm is key.
How you represent the data you are checking is a factor. int has a maximum value it can hold.
A performance profiler can tell you lots of useful information about where the Hot Spots are in your program.
Congratulations on your efforts in learning C. You picked a very good learning path.

The source code that follows is basically a rewrite. It's running now as I write this. I entered 0x7FFF_FFFF, the 32-bit signed integer positive maximum. In mere minutes on my Acer aspire laptop running on an AMD ryzen 3 with Linux Mint it's already in the hundreds of millions! The memory usage of the old version was half of max, rendering anything larger than 0x3EF977 impossible on my 4gb of RAM. Now it only uses 370728 bytes of memory for its array data when computing primes from 0 to 2_147_483_647.
/*
A super optimized prime number generator using my own implementation of the sieve of Eratosthenes.
*/
#include <stdio.h>
#include <stdbool.h>
#include <math.h>
int main ()
{
int max;
printf("Please enter the maximum to which you would like to see all primes listed: "
); scanf("%i", &max);
/*
Primes and their multiples will be stored until the next multiple of the prime is larger than max.
That prime and its corresponding multiple will then be replaced with a new prime and its corresponding
multiple.
*/
int PRIMES_MAX_SIZE = (int)sqrt(max) + 1;
int primes[PRIMES_MAX_SIZE];
int multiples[PRIMES_MAX_SIZE];
primes[0] = 2;
multiples[0] = 2;
int nextIdx = 1;
int const NO_DISPOSE_SENTINAL_VALUE = -1;
int nextDispose = NO_DISPOSE_SENTINAL_VALUE;
int startConsecCount = 0;
int updateFactor;
bool isComposite;
printf("All prime numbers in the range 0 to %i:\n\n", max);
// Iterate from i = 2 to i = max and test each i for primality.
for (int i = 2; i <= max; i++)
{
isComposite = false;
/*
Check whether the current i is prime by comparing it with the current multiples of
prime numbers, updating them when they are less than the current i and then proceeding
to check whether any consecutive integers up to sqrt(i) divide the current i evenly.
*/
for (int k = 2; k < (int)sqrt(i) + 1; k++)
{
if (k < nextIdx)
{
// Update the multiple of a prime if it's smaller than the current i.
if (multiples[k] < i)
{
updateFactor = (int)(i / primes[k]);
multiples[k] = updateFactor * primes[k] + primes[k];
// Mark the value for disposal if it's greater than sqrt(max).
if (multiples[k] > (int)sqrt(max)) nextDispose = k;
}
if (i == multiples[k])
{
isComposite = true;
break;
}else
{
startConsecCount = multiples[k] + 1;
}
} else
{
if (startConsecCount != 0)
{
k = startConsecCount;
startConsecCount = 0;
}
if (i % k == 0)
{
isComposite = true;
break;
}
}
}
/*
Print the prime numbers and either insert them at indices occupied by disposed primes or at
the next array index if available.
*/
if (!isComposite)
{
printf("Prime number: %i\n", i);
if (nextDispose != NO_DISPOSE_SENTINAL_VALUE)
{
primes[nextDispose] = i;
// This will trigger the update code before the comparison in the inner loop.
multiples[nextDispose] = 0;
nextDispose = NO_DISPOSE_SENTINAL_VALUE;
}else
{
if (nextIdx < PRIMES_MAX_SIZE)
{
primes[nextIdx] = i;
multiples[nextIdx] = 0;
}
}
}
}
return 0;
}
This thing will do the old 0 to 0x3EF977 in the blink of an eye. The old version couldn't do the 32-bit maximum on my system. It's on 201 million + already. I am super chuffed with the results. Thank you for your advice. I wouldn't have made it this far without help.

How can I make this very small C program faster?

Is there any simple way to make this small program faster? I've made it for an assignment, and it's correct but too slow. The aim of the program is to print the nth pair of primes where the difference between the two is two, given n.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
bool isPrime(int number) {
for (int i = 3; i <= number/2; i += 2) {
if (!(number%i)) {
return 0;
}
}
return 1;
}
int findNumber(int n) {
int prevPrime, currentNumber = 3;
for (int i = 0; i < n; i++) {
do {
prevPrime = currentNumber;
do {
currentNumber+=2;
} while (!isPrime(currentNumber));
} while (!(currentNumber - 2 == prevPrime));
}
return currentNumber;
}
int main(int argc, char *argv[]) {
int numberin, numberout;
scanf ("%d", &numberin);
numberout = findNumber(numberin);
printf("%d %d\n", numberout - 2, numberout);
return 0;
}
I considered using some kind of array or list that would contain all primes found up until the current number and divide each number by this list instead of all numbers, but we haven't really covered these different data structures yet so I feel I should be able to solve this problem without. I'm just starting with C, but I have some experience in Python and Java.

To find pairs of primes which differ by 2, you only need to find one prime and then add 2 and test if it is also prime.
if (isPrime(x) && isPrime(x+2)) { /* found pair */ }
To find primes the best algorithm is the Sieve of Eratosthenes. You need to build a lookup table up to (N) where N is the maximum number that you can get. You can use the Sieve to get in O(1) if a number is prime. While building the Sieve you can build a list of sorted primes.
If your N is big you can also profit from the fact that a number P is prime iif it doesn't have any prime factors <= SQRT(P) (because if it has a factor > SQRT(N) then it should also have one < SQRT(N)). You can build a Sieve of Eratosthenes with size SQRT(N) to get a list of primes and then test if any of those prime divides P. If none divides P, P is prime.
With this approach you can test numbers up to 1 billion or so relatively fast and with little memory.

Here is an improvement to speed up the loop in isPrime:
bool isPrime(int number) {
for (int i = 3; i * i <= number; i += 2) { // Changed the loop condition
if (!(number%i)) {
return 0;
}
}
return 1;
}

You are calling isPrime more often than necessary. You wrote
currentNummber = 3;
/* ... */
do {
currentNumber+=2;
} while (!isPrime(currentNumber));
...which means that isPrime is called for every odd number. However, when you identified that e.g. 5 is prime, you can already tell that 10, 15, 20 etc. are not going to be prime, so you don't need to test them.
This approach of 'crossing-out' multiples of primes is done when using a sieve filter, see e.g. Sieve of Eratosthenes algorithm in C for an implementation of a sieve filter for primes in C.

Avoid testing ever 3rd candidate
Pairs of primes a, a+2 may only be found a = 6*n + 5. (except pair 3,5).
Why?
a + 0 = 6*n + 5 Maybe a prime
a + 2 = 6*n + 7 Maybe a prime
a + 4 = 6*n + 9 Not a prime when more than 3 as 6*n + 9 is a multiple of 3
So rather than test ever other integer with + 2, test with
a = 5;
loop {
if (isPrime(a) && isPrime(a+2)) PairCount++;
a += 6;
}
Improve loop exit test
Many processors/compilers, when calculating the remainder, will also have available, for nearly "free" CPU time cost, the quotient. YMMV. Use the quotient rather than i <= number/2 or i*i <= number to limit the test loop.
Use of sqrt() has a number of problems: range of double vs. int, exactness, conversion to/from integer. Recommend avoid sqrt() for this task.
Use unsigned for additional range.
bool isPrime(unsigned x) {
// With OP's selective use, the following line is not needed.
// Yet needed for a general purpose `isPrime()`
if (x%2 == 0) return x == 2;
if (x <= 3) return x == 3;
unsigned p = 1;
unsigned quotient, remainder;
do {
p += 2;
remainder = x%p;
if (remainder == 0) return false;
quotient = x/p; // quotient for "free"
} while (p < quotient); // Low cost compare
return true;
}

My C code doesn't output anything, or end running

I'm trying to answer this question:
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
Here is my code:
#include <stdio.h>
int isPrime(long long num)
{
for (long long k = 1; k < num; k++)
{
if (num%k == 0)
return 0;
}
return 1;
}
long long num = 600851475143;
int main(void)
{
long long prime_factor = 0;
for (long long j = 2; j < num; j++)
{
if (num % j == 0 && isPrime(j) == 1)
prime_factor = j;
}
printf("%lli\n", prime_factor);
}
But for some reason it doesn't print anything, or even end. What is causing this?

That's a terribly inefficient way of finding the prime factors of a number.
To find the prime factors, you should:
Create a static list of primes less than or equal to (number / 2).
Iterate over each element in the list of primes and see if each
one can evenly divide the number.
Divide the original number by the prime, and check the last number
again.
Why? Well the smallest prime is 2. If the number isn't even then the first prime that can divide any number > 2 is 3.
Every number can be uniquely identified by its prime factors. This is called the Fundamental Theorem of Arithmetic. That means that there is only one way to represent e.g. 15 as a product of primes, in this case { 3, 5 }.
However any prime can divide a number more than once. For example, 49 is the product of two primes { 7, 7 }. Dividing the original number by one of its factors makes subsequent divisions quicker. And you can logically stop checking when the number == 1.

It doesn't print anything because it never leaves the loop, after which is found the only line which prints anything in the code. Try a smaller number and see if it ends.

Determine Prime Numbers using SINGLE do-while Loop

I wrote this program per my professor's instruction. Turns out he wanted us to use a SINGLE do-while loop. While I did technically do that... this won't fly. I can't figure out how to do it without using a for-loop or at least another loop of some other type. He said it could use continue or break statements--but that it might not be necessary.
I would appreciate not just re-writing my code--while this is handy, I don't learn from it well.
I appreciate any and all help.
int main() {
int max, x, n = 2; //init variables
//start n at 2 because 1 isn't prime ever
//asks user for max value
printf("Enter max number: ");
scanf("%i", &max);
/*prints prime numbers while the max value
is greater than the number being checked*/
do {
x = 0; //using x as a flag
for (int i = 2; i <= (n / 2); i++) {
if ((n % i) == 0) {
x = 1;
break;
}
}
if (x == 0) //if n is prime, print it!
printf("%i\n", n);
n++; //increase number to check for prime-ness
} while (n < max);
return 0;
}

This is definitely doable. The trick is to have a test variable, and each iteration through your while loop, check the test variable against your current number. Always start the test variable at 2 (every natural number > 0 is divisible by 1)
Cases to consider:
Our current number is divisible by the test variable -- number is NOT prime, increase the current number and reset the test variable.
Our test variable is greater than the square root of the current number. By definition, it CANNOT divide the current number, so the current number has to be prime (we have tried all numbers lower than the square root of the current number and none of them divide it). Increase the current number and reset the test variable.
Lastly, if either above case isn't true, we have to try the next number higher. Increment the test variable.
I have not provided the code as you asked to not have it re-written, but can provide if you would like.
EDIT
#include <stdio.h>
#include <math.h>
int main(void)
{
int max = 20;
int current = 4;
int checker = 2;
do{
if(checker > sqrt((double)current))
{
checker = 2;
printf("%d is prime\n",current);
current++;
}
else if(current % checker == 0)
{
checker = 2;
printf("%d is NOT prime\n",current);
current++;
}
else
checker++;
}while(current < max);
}
Output:
4 is NOT prime
5 is prime
6 is NOT prime
7 is prime
8 is NOT prime
9 is NOT prime
10 is NOT prime
11 is prime
12 is NOT prime
13 is prime
14 is NOT prime
15 is NOT prime
16 is NOT prime
17 is prime
18 is NOT prime
19 is prime

I won't give you the exact code, but two pointers that should help you:
First, a for loop can be written as a while loop (and, vice versa)
for (int i=0; i< 100; ++i)
...
would become:
int i=0;
while (i < 100)
{
...
++i;
}
Second, two nested loops can become a single one, in any number of ways:
for (int i=0; i< 100; ++i)
for (int j=0; j< 100; ++j)
...
Becomes
for (int z=0; z< 100*100; ++z)
{
i = z / 100;
j = z % 100;
}
The above shows two for loops, but you can perform similar transforms on other loops.

Think Eratosthenes sieve. In this method we strike composite numbers out of a table, so that in the end only primes remain. For simplicity, the table contains only odd numbers. You start pointing at 3, which is a prime. Strike out 3*3, 3*5... Finish your run over the table (it's finite), point at 5. It's not striked out, thus a prime. Strike out 15, 25... check 7, prime, strike 21, 35... check 9, already striked out, move on to 11...
Questions:
You have just checked a number, what is the next number to check?
How do you know you've ran out of numbers to check?
Write down answers to these questions, and you have a one-loop prime-finding algorithm.

Getting one too few divisors

This is a program to count the number of divisors for a number, but it is giving one less divisor than there actually is for that number.
#include <stdio.h>
int i = 20;
int divisor;
int total;
int main()
{
for (divisor = 1; divisor <= i; divisor++)
{
if ((i % divisor == 0) && (i != divisor))
{
total = total++;
}
}
printf("%d %d\n", i, total);
return 0;
}
The number 20 has 6 divisors, but the program says that there are 5 divisors.

&& (i != divisor)
means that 20 won't be considered a divisor. If you want it to be considered, ditch that bit of code, and you'll get the whole set, {1, 2, 4, 5, 10, 20}.
Even if you didn't want the number counted as a divisor, you could still ditch that code and just use < instead of <= in the for statement.
And:
total = total++;
is totally unnecessary. It may even be undefined, I'm just too lazy to check at the moment and it's not important since nobody writes code like that for long :-)
Use either:
total = total + 1;
or (better):
total++;

Divisor counting is perhaps simpler and certainly faster than any of these. The key fact to note is that if p is a divisor of n, then so is n/p. Whenever p is not the square root of n, then you get TWO divisors per division test, not one.
int divcount(int n)
{
int i, j, count=0;
for (i=1, j=n; i<j; j = n/++i)
{
if (i*j == n)
count += 2;
}
if (i == j && i*j == n)
++count;
return count;
}
That gets the job done with sqrt(n) divisions, and sqrt(n) multiplications. I choose that because, while j=n/i and another j%i can be done with a single division instruction on most CPUs, I haven't seen compilers pick up on that optimization. Since multiplication is single-clock on modern desktop processors, the i*j == n test is much cheaper than a second division.
PS: If you need a list of divisors, they come up in the loop as i and j values, and perhaps as the i==j==sqrt(n) value at the end, if n is a square.

You have added an extra check && (i != divisor) as explained in given answer.
Here, I wrote the same program using the prime factorisation. This is quick way to find the number of divisor for large number (reference).
// this function return the number of divisor for n.
// if n = (m^a) (n^b) ... where m, n.. are prime factors of n
// then number of divisor d(n) = (a+1)*(b+1)..
int divisorcount(int n){
int divider = 2;
int limit = n/2;
int divisorCount = 1;
int power = 0;
// loop through i=2...n/2
while(divider<=limit){
if(n%divider==0){
// dividing numper using prime factor
// (as smallest number devide a number
// is it's prime factor) and increase the
// power term for prime factor.
power++;
n/=divider;
}
else{
if(power != 0){
// use the prime factor count to calculate
// divisor count.
divisorCount*=(power+1);
}
power = 0;
divider++;
// if n become 1 then we have completed the
// prime factorization of n.
if(n==1){
break;
}
}
}
return divisorCount;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why does this prime number algorithm work? - c

Related

I want to optimize this program

How can I make this very small C program faster?

My C code doesn't output anything, or end running

Determine Prime Numbers using SINGLE do-while Loop

Getting one too few divisors

Categories

Resources