Professor says this isn't a efficient algorithm to check whether the number is divisible by a number from 100,000-150,000. I'm having trouble finding a better way. Any help would be appreciated.
unsigned short divisibility_check(unsigned long n) {
unsigned long i;
for (i = 100000; i <= 150000; i++) {
if (n % i == 0) {
return 0;
}
}
return 1;
}
Let's say you need to find whether a positive integer K is divisible by a number between 100,000 and 150,000, and it is such a rare operation, that doing precalculations is just not worth the processor time or memory used.
If K < 100,000, it cannot be divisible by a number between 100,000 and 150,000.
If 100,000 ≤ K ≤ 150,000, it is divisible by itself. It is up to you to decide whether this counts or not.
For a K > 150,000 to be divisible by M, with 100,000 ≤ M ≤ 150,000, K must also be divisible by L = K / M. This is because K = L × M, and all three are positive integers. So, you only need to test the divisibility of K by a set of L, where ⌊ K / 150,000 ⌋ ≤ L ≤ ⌊ K / 100,000 ⌋.
However, that set of Ls becomes larger than the set of possible Ms when K > = 15,000,000,000. Then it is again less work to just test K for divisibility against each M, much like OP's code is now.
When implementing this as a program, the most important thing in practice is, surprisingly, the comments you add. Do not write comments that describe what the code does; write comments that explain the model or algorithm you are trying to implement (say, at the function level), and your intent of what each small block of code should accomplish.
In this particular case, you should probably add a comment to each if clause, explaining your reasoning, much like I did above.
Beginner programmers often omit comments completely. It is unfortunate, because writing good comments is a hard habit to pick up afterwards. It is definitely a good idea to learn to comment your code (as I described above -- the comments that describe what the code does are less than useful; more noise than help), and keep honing your skill on that.
A programmer whose code is maintainable, is worth ten geniuses who produce write-only code. This is because all code has bugs, because humans make errors. To be an efficient developer, your code must be maintainable. Otherwise you're forced to rewrite each buggy part from scratch, wasting a lot of time. And, as you can see above, "optimization" at the algorithmic level, i.e. thinking about how to avoid having to do work, yields much better results than trying to optimize your loops or something like that. (You'll find in real life that surprisingly often, optimizing a loop in the proper way, removes the loop completely.)
Even in exercises, proper comments may be the difference between "no points, this doesn't work" and "okay, I'll give you partial credit for this one, because you had a typo/off-by-one bug/thinko on line N, but otherwise your solution would have worked".
As bolov did not understand how the above leads to a "naive_with_checks" function, I'll show it implemented here.
For ease of testing, I'll show a complete test program. Supply the range of integers to test, and the range of divisors accepted, as parameters to the program (i.e. thisprogram 1 500000 100000 150000 to duplicate bolov's tests).
#include <stdlib.h>
#include <inttypes.h>
#include <limits.h>
#include <locale.h>
#include <ctype.h>
#include <stdio.h>
#include <errno.h>
int is_divisible(const uint64_t number,
const uint64_t minimum_divisor,
const uint64_t maximum_divisor)
{
uint64_t divisor, minimum_result, maximum_result, result;
if (number < minimum_divisor) {
return 0;
}
if (number <= maximum_divisor) {
/* Number itself is a valid divisor. */
return 1;
}
minimum_result = number / maximum_divisor;
if (minimum_result < 2) {
minimum_result = 2;
}
maximum_result = number / minimum_divisor;
if (maximum_result < minimum_result) {
maximum_result = minimum_result;
}
if (maximum_result - minimum_result > maximum_divisor - minimum_divisor) {
/* The number is so large that it is the least amount of work
to check each possible divisor. */
for (divisor = minimum_divisor; divisor <= maximum_divisor; divisor++) {
if (number % divisor == 0) {
return 1;
}
}
return 0;
} else {
/* There are fewer possible results than divisors,
so we check the results instead. */
for (result = minimum_result; result <= maximum_result; result++) {
if (number % result == 0) {
divisor = number / result;
if (divisor >= minimum_divisor && divisor <= maximum_divisor) {
return 1;
}
}
}
return 0;
}
}
int parse_u64(const char *s, uint64_t *to)
{
unsigned long long value;
const char *end;
/* Empty strings are not valid. */
if (s == NULL || *s == '\0')
return -1;
/* Parse as unsigned long long. */
end = s;
errno = 0;
value = strtoull(s, (char **)(&end), 0);
if (errno == ERANGE)
return -1;
if (end == s)
return -1;
/* Overflow? */
if (value > UINT64_MAX)
return -1;
/* Skip trailing whitespace. */
while (isspace((unsigned char)(*end)))
end++;
/* If the string does not end here, it has garbage in it. */
if (*end != '\0')
return -1;
if (to)
*to = (uint64_t)value;
return 0;
}
int main(int argc, char *argv[])
{
uint64_t kmin, kmax, dmin, dmax, k, count;
if (argc != 5) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help | help ]\n", argv[0]);
fprintf(stderr, " %s MIN MAX MIN_DIVISOR MAX_DIVISOR\n", argv[0]);
fprintf(stderr, "\n");
fprintf(stderr, "This program counts which positive integers between MIN and MAX,\n");
fprintf(stderr, "inclusive, are divisible by MIN_DIVISOR to MAX_DIVISOR, inclusive.\n");
fprintf(stderr, "\n");
return EXIT_SUCCESS;
}
/* Use current locale. This may change which codes isspace() considers whitespace. */
if (setlocale(LC_ALL, "") == NULL)
fprintf(stderr, "Warning: Your C library does not support your current locale.\n");
if (parse_u64(argv[1], &kmin) || kmin < 1) {
fprintf(stderr, "%s: Invalid minimum positive integer to test.\n", argv[1]);
return EXIT_FAILURE;
}
if (parse_u64(argv[2], &kmax) || kmax < kmin || kmax >= UINT64_MAX) {
fprintf(stderr, "%s: Invalid maximum positive integer to test.\n", argv[2]);
return EXIT_FAILURE;
}
if (parse_u64(argv[3], &dmin) || dmin < 2) {
fprintf(stderr, "%s: Invalid minimum divisor to test for.\n", argv[3]);
return EXIT_FAILURE;
}
if (parse_u64(argv[4], &dmax) || dmax < dmin) {
fprintf(stderr, "%s: Invalid maximum divisor to test for.\n", argv[4]);
return EXIT_FAILURE;
}
count = 0;
for (k = kmin; k <= kmax; k++)
count += is_divisible(k, dmin, dmax);
printf("%" PRIu64 "\n", count);
return EXIT_SUCCESS;
}
It is useful to note that the above, running bolov's test, i.e. thisprogram 1 500000 100000 150000 only takes about 15 ms of wall clock time (13 ms CPU time), median, on a much slower Core i5-7200U processor. For really large numbers, like 280,000,000,000 to 280,000,010,000, the test does the maximum amount of work, and takes about 3.5 seconds per 10,000 numbers on this machine.
In other words, I wouldn't trust bolov's numbers to have any relation to timings for properly written test cases.
It is important to note that for any K between 1 and 500,000, the same test that bolov says their code measures, the above code does at most two divisibility tests to find if K is divisible by an integer between 100,000 and 150,000.
This solution is therefore quite efficient. It is definitely acceptable and near-optimal, when the tested K are relatively small (say, 32 bit unsigned integers or smaller), or when precomputed tables cannot be used.
Even when precomputed tables can be used, it is unclear if/when prime factorization becomes faster than the direct checks. There is certainly a tradeoff in the size and content of the precomputed tables. bolov claims that it is clearly superior to other methods, but hasn't implemented a proper "naive" divisibility test as shown above, and bases their opinion on experiments on quite small integers (1 to 500,000) that have simple prime decompositions.
As an example, a table of integers 1 to 500,000 pre-checked for divisibility takes only 62500 bytes (43750 bytes for 150,000 to 500,000). With that table, each test takes a small near-constant time (that only depends on memory and cache effects). Extending it to all 32-bit unsigned integers would require 512 GiB (536,870,912 bytes); the table can be stored in a memory-mapped read-only file, to let the OS kernel manage how much of it is mapped to RAM at any time.
Prime decomposition itself, especially using trial division, becomes more expensive than the naive approach when the number of trial divisions exceeds the range of possible divisors (50,000 divisors in this particular case). As there are 13848 primes (if one counts 1 and 2 as primes) between 1 and 150,000, the number of trial divisions can easily approach the number of divisors for sufficiently large input values.
For numbers with many prime factors, the combinatoric phase, finding if any subset of the prime factors multiply to a number between 100,000 and 150,000 is even more problematic. The number of possible combinations grows faster than exponentially. Without careful checks, this phase alone can do way more work per large input number than just trial division with each possible divisor would be.
(As an example, if you have 16 different prime factors, you already have 65,535 different combinations; more than the number of direct trial divisions. However, all such numbers are larger than 64-bit; the smallest being 2·3·5·7·11·13·17·19·23·29·31·37·41·43·47·53 = 32,589,158,477,190,044,730 which is a 65-bit number.)
There is also the problem of code complexity. The more complex the code, the harder it is to debug and maintain.
Ok, so I've implemented the version with sieve primes and factorization mentioned in the comments by m69 and it is ... way faster than the naive approach. I must admit, I didn't expect this at all.
My notations: left == 100'000 and right = 150'000
naive your version
naive_with_checks your version with simple checks:
if (n < left) no divisor
else if (n <= right) divisor
else if (left * 2 >= right && n < left * 2) divisor
factorization (above checks implemented)
Precompute the Sieve of Eratosthenes for all primes up to right. This time is not measured
factorize n (only with the primes from the prev step)
generate all subsets (backtracking, depth first: i.e. generate p1^0 * p2^0 * p3^0 first, instead of p1^5 first) with the product < left or until the product is in [left, right] (found divisor).
factorization_opt optimization of the previous algorithm where the subsets are not generated (no vector of subsets is created). I just pass the current product from one backtracking iteration to the next.
Nominal Animal's version I have also ran his version on my system with the same range.
I have written the program in C++ so I won't share it here.
I used std::uint64_t as data type and I have checked all numbers from 1 to 500'000 to see if each is divisible by a number in interval [100'000, 150'000]. All version reached the same solution: 170'836 numbers with positive results.
The setup:
Hardware: Intel Core i7-920, 4 cores with HT (all algorithm versions are single threaded), 2.66 GHz (boost 2.93 GHz),
8 MB SmartCache; memory: 6 GB DDR3 triple channel.
Compiler: Visual Studio 2017 (v141), Release x64 mode.
I must also add that I haven't profiled the programs so there is definitely room to improve the implementation. However this is enough here as the idea is to find a better algorithm.
version | elapsed time (milliseconds)
-----------------------+--------------
naive | 167'378 ms (yes, it's thousands separator, aka 167 seconds)
naive_with_checks | 97'197 ms
factorization | 7'906 ms
factorization_opt | 7'320 ms
|
Nominal Animal version | 14 ms
Some analysis:
For naive vs naive_with_checks: all the numbers in [1 200'000] can be solved with just the simple checks. As these represent 40% of all the numbers checked, the naive_with_checks version does roughly 60% of the work naive does. The execution time reflect this as naive_with_checks runtime is ≅58% of the naive version.
The factorization version is a whopping 12.3 times faster. That is indeed impressive. I haven't analyzed the time complexity of the alg.
And the final optimization brings a further 1.08x speedup. This is basically the time gained by removing the creation and copy of the small vectors of subset factors.
For those interested the sieve precomputation which is not included above takes about 1 ms. And this is the naive implementation from wikipedia, no optimizations whatsoever.
For comparison, here's what I had in mind when I posted my comment about using prime factorization. Compiled with gcc -std=c99 -O3 -m64 -march=haswell this is slightly faster than the naive method with checks and inversion when tested with the last 10,000 integers in the 64-bit range (3.469 vs 3.624 seconds).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <stdbool.h>
void eratosthenes(bool *ptr, uint64_t size) {
memset(ptr, true, size);
for (uint64_t i = 2; i * i < size; i++) {
if (ptr[i]) {
for (uint64_t j = i * i; j < size; j += i) {
ptr[j] = false;
}
}
}
}
bool divisible(uint64_t n, uint64_t a, uint64_t b) {
/* check for trivial cases first */
if (n < a) {
return false;
}
if (n <= b) {
return true;
}
if (n < 2 * a) {
return false;
}
/* Inversion: use range n/b ~ n/a; see Nominal Animal's answer */
if (n < a * b) {
uint64_t c = a;
a = (n + b - 1) / b; // n/b rounded up
b = n / c;
}
/* Create prime sieve when first called, or re-calculate it when */
/* called with a higher value of b; place before inversion in case */
/* of a large sequential test, to avoid repeated re-calculation. */
static bool *prime = NULL;
static uint64_t prime_size = 0;
if (prime_size <= b) {
prime_size = b + 1;
prime = realloc(prime, prime_size * sizeof(bool));
if (!prime) {
printf("Out of memory!\n");
return false;
}
eratosthenes(prime, prime_size);
}
/* Factorize n into prime factors up to b, using trial division; */
/* there are more efficient but also more complex ways to do this. */
/* You could return here, if a factor in the range a~b is found. */
static uint64_t factor[63];
uint8_t factors = 0;
for (uint64_t i = 2; i <= n && i <= b; i++) {
if (prime[i]) {
while (n % i == 0) {
factor[factors++] = i;
n /= i;
}
}
}
/* Prepare divisor sieve when first called, or re-allocate it when */
/* called with a higher value of b; in a higher-level language, you */
/* would probably use a different data structure for this, because */
/* this method iterates repeatedly over a potentially sparse array. */
static bool *divisor = NULL;
static uint64_t div_size = 0;
if (div_size <= b / 2) {
div_size = b / 2 + 1;
divisor = realloc(divisor, div_size * sizeof(bool));
if (!divisor) {
printf("Out of memory!\n");
return false;
}
}
memset(divisor, false, div_size);
divisor[1] = true;
uint64_t max = 1;
/* Iterate over each prime factor, and for every divisor already in */
/* the sieve, add the product of the divisor and the factor, up to */
/* the value b/2. If the product is in the range a~b, return true. */
for (uint8_t i = 0; i < factors; i++) {
for (uint64_t j = max; j > 0; j--) {
if (divisor[j]) {
uint64_t product = factor[i] * j;
if (product >= a && product <= b) {
return true;
}
if (product < div_size) {
divisor[product] = true;
if (product > max) {
max = product;
}
}
}
}
}
return false;
}
int main() {
uint64_t count = 0;
for (uint64_t n = 18446744073709541615LLU; n <= 18446744073709551614LLU; n++) {
if (divisible(n, 100000, 150000)) ++count;
}
printf("%llu", count);
return 0;
}
And this is the naive + checks + inversion implementation I compared it with:
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
bool divisible(uint64_t n, uint64_t a, uint64_t b) {
if (n < a) {
return false;
}
if (n <= b) {
return true;
}
if (n < 2 * a) {
return false;
}
if (n < a * b) {
uint64_t c = a;
a = (n + b - 1) / b;
b = n / c;
}
while (a <= b) {
if (n % a++ == 0) return true;
}
return false;
}
int main() {
uint64_t count = 0;
for (uint64_t n = 18446744073709541615LLU; n <= 18446744073709551614LLU; n++) {
if (divisible(n, 100000, 150000)) ++count;
}
printf("%llu", count);
return 0;
}
Here's a recursive method with primes. The idea here is that if a number is divisible by a number between 100000 and 150000, there is a path of reducing by division the product of only relevant primes that will pass through a state in the target range. (Note: the code below is meant for numbers greater than 100000*150000). In my testing, I could not find an instance where the stack performed over 600 iterations.
# Euler sieve
def getPrimes():
n = 150000
a = (n+1) * [None]
ps = ([],[])
s = []
p = 1
while (p < n):
p = p + 1
if not a[p]:
s.append(p)
# Save primes less
# than half
# of 150000, the only
# ones needed to construct
# our candidates.
if p < 75000:
ps[0].append(p);
# Save primes between
# 100000 and 150000
# in case our candidate
# is prime.
elif p > 100000:
ps[1].append(p)
limit = n / p
new_s = []
for i in s:
j = i
while j <= limit:
new_s.append(j)
a[j*p] = True
j = j * p
s = new_s
return ps
ps1, ps2 = getPrimes()
def f(n):
# Prime candidate
for p in ps2:
if not (n % p):
return True
# (primes, prime_counts)
ds = ([],[])
prod = 1
# Prepare only prime
# factors that could
# construct a composite
# candidate.
for p in ps1:
while not (n % p):
prod *= p
if (not ds[0] or ds[0][-1] != p):
ds[0].append(p)
ds[1].append(1)
else:
ds[1][-1] += 1
n /= p
# Reduce the primes product to
# a state where it's between
# our target range.
stack = [(prod,0)]
while stack:
prod, i = stack.pop()
# No point in reducing further
if prod < 100000:
continue
# Exit early
elif prod <= 150000:
return True
# Try reducing the product
# by different prime powers
# one prime at a time
if i < len(ds[0]):
for p in xrange(ds[1][i] + 1):
stack.append((prod / ds[0][i]**p, i + 1))
return False
Output:
c = 0
for ii in xrange(1099511627776, 1099511628776):
f_i = f(ii)
if f_i:
c += 1
print c # 239
Here is a very simple solution with a sieve cache. If you call the divisibility_check function for many numbers in a sequence, this should be very efficient:
#include <string.h>
int divisibility_check_sieve(unsigned long n) {
static unsigned long sieve_min = 1, sieve_max;
static unsigned char sieve[1 << 19]; /* 1/2 megabyte */
if (n < sieve_min || n > sieve_max) {
sieve_min = n & ~(sizeof(sieve) - 1);
sieve_max = sieve_min + sizeof(sieve) - 1;
memset(sieve, 1, sizeof sieve);
for (unsigned long m = 100000; m <= 150000; m++) {
unsigned long i = sieve_min % m;
if (i != 0)
i = m - i;
for (; i < sizeof sieve; i += m) {
sieve[i] = 0;
}
}
}
return sieve[n - sieve_min];
}
Here is a comparative benchmark:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
int divisibility_check_naive(unsigned long n) {
for (unsigned long i = 100000; i <= 150000; i++) {
if (n % i == 0) {
return 0;
}
}
return 1;
}
int divisibility_check_small(unsigned long n) {
unsigned long i, min = n / 150000, max = n / 100000;
min += (min == 0);
max += (max == 0);
if (max - min > 150000 - 100000) {
for (i = 100000; i <= 150000; i++) {
if (n % i == 0) {
return 0;
}
}
return 1;
} else {
for (i = min; i <= max; i++) {
if (n % i == 0) {
unsigned long div = n / i;
if (div >= 100000 && div <= 150000)
return 0;
}
}
return 1;
}
}
int divisibility_check_sieve(unsigned long n) {
static unsigned long sieve_min = 1, sieve_max;
static unsigned char sieve[1 << 19]; /* 1/2 megabyte */
if (n < sieve_min || n > sieve_max) {
sieve_min = n & ~(sizeof(sieve) - 1);
sieve_max = sieve_min + sizeof(sieve) - 1;
memset(sieve, 1, sizeof sieve);
for (unsigned long m = 100000; m <= 150000; m++) {
unsigned long i = sieve_min % m;
if (i != 0)
i = m - i;
for (; i < sizeof sieve; i += m) {
sieve[i] = 0;
}
}
}
return sieve[n - sieve_min];
}
int main(int argc, char *argv[]) {
unsigned long n, count = 0, lmin, lmax, range[2] = { 1, 500000 };
int pos = 0, naive = 0, small = 0, sieve = 1;
clock_t t;
char *p;
for (int i = 1; i < argc; i++) {
n = strtoul(argv[i], &p, 0);
if (*p == '\0' && pos < 2)
range[pos++] = n;
else if (!strcmp(argv[i], "naive"))
naive = 1;
else if (!strcmp(argv[i], "small"))
small = 1;
else if (!strcmp(argv[i], "sieve"))
sieve = 1;
else
printf("invalid argument: %s\n", argv[i]);
}
lmin = range[0];
lmax = range[1] + 1;
if (naive) {
t = clock();
for (count = 0, n = lmin; n != lmax; n++) {
count += divisibility_check_naive(n);
}
t = clock() - t;
printf("naive: [%lu..%lu] -> %lu non-divisible numbers, %10.2fms\n",
lmin, lmax - 1, count, t * 1000.0 / CLOCKS_PER_SEC);
}
if (small) {
t = clock();
for (count = 0, n = lmin; n != lmax; n++) {
count += divisibility_check_small(n);
}
t = clock() - t;
printf("small: [%lu..%lu] -> %lu non-divisible numbers, %10.2fms\n",
lmin, lmax - 1, count, t * 1000.0 / CLOCKS_PER_SEC);
}
if (sieve) {
t = clock();
for (count = 0, n = lmin; n != lmax; n++) {
count += divisibility_check_sieve(n);
}
t = clock() - t;
printf("sieve: [%lu..%lu] -> %lu non-divisible numbers, %10.2fms\n",
lmin, lmax - 1, count, t * 1000.0 / CLOCKS_PER_SEC);
}
return 0;
}
Here are some run times:
naive: [1..500000] -> 329164 non-divisible numbers, 158174.52ms
small: [1..500000] -> 329164 non-divisible numbers, 12.62ms
sieve: [1..500000] -> 329164 non-divisible numbers, 1.35ms
sieve: [0..4294967295] -> 3279784841 non-divisible numbers, 8787.23ms
sieve: [10000000000000000000..10000000001000000000] -> 765978176 non-divisible numbers, 2205.36ms
int prime(unsigned long long n){
unsigned val=1, divisor=7;
if(n==2 || n==3) return 1; //n=2, n=3 (special cases).
if(n<2 || !(n%2 && n%3)) return 0; //if(n<2 || n%2==0 || n%3==0) return 0;
for(; divisor<=n/divisor; val++, divisor=6*val+1) //all primes take the form 6*k(+ or -)1, k[1, n).
if(!(n%divisor && n%(divisor-2))) return 0; //if(n%divisor==0 || n%(divisor-2)==0) return 0;
return 1;
}
The code above is something a friend wrote up for getting a prime number. It seems to be using some sort of sieving, but I'm not sure how it exactly works. The code below is my less awesome version. I would use sqrt for my loop, but I saw him doing something else (probably sieving related) and so I didn't bother.
int prime( unsigned long long n ){
unsigned i=5;
if(n < 4 && n > 0)
return 1;
if(n<=0 || !(n%2 || n%3))
return 0;
for(;i<n; i+=2)
if(!(n%i)) return 0;
return 1;
}
My question is: what exactly is he doing?
Your friend's code is making use of the fact that for N > 3, all prime numbers take the form (6×M±1) for M = 1, 2, ... (so for M = 1, the prime candidates are N = 5 and N = 7, and both those are primes). Also, all prime pairs are like 5 and 7. This only checks 2 out of every 3 odd numbers, whereas your solution checks 3 out of 3 odd numbers.
Your friend's code is using division to achieve something akin to the square root. That is, the condition divisor <= n / divisor is more or less equivalent to, but slower and safer from overflow than, divisor * divisor <= n. It might be better to use unsigned long long max = sqrt(n); outside the loop. This reduces the amount of checking considerably compared with your proposed solution which searches through many more possible values. The square root check relies on the fact that if N is composite, then for a given pair of factors F and G (such that F×G = N), one of them will be less than or equal to the square root of N and the other will be greater than or equal to the square root of N.
As Michael Burr points out, the friend's prime function identifies 25 (5×5) and 35 (5×7) as prime, and generates 177 numbers under 1000 as prime whereas, I believe, there are just 168 primes in that range. Other misidentified composites are 121 (11×11), 143 (13×11), 289 (17×17), 323 (17×19), 841 (29×29), 899 (29×31).
Test code:
#include <stdio.h>
int main(void)
{
unsigned long long c;
if (prime(2ULL))
printf("2\n");
if (prime(3ULL))
printf("3\n");
for (c = 5; c < 1000; c += 2)
if (prime(c))
printf("%llu\n", c);
return 0;
}
Fixed code.
The trouble with the original code is that it stops checking too soon because divisor is set to the larger, rather than the smaller, of the two numbers to be checked.
static int prime(unsigned long long n)
{
unsigned long long val = 1;
unsigned long long divisor = 5;
if (n == 2 || n == 3)
return 1;
if (n < 2 || n%2 == 0 || n%3 == 0)
return 0;
for ( ; divisor<=n/divisor; val++, divisor=6*val-1)
{
if (n%divisor == 0 || n%(divisor+2) == 0)
return 0;
}
return 1;
}
Note that the revision is simpler to understand because it doesn't need to explain the shorthand negated conditions in tail comments. Note also the +2 instead of -2 in the body of the loop.
He's checking for the basis 6k+1/6k-1 as all primes can be expressed in that form (and all integers can be expressed in the form of 6k+n where -1 <= n <= 4). So yes it is a form of sieving.. but not in the strict sense.
For more:
http://en.wikipedia.org/wiki/Primality_test
In case the 6k+-1 portion is confusing, note that you can perform some factorization of most forms of 6k+n and some are obviously composite and some need to be tested.
Consider numbers:
6k + 0 -> composite
6k + 1 -> not obviously composite
6k + 2 -> 2(3k+1) --> composite
6k + 3 -> 3(2k+1) --> composite
6k + 4 -> 2(3k+2) --> composite
6k + 5 -> not obviously composite
I've not seen this little trick before, so it's neat, but of limited utility since a sieve of Eratosthenese is more efficient for finding many small prime numbers, and larger prime numbers benefit from faster, more intelligent, tests.
#include<stdio.h>
int main()
{
int i,j;
printf("enter the value :");
scanf("%d",&i);
for (j=2;j<i;j++)
{
if (i%2==0 || i%j==0)
{
printf("%d is not a prime number",i);
return 0;
}
else
{
if (j==i-1)
{
printf("%d is a prime number",i);
}
else
{
continue;
}
}
}
}
#include<stdio.h>
int main()
{
int n, i = 3, count, c;
printf("Enter the number of prime numbers required\n");
scanf("%d",&n);
if ( n >= 1 )
{
printf("First %d prime numbers are :\n",n);
printf("2\n");
}
for ( count = 2 ; count <= n ; )
{
for ( c = 2 ; c <= i - 1 ; c++ )
{
if ( i%c == 0 )
break;
}
if ( c == i )
{
printf("%d\n",i);
count++;
}
i++;
}
return 0;
}
I am trying to generate all the prime factors of a number n. When I give it the number 126 it gives me 2, 3 and 7 but when I give it say 8 it gives me 2, 4 and 8. Any ideas as to what I am doing wrong?
int findPrime(unsigned long n)
{
int testDivisor, i;
i = 0;
testDivisor = 2;
while (testDivisor < n + 1)
{
if ((testDivisor * testDivisor) > n)
{
//If the test divisor squared is greater than the current n, then
//the current n is either 1 or prime. Save it if prime and return
}
if (((n % testDivisor) == 0))
{
prime[i] = testDivisor;
if (DEBUG == 1) printf("prime[%d] = %d\n", i, prime[i]);
i++;
n = n / testDivisor;
}
testDivisor++;
}
return i;
}
You are incrementing testDivisor even when you were able to divide n by it. Only increase it when it is not divisible anymore. This will result in 2,2,2, so you have to modify it a bit further so you do not store duplicates, but since this is a homework assignment I think you should figure that one out yourself :)
Is this based on an algorithm your professor told you to implement or is it your own heuristic? In case it helps, some known algorithms for prime factorization are the Quadratic Sieve and the General Number Field Sieve.
Right now, you aren't checking if any divisors you find are prime. As long as n % testDivisor == 0 you are counting testDivisor as a prime factor. Also, you are only dividing through by testDivisor once. You could fix this a number of ways, one of which would be to replace the statement if (((n % testDivisor) == 0)) with while (((n % testDivisor) == 0)).
Fixing this by adding the while loop also ensures that you won't get composite numbers as divisors, as if they still divide n, a smaller prime factor must have also divided n and the while loop for that prime factor wouldn't have left early.
Here is code to find the Prime Factor:
long GetPrimeFactors(long num, long *arrResult)
{
long count = 0;
long arr[MAX_SIZE];
long i = 0;
long idx = 0;
for(i = 2; i <= num; i++)
{
if(IsPrimeNumber(i) == true)
arr[count++] = i;
}
while(1)
{
if(IsPrimeNumber(num) == true)
{
arrResult[idx++] = num;
break;
}
for(i = count - 1; i >= 0; i--)
{
if( (num % arr[i]) == 0)
{
arrResult[idx++] = arr[i];
num = num / arr[i];
break;
}
}
}
return idx;
}
Reference: http://www.softwareandfinance.com/Turbo_C/Prime_Factor.html
You can use the quadratic sieve algorithm, which factors 170-bit integers in second and 220-bit integers in minute. There is a pure C implementation here that does not require GMP or an external library : https://github.com/michel-leonard/C-Quadratic-Sieve, it's able to provide you a list of the prime factors of N. Thank You.
I am accepting a composite number as an input. I want to print all its factors and also the largest prime factor of that number. I have written the following code. It is working perfectly ok till the number 51. But if any number greater than 51 is inputted, wrong output is shown. how can I correct my code?
#include<stdio.h>
void main()
{
int i, j, b=2, c;
printf("\nEnter a composite number: ");
scanf("%d", &c);
printf("Factors: ");
for(i=1; i<=c/2; i++)
{
if(c%i==0)
{
printf("%d ", i);
for(j=1; j<=i; j++)
{
if(i%j > 0)
{
b = i;
}
if(b%3==0)
b = 3;
else if(b%2==0)
b = 2;
else if(b%5==0)
b = 5;
}
}
}
printf("%d\nLargest prime factor: %d\n", c, b);
}
This is a bit of a spoiler, so if you want to solve this yourself, don't read this yet :). I'll try to provide hints in order of succession, so you can read each hint in order, and if you need more hints, move to the next hint, etc.
Hint #1:
If divisor is a divisor of n, then n / divisor is also a divisor of n. For example, 100 / 2 = 50 with remainder 0, so 2 is a divisor of 100. But this also means that 50 is a divisor of 100.
Hint #2
Given Hint #1, what this means is that we can loop from i = 2 to i*i <= n when checking for prime factors. For example, if we are checking the number 100, then we only have to loop to 10 (10*10 is <= 100) because by using hint #1, we will get all the factors. That is:
100 / 2 = 50, so 2 and 50 are factors
100 / 5 = 20, so 5 and 20 are factors
100 / 10 = 10, so 10 is a factor
Hint #3
Since we only care about prime factors for n, it's sufficient to just find the first factor of n, call it divisor, and then we can recursively find the other factors for n / divisor. We can use a sieve approach and mark off the factors as we find them.
Hint #4
Sample solution in C:
bool factors[100000];
void getprimefactors(int n) {
// 0 and 1 are not prime
if (n == 0 || n == 1) return;
// find smallest number >= 2 that is a divisor of n (it will be a prime number)
int divisor = 0;
for(int i = 2; i*i <= n; ++i) {
if (n % i == 0) {
divisor = i;
break;
}
}
if (divisor == 0) {
// we didn't find a divisor, so n is prime
factors[n] = true;
return;
}
// we found a divisor
factors[divisor] = true;
getprimefactors(n / divisor);
}
int main() {
memset(factors,false,sizeof factors);
int f = 1234;
getprimefactors(f);
int largest;
printf("prime factors for %d:\n",f);
for(int i = 2; i <= f/2; ++i) {
if (factors[i]) {
printf("%d\n",i);
largest = i;
}
}
printf("largest prime factor is %d\n",largest);
return 0;
}
Output:
---------- Capture Output ----------
> "c:\windows\system32\cmd.exe" /c c:\temp\temp.exe
prime factors for 1234:
2
617
largest prime factor is 617
> Terminated with exit code 0.
I presume you're doing this to learn, so I hope you don't mind a hint.
I'd start by stepping through your algorithm on a number that fails. Does this show you where the error is?
You need to recode so that your code finds all the prime numbers of a given number, instead of just calculating for the prime numbers 2,3, and 5. In other words, your code can only work with the number you are calculating is a prime number or is a multiple of 2, 3, or 5. But 7, 11, 13, 17, 19 are also prime numbers--so your code should simply work by finding all factors of a particular number and return the largest factor that is not further divisible.
Really, this is very slow for all but the smallest numbers (below, say, 100,000). Try finding just the prime factors of the number:
#include <cmath>
void addfactor(int n) {
printf ("%d\n", n);
}
int main()
{
int d;
int s;
int c = 1234567;
while (!(c&1)) {
addfactor(2);
c >>= 1;
}
while (c%3 == 0) {
addfactor(3);
c /= 3;
}
s = (int)sqrt(c + 0.5);
for (d = 5; d <= s;) {
while (c % d == 0) {
addfactor(d);
c /= d;
s = (int)sqrt(c + 0.5);
}
d += 2;
while (c % d == 0) {
addfactor(d);
c /= d;
s = (int)sqrt(c + 0.5);
}
d += 4;
}
if (c > 1)
addfactor(c);
return 0;
}
where addfactor is some kind of macro that adds the factor to a list of prime factors. Once you have these, you can construct a list of all the factors of the number.
This is dramatically faster than the other code snippets posted here. For a random input like 10597959011, my code would take something like 2000 bit operations plus 1000 more to re-constitute the divisors, while the others would take billions of operations. It's the difference between 'instant' and a minute in that case.
Simplification to dcp's answer(in an iterative way):
#include <stdio.h>
void factorize_and_print(unsigned long number) {
unsigned long factor;
for(factor = 2; number > 1; factor++) {
while(number % factor == 0) {
number = number / factor;
printf("%lu\n",factor);
}
}
}
/* example main */
int main(int argc,char** argv) {
if(argc >= 2) {
long number = atol(argv[1]);
factorize_and_print(number);
} else {
printf("Usage: %s <number>%<number> is unsigned long", argv[0]);
}
}
Note: There is a number parsing bug here that is not getting the number in argv correctly.