exceeding 500000 with the method of Erastosthenes - c

i got a problem which i can't solve
I want to know all prime numbers below a given limit x. Allowing me to enter x and calculate the prime numbers using the method of Erastosthenes. Displaying the result on the screen and saving it to a text file.
Calculating the primenumbers below the x, printing them and saving them to a text file worked, the only problem i have is that x can't exceed 500000
could you guys help me?
#include <stdio.h>
#include <math.h>
void sieve(long x, int primes[]);
main()
{
long i;
long x=500000;
int v[x];
printf("give a x\n");
scanf("%d",&x);
FILE *fp;
fp = fopen("primes.txt", "w");
sieve(x, v);
for (i=0;i<x;i++)
{
if (v[i] == 1)
{
printf("\n%d",i);
fprintf(fp, "%d\n",i);
}
}
fclose(fp);
}
void sieve(long x, int primes[])
{
int i;
int j;
for (i=0;i<x;i++)
{
primes[i]=1; // we initialize the sieve list to all 1's (True)
primes[0]=0,primes[1]=0; // Set the first two numbers (0 and 1) to 0 (False)
}
for (i=2;i<sqrt(x);i++) // loop through all the numbers up to the sqrt(n)
{
for (j=i*i;j<x;j+=i) // mark off each factor of i by setting it to 0 (False)
{
primes[j] = 0;
}
}
}

You will be able to handle four times as many values by declaring char v [500000] instead of int v [100000].
You can handle eight times more values by declaring unsigned char v [500000] and using only a single bit for each prime number. This makes the code a bit more complicated.
You can handle twice as many values by having a sieve for odd numbers only. Since 2 is the only even prime number, there is no point keeping them in the sieve.
Since memory for local variables in a function is often quite limited, you can handle many more values by using a static array.

Allocating v as an array of int is wasteful, and making it a local array is risky, stack space being limited. If the array becomes large enough to exceed available stack space, the program will invoke undefined behaviour and likely crash.
While there are ways to improve the efficiency of the sieve by changing the sieve array to an array of bits containing only odd numbers or fewer numbers (6n-1 and 6n+1 is a good trick), you can still improve the efficiency of your simplistic approach by a factor of 10 with easy changes:
fix primes[0] and primes[1] outside the loop,
clear even offsets of prime except the first and only scan odd numbers,
use integer arithmetic for the outer loop limit,
ignore numbers that are already known to be composite,
only check off odd multiples of i.
Here is an improved version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void sieve(long x, unsigned char primes[]) {
long i, j;
for (i = 0; i < x; i++) {
primes[i] = i & 1;
}
primes[1] = 0;
primes[2] = 1;
/* loop through all odd numbers up to the sqrt(x) */
for (i = 3; (j = i * i) < x; i += 2) {
/* skip composite numbers */
if (primes[i] == 0)
continue;
/* mark each odd multiple of i as composite */
for (; j < x; j += i + i) {
primes[j] = 0;
}
}
}
int main(int argc, char *argv[]) {
long i, x, count;
int do_count = 0;
unsigned char *v;
if (argc > 1) {
x = strtol(argv[1], NULL, 0);
} else {
printf("enter x: ");
if (scanf("%ld", &x) != 1)
return 1;
}
if (x < 0) {
x = -x;
do_count = 1;
}
v = malloc(x);
if (v == NULL) {
printf("Not enough memory\n");
return 1;
}
sieve(x, v);
if (do_count) {
for (count = i = 0; i < x; i++) {
count += v[i];
}
printf("%ld\n", count);
} else {
for (i = 0; i < x; i++) {
if (v[i] == 1) {
printf("%ld\n", i);
}
}
}
free(v);
return 0;
}

I believe the problem you are having is allocating an array of int if more than 500000 elements on the stack. This is not an efficient way, to use an array where the element is the number and the value indicates whether it is prime or not. If you want to do this, at least use bool, not int as this should only be 1 byte, not 4.
Also notice this
for (i=0;i<x;i++)
{
primes[i]=1; // we initialize the sieve list to all 1's (True)
primes[0]=0,primes[1]=0; // Set the first two numbers (0 and 1) to 0 (False)
}
You are reassigning the first two elements in each loop. Take it out of the loop.

You are initializing x to be 500000, then creating an array with x elements, thus it will have 500000 elements. You are then reading in x. The array will not change size when the value of x changes - it is fixed at 500000 elements, the value of x when you created the array. You want something like this:
long x=500000;
printf("give a x\n");
scanf("%d",&x);
int *v = new int[x];
This fixes your fixed size array issue, and also gets it off the stack and into the heap which will allow you to allocate more space. It should work up to the limit of the memory you have available.

Related

How to Improve this piece of C Code- Code Wars question on Prime Gaps

I'm currently learning C and have been practicing on codewars recently. I came across this question on prime gaps and was curious on how to improve it. I was initially fooled in thinking this wouldn't be as bad but I realized that finding primes is difficult (especially for large numbers where it can be at least an NP-Hard problem). I know my code right now has multiple for-loops and this is terrible in terms of performance. I also don't fully know the clean ways of writing C so there might be some no-nos I did (e.g. I know it's my responsibility to free up dynamically allocated memory but I tried freeing memory in the main() calling function and by freeing the first element of the allocated memory block--not sure if this is the appropriate way of freeing up a block of memory)
In general, the main function calls the prime_gap function several times. I know this code works because it was submitted successfully but any tips on writing this better (algorithmically in C)?
/* a prime gap of length "n" indicates that n-1 consecutive composite numbers exist between two primes.
* For example, the gap beween (2,3) is 1, the gap between (5,7) is 2 and the gap between (7,11) is 4.
* Our function should return the first pair of primes that satisfies the gap that we're looking for in a search between two numbers. /
There should also be no primes that exist within the gap of the first two primes that are found.
* gap(g, n, m) -> where g = gap length, n = start of search space, m = end of search space
*/
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <math.h>
long long *gap(int g, int n, int m);
bool check_prime(int, bool);
int main(int argc, const char *argv[]){
long long *check3 = gap(2,100,110);
for (int i = 0; i < 2; i++){
printf("%lld ", check3[i]);
}
free(&check3[0]);
printf("\n");
long long *check = gap(2,3,50);
for (int i = 0; i< 2; i++){
printf("%lld ", check[i]);
}
printf("\n");
free(&check[0]);
long long *check1 = gap(2,5,5);
for (int i = 0; i < 2; i++){
printf("%lld ", check1[i]);
}
free(&check1[0]);
printf("\n");
long long *check2 = gap(4,130,200);
for (int i = 0; i < 2; i++){
printf("%lld ", check2[i]);
}
free(&check2[0]);
printf("\n");
long long *check4 = gap(6,100,110);
for (int i = 0; i < 2; i++){
printf("%lld ", check4[i]);
}
free(&check4[0]);
printf("\n");
long long *gap(int g, int n, int m) {
long long *result = (long long*) malloc(sizeof(long long) *2); // dynamically allocate 2 long longs for the integer array
if (result == NULL){
perror("Not enough memory");
}
int test = 0;
static bool prime;
for (int i = n; i < m; i++) { // traverse search space
prime = true;
prime = check_prime(i, prime);
if (prime == true) { // identifies prime number
test = i + g; // add the gap value to identified prime
prime = false; // set bool to false to now check for any primes that exist between i and i+gap
for (int z = i+1; z < test; z++ ) { // check there is no prime in between the first and second (test) primes
prime = check_prime(z, prime);
if (prime == true) break;
}
if (prime != true) { // found no primes between i and i+gap
prime = true; // set bool to true to then toggle off in the check right below if i+gap is not actually prime
prime = check_prime(test, prime); // now need to check whether i+gap itself is a prime
if (prime == true) {
result[0] = i; result[1] = test;
return result;
}
}
}
}
result[0] = result[1] = 0;
return result;
}
bool check_prime(int i, bool prime){
for (int j = 2; j <= sqrt(i); j++){
if (i % j == 0) {
return false;
}
}
return true;
}
Reading you code, the following comments come to mind:
you are never freeing the space allocated by the malloc
therefore I am wondering if you really need to use malloc, a simple global variable would have been sufficient for what you are doing with it
you check_prime function has a second parameter prime that is never used
in function gap, the variable prime is indicated as static, this is not required, it could also lead to errors
from the algorithmic point of view:
your logic goes like
for i in range to check:
if i is prime
check if all the number between i and i+gap are not prime
if i+gap is prime then return the tuple(i, i+gap)
globally, you are checking several times for the same number if it is prime, since this is by far the most "expensive" operation, you should try not to
specifically, you should start by checking test before iterating over all the numbers in the range i..test.

My code functions with a printf statement, but not without it

Adding the printf("Hi!\n") statements allows the code to work. It also works if the bound initial bound is improper and the user enters a new one. When I ran some tests calculate divers sometimes returned a character instead of an integer. I'm thinking it has something to do with my memory allocation. I also noticed that ./a.out 6 10 "|" would work but ./a.out 6 25 "|" would not causing an infinite loop when printing the lines of "|".
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
// Structs
typedef struct data_struct {
int lineNumber;
int divisorSum;
char type[10];
}data;
// Prototypes
int calculateDivsors(int integer);
// Functions
int main (int argc, char *argv[]) {
int lowerBound;
int upperBound;
char character;
// Gets the values from command-line
sscanf(argv[1], "%d", &lowerBound);
sscanf(argv[2], "%d", &upperBound);
sscanf(argv[3], "%c", &character);
// Check to see if bound is proper
while (upperBound <= lowerBound || lowerBound < 2) {
printf("Error, please enter a new range (positive increasing).\n");
scanf("%d %d", &lowerBound, &upperBound);
}
// Structure calls
data* info = NULL;
int totalData = upperBound - lowerBound;
// Allocate the memory
info = (data*)malloc(totalData * sizeof(data));
printf("Hi!\n");
if (info != NULL) {
// Iterate through all the digits between the two bounds
for (int i = lowerBound; i <= upperBound; i++) {
int sum = calculateDivsors(i);
// Write data to indiviual structures
info[i].lineNumber = i;
info[i].divisorSum = sum;
// Check to see if the sum is greater than, less than, or equal to the original
if (sum == i) {
strcpy(info[i].type, "Perfect");
}
else if (sum > i) {
strcpy(info[i].type, "Abundant");
}
else if (sum < i) {
strcpy(info[i].type, "Deficient");
}
// Line n# has a column width of 4, string of 10
printf("%4d is %-10s\t", info[i].lineNumber, info[i].type);
// Generate Pictogram
for (int j = 0; j < info[i].divisorSum; j++) {
printf("%c", character);
}
printf("\n");
}
}
}
// Adds up the sum of diviors
int calculateDivsors(int integer) {
int sum = 0;
for (int i = 1; i < integer; i++) {
// Add to sum if perfectly i is a sum of integer
if (integer % i == 0) {
sum += i;
}
}
return sum; // Returns the sum of diviors
}
You are accessing data outside its allocated buffer whenever lowerBound doesn't start with 0.
info[i].lineNumber = i;
Ideally, you should become...
info[i - lowerBound].lineNumber = i;
To ensure that the indexing starts at 0. Further, your window between lowerBound and upperBound is inclusive. That means it includes both ending boundaries. Therefore, totalData is undersized by one element. Even if you fix the indexing problem, your code will still be wrong with this:
int totalData = (upperBound - lowerBound) + 1;
Failing to do both of the above causes your code to invoke undefined behavior (UB), and thus unpredictable results thereafter. It may even appear to work. That, however, is a red herring when your code has UB. Don't confuse defined behavior with observed behavior. You can trust the latter only once you have the former; the two are not synonymous.

Prime Finder in C

My prime finder is based on the fact that to check if a number is prime, we only need to check the prime numbers up to it's square root. So, to find every prime number between 0 and x, knowing all the prime numbers between 0 and x's square root will allow us to compute things very quickly. This initial list of prime finders we find using the brute force method, then we pass this list into the quick prime finder.
This code compiles and works correctly, but for some reason I'm getting segmentation fault 11 when I try an upper bound of 5 million or more. It seems to be "all good" until I try to make the "finalPrimes" list. Any thoughts as to why this might be/general feedback is greatly appreciated.
PS, I'm new to C so general commentary on my design is appreciated as well.
#include<stdio.h>
#include<math.h>
#include<stdbool.h>
void fill_array_with_primes_brute(int *array, int upper);
void fill_array_with_primes_quick(int *initial, int *final, int lower, int upper);
int find_end(int *array, int length);
bool is_prime_brute(int num);
bool is_prime_quick(int *primes, int num);
int main(void)
{
int upperBound;
printf("Enter upper bound: \n");
scanf("%i", &upperBound); /* get user input for upper bound */
int boundRoot = (int) sqrtf((float) upperBound) + 1; /* get the root of this upper bound for later use */
printf("%i is root\n", boundRoot);
printf("All good\n");
int initialPrimes[boundRoot / 2]; /* we can safely assume that the number of primes between 0 and x is less than x / 2 for larger numbers */
printf("All good\n");
int finalPrimes[upperBound / 2];
printf("All good\n");
fill_array_with_primes_brute(initialPrimes, boundRoot);
printf("All good\n");
int initialPrimesSize = find_end(initialPrimes, sizeof initialPrimes / sizeof initialPrimes[0]);
printf("All good\n");
printf("%i primes between 0 and %i\n", initialPrimesSize, boundRoot);
printf("All good\n");
initialPrimes[initialPrimesSize] = 50000;
printf("All good\n");
printf("\nHere they are: \n"); /* This will act as a barrier between the primes and the trailing 0's so that is_prime_quick works properly */
for (int x = 0; x < initialPrimesSize; x++)
{
printf("%i\n", initialPrimes[x]);
}
fill_array_with_primes_quick(initialPrimes, finalPrimes, boundRoot, upperBound);
printf("\nHere are the other ones: \n");
int pos = 0;
while (finalPrimes[pos] != 0)
{
printf("%i\n", finalPrimes[pos]);
pos++;
}
}
void fill_array_with_primes_brute(int *array, int upper) /* upper is the number up to which we want primes */
{
array[0] = 2;
array[1] = 3; /* fill array with 2 & 3 cos yolo */
int arrayCount = 2; /* start this counter cos C doesn't have ArrayLists */
for (int pote = 4; pote < upper; pote++) /* every number in range is potentially a prime */
{
if (is_prime_brute(pote))
{
array[arrayCount] = pote;
arrayCount++;
}
}
}
bool is_prime_brute(int num)
{
for (int x = 2; x < (int) sqrtf((float) num) + 1; x++) /* go through numbers up to the number's square root looking for a factor */
{
if (num % x == 0)
{
return false; /* has a factor, so not a prime */
}
}
return true; /* if we've made it this far it's a prime */
}
void fill_array_with_primes_quick(int *initial, int *final, int lower, int upper)
{
int arrayCount = 0;
for (int pote = lower; pote < upper; pote++)
{
if (is_prime_quick(initial, pote))
{
final[arrayCount] = pote;
arrayCount++;
}
}
}
bool is_prime_quick(int *primes, int num)
{
int pos = 0;
while (primes[pos] < (int) sqrtf((float) num) + 1) /* while the number we're at in the array is less than the number's square root */
{
if (num % primes[pos] == 0)
{
return false;
}
pos++;
}
return true;
}
int find_end(int *array, int length) /* Find the true end of the array, as it will contain a few trailing 0's */
{
for(int x = 0; x < length; x++)
{
if (array[x] == 0)
{
return x;
}
}
return 0;
}
This happens because you allocate too much memory in the automatic memory area (also known as "on the stack").
Replace these declarations with mallocs:
int initialPrimes[boundRoot / 2];
int finalPrimes[boundRoot / 2];
become
int *initialPrimes = malloc(sizeof(int)*boundRoot / 2);
int *finalPrimes = malloc(sizeof(int)*boundRoot / 2);
Also replace sizeof initialPrimes / sizeof initialPrimes[0]) expression with boundRoot / 2. Also add calls to free for both allocated arrays: after the final while loop in main, add
free(initialPrimes);
free(finalPrimes);
The square root of 5m is about 2236, so it is a stack overflow. Your code seems to be safe though, so the segmentation fault isn't caused by any undefined behavior:
Enter upper bound:
5000000
2237 is root
All good
All good
ASAN:DEADLYSIGNAL
=================================================================
==24998==ERROR: AddressSanitizer: stack-overflow on address 0x7ffe01f4fb28 (pc 0x55d6add011dd bp 0x7ffe028da410 sp 0x7ffe01f4fb30 T0)
#0 0x55d6add011dc in main (/tmp/a.out+0x11dc)
#1 0x7fbb442fb4c9 in __libc_start_main (/usr/lib/libc.so.6+0x204c9)
#2 0x55d6add00d19 in _start (/tmp/a.out+0xd19)
SUMMARY: AddressSanitizer: stack-overflow (/tmp/a.out+0x11dc) in main
==24998==ABORTING
As #dasblinkenlight mentioned, you may fix it using heap allocation. However, also consider one of the primality test algorithm, which is way faster and more scalable, but some aren't proved to be 100% correct (it's actually used for crypto).
The crash happens here: int finalPrimes[upperBound / 2]; when you declare and define automatic variable length array.
VLA resides on the stack, and stack space is relatively small.
To solve the problem, you should manually allocate space on the heap using malloc instead.
int* initialPrimes = malloc(sizeof(int)*(upperBound / 2));
int* finalPrimes = malloc(sizeof(int)*(upperBound / 2));
and when you are done with them, don't forget to free the memory.
Note that if you declare the array as global variables (with some constant yet big size) then the compiler will allocate them for you.
For instance the following is declaration make the crash vanish:
int finalPrimes[5000001];
int initialPrimes[5000001];
int main(void){
....

What is wrong with my hash function?

I'm trying to create a hash table. Here is my code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define N 19
#define c1 3
#define c2 5
#define m 3000
int efort;
int h_table[N];
int h(int k, int i)
{
return (k + i*c1 + i*i*c2) % N;
}
void init()
{
for (int i = 0; i < N; i++)
h_table[i] = -1;
}
void insert(int k)
{
int position, i;
i = 0;
do
{
position = h(k, i);
printf("\n Position %d \n", position);
if (h_table[position] == -1)
{
h_table[position] = k;
printf("Inserted :elem %d at %d \n", h_table[position], position);
break;
}
else
{
i += 1;
}
} while (i != N);
}
void print(int n)
{
printf("\nTable content: \n");
for (int i = 0; i < n; i++)
{
printf("%d ", h_table[i]);
}
}
void test()
{
int a[100];
int b[100];
init();
memset(b, -1, 100);
srand(time(NULL));
for (int i = 0; i < N; i++)
{
a[i] = rand() % (3000 + 1 - 2000) + 2000;
}
for (int i = 0; i < N ; i++)
{
insert(a[i]);
}
print(N);
}
int main()
{
test();
return 0;
}
Hash ("h") function and "insert" function are took from "Introduction to algorithms" book (Cormen).I don't know what is happening with the h function or insert function. Sometimes it fills completely my array, but sometimes it doesn't. That means it doesn't work good. What am I doing wrong?
In short, you are producing repeating values for position often enough to prevent h_table[] from being populated after only N attempts...
The pseudo-random number generator is not guaranteed to produce a set of unique numbers, nor is your h(...) function guaranteed to produce a mutually exclusive set of position values. It is likely that you are generating the same position enough times that you run out of loops before all 19 positions have been generated. The question how many times must h(...) be called on average before you are likely to get the value of an unused position? should be answered. This may help to direct you to the problem.
As an experiment, I increased the looping indexes from N to 100 in all but the h(...) function (so as not to overrun h_table[] ). And as expected the first 5 positions filled immediately. The next one filled after 3 more tries. The next one 10 tries later, and so on, until by the end of 100 tries, there were still some unwritten positions.
On the next run, all table positions were filled.
2 possible solutions:
1) Modify hash to improve probability of unique values.
2) Increase iterations to populate h_table
A good_hash_function() % N may repeat itself in N re-hashes. A good hash looks nearly random in its output even though it is deterministic. So in N tries it might not loop through all the array elements.
After failing to find a free array element after a number of tries, say N/3 tries, recommend a different approach. Just look for the next free element.

error with array size

I am trying to make a program that calculates the amount of prime numbers that don't exceed an integer using the sieve of Eratosthenes. While my program works fine (and fast) for small numbers, after a certain number (46337) I get a "command terminated by signal 11" error, which I suppose has to do with array size. I tried to use malloc() but I didn't get it quite right. What shall I do for big numbers (up to 5billion)?
#include <stdio.h>
#include<stdlib.h>
int main(){
signed long int x,i, j, prime = 0;
scanf("%ld", &x);
int num[x];
for(i=2; i<=x;i++){
num[i]=1;
}
for(i=2; i<=x;i++){
if(num[i] == 1){
for(j=i*i; j<=x; j = j + i){
num[j] = 0;
}
//printf("num[%d]\n", i);
prime++;
}
}
printf("%ld", prime);
return 0;
}
Your array
int num[x];
is on the stack, where only small arrays can be accommodated. For large array size you'll have to allocate memory. You can save on memory bloat by using char type, because you only need a status.
char *num = malloc(x+1); // allow for indexing by [x]
if(num == NULL) {
// deal with allocation error
}
//... the sieve code
free(num);
I suggest also, you must check that i*i does not break the int limit by using
if(num[i] == 1){
if (x / i >= i){ // make sure i*i won't break
for(j=i*i; j<=x; j = j + i){
num[j] = 0;
}
}
}
Lastly, you want to go to 5 billion, which is outside the range of uint32_t (which unsigned long int is on my system) at 4.2 billion. If that will satisfy you, change the int definitions to unsigned, watching out that your loop controls don't wrap, that is, use unsigned x = UINT_MAX - 1;
If you don't have 5Gb memory available, use bit status as suggest by #BoPersson.
The following code checks for errors, tested with values up to 5000000000, properly outputs the final count of number of primes, uses malloc so as to avoid overrunning the available stack space.
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned long int x,i, j;
unsigned prime = 0;
scanf("%lu", &x);
char *num = malloc( x);
if( NULL == num)
{
perror( "malloc failed");
exit(EXIT_FAILURE);
}
for(i=0; i<x;i++)
{
num[i]=1;
}
for(i=2; i<x;i++)
{
if(num[i] == 1)
{
for(j=i*i; j<x; j = j + i)
{
num[j] = 0;
}
//printf("num[%lu]\n", i);
prime++;
}
}
printf("%u\n", prime);
return 0;
}

Resources