mpz_add() causes segmentation Fault at large program - c

My problem is the following. I have to write a program which calculates a really large element of the Fibonacci numbers (lowest it has to calculate is the pow(2,10)th member, largest is pow(2,20)th member). For this I'm using GMP's mpz_t and it's functions for calculations.
I use a tail recursive algorithm for this (later on I have to make it run parallel). The issue is that it runs for a while, then suddenly: Segmentation fault (core dumped).
I show you my code, explain it, so you don't have to waste your time figuring it out and tell you what I got to know.
int main(int argc, char** argv){
char result[1000000]; char *r; r = result;
long int n;
mpz_t num;
mpz_init(num);
double start_t, end_t, total_t;
start_t = omp_get_wtime();
for(int i = 0; i < 11; i++){
n = pow(2,i+10);
fibo(num,n);
char *d = mpz_get_str(NULL,10,num);
strcpy(r,d);
printf("The %ld. element of Fibonacci is: %s\n",n,result);
fflush(stdout);
memset(result, 0, sizeof result);
}
end_t = omp_get_wtime();
total_t = end_t - start_t;
printf("Time of running: %.6f\n",total_t);
return 0;
}
The main() function basically creates (and initializes) the variables, sets up the time measurement and in a for loop calls the fibo() function, getting the result and printing it. When everything is done, the program writes out the time of running and quits.
void fibo(mpz_t res, long int n){
if(n == 0){
mpz_set_str(res,"0",10);
return;
}else{
mpz_t temp1;
mpz_t temp2;
mpz_init_set_si(temp1,0);
mpz_init_set_si(temp2,1);
fiboTail(res,n,1,temp1,temp2);
mpz_clear(temp1);
mpz_clear(temp2);
}
}
fibo() gets 2 arguments, first one is mpz_t (for the ones who don't know, this is a pointer and it's going to belong to the one that got created in the main() so the final value is going to land back there for further usage) and the second one is the number of the element we need to calculate. If the element number is 0, we simply give back "0", otherwise we make two mpz_t variables, set one two "0", the other to "1" and hand them to the fiboTail() along with some other arguments.
void fiboTail(mpz_t res, long int n, long int m, mpz_t fibPrev, mpz_t fibCurrent){
if(n == m){
mpz_set(res,fibCurrent);
}else{
mpz_add(fibPrev,fibPrev,fibCurrent);
fiboTail(res,n, m + 1, fibCurrent, fibPrev);
}
}
So this one basically is the core. m counts how many additions we have done, on which element we are at, n is the number of element we need, fibCurrent and fibPrev is the current and previous Fibonacci number respectively.
Sorry for the dumb explanation, I figure most of you knew this without me trying to explain.
So, this program is really fast. The problem (Segmentation fault) happens when it's counting the 131072th element (sometimes on a smaller one, its...random(?)). Then the program stops about the same number of addition/m value (not always on the same one, but close to there) and the previously mentioned error message appears. I use gcc to compile (actually using Makefiles), so I added the -g switch and used gdb to get more info. Here is what I found:
I ran the program in gdb and used backtrace which produce this.
Here is the detailed stack info using info frame on frame #0-5. The error occurs at the mpz_add call, but I don't know why.
If you need any more information, I can give them, but for now I don't know what else would be useful.
Sorry for the long post, thanks for the answers in advance!
Edit:
As it seems that mpz_add dies at a point, I got out the info of the call, you can see it: i.imgur.com/XOpTve1.png (Sorry, can't post more than 2 links :/ )

Not sure if this will help, but here is the Lucas sequence method for finding fibonacci(n), which is fast and could be used to confirm your results (such as size of the result which may be too big). It's similar to implementing fibnoacci(n) in matrix form and using repeated squaring to raise the matrix to the nth power where fib(n) = M^n x fib(0), where M is a 2 by 2 matrix, and fib() is a 2 element vector. The Lucas sequence function takes 1 + log2(n) loops to run, so for n=2^20, it would take 21 loops.
uint64_t fibl(uint64_t n) {
uint64_t a, b, p, q, qq, aq;
a = q = 1;
b = p = 0;
while(1) {
if(n & 1) {
aq = a*q;
a = b*q + aq + a*p;
b = b*p + aq;
}
n >>= 1;
if(n == 0)
break;
qq = q*q;
q = 2*p*q + qq;
p = p*p + qq;
}
return b;
}

Try this: instead of repeatedly making tail calls which might not be optimized, iteratively calculate the values or use memoization. The program is probably running out of memory somewhere.

What you got is just the plain old stack overflow due to the recursive nature of your fiboTail function.
First, note that you don't need a debbuger to check for this, just place some printfs around fiboTail and you'll see that mpz_add executes fine and the fault happens just at the fiboTail call, before the entry to the next fiboTail with (n,m) values like (131072,115265), (131072,115224), and (131072,115218).
You can get rid of recursion rewritting fiboTail as
void fiboTail(mpz_t res, long int n, long int m, mpz_t fibPrev, mpz_t fibCurrent) {
mpz_t tmp;
mpz_init(tmp);
for (long int i = m; i < n; i++) {
mpz_add(fibPrev, fibPrev, fibCurrent);
mpz_set(tmp, fibPrev);
mpz_set(fibPrev, fibCurrent);
mpz_set(fibCurrent, tmp);
}
mpz_set(res, fibCurrent);
}
Of course, you can simplify the call by removing some parameters, or just place its code inside fibo.
In my (very old) PC, Core I5 2th gen 4Gb RAM, I get (replaced omp_get_wtime with clock from time.h, though):
Time of running: 11.068110
and it seems all elements check, up to the 1048576th which has 219140 chars, and first digits check with the answers from WolframAlpha.
If recursion is so important (note that you don't necessarily need it for parallel programming), you can either increase stack size (at compile or runtime), or you can check your compiler setting to reuse the fiboTail stack frame, since you are using tail recursion.

Related

why does it show segmentation fault when I tried to use recursion here?

I tried to write a code to calculate how many 1 are there in a number's binary form. This is my code:
#include <stdio.h>
#include <math.h>
static int num = 0;
void binary(int target){
int n = 0;
int a = 1;
if(target != 0){
while(target >= a ){
n++;
a = pow(2, n);
}
a = pow(2, n-1);
num++;
binary(target - a);
}
}
int main() {
int target = 0;
scanf("%d", &target);
binary(target);
printf("%d",num);
return 0;
}
However, this shows segmentation fault when I run it. I don't know where has the code tried to access memories that are not allowed. I figured it might have something to do with the recursion in the binary function. Can anyone tell me what have caused the segmentation fault here? Thank you so much. I really can't understand segfaults :(
As mentioned in the comments, pow is not a good candiate here due to its signature:
double pow(double x, double y);
so any place that you are using pow, you are implicitly using floating point numbers. I was able to cause a segfault with the input 1<<31 which is the value -2147483648. This will cause your loop to terminate with n=0, a=1. You then set a = pow(2, -1), but since a is an integer, this gets floored down to just 0. You then recurse with binary(target - 0) which might as well just be binary(target) again, hence you have an infinite call with no termination.
I'll also leave as a note that recursion for this type of problem is probably not the right tool, unless your goal is to learn about recursion. There is a much more concise and reliable method via a loop and the & operator. I would also suggest using unsigned values to avoid issues like this with negative terms.

Bus Error in C for Loop

I have a toy cipher program which is encountering a bus error when given a very long key (I'm using 961168601842738797 to reproduce it), which perplexes me. When I commented out sections to isolate the error, I found it was being caused by this innocent-looking for loop in my Sieve of Eratosthenes.
unsigned long i;
int candidatePrimes[CANDIDATE_PRIMES];
// CANDIDATE_PRIMES is a macro which sets the length of the array to
// two less than the upper bound of the sieve. (2 being the first prime
// and the lower bound.)
for (i=0;i<CANDIDATE_PRIMES;i++)
{
printf("i: %d\n", i); // does not print; bus error occurs first
//candidatePrimes[i] = PRIME;
}
At times this has been a segmentation fault rather than a bus error.
Can anyone help me to understand what is happening and how I can fix it/avoid it in the future?
Thanks in advance!
PS
The full code is available here:
http://pastebin.com/GNEsg8eb
I would say your VLA is too large for your stack, leading to undefined behaviour.
Better to allocate the array dynamically:
int *candidatePrimes = malloc(CANDIDATE_PRIMES * sizeof(int));
And don't forget to free before returning.
If this is Eratosthenes Sieve, then the array is really just flags. It's wasteful to use int if it's just going to hold 0 or 1. At least use char (for speed), or condense to a bit array (for minimal storage).
The problem is that you're blowing the stack away.
unsigned long i;
int candidatePrimes[CANDIDATE_PRIMES];
If CANDIDATE_PRIMES is large, this alters the stack pointer by a massive amount. But it doesn't touch the memory, it just adjusts the stack pointer by a very large amount.
for (i=0;i<CANDIDATE_PRIMES;i++)
{
This adjusts "i" which is way back in the good area of the stack, and sets it to zero. Checks that it's < CANDIDATE_PRIMES, which it is, and so performs the first iteration.
printf("i: %d\n", i); // does not print; bus error occurs first
This attempts to put the parameters for "printf" onto the bottom of the stack. BOOM. Invalid memory location.
What value does CANDIDATE_PRIMES have?
And, do you actually want to store all the primes you're testing or only those that pass? What is the purpose of storing the values 0 thru CANDIDATE_PRIMES sequentially in an array???
If what you just wanted to store the primes, you should use a dynamic allocation and grow it as needed.
size_t g_numSlots = 0;
size_t g_numPrimes = 0;
unsigned long* g_primes = NULL;
void addPrime(unsigned long prime) {
unsigned long* newPrimes;
if (g_numPrimes >= g_numSlots) {
g_numSlots += 256;
newPrimes = realloc(g_primes, g_numSlots * sizeof(unsigned long));
if (newPrimes == NULL) {
die(gracefully);
}
g_primes = newPrimes;
}
g_primes[g_numPrimes++] = prime;
}

why runtime error on online judge?

I am unable to understand why i am getting runtime error with this code. Problem is every number >=6 can be represented as sum of two prime numbers.
My code is ...... Thanks in advance problem link is http://poj.org/problem?id=2262
#include "stdio.h"
#include "stdlib.h"
#define N 1000000
int main()
{
long int i,j,k;
long int *cp = malloc(1000000*sizeof(long int));
long int *isprime = malloc(1000000*sizeof(long int));
//long int *isprime;
long int num,flag;
//isprime = malloc(2*sizeof(long int));
for(i=0;i<N;i++)
{
isprime[i]=1;
}
j=0;
for(i=2;i<N;i++)
{
if(isprime[i])
{
cp[j] = i;
j++;
for(k=i*i;k<N;k+=i)
{
isprime[k] = 0;
}
}
}
//for(i=0;i<j;i++)
//{
// printf("%d ",cp[i]);
//}
//printf("\n");
while(1)
{
scanf("%ld",&num);
if(num==0) break;
flag = 0;
for(i=0;i<j&&num>cp[i];i++)
{
//printf("%d ",cp[i]);
if(isprime[num-cp[i]])
{
printf("%ld = %ld + %ld\n",num,cp[i],num-cp[i]);
flag = 1;
break;
}
}
if(flag==0)
{
printf("Goldbach's conjecture is wrong.\n");
}
}
free(cp);
free(isprime);
return 0;
}
Two possibilities immediately spring to mind. The first is that the user input may be failing if whatever test harness is being used does not provide any input. Without knowing more detail on the harness, this is a guess at best.
You could check that by hard-coding a value rather than accepting one from standard input.
The other possibility is the rather large memory allocations being done. It may be that you're in a constrained environment which doesn't allow that.
A simple test for that is to drop the value of N (and, by the way, use it rather than the multiple hardcoded 1000000 figures in your malloc calls). A better way would be to check the return value from malloc to ensure it's not NULL. That should be done anyway.
And, aside from that, you may want to check your Eratosthenes Sieve code. The first item that should be marked non-prime for the prime i is i + i rather than i * i as you have. I think it should be:
for (k = i + i; k < N; k += i)
The mathematical algorithm is actually okay since any multiple of N less than N * N will already have been marked non-prime by virtue of the fact it's a multiple of one of the primes previously checked.
Your problem lies with integer overflow. At the point where N becomes 46_349, N * N is 2_148_229_801 which, if you have a 32-bit two's complement integer (maximum value of 2_147_483_647), will wrap around to -2_146_737_495.
When that happens, the loop keeps going since that negative number is still less than your limit, but using it as an array index is, shall we say, inadvisable :-)
The reason it works with i + i is because your limit is well short of INT_MAX / 2 so no overflow happens there.
If you want to make sure that this won't be a problem if you get up near INT_MAX / 2, you can use something like:
for (k = i + i; (k < N) && (k > i); k += i)
That extra check on k should catch the wraparound event, provided your wrapping follows the "normal" behaviour - technically, I think it's undefined behaviour to wrap but most implementations simply wrap two positives back to a negative due to the two's complement nature. Be aware then that this is actually non-portable, but what that means in practice is that it will only work on 99.999% of machines out there :-)
But, if you're a stickler for portability, there are better ways to prevent overflow in the first place. I won't go into them here but to say they involve subtracting one of the terms being summed from MAX_INT and comparing it to the other term being summed.
The only way I can get this to give an error is if I enter a value greater than 1000000 or less than 1 to the scanf().
Like this:
ubuntu#amrith:/tmp$ ./x
183475666
Segmentation fault (core dumped)
ubuntu#amrith:/tmp$
But the reason for that should be obvious. Other than that, this code looks good.
Just trying to find what went wrong!
If the sizeof(long int) is 4 bytes for the OS that you are using, then it makes this problem.
In the code:
for(k=i*i;k<N;k+=i)
{
isprime[k] = 0;
}
Here, when you do k = i*i, for large values if i, the value of k goes beyond 4 bytesand get truncated which may result in negative numbers and so, the condition k<N is satisfied but with a negative number :). So you get a segmentation fault there.
It's good that you need only i+i, but if you need to increase the limit, take care of this problem.

finding greatest prime factor using recursion in c

have wrote the code for what i see to be a good algorithm for finding the greatest prime factor for a large number using recursion. My program crashes with any number greater than 4 assigned to the variable huge_number though. I am not good with recursion and the assignment does not allow any sort of loop.
#include <stdio.h>
long long prime_factor(int n, long long huge_number);
int main (void)
{
int n = 2;
long long huge_number = 60085147514;
long long largest_prime = 0;
largest_prime = prime_factor(n, huge_number);
printf("%ld\n", largest_prime);
return 0;
}
long long prime_factor (int n, long long huge_number)
{
if (huge_number / n == 1)
return huge_number;
else if (huge_number % n == 0)
return prime_factor (n, huge_number / n);
else
return prime_factor (n++, huge_number);
}
any info as to why it is crashing and how i could improve it would be greatly appreciated.
Even fixing the problem of using post-increment so that the recursion continues forever, this is not a good fit for a recursive solution - see here for why, but it boils down to how fast you can reduce the search space.
While your division of huge_number whittles it down pretty fast, the vast majority of recursive calls are done by simply incrementing n. That means you're going to use a lot of stack space.
You would be better off either:
using an iterative solution where you won't blow out the stack (if you just want to solve the problem) (a); or
finding a more suitable problem for recursion if you're just trying to learn recursion.
(a) An example of such a beast, modeled on your recursive solution, is:
#include <stdio.h>
long long prime_factor_i (int n, long long huge_number) {
while (n < huge_number) {
if (huge_number % n == 0) {
huge_number /= n;
continue;
}
n++;
}
return huge_number;
}
int main (void) {
int n = 2;
long long huge_number = 60085147514LL;
long long largest_prime = 0;
largest_prime = prime_factor_i (n, huge_number);
printf ("%lld\n", largest_prime);
return 0;
}
As can be seen from the output of that iterative solution, the largest factor is 10976461. That means the final batch of recursions in your recursive solution would require a stack depth of ten million stack frames, not something most environments will contend with easily.
If you really must use a recursive solution, you can reduce the stack space to the square root of that by using the fact that you don't have to check all the way up to the number, but only up to its square root.
In addition, other than 2, every other prime number is odd, so you can further halve the search space by only checking two plus the odd numbers.
A recursive solution taking those two things into consideration would be:
long long prime_factor_r (int n, long long huge_number) {
// Debug code for level checking.
// static int i = 0;
// printf ("recursion level = %d\n", ++i);
// Only check up to square root.
if (n * n >= huge_number)
return huge_number;
// If it's a factor, reduce the number and try again.
if (huge_number % n == 0)
return prime_factor_r (n, huge_number / n);
// Select next "candidate" prime to check against, 2 -> 3,
// 2n+1 -> 2n+3 for all n >= 1.
if (n == 2)
return prime_factor_r (3, huge_number);
return prime_factor_r (n + 2, huge_number);
}
You can see I've also removed the (awkward, in my opinion) construct:
if something then
return something
else
return something else
I much prefer the less massively indented code that comes from:
if something then
return something
return something else
But that's just personal preference. In any case, that gets your recursion level down to 1662 (uncomment the debug code to verify) rather than ten million, a rather sizable reduction but still not perfect. That runs okay in my environment.
You meant n+1 instead of n++. n++ increments n after using it, so the recursive call gets the original value of n.
You are overflowing stack, because n++ post-increments the value, making a recursive call with the same values as in the current invocation.
the crash reason is stack overflow. I add a counter to your program and execute it(on ubuntu 10.04 gcc 4.4.3) the counter stop at "218287" before core dump. the better solution is using loop instead of recursion.

storing known sequences in c

I'm working on Project Euler #14 in C and have figured out the basic algorithm; however, it runs insufferably slow for large numbers, e.g. 2,000,000 as wanted; I presume because it has to generate the sequence over and over again, even though there should be a way to store known sequences (e.g., once we get to a 16, we know from previous experience that the next numbers are 8, 4, 2, then 1).
I'm not exactly sure how to do this with C's fixed-length array, but there must be a good way (that's amazingly efficient, I'm sure). Thanks in advance.
Here's what I currently have, if it helps.
#include <stdio.h>
#define UPTO 2000000
int collatzlen(int n);
int main(){
int i, l=-1, li=-1, c=0;
for(i=1; i<=UPTO; i++){
if( (c=collatzlen(i)) > l) l=c, li=i;
}
printf("Greatest length:\t\t%7d\nGreatest starting point:\t%7d\n", l, li);
return 1;
}
/* n != 0 */
int collatzlen(int n){
int len = 0;
while(n>1) n = (n%2==0 ? n/2 : 3*n+1), len+=1;
return len;
}
Your original program needs 3.5 seconds on my machine. Is it insufferably slow for you?
My dirty and ugly version needs 0.3 seconds. It uses a global array to store the values already calculated. And use them in future calculations.
int collatzlen2(unsigned long n);
static unsigned long array[2000000 + 1];//to store those already calculated
int main()
{
int i, l=-1, li=-1, c=0;
int x;
for(x = 0; x < 2000000 + 1; x++) {
array[x] = -1;//use -1 to denote not-calculated yet
}
for(i=1; i<=UPTO; i++){
if( (c=collatzlen2(i)) > l) l=c, li=i;
}
printf("Greatest length:\t\t%7d\nGreatest starting point:\t%7d\n", l, li);
return 1;
}
int collatzlen2(unsigned long n){
unsigned long len = 0;
unsigned long m = n;
while(n > 1){
if(n > 2000000 || array[n] == -1){ // outside range or not-calculated yet
n = (n%2 == 0 ? n/2 : 3*n+1);
len+=1;
}
else{ // if already calculated, use the value
len += array[n];
n = 1; // to get out of the while-loop
}
}
array[m] = len;
return len;
}
Given that this is essentially a throw-away program (i.e. once you've run it and got the answer, you're not going to be supporting it for years :), I would suggest having a global variable to hold the lengths of sequences already calculated:
int lengthfrom[UPTO] = {};
If your maximum size is a few million, then we're talking megabytes of memory, which should easily fit in RAM at once.
The above will initialise the array to zeros at startup. In your program - for each iteration, check whether the array contains zero. If it does - you'll have to keep going with the computation. If not - then you know that carrying on would go on for that many more iterations, so just add that to the number you've done so far and you're done. And then store the new result in the array, of course.
Don't be tempted to use a local variable for an array of this size: that will try to allocate it on the stack, which won't be big enough and will likely crash.
Also - remember that with this sequence the values go up as well as down, so you'll need to cope with that in your program (probably by having the array longer than UPTO values, and using an assert() to guard against indices greater than the size of the array).
If I recall correctly, your problem isn't a slow algorithm: the algorithm you have now is fast enough for what PE asks you to do. The problem is overflow: you sometimes end up multiplying your number by 3 so many times that it will eventually exceed the maximum value that can be stored in a signed int. Use unsigned ints, and if that still doesn't work (but I'm pretty sure it does), use 64 bit ints (long long).
This should run very fast, but if you want to do it even faster, the other answers already addressed that.

Resources