Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
So i recently did a university exam and one of the questions asked us to create a program that would print out the nth number in the tribonacci sequence (1,1,1,3,5,9,17,31...). These numbers were said to go as large as 1500 digits long. I created a recursive function that worked for the first 37 tribonacci numbers. But a stack overflow occurred at the 38th number. The question had warned us about this and said that we would somehow need to overcome this, but i have no idea how. Were we meant to create our own data type?
double tribonacci(int n){
if(n < 4){
return 1;
}else{
return tribonacci(n-3) + tribonacci(n-2) + tribonacci(n-1);
}
}
int main(int argc, char *argv[]){
double value = tribonacci(atoi(argv[1]));
printf("%lf\n", value);
}
This is the solution i wrote under exam conditions, which was within 15 minutes.
The program took the value of n from an input in the command line. We were not allowed to use any libraries except for stdlib.h and stdio.h. So with all that said, how might one create a data type large enough to print out numbers with 1500 digits (since the double data type only holds enough for up until the 37th tribonacci number)? Or is there another method to this question?
You should use some arbitrary-precision arithmetic library (a.k.a. Bigints or bignums) if your teacher allows them. I recommend GMPlib, but there are others.
See also this answer (notably if your teacher wants you to write some crude arbitrary precision addition).
For a development time limited exam solution, I'd definitely go for the quick & dirty approach, but I wouldn't exactly complete it within 15 minutes.
The problem size is restricted to 1500 characters, computing tribonacci indicates that you will always need to carry subresult N-3, N-2 and N-1 in order to compute subresult N. So lets define a suitable static data structure with the right starting values (its 1;1;1 in your question, but I think it should be 0;1;1):
char characterLines[4][1501] = { { '0', 0 }, { '1', 0 }, { '1', 0 } };
Then define an add function that operates on character arrays, expecting '\0' as end of array and the character numbers '0' to '9' as digits in a way that the least significant digit comes first.
void addBigIntegerCharacters(const char* i1, const char* i2, char* outArray)
{
int carry = 0;
while(*i1 && *i2)
{
int partResult = carry + (*i1 - '0') + (*i2 - '0');
carry = partResult / 10;
*outArray = (partResult % 10) + '0';
++i1; ++i2; ++outArray;
}
while(*i1)
{
int partResult = carry + (*i1 - '0');
carry = partResult / 10;
*outArray = (partResult % 10) + '0';
++i1; ++outArray;
}
while(*i2)
{
int partResult = carry + (*i2 - '0');
carry = partResult / 10;
*outArray = (partResult % 10) + '0';
++i2; ++outArray;
}
if (carry > 0)
{
*outArray = carry + '0';
++outArray;
}
*outArray = 0;
}
Compute the tribonacci with the necessary number of additions:
// n as 1-based tribonacci index.
char* computeTribonacci(int n)
{
// initialize at index - 1 since it will be updated before first computation
int srcIndex1 = -1;
int srcIndex2 = 0;
int srcIndex3 = 1;
int targetIndex = 2;
if (n < 4)
{
return characterLines[n - 1];
}
n -= 3;
while (n > 0)
{
// update source and target indices
srcIndex1 = (srcIndex1 + 1) % 4;
srcIndex2 = (srcIndex2 + 1) % 4;
srcIndex3 = (srcIndex3 + 1) % 4;
targetIndex = (targetIndex + 1) % 4;
addBigIntegerCharacters(characterLines[srcIndex1], characterLines[srcIndex2], characterLines[targetIndex]);
addBigIntegerCharacters(characterLines[targetIndex], characterLines[srcIndex3], characterLines[targetIndex]);
--n;
}
return characterLines[targetIndex];
}
And remember that your least significant digit comes first when printing the result
void printReverse(const char* start)
{
const char* printIterator = start;
while (*printIterator)
{
++printIterator;
}
do
{
putchar(*(--printIterator));
} while (printIterator != start);
}
int main()
{
char* c = computeTribonacci(50); // the real result is the array right-to-left
printReverse(c);
}
As said, this is kindof quick & dirty coded, but still not within 15 minutes.
The reason why I use a separate char per decimal digit is mainly readability and conformity to the way how decimal math works on pen&paper, which is an important factor when development time is limited. With focus on runtime constraints rather than development time, I'd probably group the numbers in an array of unsigned long long, each representing 18 decimal digits. I would still focus on decimal digit groupings, because this is a lot easier to print as characters using the standard library functions. 18 because I need one digit for math overflow and 19 is the limit of fully available decimal digits for unsigned long long. This would result in a few more changes... 0 couldn't be used as termination character anymore, so it would probably be worth saving the valid length of each array. The principle of add and computeTribonacci would stay the same with some minor technical changes, printing would need some tweaks to ensure a length 18 output for each group of numbers other than the most significant one.
You require a different algorithm. The code posted cannot suffer from an integer overflow, as it does all its calculations in doubles. So you are probably getting a stack overflow instead. The posted code uses exponential time and space, and at N=38 that exponential space is probably overflowing the stack. Some alternatives, in increasing order of efficiency and complexity:
Use the "memoization" technique to optimize the algorithm you have.
Build up the answer starting by calculating N=4, and iterating upwards. No recursion is then needed.
Do the mathematics (or find someone who can) to get the "closed form solution" that allows direct calculation of the answer. See https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression for how this works for regular fibonacci numbers.
You will also need a "big number" data structure - see other answers.
You need to replace the + operation with an operator ADD made by yourself and encode BigIntegers as you wish -- there are lots of ways to encode BigIntegers.
So you need to define yourself a datatype BigInteger and the following operations
ADD : BigInteger, BigInteger -> BigInteger
1+ : BigInteger -> BigInteger
2- : BigInteger -> BigInteger
<4 : BigInteger -> boolean
The constants 1,2,4 as BigInteger
and after having replaced these things write a standard function to compute fibb in linear time and space.
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am extremely new to C and trying to make a 6-bit binary counter, where each return has all 6 digits listed (i.e 000000, 000001,...). Currently, my solution compiles but does not execute once compiled (I get a warning that says something to the effect of "A problem caused Windows to stop working" and then no output is displayed). If anyone could help figure out why this happens, or suggest a better way to do this since I know my approach is extremely convoluted, I'd appreciate the help!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdint.h>
long * convert(long dec){
if(dec == 0){
return 0;
}else{
return(long *)(dec % 2 + (10 * *convert(dec / 2)));
}
}
char* long_enough(char* num){
char* have_one = "0000";
char* have_two = "0000";
char* have_three = "000";
char* have_four = "00";
char* have_five = "0";
if(strlen(num) == 2){
strcat(have_one, num);
}else if(strlen(num) == 3){
strcat(have_two, num);
}else if(strlen(num) == 4){
strcat(have_three, num);
}else if(strlen(num) == 5){
strcat(have_four, num);
}else if(strlen(num) == 6){
strcat(have_five, num);
}
}
char main(){
int i;
int count = -1;
printf("\n");
for(i = 0; i < 5; i++){
count++;
long* binNum = (long *)(convert(count));
char* new;
char done = sprintf(new, "%d", binNum);
long_enough((char *)(intptr_t)done);
printf("%s\n", long_enough((char *)(intptr_t)done));
}
}
I think it has to do with your handling of pointers. #Pete Becker suggestion should get get you started but what also jumped out at me was this line:
return(long *)(dec % 2 + (10 * *convert(dec / 2)));
Here you are multiplying by the raw memory of the convert function result. If your intent is to raise to a power, note that there is no exponent operator in C. Do raise to a power, you'll need to #include <math.h> and use the pow(x,e) function.
You'll definitely want to read up on the pointer and value semantics in C. I'd recommend the book The C Programming Language by Brian W. Kernighan & Dennis M. Ritchie (the creators of the language). It is concise and will likely get you further, faster than a lot of the other books out there.
Are you using a long* to point nowhere?
It makes no sense. A pointer is meant to point to a memmory region and this memory region must have been allocated (either stack or heap).
It makes no sense that you are multiplying a binary number by 10 either.
You need to clarify your thought.
When you say you are making a "binary counter", think first about what it means. Given the code you posted, it looks like you should split your problem in two parts:
Count in binary.
Show the value in a human readable manner.
Once you've splitted the problem in two, I'll help you with one and let the other on your own.
Convert a number into binary:
Well, computer numbers are already binary by construction. Let's suppose you need, for any didactic reason, to reinvent the wheel in such a way that you can address individual values.
You have basically two options: use an array or use a bit-mask.
By using an array you'll waste more memory but printing the result will be easier.
By using a bit-mask you basically will have to allocate a single integer (or even char since you just need 6 bits) and shift it left while testing the original number.
// Example using bit-masks / bit-shifts into an uchar.
unsigned char to_bin6( unsigned int number )
{
unsigned char bin6 = 0;
// Left align and Clear out extra bits since we only care about the lowest 6.
number <<= ( 8 * sizeof( unsigned int ) ) - 6;
for ( int count = 6; count; --count )
bin6 <<= number;
return bin6;
}
Now with arrays.
// Example using char array.
// Array needs to have an additional element for the EoS (end of string) marker.
void to_bin6( unsigned int number, unsigned char bin6[ 7 ] )
{
// Fills the output buffer in the same direction you'd expect to read.
for ( int count = 5; count >= 0; --count )
bin6[ count ] = ( number & ( 1 << count ) ) ?
'1' : '0'; // Feeds character '1' or '0' according to bit value.
bin6[ 6 ] = '\0'; // EoS: Note this is NOT the same as '0'.
}
Show a number as binary for human reading:
That's the part I'll leave for you.
Hint: using array it is very easy.
Tell me if you need more.
I am trying to write C code which will print the first 1million Fibonacci numbers.
UPDATE: The actual problem is I want to get the last 10 digits of F(1,000,000)
I understand how the sequence works and how to write the code to achieve that however as F(1,000,000) is very large I am struggling to find a way to represent it.
This is code I am using:
#include<stdio.h>
int main()
{
unsigned long long n, first = 0, second = 1, next, c;
printf("Enter the number of terms\n");
scanf("%d",&n);
printf("First %d terms of Fibonacci series are :-\n",n);
for ( c = 0 ; c < n ; c++ )
{
if ( c <= 1 )
next = c;
else
{
next = first + second;
first = second;
second = next;
}
printf("%d\n",next);
}
return 0;
}
I am using long long to try and make sure there are enough bits to store the number.
This is the output for the first 100 numbers:
First 100 terms of Fibonacci series are :-
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
1346269
2178309
3524578
5702887
9227465
14930352
24157817
39088169
63245986
102334155
165580141
267914296
433494437
701408733
1134903170
1836311903
-1323752223
512559680
-811192543
-298632863
-1109825406
-1408458269
...
Truncated the output but you can see the problem, I believe the size of the number generated is causing the value to overflow to negative. I don't understand how to stop it in all honesty.
Can anybody point me in the right direction to how to actually handle numbers of this size?
I haven't tried to print the first million because if it fails on printing F(100) there isn't much hope of it printing F(1,000,000).
You want the last 10 digits of Fib(1000000). Read much more about Fibonacci numbers (and read twice).
Without thinking much, you could use some bignum library like GMPlib. You would loop to compute Fib(1000000) using a few mpz_t bigint variables (you certainly don't need an array of a million mpz_t, but less mpz_t variables than you have fingers in your hand). Of course, you won't print all the fibonacci numbers, only the last 1000000th one (so a cheap laptop today has enough memory, and would spit that number in less than an hour). As John Coleman answered it has about 200K digits (i.e. 2500 lines of 80 digits each).
(BTW, when thinking of a program producing some big output, you'll better guess-estimate the typical size of that output and the typical time to get it; if it does not fit in your desktop room -or your desktop computer-, you have a problem, perhaps an economical one: you need to buy more computing resources)
Notice that efficient bignum arithmetic is a hard subject. Clever algorithms exist for bignum arithmetic which are much more efficient than the naive one you would imagine.
Actually, you don't need any bigints. Read some math textbook about modular arithmetic. The modulus of a sum (or a product) is congruent to the sum (resp. the product) of the modulus. Use that property. A 10 digits integer fits in a 64 bits int64_t so with some thinking you don't need any bignum library.
(I guess that with slightly more thinking, you don't need any computer or any C program to compute that. A cheap calculator, a pencil and a paper should be enough, and probably the calculator is not needed at all.)
The lesson to learn when programming (or when solving math exercises) is to think about the problem and try to reformulate the question before starting coding. J.Pitrat (an Artificial Intelligence pioneer in France, now retired, but still working on his computer) has several interesting blog entries related to that: Is it possible to define a problem?, When Donald and Gerald meet Robert, etc.
Understanding and thinking about the problem (and sub-problems too!) is an interesting part of software development. If you work on software developement, you'll be first asked to solve real-world problems (e.g. make a selling website, or an autonomous vacuum cleaner) and you'll need to think to transform that problem into something which is codable on a computer. Be patient, you'll need ten years to learn programming.
To "get the last 10 digits of F(1,000,000)", simply apply the remainder function % when calculating next and use the correct format specifier: "%llu".
There is no need to sum digits more significant than the 10 least significant digits.
// scanf("%d",&n);
scanf("%llu",&n);
...
{
// next = first + second;
next = (first + second) % 10000000000;
first = second;
second = next;
}
// printf("%d\n",next);
printf("%010llu\n",next);
My output (x'ed the last 5 digits to not give-away the final answer)
66843xxxxx
By Binet's Formula the nth Fibonacci Number is approximately the golden ratio (roughly 1.618) raised to the power n and then divided by the square root of 5. A simple use of logarithms shows that the millionth Fibonacci number thus has over 200,000 digits. The average length of one of the first million Fibonacci numbers is thus over 100,000 = 10^5. You are thus trying to print 10^11 = 100 billion digits. I think that you will need more than a big int library to do that.
On the other hand -- if you want to simply compute the millionth number, you can do so -- though it would be better to use a method which doesn't compute all of the intermediate numbers (as simply computing rather than printing them all would still be infeasible for large enough n). It is well known (see this) that the nth Fibonacci number is one of the 4 entries of the nth power of the matrix [[1,1],[1,0]]. If you use exponentiation by squaring (which works for matrix powers as well since matrix multiplication is associative) together with a good big int library -- it becomes perfectly feasible to compute the millionth Fibonacci number.
[On Further Edit]: Here is a Python program to compute very large Fibonacci numbers, modified to now accept an optional modulus. Under the hood it is using a good C bignum library.
def mmult(A,B,m = False):
#assumes A,B are 2x2 matrices
#m is an optional modulus
a = A[0][0]*B[0][0] + A[0][1]*B[1][0]
b = A[0][0]*B[0][1] + A[0][1]*B[1][1]
c = A[1][0]*B[0][0] + A[1][1]*B[1][0]
d = A[1][0]*B[0][1] + A[1][1]*B[1][1]
if m:
return [[a%m,b%m],[c%m,d%m]]
else:
return [[a,b],[c,d]]
def mpow(A,n,m = False):
#assumes A is 2x2
if n == 0:
return [[1,0],[0,1]]
elif n == 1: return [row[:] for row in A] #copy A
else:
d,r = divmod(n,2)
B = mpow(A,d,m)
B = mmult(B,B,m)
if r > 0:
B = mmult(B,A,m)
return B
def Fib(n,m = False):
Q = [[1,1],[1,0]]
return mpow(Q,n,m)[0][1]
n = Fib(999999)
print(len(str(n)))
print(n % 10**10)
googol = 10**100
print(Fib(googol, googol))
Output (with added whitespace):
208988
6684390626
3239047153240982923932796604356740872797698500591032259930505954326207529447856359183788299560546875
Note that what you call the millionth Fibonacci number, I call the 999,999th -- since it is more standard to start with 1 as the first Fibonacci number (and call 0 the 0th if you want to count it as a Fibonacci number). The first output number confirms that there are over 200,000 digits in the number and the second gives the last 10 digits (which is no longer a mystery). The final number is the last 100 digits of the googolth Fibonacci number -- computed in a small fraction of a second. I haven't been able to do a googolplex yet :)
This question comes without doubt from some programming competition, and you have to read these questions carefully.
The 1 millionth Fibonacci number is HUGE. Probably about 200,000 digits or so. Printing the first 1,000,000 Fibonacci number will kill a whole forest of trees. But read carefully: Nobody asks you for the 1 millionth Fibonacci number. You are asked for the last ten digits of that number.
So if you have the last 10 digits of Fib(n-2) and of Fib(n-1), how can you find the last 10 digits of Fib(n)? How do you calculate the last ten digits of a Fibonacci number without calculating the number itself?
PS. You can't print long long numbers with %d. Use %lld.
Your algorithm is actually correct. Since you're using unsigned long long, you have enough digits to capture the last 10 digits and the nature of unsigned overflow functions as modulo arithmetic, so you'll get the correct results for at least the last 10 digits.
The problem is in the format specifier you're using for the output:
printf("%d\n",next);
The %d format specifier expects an int, but you're passing an unsigned long long. Using the wrong format specifier invokes undefined behavior.
What's most likely happening in this particular case is that printf is picking up the low-order 4 bytes of next (as your system seems to be little endian) and interpreting them as a signed int. This ends up displaying the correct values for roughly the first 60 numbers or so, but incorrect ones after that.
Use the correct format specifier, and you'll get the correct results:
printf("%llu\n",next);
You also need to do the same when reading / printing n:
scanf("%llu",&n);
printf("First %llu terms of Fibonacci series are :-\n",n);
Here's the output of numbers 45-60:
701408733
1134903170
1836311903
2971215073
4807526976
7778742049
12586269025
20365011074
32951280099
53316291173
86267571272
139583862445
225851433717
365435296162
591286729879
956722026041
You can print Fibonacci(1,000,000) in C, it takes about 50 lines, a minute and no library :
Some headers are required :
#include <stdio.h>
#include <stdlib.h>
#define BUFFER_SIZE (16 * 3 * 263)
#define BUFFERED_BASE (1LL << 55)
struct buffer {
size_t index;
long long int data[BUFFER_SIZE];
};
Some functions too :
void init_buffer(struct buffer * buffer, long long int n){
buffer->index = BUFFER_SIZE ;
for(;n; buffer->data[--buffer->index] = n % BUFFERED_BASE, n /= BUFFERED_BASE);
}
void fly_add_buffer(struct buffer *buffer, const struct buffer *client) {
long long int a = 0;
size_t i = (BUFFER_SIZE - 1);
for (; i >= client->index; --i)
(a = (buffer->data[i] = (buffer->data[i] + client->data[i] + a)) > (BUFFERED_BASE - 1)) && (buffer->data[i] -= BUFFERED_BASE);
for (; a; buffer->data[i] = (buffer->data[i] + a), (a = buffer->data[i] > (BUFFERED_BASE - 1)) ? buffer->data[i] -= BUFFERED_BASE : 0, --i);
if (++i < buffer->index) buffer->index = i;
}
A base converter is used to format the output in base 10 :
#include "string.h"
// you must free the returned string after usage
static char *to_string_buffer(const struct buffer * buffer, const int base_out) {
static const char *alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
size_t a, b, c = 1, d;
char *s = malloc(c + 1);
strcpy(s, "0");
for (size_t i = buffer->index; i < BUFFER_SIZE; ++i) {
for (a = buffer->data[i], b = c; b;) {
d = ((char *) memchr(alphabet, s[--b], base_out) - alphabet) * BUFFERED_BASE + a;
s[b] = alphabet[d % base_out];
a = d / base_out;
}
while (a) {
s = realloc(s, ++c + 1);
memmove(s + 1, s, c);
*s = alphabet[a % base_out];
a /= base_out;
}
}
return s;
}
Example usage :
#include <sys/time.h>
double microtime() {
struct timeval time;
gettimeofday(&time, 0);
return (double) time.tv_sec + (double) time.tv_usec / 1e6;
}
int main(void){
double a = microtime();
// memory for the 3 numbers is allocated on the stack.
struct buffer number_1 = {0}, number_2 = {0}, number_3 = {0};
init_buffer(&number_1, 0);
init_buffer(&number_2, 1);
for (int i = 0; i < 1000000; ++i) {
number_3 = number_1;
fly_add_buffer(&number_1, &number_2);
number_2 = number_3;
}
char * str = to_string_buffer(&number_1, 10); // output in base 10
puts(str);
free(str);
printf("took %gs\n", microtime() - a);
}
Example output :
The 1000000th Fibonacci number is :
19532821287077577316320149475 ... 03368468430171989341156899652
took 30s including 15s of base 2^55 to base 10 conversion.
Also it's using a nice but slow base converter.
Thank You.
I'm looking for a C function like the following that parses a length-terminated char array that expresses a floating point value and returns that value as a float.
float convert_carray_to_float( char const * inchars, int incharslen ) {
...
}
Constraints:
The character at inchars[incharslen] might be a digit or other character that might confuse the commonly used standard conversion routines.
The routine is not allowed to invoke inchars[incharslen] = 0 to create a z terminated string in place and then use the typical library routines. Even patching up the z-overwritten character before returning is not allowed.
Obviously one could copy the char array in to a new writable char array and append a null at the end, but I am hoping to avoid copying. My concern here is performance.
This will be called often so I'd like this to be as efficient as possible. I'd be happy to write my own routine that parses and builds up the float, but if that's the best solution, I'd be interested in the most efficient way to do this in C.
If you think removing constraint 3 really is the way to go to achieve high performance, please explain why and provide a sample that you think will perform better than solutions that maintain constraint 3.
David Gay's implementation, used in the *BSD libcs, can be found here: https://svnweb.freebsd.org/base/head/contrib/gdtoa/ The most important file is strtod.c, but it requires some of the headers and utilities. Modifying that to check the termination every time the string pointer is updated would be a bit of work but not awful.
However, you might afterwards think that the cost of the extra checks is comparable to the cost of copying the string to a temporary buffer of known length, particularly if the strings are short and of a known length, as in your example of a buffer packed with 3-byte undelimited numbers. On most architectures, if the numbers are no more than 8 bytes long and you were careful to ensure that the buffer had a bit of tail room, you could do the copy with a single 8-byte unaligned memory access at very little cost.
Here's a pretty good outline.
Not sure it covers all cases, but it shows most of the flow:
float convert_carray_to_float(char const * inchars, int incharslen)
{
int Sign = +1;
int IntegerPart = 0;
int DecimalPart = 0;
int Denominator = 1;
bool beforeDecimal = true;
if (incharslen == 0)
{
return 0.0f;
}
int i=0;
if (inchars[0] == '-')
{
Sign = -1;
i++;
}
if (inchars[0] == '+')
{
Sign = +1;
i++;
}
for( ; i<incharslen; ++i)
{
if (inchars[i] == '.')
{
beforeDecimal = false;
continue;
}
if (!isdigit(inchars[i]))
{
return 0.0f;
}
if (beforeDecimal)
{
IntegerPart = 10 * IntegerPart + (inchars[i] - '0');
}
else
{
DecimalPart = 10 * DecimalPart + (inchars[i] - '0');
Denominator *= 10;
}
}
return Sign * (IntegerPart + ((float)DecimalPart / Denominator));
}
I don't know where I am doing wrong in trying to calculate prime factorizations using Pollard's rho algorithm.
#include<stdio.h>
#define f(x) x*x-1
int pollard( int );
int gcd( int, int);
int main( void ) {
int n;
scanf( "%d",&n );
pollard( n );
return 0;
}
int pollard( int n ) {
int i=1,x,y,k=2,d;
x = rand()%n;
y = x;
while(1) {
i++;
x = f( x ) % n;
d = gcd( y-x, n);
if(d!=1 && d!=n)
printf( "%d\n", d);
if(i == k) {
y = x;
k = 2 * k;
}
}
}
int gcd( int a, int b ) {
if( b == 0)
return a;
else
return gcd( b, a % b);
}
One immediate problem is, as Peter de Rivaz suspected the
#define f(x) x*x-1
Thus the line
x = f(x)%n;
becomes
x = x*x-1%n;
and the precedence of % is higher than that of -, hence the expression is implicitly parenthesised as
x = (x*x) - (1%n);
which is equivalent to x = x*x - 1; (I assume n > 1, anyway it's x = x*x - constant;) and if you start with a value x >= 2, you have overflow before you had a realistic chance of finding a factor:
2 -> 2*2-1 = 3 -> 3*3 - 1 = 8 -> 8*8 - 1 = 63 -> 3968 -> 15745023 -> overflow if int is 32 bits
That doesn't immediately make it impossible that gcd(y-x,n) is a factor, though. It just makes it likely that at a stage where theoretically, you would have found a factor, the overflow destroys the common factor that mathematically would exist - more likely than a common factor introduced by overflow.
Overflow of signed integers is undefined behaviour, so there are no guarantees how the programme behaves, but usually it behaves consistently so the iteration of f still produces a well-defined sequence for which the algorithm in principle works.
Another problem is that y-x will frequently be negative, and then the computed gcd can also be negative - often -1. In that case, you print -1.
And then, it is a not too rare occurrence that iterating f from a starting value doesn't detect a common factor because the cycles modulo both prime factors (for the example of n a product of two distinct primes) have equal length and are entered at the same time. You make no attempt at detecting such a case; whenever gcd(|y-x|, n) == n, any further work in that sequence is pointless, so you should break out of the loop when d == n.
Also, you never check whether n is a prime, in which case trying to find a factor is a futile undertaking from the start.
Furthermore, after fixing f(x) so that the % n applies to the complete result of f(x), you have the problem that x*x still overflows for relatively small x (with the standard signed 32-bit ints, for x >= 46341), so factoring larger n may fail due to overflow. At least, you should use unsigned long long for the computations, so that overflow is avoided for n < 2^32. However, factorising such small numbers is typically done more efficiently with trial division. Pollard's Rho method and other advanced factoring algorithms are meant for larger numbers, where trial division is no longer efficient or even feasible.
I'm just a novice at C++, and I am new to Stack Overflow, so some of what I have written is going to look sloppy, but this should get you going in the right direction. The program posted here should generally find and return one non-trivial factor of the number you enter at the prompt, or it will apologize if it cannot find such a factor.
I tested it with a few semiprime numbers, and it worked for me. For 371156167103, it finds 607619 without any detectable delay after I hit the enter key. I didn't check it with larger numbers than this. I used unsigned long long variables, but if possible, you should get and use a library that provides even larger integer types.
Editing to add, the single call to the method f for X and 2 such calls for Y is intentional and is in accordance with the way the algorithm works. I thought to nest the call for Y inside another such call to keep it on one line, but I decided to do it this way so it's easier to follow.
#include "stdafx.h"
#include <stdio.h>
#include <iostream>
typedef unsigned long long ULL;
ULL pollard(ULL numberToFactor);
ULL gcd(ULL differenceBetweenCongruentFunctions, ULL numberToFactor);
ULL f(ULL x, ULL numberToFactor);
int main(void)
{
ULL factor;
ULL n;
std::cout<<"Enter the number for which you want a prime factor: ";
std::cin>>n;
factor = pollard(n);
if (factor == 0) std::cout<<"No factor found. Your number may be prime, but it is not certain.\n\n";
else std::cout<<"One factor is: "<<factor<<"\n\n";
}
ULL pollard(ULL n)
{
ULL x = 2ULL;
ULL y = 2ULL;
ULL d = 1ULL;
while(d==1||d==n)
{
x = f(x,n);
y = f(y,n);
y = f(y,n);
if (y>x)
{
d = gcd(y-x, n);
}
else
{
d = gcd(x-y, n);
}
}
return d;
}
ULL gcd(ULL a, ULL b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
ULL currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
ULL f(ULL x, ULL n)
{
return (x * x + 1) % n;
}
Sorry for the long delay getting back to this. As I mentioned in my first answer, I am a novice at C++, which will be evident in my excessive use of global variables, excessive use of BigIntegers and BigUnsigned where other types might be better, lack of error checking, and other programming habits on display which a more skilled person might not exhibit. That being said, let me explain what I did, then will post the code.
I am doing this in a second answer because the first answer is useful as a very simple demo of how a Pollard's Rho algorithm is to implement once you understand what it does. And what it does is to first take 2 variables, call them x and y, assign them the starting values of 2. Then it runs x through a function, usually (x^2+1)%n, where n is the number you want to factor. And it runs y through the same function twice each cycle. Then the difference between x and y is calculated, and finally the greatest common divisor is found for this difference and n. If that number is 1, then you run x and y through the function again.
Continue this process until the GCD is not 1 or until x and y are equal again. If the GCD is found which is not 1, then that GCD is a non-trivial factor of n. If x and y become equal, then the (x^2+1)%n function has failed. In that case, you should try again with another function, maybe (x^2+2)%n, and so on.
Here is an example. Take 35, for which we know the prime factors are 5 and 7. I'll walk through Pollard Rho and show you how it finds a non-trivial factor.
Cycle #1: X starts at 2. Then using the function (x^2+1)%n, (2^2+1)%35, we get 5 for x. Y starts at 2 also, and after one run through the function, it also has a value of 5. But y always goes through the function twice, so the second run is (5^2+1)%35, or 26. The difference between x and y is 21. The GCD of 21 (the difference) and 35 (n) is 7. We have already found a prime factor of 35! Note that the GCD for any 2 numbers, even extremely large exponents, can be found very quickly by formula using Euclid's algorithm, and that's what the program I will post here does.
On the subject of the GCD function, I am using one library I downloaded for this program, a library that allows me to use BigIntegers and BigUnsigned. That library also has a GCD function built in, and I could have used it. But I decided to stay with the hand-written GCD function for instructional purposes. If you want to improve the program's execution time, it might be a good idea to use the library's GCD function because there are faster methods than Euclid, and the library may be written to use one of those faster methods.
Another side note. The .Net 4.5 library supports the use of BigIntegers and BigUnsigned also. I decided not to use that for this program because I wanted to write the whole thing in C++, not C++/CLI. You could get better performance from the .Net library, or you might not. I don't know, but I wanted to share that that is also an option.
I am jumping around a bit here, so let me start now by explaining in broad strokes what the program does, and lastly I will explain how to set it up on your computer if you use Visual Studio 11 (also called Visual Studio 2012).
The program allocates 3 arrays for storing the factors of any number you give it to process. These arrays are 1000 elements wide, which is excessive, maybe, but it ensures any number with 1000 prime factors or less will fit.
When you enter the number at the prompt, it assumes the number is composite and puts it in the first element of the compositeFactors array. Then it goes through some admittedly inefficient while loops, which use Miller-Rabin to check if the number is composite. Note this test can either say a number is composite with 100% confidence, or it can say the number is prime with extremely high (but not 100%) confidence. The confidence is adjustable by a variable confidenceFactor in the program. The program will make one check for every value between 2 and confidenceFactor, inclusive, so one less total check than the value of confidenceFactor itself.
The setting I have for confidenceFactor is 101, which does 100 checks. If it says a number is prime, the odds that it is really composite are 1 in 4^100, or the same as the odds of correctly calling the flip of a fair coin 200 consecutive times. In short, if it says the number is prime, it probably is, but the confidenceFactor number can be increased to get greater confidence at the cost of speed.
Here might be as good a place as any to mention that, while Pollard's Rho algorithm can be pretty effective factoring smaller numbers of type long long, the Miller-Rabin test to see if a number is composite would be more or less useless without the BigInteger and BigUnsigned types. A BigInteger library is pretty much a requirement to be able to reliably factor large numbers all the way to their prime factors like this.
When Miller Rabin says the factor is composite, it is factored, the factor stored in a temp array, and the original factor in the composites array divided by the same factor. When numbers are identified as likely prime, they are moved into the prime factors array and output to screen. This process continues until there are no composite factors left. The factors tend to be found in ascending order, but this is coincidental. The program makes no effort to list them in ascending order, but only lists them as they are found.
Note that I could not find any function (x^2+c)%n which will factor the number 4, no matter what value I gave c. Pollard Rho seems to have a very hard time with all perfect squares, but 4 is the only composite number I found which is totally impervious to it using functions in the format described. Therefore I added a check for an n of 4 inside the pollard method, returning 2 instantly if so.
So to set this program up, here is what you should do. Go to https://mattmccutchen.net/bigint/ and download bigint-2010.04.30.zip. Unzip this and put all of the .hh files and all of the C++ source files in your ~\Program Files\Microsoft Visual Studio 11.0\VC\include directory, excluding the Sample and C++ Testsuite source files. Then in Visual Studio, create an empty project. In the solution explorer, right click on the resource files folder and select Add...existing item. Add all of the C++ source files in the directory I just mentioned. Then also in solution expolorer, right click the Source Files folder and add a new item, select C++ file, name it, and paste the below source code into it, and it should work for you.
Not to flatter overly much, but there are folks here on Stack Overflow who know a great deal more about C++ than I do, and if they modify my code below to make it better, that's fantastic. But even if not, the code is functional as-is, and it should help illustrate the principles involved in programmatically finding prime factors of medium sized numbers. It will not threaten the general number field sieve, but it can factor numbers with 12 - 14 digit prime factors in a reasonably short time, even on an old Core2 Duo computer like the one I am using.
The code follows. Good luck.
#include <string>
#include <stdio.h>
#include <iostream>
#include "BigIntegerLibrary.hh"
typedef BigInteger BI;
typedef BigUnsigned BU;
using std::string;
using std::cin;
using std::cout;
BU pollard(BU numberToFactor);
BU gcda(BU differenceBetweenCongruentFunctions, BU numberToFactor);
BU f(BU x, BU numberToFactor, int increment);
void initializeArrays();
BU getNumberToFactor ();
void factorComposites();
bool testForComposite (BU num);
BU primeFactors[1000];
BU compositeFactors[1000];
BU tempFactors [1000];
int primeIndex;
int compositeIndex;
int tempIndex;
int numberOfCompositeFactors;
bool allJTestsShowComposite;
int main ()
{
while(1)
{
primeIndex=0;
compositeIndex=0;
tempIndex=0;
initializeArrays();
compositeFactors[0] = getNumberToFactor();
cout<<"\n\n";
if (compositeFactors[0] == 0) return 0;
numberOfCompositeFactors = 1;
factorComposites();
}
}
void initializeArrays()
{
for (int i = 0; i<1000;i++)
{
primeFactors[i] = 0;
compositeFactors[i]=0;
tempFactors[i]=0;
}
}
BU getNumberToFactor ()
{
std::string s;
std::cout<<"Enter the number for which you want a prime factor, or 0 to quit: ";
std::cin>>s;
return stringToBigUnsigned(s);
}
void factorComposites()
{
while (numberOfCompositeFactors!=0)
{
compositeIndex = 0;
tempIndex = 0;
// This while loop finds non-zero values in compositeFactors.
// If they are composite, it factors them and puts one factor in tempFactors,
// then divides the element in compositeFactors by the same amount.
// If the element is prime, it moves it into tempFactors (zeros the element in compositeFactors)
while (compositeIndex < 1000)
{
if(compositeFactors[compositeIndex] == 0)
{
compositeIndex++;
continue;
}
if(testForComposite(compositeFactors[compositeIndex]) == false)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else
{
tempFactors[tempIndex] = pollard (compositeFactors[compositeIndex]);
compositeFactors[compositeIndex] /= tempFactors[tempIndex];
tempIndex++;
compositeIndex++;
}
}
compositeIndex = 0;
// This while loop moves all remaining non-zero values from compositeFactors into tempFactors
// When it is done, compositeFactors should be all 0 value elements
while (compositeIndex < 1000)
{
if (compositeFactors[compositeIndex] != 0)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else compositeIndex++;
}
compositeIndex = 0;
tempIndex = 0;
// This while loop checks all non-zero elements in tempIndex.
// Those that are prime are shown on screen and moved to primeFactors
// Those that are composite are moved to compositeFactors
// When this is done, all elements in tempFactors should be 0
while (tempIndex<1000)
{
if(tempFactors[tempIndex] == 0)
{
tempIndex++;
continue;
}
if(testForComposite(tempFactors[tempIndex]) == false)
{
primeFactors[primeIndex] = tempFactors[tempIndex];
cout<<primeFactors[primeIndex]<<"\n";
tempFactors[tempIndex]=0;
primeIndex++;
tempIndex++;
}
else
{
compositeFactors[compositeIndex] = tempFactors[tempIndex];
tempFactors[tempIndex]=0;
compositeIndex++;
tempIndex++;
}
}
compositeIndex=0;
numberOfCompositeFactors=0;
// This while loop just checks to be sure there are still one or more composite factors.
// As long as there are, the outer while loop will repeat
while(compositeIndex<1000)
{
if(compositeFactors[compositeIndex]!=0) numberOfCompositeFactors++;
compositeIndex ++;
}
}
return;
}
// The following method uses the Miller-Rabin primality test to prove with 100% confidence a given number is composite,
// or to establish with a high level of confidence -- but not 100% -- that it is prime
bool testForComposite (BU num)
{
BU confidenceFactor = 101;
if (confidenceFactor >= num) confidenceFactor = num-1;
BU a,d,s, nMinusOne;
nMinusOne=num-1;
d=nMinusOne;
s=0;
while(modexp(d,1,2)==0)
{
d /= 2;
s++;
}
allJTestsShowComposite = true; // assume composite here until we can prove otherwise
for (BI i = 2 ; i<=confidenceFactor;i++)
{
if (modexp(i,d,num) == 1)
continue; // if this modulus is 1, then we cannot prove that num is composite with this value of i, so continue
if (modexp(i,d,num) == nMinusOne)
{
allJTestsShowComposite = false;
continue;
}
BU exponent(1);
for (BU j(0); j.toInt()<=s.toInt()-1;j++)
{
exponent *= 2;
if (modexp(i,exponent*d,num) == nMinusOne)
{
// if the modulus is not right for even a single j, then break and increment i.
allJTestsShowComposite = false;
continue;
}
}
if (allJTestsShowComposite == true) return true; // proven composite with 100% certainty, no need to continue testing
}
return false;
/* not proven composite in any test, so assume prime with a possibility of error =
(1/4)^(number of different values of i tested). This will be equal to the value of the
confidenceFactor variable, and the "witnesses" to the primality of the number being tested will be all integers from
2 through the value of confidenceFactor.
Note that this makes this primality test cryptographically less secure than it could be. It is theoretically possible,
if difficult, for a malicious party to pass a known composite number for which all of the lowest n integers fail to
detect that it is composite. A safer way is to generate random integers in the outer "for" loop and use those in place of
the variable i. Better still if those random numbers are checked to ensure no duplicates are generated.
*/
}
BU pollard(BU n)
{
if (n == 4) return 2;
BU x = 2;
BU y = 2;
BU d = 1;
int increment = 1;
while(d==1||d==n||d==0)
{
x = f(x,n, increment);
y = f(y,n, increment);
y = f(y,n, increment);
if (y>x)
{
d = gcda(y-x, n);
}
else
{
d = gcda(x-y, n);
}
if (d==0)
{
x = 2;
y = 2;
d = 1;
increment++; // This changes the pseudorandom function we use to increment x and y
}
}
return d;
}
BU gcda(BU a, BU b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
BU currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
BU f(BU x, BU n, int increment)
{
return (x * x + increment) % n;
}
As far as I can see, Pollard Rho normally uses f(x) as (x*x+1) (e.g. in these lecture notes ).
Your choice of x*x-1 appears not as good as it often seems to get stuck in a loop:
x=0
f(x)=-1
f(f(x))=0
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Generating Random Numbers in Objective-C
How do I generate a random number which is within a range?
This is actually a bit harder to get really correct than most people realize:
int rand_lim(int limit) {
/* return a random number between 0 and limit inclusive.
*/
int divisor = RAND_MAX/(limit+1);
int retval;
do {
retval = rand() / divisor;
} while (retval > limit);
return retval;
}
Attempts that just use % (or, equivalently, /) to get the numbers in a range almost inevitably introduce skew (i.e., some numbers will be generated more often than others).
As to why using % produces skewed results: unless the range you want is a divisor of RAND_MAX, skew is inevitable. If you start with small numbers, it's pretty easy to see why. Consider taking 10 pieces of candy (that we'll assume you can't cut, break, etc. into smaller pieces) and trying to divide it evenly between three children. Clearly it can't be done--if you hand out all the candy, the closest you can get is for two kids to get three pieces of candy, and one of them getting four.
There's only one way for all the kids to get the same number of pieces of candy: make sure you don't hand out the last piece of candy at all.
To relate this to the code above, let's start by numbering the candies from 1 to 10 and the kids from 1 to 3. The initial division says since there are three kids, our divisor is three. We then pull a random candy from the bucket, look at its number and divide by three and hand it to that kid -- but if the result is greater than 3 (i.e. we've picked out candy number 10) we just don't hand it out at all -- we discard it and pick out another candy.
Of course, if you're using a modern implementation of C++ (i.e., one that supports C++11 or newer), you should usually use one the distribution classes from the standard library. The code above corresponds most closely with std::uniform_int_distribution, but the standard library also includes uniform_real_distribution as well as classes for a number of non-uniform distributions (Bernoulli, Poisson, normal, maybe a couple others I don't remember at the moment).
int rand_range(int min_n, int max_n)
{
return rand() % (max_n - min_n + 1) + min_n;
}
For fractions:
double rand_range(double min_n, double max_n)
{
return (double)rand()/RAND_MAX * (max_n - min_n) + min_n;
}
For an integer value in the range [min,max):
double scale = (double) (max - min) / RAND_MAX;
int val = min + floor(rand() * scale)
I wrote this specifically in Obj-C for an iPhone project:
- (int) intInRangeMinimum:(int)min andMaximum:(int)max {
if (min > max) { return -1; }
int adjustedMax = (max + 1) - min; // arc4random returns within the set {min, (max - 1)}
int random = arc4random() % adjustedMax;
int result = random + min;
return result;
}
To use:
int newNumber = [aClass intInRangeMinimum:1 andMaximum:100];
Add salt to taste
+(NSInteger)randomNumberWithMin:(NSInteger)min WithMax:(NSInteger)max {
if (min>max) {
int tempMax=max;
max=min;
min=tempMax;
}
int randomy=arc4random() % (max-min+1);
randomy=randomy+min;
return randomy;
}
I use this method in a random number related class I made. Works well for my non-demanding needs, but may well be biased in some way.