This programming problem is #85 from a page of Microsoft interview questions. The complete problem description and my solution are posted below, but I wanted to ask my question first.
The rules say that you can loop for a fixed number of times. That is, if 'x' is a variable, you can loop over a block of code based on the value of 'x' at the time that you enter the loop. If 'x' changes during the loop, that won't change how many times you loop. Also, that is the only way to loop. You can't, for instance, loop until some condition is met.
In my solution to the problem, I have a loop which will be set to execute zero or more times. The problem is, in reality, it only ever executes 0 times or 1 time because the only statement in my loop is a return statement. So if we enter the loop, it only has a chance to run once. I am using this tactic instead of using an if-else block, because logical comparisons and if statements are not allowed. The rules don't explicitly say that you can't do this, but I am wondering if you would consider my solution invalid. I couldn't really figure out another way to do it.
So here are my questions:
Do you think my solution is invalid?
If so, did you think of another way to solve the problem?
Problem description:
85) You have an abstract computer, so just forget everything you know
about computers, this one only does what I'm about to tell you it
does. You can use as many variables as you need, there are no negative
numbers, all numbers are integers. You do not know the size of the
integers, they could be infinitely large, so you can't count on
truncating at any point. There are NO comparisons allowed, no if
statements or anything like that. There are only four operations you
can do on a variable.
You can set a variable to 0.
You can set a variable = another variable.
You can increment a variable (only by 1), and it's a post increment.
You can loop. So, if you were to say loop(v1) and v1 = 10, your loop would execute 10 times, but the value in v1 wouldn't change so
the first line in the loop can change value of v1 without changing the
number of times you loop.
You need to do 3 things.
Write a function that decrements by 1.
Write a function that subtracts one variable from another.
Write a function that divides one variable by another.
See if you can implement all 3 using at most 4 variables. Meaning, you're not making function calls now, you're making macros. And at
most you can have 4 variables. The restriction really only applies to
divide, the other 2 are easy to do with 4 vars or less. Division on
the other hand is dependent on the other 2 functions, so, if subtract
requires 3 variables, then divide only has 1 variable left unchanged
after a call to subtract. Basically, just make your function calls to
decrement and subtract so you pass your vars in by reference, and you
can't declare any new variables in a function, what you pass in is all
it gets.
My psuedocode solution (loop(x) means loop through this block of code x times):
// returns number - 1
int decrement(int number)
{
int previous = 0;
int i = 0;
loop(number)
{
previous = i;
i++;
}
return previous;
}
// returns number1 - number2
int subtract(int number1, int number2)
{
loop(number2)
{
number1= decrement(number1);
}
return number1;
}
//returns numerator/denominator
divide(int numerator, int denominator)
{
loop(subtract(numerator+1, denominator))
{
return (1 + divide(subtract(numerator, denominator), denominator));
}
return 0;
}
Here are C# methods that you can build and run. I had to make an artificial way for me to
satisfy the looping rules.
public int decrement(int num)
{
int previous = 0;
int LOOP = 0;
while (LOOP < num)
{
previous = LOOP;
LOOP++;
}
return previous;
}
public int subtract(int number1, int number2)
{
int LOOP = 0;
while (LOOP < number2)
{
number1 = decrement(number1);
LOOP++;
}
return number1;
}
public int divide(int numerator, int denominator)
{
int LOOP = 0;
while (LOOP < subtract(numerator+1, denominator))
{
return (1 + divide(subtract(numerator, denominator), denominator));
}
return 0;
}
Ok so the reason I think your answer might be invalid is because of how you use return. In fact I think just using the return is too much of an assumption. in a few places you use the return value of a function call as an extra variable. Now if the return statement is ok, then your answer is valid. The reason i think its not valid is because the problem hints at the fact that you need to think of these as macros, not function calls. Everything should be done by reference, you should change the variables and the return values are how you left those variables. Here is my solution:
int x, y, v, z;
//This function leaves x decremented by one and v equal to x
def decrement(x,v):
v=0;
loop(x):
x=v;
v++;
//this function leaves x decremented by y (x-y) and v equal to x
def subtract(x,y,v):
loop(y):
decrement(x,v);
//this function leaves x and z as the result of x divided by y, leaves y as is, and v as 0
//input of v and z dont matter, x should be greater than or equal to y or be 0, y should be greater than 0.
def divide(x,y,v,z):
z=0;
loop(x):
//these loops give us z+1 if x is >0 or leave z alone otherwise
loop(x):
z++;
decrement(x,v);
loop(x):
decrement(z,v)
//restore x
x++;
//reduce x by y until, when x is zero z will no longer increment
subtract(x,y,v);
//leave x as z so x is the result
x=z;
Related
I am trying to take an integer (X) and use recursion to find the sum of digits that apply to a particular condition up to X. For example, given 10 and using conditions divisible by 2 or 3, the sum would be 5.
I have already used a different loops to solve the problem and now trying to practice with recursion.
int sum(int n) {
int totalSum;
if (n==0)
return 0;
else {
if ((n%2==0)||(n%3==0)){
totalSum+= sum(n-1);
return totalSum;
}
else {
totalSum=sum(n-1);
return totalSum;
}
}
}
I keep either receiving a zero or an incredibly high number.
totalSum+= sum(n-1);
can't possibly be correct. You never initialized totalSum, so how could it be correct to add something to it? Even if C automatically initialized variables, it would presumably initialize it to 0, so this would be equivalent to totalSum = sum(n-1);, which is the same thing you do when n is not a multiple of 2 or 3.
Notice that neither of your conditions adds the current iteration variable to anything. You should be adding n, not totalSum:
totalSum = n + sum(n-1);
I just started learning C and I faced the following problem:
in the first /recursive step/ I do not understand why we cannot simply return multiply(x, y)? Why do we need to add a value to y and only then return it?
The code is below.
Thank you!
#include <stdio.h>
unsigned int multiply(unsigned int x, unsigned int y)
{
if (x == 1)
{
/* Terminating case */
return y;
}
else if (x > 1)
{
/* Recursive step */
return y + multiply(x-1, y);
}
/* Catch scenario when x is zero */
return 0;
}
int main() {
printf("3 times 5 is %d", multiply(3, 5));
return 0;
}
If you return multiply(x, y), you will loop forever on the same call parameters. To have proper recursion, you have to reduce the problem to a simpler case. That simpler case is to reduce the multiplier by 1.
Recursion is simply doing the same operations with smaller inputs. Can we express multiplication of two numbers with smaller numbers ? Sure ! Finding the way to do this is finding a recursive definition of multiplication. First we will try to express x*y with a smaller x, like x-1.
From the simple fact that :
(x-1)*y = x*y - y
we find :
x*y=(x-1)*y + y
Remember that we will always have to find a stoping step, here we know that
0*y=0
and we're done. This gives us even a simpler form of a recursive mul function as the one given by #RobertEagle.
But let us take the thing further. x-1 is smaller than x, as y-1 is smaller than y. Exploring this fact gives us :
(x-1)*(y-1)=x*y-x-y+1
x*y = (x-1)*(y-1) + x + y - 1
This time the stopping step would be "if x or y is 0 then the result is 0". Translating this into code :
unsigned multiply(unsigned x, unsigned y)
{
if ( (x==0) || (y==0) )
return 0;
else
return x+y-1+multiply(x-1, y-1);
}
This sort of recursion is not quite efficient because we do not express the recursion with something really smaller. What if we try to halve one of the parameters ?
If x is even we can write :
x*y=2*(x/2)*y=(x/2)*(2*y)
If x is odd we can write :
x*y=(x-1+1)*y=(x-1)*y+y=((x-1)/2)*(2*y)+y
Multiplying (resp. dividing) by 2 can be achieved with left (resp. right) shifting :
unsigned multiply(unsigned x, unsigned y)
{
if (x==0)
return 0;
else if (x%2==0)
return multiply(x>>1, y<<1);
else
return y+multiply((x-1)>>1, y<<1);
}
This method will take less steps than the first one. The "more" you're getting smaller, the "more" you're getting faster, in general.
If you didn't add the value, it would simply return y as the final result of the top most recursive call.
X * Y = X, Y times
e.g. 4 * 5 = 4 + 4 + 4 + 4 + 4
This logic is expanding that evaluation, and just returning the last value would only evaluate to Y one time.
Every recursive function is made up of 2 elements:
The repetitive condition
The stopping condition
Without a stopping condition, the program would enter an infinite loop.
In this case, it would mean the program would continually call beyond x == 1.
Instead of stopping calling itself at x == 1, the program would also call multiply(-1,y), multiply(-2,y) ... to infinite.
Ultimately, you need to sum the y for x times. And to do this you need to return x times the value of y. And in this case, the repetitive condition adds y for x-1 times and for the stopping condition you only need one more time.
You also could have done it this way:
#include <stdio.h>
unsigned int multiply(unsigned int x, unsigned int y)
{
if (x == 0)
{
/* Terminating case */
return 0;
}
else if (x > 0)
{
/* Recursive step */
return y + multiply(x-1, y);
}
}
int main() {
printf("3 times 5 is %d", multiply(3, 5));
return 0;
}
Here you can see the repetitive is concerted with adding the y for exactly x times. Logically, the stopping condition would require to return 0, because we don't need to add anymore the y value. We already have added it for x times.
I recently started reading Hacking: The Art of Exploitation by Jon Erickson. In the book he has the following function, which I will refer to as (A):
int factorial (int x) // .......... 1
{ // .......... 2
int i; // .......... 3
for (i = 1; i < x; i++) // .......... 4
x *= i; // .......... 5
return x; // .......... 6
} // .......... 7
This particular function is on pg. 17. Up until this function, I have understood everything he has described. To be fair, he has explained all of the elements within (A) in detail, with the exception of the return concept. However, I just don't see how (A) is suppose to describe the process of
x! = x * (x-1) * (x-2) * (x-3)
etc which I will refer to as (B). If someone could take the time to break this down in detail I would really appreciate it. Since I am asking for your help, I will go through the elements I believe I understand in order to potentially expose elements I believe I understand but actually do not but also to help you help me make the leap from how (A) is suppose to be a representation of (B).
Ok, so here is what I believe I understand.
In line 1, in (int x), x is being assigned the type integer. What I am less sure about is whether in factorial (int x), int x is being assigned the type factorial, or if even factorial is a type.
Line 3 is simple; i is being assigned the type integer.
Line 4 I am less confident on but I think I have a decent grasp of it. I'm assuming line 4 is a while-control structure with a counter. In the first segment, the counter is referred to as i and it's initial value is established as 1. I believe the second segment of line 4, i < x, dictates that while counter i is less than x, keep looping. The third segment, i++, communicates that for every valid loop/iteration of this "while a, then b" situation, you add 1 to i.
In line 5 I believe that x *= i is suppose to be shorthand for i * x but if I didn't know that this function is suppose to explain the process of calculating a factorial, I wouldn't be able to organically explain how lines 4 and 5 are suppose to interact.
'I humbly ask for your help. For any one who helps me get over this hump, I thank you in advance. '
I think the program given in the book is wrong. Theoretically, the for-loop will never terminate, as x is growing in every iteration, and much faster than i.
In practice, x will overflow after some time, thus terminating the loop.
Forget about why this calculates the factorial for a moment. First let's figure out what it does:
int factorial (int x) // .......... 1
{ // .......... 2
int i; // .......... 3
for (i = 1; i < x; i++) // .......... 4
x *= i; // .......... 5
return x; // .......... 6
} // .......... 7
Line 1:
Ignoring the part in the parenthesis for now, the line says int factorial (...) - that means this is a function called factorial and it has a type of int. The bit inside the parenthesis says int x - that means the function takes a parameter that will be called x and is of type int. A function can take multiple parameters separated by commas but this function takes only one.
Line 3:
We are creating a variable which we will call i of type int. The variable will only exist inside these curly braces so when the function is finished i will not exist any more.
Line 4:
This is indeed a looping control that uses the variable i created on line 3 to keep the count. At the start of the loop, the i=1 makes sure the count starts at 1. The i<x means it will keep looping as long as i is less than x. The i++ means each time the loop finishes the stuff in the curly braces the variable i will be incremented. The important part of this line is that the loops stops when i gets to x - which, as we will see, never happens.
Line 5:
The x*=i means the value of x will be updated by multiplying it by i. This will happen each time the loop iterates. So, for example, if x was equal to 5 the loop will make i equal to the values 1, 2, 3 and 4 (the numbers from 1 up but less than x) and the value of x will be updated by multiplying it by 1, 2, 3 and 4, making it larger and larger. But now that x is larger, the loop doesn't end here as expected - in fact, it contines looping making x larger and larger until the value of x no longer fits into an int. At that point, the program has undefined behaviour. Because compilers assume no one would want undefined behaviour, the program can do absolutely anything at that point including crashing or looping infinitely or even rewriting the whole program so it doesn't do any calculations at all.
Line 6:
We need to get that value of x back to the outside world and that is what the return command does - if the program gets to here the return statement gives the factorial function the value of x (which is the value of the factorial of 5 in the example we just used).
Then somewhere else in your program you might do this:
int f;
f = factorial(5);
Which will make the parameter called x have an initial value of 5 and will make f have the final value of the function.
So what does it return? Well, there is undefined behaviour so anything could happen - but because x gets larger and larger it definitely will NOT return the factorial. Anything could happen, but in my tests factorial(5) returns a huge negative number.
Try it online!
So how do we fix it? Well, as #JonathanLeffler said, we can't change the value of x so we need a new variable to hold the result. We will call that variable r:
int factorial (int x)
{
int r = 1;
for (int i = 1; i < x; i++)
r *= i;
return r;
}
So this program changes the value of r and doesn't change the value of x. So the loop works properly now. But it still doesn't calculate the factorial of the value passed in - if x is 5 it multiplies all the values up to but not including x.
Try it online!
So how do we fix it? The factorial has to include all the values including x, so this actually calculates the factorial:
int factorial (int x)
{
int r = 1;
for (int i = 1; i <= x; i++)
r *= i;
return r;
}
And this works as long as the value of x you pass in is small enough that the factorial can fit into a signed int.
Try it online!
You could get more range by using an unsigned long long but there is still a maximum value that can be calculated.
The program loops through integers 1 thru N-1. (Assume the input value is 'N')
Before loop starts, x=N.
After one iteration, x=N*1.
After 2 iterations, x=N*1*2.
After N-1 iterations, x=N*1*2*....(N-1).
Which is N factorial.
So N! is returned.
The fifth line is : x *= i;
You should understand here : x = x * i;
The loop will execute while i < x. Which means until reaches x - 1;
Knowing that i begins at 1 you will get this sum : x * 1 * 2 * 3 * ... * ( x - 1)
You can rearrange this so you get : 1 * 2 * 3 * ... * (x - 1) * x
I don't know where I am doing wrong in trying to calculate prime factorizations using Pollard's rho algorithm.
#include<stdio.h>
#define f(x) x*x-1
int pollard( int );
int gcd( int, int);
int main( void ) {
int n;
scanf( "%d",&n );
pollard( n );
return 0;
}
int pollard( int n ) {
int i=1,x,y,k=2,d;
x = rand()%n;
y = x;
while(1) {
i++;
x = f( x ) % n;
d = gcd( y-x, n);
if(d!=1 && d!=n)
printf( "%d\n", d);
if(i == k) {
y = x;
k = 2 * k;
}
}
}
int gcd( int a, int b ) {
if( b == 0)
return a;
else
return gcd( b, a % b);
}
One immediate problem is, as Peter de Rivaz suspected the
#define f(x) x*x-1
Thus the line
x = f(x)%n;
becomes
x = x*x-1%n;
and the precedence of % is higher than that of -, hence the expression is implicitly parenthesised as
x = (x*x) - (1%n);
which is equivalent to x = x*x - 1; (I assume n > 1, anyway it's x = x*x - constant;) and if you start with a value x >= 2, you have overflow before you had a realistic chance of finding a factor:
2 -> 2*2-1 = 3 -> 3*3 - 1 = 8 -> 8*8 - 1 = 63 -> 3968 -> 15745023 -> overflow if int is 32 bits
That doesn't immediately make it impossible that gcd(y-x,n) is a factor, though. It just makes it likely that at a stage where theoretically, you would have found a factor, the overflow destroys the common factor that mathematically would exist - more likely than a common factor introduced by overflow.
Overflow of signed integers is undefined behaviour, so there are no guarantees how the programme behaves, but usually it behaves consistently so the iteration of f still produces a well-defined sequence for which the algorithm in principle works.
Another problem is that y-x will frequently be negative, and then the computed gcd can also be negative - often -1. In that case, you print -1.
And then, it is a not too rare occurrence that iterating f from a starting value doesn't detect a common factor because the cycles modulo both prime factors (for the example of n a product of two distinct primes) have equal length and are entered at the same time. You make no attempt at detecting such a case; whenever gcd(|y-x|, n) == n, any further work in that sequence is pointless, so you should break out of the loop when d == n.
Also, you never check whether n is a prime, in which case trying to find a factor is a futile undertaking from the start.
Furthermore, after fixing f(x) so that the % n applies to the complete result of f(x), you have the problem that x*x still overflows for relatively small x (with the standard signed 32-bit ints, for x >= 46341), so factoring larger n may fail due to overflow. At least, you should use unsigned long long for the computations, so that overflow is avoided for n < 2^32. However, factorising such small numbers is typically done more efficiently with trial division. Pollard's Rho method and other advanced factoring algorithms are meant for larger numbers, where trial division is no longer efficient or even feasible.
I'm just a novice at C++, and I am new to Stack Overflow, so some of what I have written is going to look sloppy, but this should get you going in the right direction. The program posted here should generally find and return one non-trivial factor of the number you enter at the prompt, or it will apologize if it cannot find such a factor.
I tested it with a few semiprime numbers, and it worked for me. For 371156167103, it finds 607619 without any detectable delay after I hit the enter key. I didn't check it with larger numbers than this. I used unsigned long long variables, but if possible, you should get and use a library that provides even larger integer types.
Editing to add, the single call to the method f for X and 2 such calls for Y is intentional and is in accordance with the way the algorithm works. I thought to nest the call for Y inside another such call to keep it on one line, but I decided to do it this way so it's easier to follow.
#include "stdafx.h"
#include <stdio.h>
#include <iostream>
typedef unsigned long long ULL;
ULL pollard(ULL numberToFactor);
ULL gcd(ULL differenceBetweenCongruentFunctions, ULL numberToFactor);
ULL f(ULL x, ULL numberToFactor);
int main(void)
{
ULL factor;
ULL n;
std::cout<<"Enter the number for which you want a prime factor: ";
std::cin>>n;
factor = pollard(n);
if (factor == 0) std::cout<<"No factor found. Your number may be prime, but it is not certain.\n\n";
else std::cout<<"One factor is: "<<factor<<"\n\n";
}
ULL pollard(ULL n)
{
ULL x = 2ULL;
ULL y = 2ULL;
ULL d = 1ULL;
while(d==1||d==n)
{
x = f(x,n);
y = f(y,n);
y = f(y,n);
if (y>x)
{
d = gcd(y-x, n);
}
else
{
d = gcd(x-y, n);
}
}
return d;
}
ULL gcd(ULL a, ULL b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
ULL currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
ULL f(ULL x, ULL n)
{
return (x * x + 1) % n;
}
Sorry for the long delay getting back to this. As I mentioned in my first answer, I am a novice at C++, which will be evident in my excessive use of global variables, excessive use of BigIntegers and BigUnsigned where other types might be better, lack of error checking, and other programming habits on display which a more skilled person might not exhibit. That being said, let me explain what I did, then will post the code.
I am doing this in a second answer because the first answer is useful as a very simple demo of how a Pollard's Rho algorithm is to implement once you understand what it does. And what it does is to first take 2 variables, call them x and y, assign them the starting values of 2. Then it runs x through a function, usually (x^2+1)%n, where n is the number you want to factor. And it runs y through the same function twice each cycle. Then the difference between x and y is calculated, and finally the greatest common divisor is found for this difference and n. If that number is 1, then you run x and y through the function again.
Continue this process until the GCD is not 1 or until x and y are equal again. If the GCD is found which is not 1, then that GCD is a non-trivial factor of n. If x and y become equal, then the (x^2+1)%n function has failed. In that case, you should try again with another function, maybe (x^2+2)%n, and so on.
Here is an example. Take 35, for which we know the prime factors are 5 and 7. I'll walk through Pollard Rho and show you how it finds a non-trivial factor.
Cycle #1: X starts at 2. Then using the function (x^2+1)%n, (2^2+1)%35, we get 5 for x. Y starts at 2 also, and after one run through the function, it also has a value of 5. But y always goes through the function twice, so the second run is (5^2+1)%35, or 26. The difference between x and y is 21. The GCD of 21 (the difference) and 35 (n) is 7. We have already found a prime factor of 35! Note that the GCD for any 2 numbers, even extremely large exponents, can be found very quickly by formula using Euclid's algorithm, and that's what the program I will post here does.
On the subject of the GCD function, I am using one library I downloaded for this program, a library that allows me to use BigIntegers and BigUnsigned. That library also has a GCD function built in, and I could have used it. But I decided to stay with the hand-written GCD function for instructional purposes. If you want to improve the program's execution time, it might be a good idea to use the library's GCD function because there are faster methods than Euclid, and the library may be written to use one of those faster methods.
Another side note. The .Net 4.5 library supports the use of BigIntegers and BigUnsigned also. I decided not to use that for this program because I wanted to write the whole thing in C++, not C++/CLI. You could get better performance from the .Net library, or you might not. I don't know, but I wanted to share that that is also an option.
I am jumping around a bit here, so let me start now by explaining in broad strokes what the program does, and lastly I will explain how to set it up on your computer if you use Visual Studio 11 (also called Visual Studio 2012).
The program allocates 3 arrays for storing the factors of any number you give it to process. These arrays are 1000 elements wide, which is excessive, maybe, but it ensures any number with 1000 prime factors or less will fit.
When you enter the number at the prompt, it assumes the number is composite and puts it in the first element of the compositeFactors array. Then it goes through some admittedly inefficient while loops, which use Miller-Rabin to check if the number is composite. Note this test can either say a number is composite with 100% confidence, or it can say the number is prime with extremely high (but not 100%) confidence. The confidence is adjustable by a variable confidenceFactor in the program. The program will make one check for every value between 2 and confidenceFactor, inclusive, so one less total check than the value of confidenceFactor itself.
The setting I have for confidenceFactor is 101, which does 100 checks. If it says a number is prime, the odds that it is really composite are 1 in 4^100, or the same as the odds of correctly calling the flip of a fair coin 200 consecutive times. In short, if it says the number is prime, it probably is, but the confidenceFactor number can be increased to get greater confidence at the cost of speed.
Here might be as good a place as any to mention that, while Pollard's Rho algorithm can be pretty effective factoring smaller numbers of type long long, the Miller-Rabin test to see if a number is composite would be more or less useless without the BigInteger and BigUnsigned types. A BigInteger library is pretty much a requirement to be able to reliably factor large numbers all the way to their prime factors like this.
When Miller Rabin says the factor is composite, it is factored, the factor stored in a temp array, and the original factor in the composites array divided by the same factor. When numbers are identified as likely prime, they are moved into the prime factors array and output to screen. This process continues until there are no composite factors left. The factors tend to be found in ascending order, but this is coincidental. The program makes no effort to list them in ascending order, but only lists them as they are found.
Note that I could not find any function (x^2+c)%n which will factor the number 4, no matter what value I gave c. Pollard Rho seems to have a very hard time with all perfect squares, but 4 is the only composite number I found which is totally impervious to it using functions in the format described. Therefore I added a check for an n of 4 inside the pollard method, returning 2 instantly if so.
So to set this program up, here is what you should do. Go to https://mattmccutchen.net/bigint/ and download bigint-2010.04.30.zip. Unzip this and put all of the .hh files and all of the C++ source files in your ~\Program Files\Microsoft Visual Studio 11.0\VC\include directory, excluding the Sample and C++ Testsuite source files. Then in Visual Studio, create an empty project. In the solution explorer, right click on the resource files folder and select Add...existing item. Add all of the C++ source files in the directory I just mentioned. Then also in solution expolorer, right click the Source Files folder and add a new item, select C++ file, name it, and paste the below source code into it, and it should work for you.
Not to flatter overly much, but there are folks here on Stack Overflow who know a great deal more about C++ than I do, and if they modify my code below to make it better, that's fantastic. But even if not, the code is functional as-is, and it should help illustrate the principles involved in programmatically finding prime factors of medium sized numbers. It will not threaten the general number field sieve, but it can factor numbers with 12 - 14 digit prime factors in a reasonably short time, even on an old Core2 Duo computer like the one I am using.
The code follows. Good luck.
#include <string>
#include <stdio.h>
#include <iostream>
#include "BigIntegerLibrary.hh"
typedef BigInteger BI;
typedef BigUnsigned BU;
using std::string;
using std::cin;
using std::cout;
BU pollard(BU numberToFactor);
BU gcda(BU differenceBetweenCongruentFunctions, BU numberToFactor);
BU f(BU x, BU numberToFactor, int increment);
void initializeArrays();
BU getNumberToFactor ();
void factorComposites();
bool testForComposite (BU num);
BU primeFactors[1000];
BU compositeFactors[1000];
BU tempFactors [1000];
int primeIndex;
int compositeIndex;
int tempIndex;
int numberOfCompositeFactors;
bool allJTestsShowComposite;
int main ()
{
while(1)
{
primeIndex=0;
compositeIndex=0;
tempIndex=0;
initializeArrays();
compositeFactors[0] = getNumberToFactor();
cout<<"\n\n";
if (compositeFactors[0] == 0) return 0;
numberOfCompositeFactors = 1;
factorComposites();
}
}
void initializeArrays()
{
for (int i = 0; i<1000;i++)
{
primeFactors[i] = 0;
compositeFactors[i]=0;
tempFactors[i]=0;
}
}
BU getNumberToFactor ()
{
std::string s;
std::cout<<"Enter the number for which you want a prime factor, or 0 to quit: ";
std::cin>>s;
return stringToBigUnsigned(s);
}
void factorComposites()
{
while (numberOfCompositeFactors!=0)
{
compositeIndex = 0;
tempIndex = 0;
// This while loop finds non-zero values in compositeFactors.
// If they are composite, it factors them and puts one factor in tempFactors,
// then divides the element in compositeFactors by the same amount.
// If the element is prime, it moves it into tempFactors (zeros the element in compositeFactors)
while (compositeIndex < 1000)
{
if(compositeFactors[compositeIndex] == 0)
{
compositeIndex++;
continue;
}
if(testForComposite(compositeFactors[compositeIndex]) == false)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else
{
tempFactors[tempIndex] = pollard (compositeFactors[compositeIndex]);
compositeFactors[compositeIndex] /= tempFactors[tempIndex];
tempIndex++;
compositeIndex++;
}
}
compositeIndex = 0;
// This while loop moves all remaining non-zero values from compositeFactors into tempFactors
// When it is done, compositeFactors should be all 0 value elements
while (compositeIndex < 1000)
{
if (compositeFactors[compositeIndex] != 0)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else compositeIndex++;
}
compositeIndex = 0;
tempIndex = 0;
// This while loop checks all non-zero elements in tempIndex.
// Those that are prime are shown on screen and moved to primeFactors
// Those that are composite are moved to compositeFactors
// When this is done, all elements in tempFactors should be 0
while (tempIndex<1000)
{
if(tempFactors[tempIndex] == 0)
{
tempIndex++;
continue;
}
if(testForComposite(tempFactors[tempIndex]) == false)
{
primeFactors[primeIndex] = tempFactors[tempIndex];
cout<<primeFactors[primeIndex]<<"\n";
tempFactors[tempIndex]=0;
primeIndex++;
tempIndex++;
}
else
{
compositeFactors[compositeIndex] = tempFactors[tempIndex];
tempFactors[tempIndex]=0;
compositeIndex++;
tempIndex++;
}
}
compositeIndex=0;
numberOfCompositeFactors=0;
// This while loop just checks to be sure there are still one or more composite factors.
// As long as there are, the outer while loop will repeat
while(compositeIndex<1000)
{
if(compositeFactors[compositeIndex]!=0) numberOfCompositeFactors++;
compositeIndex ++;
}
}
return;
}
// The following method uses the Miller-Rabin primality test to prove with 100% confidence a given number is composite,
// or to establish with a high level of confidence -- but not 100% -- that it is prime
bool testForComposite (BU num)
{
BU confidenceFactor = 101;
if (confidenceFactor >= num) confidenceFactor = num-1;
BU a,d,s, nMinusOne;
nMinusOne=num-1;
d=nMinusOne;
s=0;
while(modexp(d,1,2)==0)
{
d /= 2;
s++;
}
allJTestsShowComposite = true; // assume composite here until we can prove otherwise
for (BI i = 2 ; i<=confidenceFactor;i++)
{
if (modexp(i,d,num) == 1)
continue; // if this modulus is 1, then we cannot prove that num is composite with this value of i, so continue
if (modexp(i,d,num) == nMinusOne)
{
allJTestsShowComposite = false;
continue;
}
BU exponent(1);
for (BU j(0); j.toInt()<=s.toInt()-1;j++)
{
exponent *= 2;
if (modexp(i,exponent*d,num) == nMinusOne)
{
// if the modulus is not right for even a single j, then break and increment i.
allJTestsShowComposite = false;
continue;
}
}
if (allJTestsShowComposite == true) return true; // proven composite with 100% certainty, no need to continue testing
}
return false;
/* not proven composite in any test, so assume prime with a possibility of error =
(1/4)^(number of different values of i tested). This will be equal to the value of the
confidenceFactor variable, and the "witnesses" to the primality of the number being tested will be all integers from
2 through the value of confidenceFactor.
Note that this makes this primality test cryptographically less secure than it could be. It is theoretically possible,
if difficult, for a malicious party to pass a known composite number for which all of the lowest n integers fail to
detect that it is composite. A safer way is to generate random integers in the outer "for" loop and use those in place of
the variable i. Better still if those random numbers are checked to ensure no duplicates are generated.
*/
}
BU pollard(BU n)
{
if (n == 4) return 2;
BU x = 2;
BU y = 2;
BU d = 1;
int increment = 1;
while(d==1||d==n||d==0)
{
x = f(x,n, increment);
y = f(y,n, increment);
y = f(y,n, increment);
if (y>x)
{
d = gcda(y-x, n);
}
else
{
d = gcda(x-y, n);
}
if (d==0)
{
x = 2;
y = 2;
d = 1;
increment++; // This changes the pseudorandom function we use to increment x and y
}
}
return d;
}
BU gcda(BU a, BU b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
BU currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
BU f(BU x, BU n, int increment)
{
return (x * x + increment) % n;
}
As far as I can see, Pollard Rho normally uses f(x) as (x*x+1) (e.g. in these lecture notes ).
Your choice of x*x-1 appears not as good as it often seems to get stuck in a loop:
x=0
f(x)=-1
f(f(x))=0
I have the following exercise program from a book. The book states that for values x=10 and y=100, functions; min, max, incr and square are called 1, 91, 90 and 90 respectively. However, to me it looks like they are being called the following number of times, 1, 1, 1 and 0. Can someone explain to me the book numbers. Thanks.
#include <stdio.h>
int min(int x, int y){
return x < y ? x : y;
}
int max(int x, int y){
return x > y ? y : x;
}
void incr(int *xp, int v) {
*xp += v;
}
int square (int x){
return x*x;
}
int main(void){
int i;
int x = 10;
int y = 100;
int t = 0;
for (i = min(x, y); i < max(x, y); incr(&i, 1)){
t += square(i);
printf("test %i", t);
}
}
It really depends on the compiler and optimisation settings.
Compiling this program (after fixing the max() function to return the maximum) with -O3 shows that none of the functions are actually called. The compiler can see that the loop goes from 10 to 100, and that the variable is incremented by 1.
Never assume that a function is called. You tell the compiler what you want the program to do, but the compiler can choose to do it any way it wants to.
By the way, without fixing max(), the compiler could see that this is an empty loop, and produced a main() function that simply returned without setting any variable or doing anything (again, with -O3).
Well the loop is really (as x and y don't change):
for (i = 10; i < 100; incr(&i, 1))
The the first statement executes only once - that's why min is executed 1 time.
The stop condition is executed once at the beginning and then after each iteration - so 91 times. The third statemnt is executed at the end of each iteration - so 90 times.
So the book is correct.
Feels like homework so I don't want to provide a full solution, but here's a hint... The key lies in understanding that max is evaluated (called) at the end of each loop iteration because the result of max can change after each loop iteration.
The loop will continue as long as i is less than max(x, y) for a given loop iteration.
In a for loop, there are 3 parts, initialization, continue condition, increment.
for(initialization; continue condition; increment) {
body;
}
The loop does this:
Do initialization part
Check continue condition (exit if true)
Execute code in body of for loop
Execute increment condition
Go back to 2
So if we walk through it, min is called once (initialization), and until the condition is met, max is called each time (continue condition). This will happen from i = 10 to i = y, which is 91 times (once at the beginning, and once at each iteration).
The increment part is called exactly once for each iteration, but not called initially, so it would get called 90 times (100 - 10).
The square function will happen the same number of times that increment is called (because it is called before increment, but once per iteration).
The second two expressions in a for loop are evaluated every time through the loop. The second expression is run every time to check if the loop should continue, and the last expression is intended to change the state of the loop. For example:
for (int x = 0; x < 100; ++x) { /* ... */ }
The x < 100 expression has to be evaluated each time through the loop to see if you changed it. The ++x needs to be evaluated every time because it's what's incrementing x.
The for (expression 1; expression 2; expression 3) loop is really just a shortcut for the common pattern:
{
expression 1;
if (expression 2)
{
do //Do-while + if to demonstrate how `break` and `continue` affect things
{
//loop body
expression 3;
}
while(expression 2)
}
}