Matchmaking program in C? - c

The problem I am given is the following:
Write a program to discover the answer to this puzzle:"Let's say men and women are paid equally (from the same uniform distribution). If women date randomly and marry the first man with a higher salary, what fraction of the population will get married?"
From this site
My issue is that it seems that the percent married figure I am getting is wrong. Another poster asked this same question on the programmers exchange before, and the percentage getting married should be ~68%. However, I am getting closer to 75% (with a lot of variance). If anyone can take a look and let me know where I went wrong, I would be very grateful.
I realize, looking at the other question that was on the programmers exchange, that this is not the most efficient way to solve the problem. However, I would like to solve the problem in this manner before using more efficient approaches.
My code is below, the bulk of the problem is "solved" in the test function:
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define ARRAY_SIZE 100
#define MARRIED 1
#define SINGLE 0
#define MAX_SALARY 1000000
bool arrayContains(int* array, int val);
int test();
int main()
{
printf("Trial count: ");
int trials = GetInt();
int sum = 0;
for(int i = 0; i < trials; i++)
{
sum += test();
}
int average = (sum/trials) * 100;
printf("Approximately %d %% of the population will get married\n", average / ARRAY_SIZE);
}
int test()
{
srand(time(NULL));
int femArray[ARRAY_SIZE][2];
int maleArray[ARRAY_SIZE][2];
// load up random numbers
for (int i = 0; i < ARRAY_SIZE; i++)
{
femArray[i][0] = (rand() % MAX_SALARY);
femArray[i][1] = SINGLE;
maleArray[i][0] = (rand() % MAX_SALARY);
maleArray[i][1] = SINGLE;
}
srand(time(NULL));
int singleFemales = 0;
for (int k = 0; k < ARRAY_SIZE; k++)
{
int searches = 0; // count the unsuccessful matches
int checkedMates[ARRAY_SIZE] = {[0 ... ARRAY_SIZE - 1] = ARRAY_SIZE + 1};
while(true)
{
// ARRAY_SIZE - k is number of available people, subtract searches for people left
// checked all possible mates
if(((ARRAY_SIZE - k) - searches) == 0)
{
singleFemales++;
break;
}
int randMale = rand() % ARRAY_SIZE; // find a random male
while(arrayContains(checkedMates, randMale)) // ensure that the male was not checked earlier
{
randMale = rand() % ARRAY_SIZE;
}
checkedMates[searches] = randMale;
// male has a greater income and is single
if((femArray[k][0] < maleArray[randMale][0]) && (maleArray[randMale][1] == SINGLE))
{
femArray[k][1] = MARRIED;
maleArray[randMale][1] = MARRIED;
break;
}
else
{
searches++;
continue;
}
}
}
return ARRAY_SIZE - singleFemales;
}
bool arrayContains(int* array, int val)
{
for(int i = 0; i < ARRAY_SIZE; i++)
{
if (array[i] == val)
return true;
}
return false;
}

In the first place, there is some ambiguity in the problem as to what it means for the women to "date randomly". There are at least two plausible interpretations:
You cycle through the unmarried women, with each one randomly drawing one of the unmarried men and deciding, based on salary, whether to marry. On each pass through the available women, this probably results in some available men being dated by multiple women, and others being dated by none.
You divide each trial into rounds. In each round, you randomly shuffle the unmarried men among the unmarried women, so that each unmarried man dates exactly one unmarried woman.
In either case, you must repeat the matching until there are no more matches possible, which occurs when the maximum salary among eligible men is less than or equal to the minimum salary among eligible women.
In my tests, the two interpretations produced slightly different statistics: about 69.5% married using interpretation 1, and about 67.6% using interpretation 2. 100 trials of 100 potential couples each was enough to produce fairly low variance between runs. In the common (non-statistical) sense of the term, for example, the results from one set of 10 runs varied between 67.13% and 68.27%.
You appear not to take either of those interpretations, however. If I'm reading your code correctly, you go through the women exactly once, and for each one you keep drawing random men until either you find one that that woman can marry or you have tested every one. It should be clear that this yields a greater chance for women early in the list to be married, and that order-based bias will at minimum increase the variance of your results. I find it plausible that it also exerts a net bias toward more marriages, but I don't have a good argument in support.
Additionally, as I wrote in comments, you introduce some bias through the way you select random integers. The rand() function returns an int between 0 and RAND_MAX, inclusive, for RAND_MAX + 1 possible values. For the sake of argument, let's suppose those values are uniformly distributed over that range. If you use the % operator to shrink the range of the result to N possible values, then that result is still uniformly distributed only if N evenly divides RAND_MAX + 1, because otherwise more rand() results map to some values than map to others. In fact, this applies to any strictly mathematical transformation you might think of to narrow the range of the rand() results.
For the salaries, I don't see why you even bother to map them to a restricted range. RAND_MAX is as good a maximum salary as any other; the statistics gleaned from the simulation don't depend on the range of salaries; but only on their uniform distribution.
For selecting random indices into your arrays, however, either for drawing men or for shuffling, you do need a restricted range, so you do need to take care. The best way to reduce bias in this case is to force the random numbers drawn to come from a range that is evenly divisible by the number of options by re-drawing as many times as necessary to ensure it:
/*
* Returns a random `int` in the half-open interval [0, upper_bound).
* upper_bound must be positive, and should not exceed RAND_MAX + 1.
*/
int random_draw(int upper_bound) {
/* integer division truncates the remainder: */
int rand_bound = (RAND_MAX / upper_bound) * upper_bound;
for (;;) {
int r = rand();
if (r < rand_bound) {
return r % upper_bound;
}
}
}

Related

How to optimize/ make this c code even faster?

I need to optimize this c code in order for it to run as fast as possible. I am quite new to code optimization in general. What should I begin with?
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char*argv[]) {
int n, i, flag;
int sumOfPrimeNumbers; //sum of prime numbers
sumOfPrimeNumbers = 0;
do {
flag = 0;
scanf("%d", &n);
for(i=2;i < n;i++)
{
if(n%i==0) {
flag=1; // flag all non-prime numbers
break;
}
}
if(flag==0) {
sumOfPrimeNumbers = sumOfPrimeNumbers + n; // sum prime numbers
}
} while (n != 0);
printf("%d\n", sumOfPrimeNumbers);
return 0;
}
For small values of n (maybe values less than 66536?) you can use a table of precomputed answers, like "printf("%d\n", table[n]);".
For larger values you can split n into "zone" and "offset in zone", like "zone = n / zone_size; offset = n % zone_size;" and then use "zone" as an index into a precomputed table to determine an initial starting point (and skip a huge amount of work, like "sumOfPrimeNumbers = zoneStartTable[n / zone_size;"). The "offset in zone" part can be used with Sieve of Eratosthenes; which means that it's nicer for "zone_size" to be the product of the smallest primes (e.g. maybe like "zone_size = 2 * 3 * 5 * 7 * 11 * 13 * 17;") because that makes it a little easier to create a Sieve of Eratosthenes from a non-zero starting point.
For this approach to work you will actually need 2 sieves - one to find primes from 1 to "sqrt(n)" so that you can mark multiples of those primes as "not prime" in the second sieve (which will contain values from "zone * zone_size" to n). This process can be accelerated by recognizing that the sieve for the smallest primes (that you used to determine "zone_size") create a pattern that repeats every "zone_size" numbers, and that pattern can be predetermined and then copied into both of the sieves to initialize the sieves, allowing you to skip marking the smallest primes in both sieves.
Improve the algorithm. Avoid premature optimizations
Rather than test up to n, search to the square root of n
// for(i=2;i < n;i++)
for (i=2; i <= n/i; i++)
Sieve of Eratosthenes
Form a list of found primes {2,3,5} and only test against those. As a new prime is found, append it to the list.
Many other optimizations possible.

Why is my C code only generating every third random number?

I am trying to simulate the propagation of a worm across a network made of 100,000 computers. The simulation itself is very simple and I don't need any help except that for some reason, I am only getting every third random number.
Only the computers whose index modulo 1000 is less than 10 can be infected so when 1000 computers are infected, the program should be done. For some reason, my program only gets 329. When I lower the goal number and check the contents of the array, only every third computer has been changed and it is a consistent pattern. For example at the end of the array, only computers 98001, 98004, 98007, 99002, 99005, 99008 are changed even though the computers in between (98002, 98003, etc.) should be changed as well. The pattern holds all the way to the beginning of the array. When I try to get all 1000 changed, the program goes into an infinite loop and is stuck at 329.
Edit: I just discovered that if I lower the NETSIZE to 10,000 and the goal in the while loop to 100, it doesn't skip anything. Does that mean the problem has something to do with a rounding error? Someone who knows more about C than me must know the answer.
Thanks.
#include <stdio.h>
#include <stdlib.h>
#define NETSIZE 100000
double rand01();
void initNetwork();
unsigned char network[NETSIZE];
int scanrate = 3;
int infectedCount;
int scans;
int ind;
int time;
int main(void) {
initNetwork();
time = 0;
infectedCount = 1;
while (infectedCount < 1000) { //changing 1000 to 329 stops the infinite loop
scans = infectedCount * scanrate;
for (int j = 0; j < scans; j++) {
ind = (int) (rand01() * NETSIZE);
if (network[ind] == 0) {
network[ind] = 1;
infectedCount++;
}
}
time++;
}
for (int k = 0; k < NETSIZE; k++) {
if (network[k] == 1) printf("%d at %d\n", network[k], k);
}
}
double rand01() {
double temp;
temp = (rand() + 0.1) / (RAND_MAX + 1.0);
return temp;
}
void initNetwork() {
for (int i = 0; i < NETSIZE; i++) {
if (i % 1000 < 10) {
network[i] = 0;
} else {
network[i] = 2;
}
}
network[1000] = 1;
}
In the above code, I expect the code to run until the 1000 vulnerable indexes are changed from 0 to 1.
Converting comments into an answer.
What is RAND_MAX on your system? If it is a 15-bit or 16-bit value, you probably aren't getting good enough quantization when converted to double. If it is a 31-bit or bigger number, that (probably) won't be the issue. You need to investigate what values are generated by just the rand01() function with different seeds, plus the multiplication and cast to integer — simply print the results and sort -n | uniq -c to see how uniform the results are.
On my system RAND_MAX is only 32767. Do you think that might be why my results might not be granular enough? Now that you've made me think about it, there would only be 32,767 possible values and my network array is 100,000 possible values. Which corresponds about about the 1/3 results I am getting.
Yes, I think that is very probably the problem. You want 100,000 different values, but your random number generator can only generate about 33,000 different values, which is awfully close to your 1:3 metric. It also explains immediately why you got good results when you reduced the multiplier from 100,000 to 10,000.
You could try:
double rand01(void)
{
assert(RAND_MAX == 32767);
return ((rand() << 15) + rand()) / ((RAND_MAX + 1.0) * (RAND_MAX + 1.0));
}
Or you could use an alternative random number generator — for example, POSIX defines both the drand48() family of functions and
random(), with corresponding seed-setting functions where needed.
Yeah, the problem I am having is that the RAND_MAX value on my system is only 32767 and I am trying to effectively spread that out over 100,000 values which results in about only every third number ever showing up.
In my defense, the person who suggested the rand01() function has a PhD in Computer Science, but I think he ran this code on our school's main computer which probably has a much bigger RAND_MAX value.
#JonathanLeffler deserves credit for this solution.

Not getting proper output from Pollard's rho algorithm implementation

I don't know where I am doing wrong in trying to calculate prime factorizations using Pollard's rho algorithm.
#include<stdio.h>
#define f(x) x*x-1
int pollard( int );
int gcd( int, int);
int main( void ) {
int n;
scanf( "%d",&n );
pollard( n );
return 0;
}
int pollard( int n ) {
int i=1,x,y,k=2,d;
x = rand()%n;
y = x;
while(1) {
i++;
x = f( x ) % n;
d = gcd( y-x, n);
if(d!=1 && d!=n)
printf( "%d\n", d);
if(i == k) {
y = x;
k = 2 * k;
}
}
}
int gcd( int a, int b ) {
if( b == 0)
return a;
else
return gcd( b, a % b);
}
One immediate problem is, as Peter de Rivaz suspected the
#define f(x) x*x-1
Thus the line
x = f(x)%n;
becomes
x = x*x-1%n;
and the precedence of % is higher than that of -, hence the expression is implicitly parenthesised as
x = (x*x) - (1%n);
which is equivalent to x = x*x - 1; (I assume n > 1, anyway it's x = x*x - constant;) and if you start with a value x >= 2, you have overflow before you had a realistic chance of finding a factor:
2 -> 2*2-1 = 3 -> 3*3 - 1 = 8 -> 8*8 - 1 = 63 -> 3968 -> 15745023 -> overflow if int is 32 bits
That doesn't immediately make it impossible that gcd(y-x,n) is a factor, though. It just makes it likely that at a stage where theoretically, you would have found a factor, the overflow destroys the common factor that mathematically would exist - more likely than a common factor introduced by overflow.
Overflow of signed integers is undefined behaviour, so there are no guarantees how the programme behaves, but usually it behaves consistently so the iteration of f still produces a well-defined sequence for which the algorithm in principle works.
Another problem is that y-x will frequently be negative, and then the computed gcd can also be negative - often -1. In that case, you print -1.
And then, it is a not too rare occurrence that iterating f from a starting value doesn't detect a common factor because the cycles modulo both prime factors (for the example of n a product of two distinct primes) have equal length and are entered at the same time. You make no attempt at detecting such a case; whenever gcd(|y-x|, n) == n, any further work in that sequence is pointless, so you should break out of the loop when d == n.
Also, you never check whether n is a prime, in which case trying to find a factor is a futile undertaking from the start.
Furthermore, after fixing f(x) so that the % n applies to the complete result of f(x), you have the problem that x*x still overflows for relatively small x (with the standard signed 32-bit ints, for x >= 46341), so factoring larger n may fail due to overflow. At least, you should use unsigned long long for the computations, so that overflow is avoided for n < 2^32. However, factorising such small numbers is typically done more efficiently with trial division. Pollard's Rho method and other advanced factoring algorithms are meant for larger numbers, where trial division is no longer efficient or even feasible.
I'm just a novice at C++, and I am new to Stack Overflow, so some of what I have written is going to look sloppy, but this should get you going in the right direction. The program posted here should generally find and return one non-trivial factor of the number you enter at the prompt, or it will apologize if it cannot find such a factor.
I tested it with a few semiprime numbers, and it worked for me. For 371156167103, it finds 607619 without any detectable delay after I hit the enter key. I didn't check it with larger numbers than this. I used unsigned long long variables, but if possible, you should get and use a library that provides even larger integer types.
Editing to add, the single call to the method f for X and 2 such calls for Y is intentional and is in accordance with the way the algorithm works. I thought to nest the call for Y inside another such call to keep it on one line, but I decided to do it this way so it's easier to follow.
#include "stdafx.h"
#include <stdio.h>
#include <iostream>
typedef unsigned long long ULL;
ULL pollard(ULL numberToFactor);
ULL gcd(ULL differenceBetweenCongruentFunctions, ULL numberToFactor);
ULL f(ULL x, ULL numberToFactor);
int main(void)
{
ULL factor;
ULL n;
std::cout<<"Enter the number for which you want a prime factor: ";
std::cin>>n;
factor = pollard(n);
if (factor == 0) std::cout<<"No factor found. Your number may be prime, but it is not certain.\n\n";
else std::cout<<"One factor is: "<<factor<<"\n\n";
}
ULL pollard(ULL n)
{
ULL x = 2ULL;
ULL y = 2ULL;
ULL d = 1ULL;
while(d==1||d==n)
{
x = f(x,n);
y = f(y,n);
y = f(y,n);
if (y>x)
{
d = gcd(y-x, n);
}
else
{
d = gcd(x-y, n);
}
}
return d;
}
ULL gcd(ULL a, ULL b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
ULL currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
ULL f(ULL x, ULL n)
{
return (x * x + 1) % n;
}
Sorry for the long delay getting back to this. As I mentioned in my first answer, I am a novice at C++, which will be evident in my excessive use of global variables, excessive use of BigIntegers and BigUnsigned where other types might be better, lack of error checking, and other programming habits on display which a more skilled person might not exhibit. That being said, let me explain what I did, then will post the code.
I am doing this in a second answer because the first answer is useful as a very simple demo of how a Pollard's Rho algorithm is to implement once you understand what it does. And what it does is to first take 2 variables, call them x and y, assign them the starting values of 2. Then it runs x through a function, usually (x^2+1)%n, where n is the number you want to factor. And it runs y through the same function twice each cycle. Then the difference between x and y is calculated, and finally the greatest common divisor is found for this difference and n. If that number is 1, then you run x and y through the function again.
Continue this process until the GCD is not 1 or until x and y are equal again. If the GCD is found which is not 1, then that GCD is a non-trivial factor of n. If x and y become equal, then the (x^2+1)%n function has failed. In that case, you should try again with another function, maybe (x^2+2)%n, and so on.
Here is an example. Take 35, for which we know the prime factors are 5 and 7. I'll walk through Pollard Rho and show you how it finds a non-trivial factor.
Cycle #1: X starts at 2. Then using the function (x^2+1)%n, (2^2+1)%35, we get 5 for x. Y starts at 2 also, and after one run through the function, it also has a value of 5. But y always goes through the function twice, so the second run is (5^2+1)%35, or 26. The difference between x and y is 21. The GCD of 21 (the difference) and 35 (n) is 7. We have already found a prime factor of 35! Note that the GCD for any 2 numbers, even extremely large exponents, can be found very quickly by formula using Euclid's algorithm, and that's what the program I will post here does.
On the subject of the GCD function, I am using one library I downloaded for this program, a library that allows me to use BigIntegers and BigUnsigned. That library also has a GCD function built in, and I could have used it. But I decided to stay with the hand-written GCD function for instructional purposes. If you want to improve the program's execution time, it might be a good idea to use the library's GCD function because there are faster methods than Euclid, and the library may be written to use one of those faster methods.
Another side note. The .Net 4.5 library supports the use of BigIntegers and BigUnsigned also. I decided not to use that for this program because I wanted to write the whole thing in C++, not C++/CLI. You could get better performance from the .Net library, or you might not. I don't know, but I wanted to share that that is also an option.
I am jumping around a bit here, so let me start now by explaining in broad strokes what the program does, and lastly I will explain how to set it up on your computer if you use Visual Studio 11 (also called Visual Studio 2012).
The program allocates 3 arrays for storing the factors of any number you give it to process. These arrays are 1000 elements wide, which is excessive, maybe, but it ensures any number with 1000 prime factors or less will fit.
When you enter the number at the prompt, it assumes the number is composite and puts it in the first element of the compositeFactors array. Then it goes through some admittedly inefficient while loops, which use Miller-Rabin to check if the number is composite. Note this test can either say a number is composite with 100% confidence, or it can say the number is prime with extremely high (but not 100%) confidence. The confidence is adjustable by a variable confidenceFactor in the program. The program will make one check for every value between 2 and confidenceFactor, inclusive, so one less total check than the value of confidenceFactor itself.
The setting I have for confidenceFactor is 101, which does 100 checks. If it says a number is prime, the odds that it is really composite are 1 in 4^100, or the same as the odds of correctly calling the flip of a fair coin 200 consecutive times. In short, if it says the number is prime, it probably is, but the confidenceFactor number can be increased to get greater confidence at the cost of speed.
Here might be as good a place as any to mention that, while Pollard's Rho algorithm can be pretty effective factoring smaller numbers of type long long, the Miller-Rabin test to see if a number is composite would be more or less useless without the BigInteger and BigUnsigned types. A BigInteger library is pretty much a requirement to be able to reliably factor large numbers all the way to their prime factors like this.
When Miller Rabin says the factor is composite, it is factored, the factor stored in a temp array, and the original factor in the composites array divided by the same factor. When numbers are identified as likely prime, they are moved into the prime factors array and output to screen. This process continues until there are no composite factors left. The factors tend to be found in ascending order, but this is coincidental. The program makes no effort to list them in ascending order, but only lists them as they are found.
Note that I could not find any function (x^2+c)%n which will factor the number 4, no matter what value I gave c. Pollard Rho seems to have a very hard time with all perfect squares, but 4 is the only composite number I found which is totally impervious to it using functions in the format described. Therefore I added a check for an n of 4 inside the pollard method, returning 2 instantly if so.
So to set this program up, here is what you should do. Go to https://mattmccutchen.net/bigint/ and download bigint-2010.04.30.zip. Unzip this and put all of the .hh files and all of the C++ source files in your ~\Program Files\Microsoft Visual Studio 11.0\VC\include directory, excluding the Sample and C++ Testsuite source files. Then in Visual Studio, create an empty project. In the solution explorer, right click on the resource files folder and select Add...existing item. Add all of the C++ source files in the directory I just mentioned. Then also in solution expolorer, right click the Source Files folder and add a new item, select C++ file, name it, and paste the below source code into it, and it should work for you.
Not to flatter overly much, but there are folks here on Stack Overflow who know a great deal more about C++ than I do, and if they modify my code below to make it better, that's fantastic. But even if not, the code is functional as-is, and it should help illustrate the principles involved in programmatically finding prime factors of medium sized numbers. It will not threaten the general number field sieve, but it can factor numbers with 12 - 14 digit prime factors in a reasonably short time, even on an old Core2 Duo computer like the one I am using.
The code follows. Good luck.
#include <string>
#include <stdio.h>
#include <iostream>
#include "BigIntegerLibrary.hh"
typedef BigInteger BI;
typedef BigUnsigned BU;
using std::string;
using std::cin;
using std::cout;
BU pollard(BU numberToFactor);
BU gcda(BU differenceBetweenCongruentFunctions, BU numberToFactor);
BU f(BU x, BU numberToFactor, int increment);
void initializeArrays();
BU getNumberToFactor ();
void factorComposites();
bool testForComposite (BU num);
BU primeFactors[1000];
BU compositeFactors[1000];
BU tempFactors [1000];
int primeIndex;
int compositeIndex;
int tempIndex;
int numberOfCompositeFactors;
bool allJTestsShowComposite;
int main ()
{
while(1)
{
primeIndex=0;
compositeIndex=0;
tempIndex=0;
initializeArrays();
compositeFactors[0] = getNumberToFactor();
cout<<"\n\n";
if (compositeFactors[0] == 0) return 0;
numberOfCompositeFactors = 1;
factorComposites();
}
}
void initializeArrays()
{
for (int i = 0; i<1000;i++)
{
primeFactors[i] = 0;
compositeFactors[i]=0;
tempFactors[i]=0;
}
}
BU getNumberToFactor ()
{
std::string s;
std::cout<<"Enter the number for which you want a prime factor, or 0 to quit: ";
std::cin>>s;
return stringToBigUnsigned(s);
}
void factorComposites()
{
while (numberOfCompositeFactors!=0)
{
compositeIndex = 0;
tempIndex = 0;
// This while loop finds non-zero values in compositeFactors.
// If they are composite, it factors them and puts one factor in tempFactors,
// then divides the element in compositeFactors by the same amount.
// If the element is prime, it moves it into tempFactors (zeros the element in compositeFactors)
while (compositeIndex < 1000)
{
if(compositeFactors[compositeIndex] == 0)
{
compositeIndex++;
continue;
}
if(testForComposite(compositeFactors[compositeIndex]) == false)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else
{
tempFactors[tempIndex] = pollard (compositeFactors[compositeIndex]);
compositeFactors[compositeIndex] /= tempFactors[tempIndex];
tempIndex++;
compositeIndex++;
}
}
compositeIndex = 0;
// This while loop moves all remaining non-zero values from compositeFactors into tempFactors
// When it is done, compositeFactors should be all 0 value elements
while (compositeIndex < 1000)
{
if (compositeFactors[compositeIndex] != 0)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else compositeIndex++;
}
compositeIndex = 0;
tempIndex = 0;
// This while loop checks all non-zero elements in tempIndex.
// Those that are prime are shown on screen and moved to primeFactors
// Those that are composite are moved to compositeFactors
// When this is done, all elements in tempFactors should be 0
while (tempIndex<1000)
{
if(tempFactors[tempIndex] == 0)
{
tempIndex++;
continue;
}
if(testForComposite(tempFactors[tempIndex]) == false)
{
primeFactors[primeIndex] = tempFactors[tempIndex];
cout<<primeFactors[primeIndex]<<"\n";
tempFactors[tempIndex]=0;
primeIndex++;
tempIndex++;
}
else
{
compositeFactors[compositeIndex] = tempFactors[tempIndex];
tempFactors[tempIndex]=0;
compositeIndex++;
tempIndex++;
}
}
compositeIndex=0;
numberOfCompositeFactors=0;
// This while loop just checks to be sure there are still one or more composite factors.
// As long as there are, the outer while loop will repeat
while(compositeIndex<1000)
{
if(compositeFactors[compositeIndex]!=0) numberOfCompositeFactors++;
compositeIndex ++;
}
}
return;
}
// The following method uses the Miller-Rabin primality test to prove with 100% confidence a given number is composite,
// or to establish with a high level of confidence -- but not 100% -- that it is prime
bool testForComposite (BU num)
{
BU confidenceFactor = 101;
if (confidenceFactor >= num) confidenceFactor = num-1;
BU a,d,s, nMinusOne;
nMinusOne=num-1;
d=nMinusOne;
s=0;
while(modexp(d,1,2)==0)
{
d /= 2;
s++;
}
allJTestsShowComposite = true; // assume composite here until we can prove otherwise
for (BI i = 2 ; i<=confidenceFactor;i++)
{
if (modexp(i,d,num) == 1)
continue; // if this modulus is 1, then we cannot prove that num is composite with this value of i, so continue
if (modexp(i,d,num) == nMinusOne)
{
allJTestsShowComposite = false;
continue;
}
BU exponent(1);
for (BU j(0); j.toInt()<=s.toInt()-1;j++)
{
exponent *= 2;
if (modexp(i,exponent*d,num) == nMinusOne)
{
// if the modulus is not right for even a single j, then break and increment i.
allJTestsShowComposite = false;
continue;
}
}
if (allJTestsShowComposite == true) return true; // proven composite with 100% certainty, no need to continue testing
}
return false;
/* not proven composite in any test, so assume prime with a possibility of error =
(1/4)^(number of different values of i tested). This will be equal to the value of the
confidenceFactor variable, and the "witnesses" to the primality of the number being tested will be all integers from
2 through the value of confidenceFactor.
Note that this makes this primality test cryptographically less secure than it could be. It is theoretically possible,
if difficult, for a malicious party to pass a known composite number for which all of the lowest n integers fail to
detect that it is composite. A safer way is to generate random integers in the outer "for" loop and use those in place of
the variable i. Better still if those random numbers are checked to ensure no duplicates are generated.
*/
}
BU pollard(BU n)
{
if (n == 4) return 2;
BU x = 2;
BU y = 2;
BU d = 1;
int increment = 1;
while(d==1||d==n||d==0)
{
x = f(x,n, increment);
y = f(y,n, increment);
y = f(y,n, increment);
if (y>x)
{
d = gcda(y-x, n);
}
else
{
d = gcda(x-y, n);
}
if (d==0)
{
x = 2;
y = 2;
d = 1;
increment++; // This changes the pseudorandom function we use to increment x and y
}
}
return d;
}
BU gcda(BU a, BU b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
BU currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
BU f(BU x, BU n, int increment)
{
return (x * x + increment) % n;
}
As far as I can see, Pollard Rho normally uses f(x) as (x*x+1) (e.g. in these lecture notes ).
Your choice of x*x-1 appears not as good as it often seems to get stuck in a loop:
x=0
f(x)=-1
f(f(x))=0

Minimize the sum of errors of representative integers

Given n integers between [0,10000] as D1,D2...,Dn, where there may be duplicates, and n can be huge:
I want to find k distinct representative integers (for example k=5) between [0,10000] as R1,R2,...,Rk, so the sum of errors of all the representative integers is minimized.
The error of a representative integer is defined below:
Assuming we have k representative integers in ascending order as {R1,R2...,Rk}, the error of Ri
is :
and i want to minimize the sum of errors of the k representative integers:
how can this be done efficiently?
EDIT1: The smallest of the k representative integers must be the smallest number in
{D1,D2...,Dn}
EDIT2: The largest of the k representative integers must be the largest number in {D1,D2...,Dn} plus 1. For example when the largest number in {D1,D2...,Dn} is 9787 then Rk is 9788.
EDIT3: A concrete example is here:
D={1,3,3,7,8,14,14,14,30} and if k=5 and R is picked as {1,6,10,17,31} then the sum of errors is :
sum of errors=(1-1)+(3-1)*2+(7-6)+(8-6)+(14-10)*3+(30-17)=32
this is because 1<=1,3,3<6 , 6<=7,8<10, 10<=14,14,14<17, 17<=30<31
Although with help from the community you have managed to state your
problem in a form which is understandable mathematically, you still do
not supply enough information to enable me (or anyone else) to give
you a definitive answer (I would have posted this as a comment, but
for some reason I don't see the "add comment" option is available to
me). In order to give a good answer to this problem, we need to know
the relative sizes of n and k versus each other and 10000 (or the
expected maximum of the Di if it isn't 10000), and whether
it is critical that you attain the exact minimum (even if this
requires an exorbitant amount of time for the calculation) or if a
close approximation would be OK, also (and if so, how close do you
need to get). In addition, in order to know what algorithm runs in
minimum time, we need to have an understanding about what kind of
hardware is going to run the algorithm (i.e., do we have m CPU
cores to run on in parallel and what is the size of m relative to k).
Another important piece of information is whether this problem will be
solved only once, or it must be solved many times but there exists
some connection between the distributions of the integers Di
from one problem to the next (e.g., the integers
Di are all random samples from a particular, unchanging
probability distribution, or perhaps each successive problem has as
its input an ever-increasing set which is the set from the previous
problem plus an extra s integers).
No reasonable algorithm for your problem should run in time which
depends in a way greater than linear in n, since building a histogram
of the n integers Di requires O(n) time, and the answer to
the optimization problem itself is only dependent on the histogram of
the integers and not on their ordering. No algorithm can run in time
less than O(n), since that is the size of the input of the problem.
A brute force search over all of the possibilities requires, (assuming
that at least one of the Di is 0 and another is 10000), for
small k, say k < 10, approximately O(10000k-2) time, so if
log10(n) >> 4(k-2), this is the optimal algorithm (since in
this case the time for the brute force search is insignificant compared to the
time to read the input). It is also interesting to note that if k is
very close to 10000, then a brute force search only requires
O(1000010002-k) (because we can search instead over the
integers which are not used as representative integers).
If you update the definition of the problem with more information, I
will attempt to edit my answer in turn.
Now the question is clarified, we observe the Ri divide the Dx into k-1 intervals, [R1,R2), [R2,R3), ... [Rk-1,Rk). Every Dx belongs to exactly one of those intervals. Let qi be the number of Dx in the interval [Ri,Ri+1), and let si be the sum of those Dx. Then each error(Ri) expression is the sum of qi terms and evaluates to si - qiRi.
Summing that over all i, we get a total error of S - sum(qiRi), where S is the sum of all the Dx. So the problem is to choose the Ri to maximize sum(qiRi). Remember each qi is the number of original data at least as large as Ri, but smaller than the next one.
Any global maximum must be a local maximum; so we imagine increasing or decreasing one of the Ri. If Ri is not one of the original data values, then we can increase it without changing any of the qi and improve our target function. So an optimal solution has each Ri (except the limiting last one) as one of the data values. I got a bit bogged down in math after that, but it seems a sensible approach is to pick the initial Ri as every (n/k)th data value (simple percentiles), then iteratively seeing if moving the R_i to the previous or next value improves the score and thus decreases the error. (The qiRi seems easier to work with, since you can read the data and count repetitions and update qi, Ri by only looking at a single data/count point. You only need to store an array of 10,000 data counts, no matter how huge the data).
data: 1 3 7 8 14 30
count: 1 2 1 1 3 1 sum(data) = 94
initial R: 1 3 8 14 31
initial Q: 1 3 1 4 sum(QR) = 74 (hence error = 20)
In this example, we could try changing the 3 or the 8 to a 7, For example if we increase the 3 to 7, then we see there are 2 3's in the initial data, so the first two Q's become 1+2, 3-2 - it turns out this decreases sum(QR)). I'm sure there are smarter patterns to detect what changes in the QR table are viable, but this seems workable.
If the distribution is near random and the selection (n) is large enough, you are wasting time, generally, trying to optimize for what will amount to real costs in time calculating to gain decreasing improvements in % from expected averages. The fastest average solution is to set the lower k-1 at the low end of intervals M/(k-1), where M is the lowest upper bound - the greatest lower bound (ie, M = max number possible - 0) and the last k at M+1. It would take order k (the best we can do with the information presented in this problem) to figure those values out. Stating what I just did is not a proof of course.
My point is this. The above discussion is one simplification that I think is very practical for one large class of sets. At the other end, it's straightforward to compute every error possible for all permutations and then select the smallest one. The running time for this makes that solution intractable in many cases. That the person asking this question expects more than the most direct and exact (intractable) answer leaves much that is open-ended. We can trim at the edges from here to eternity trying to quantify all sorts of properties along the infinite solution space for all possible permutations (or combinations) of n numbers and all k values.
The integrity of this program was partly confirmed by a modified version of it used here to produce data that matched results obtained independently by #mhum.
It finds the exact minimal error value(s) and corresponding R values for a given data set and k value(s).
/************************************************************
This program can be compiled and run (eg, on Linux):
$ gcc -std=c99 minimize-sum-errors.c -o minimize-sum-errors
$ ./minimize-sum-errors
************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//data: Data set of values. Add extra large number at the end
int data[]={
10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240,99999,1,2,3,4,5,6,7,8,9,10,24,24,14,12,41,51,21,41,41,848,21, 547,3,2,888,4,1,66,5,4,2,11,742,95,25,365,52,441,874,51,2,145,254,24,245,54,21,87,98,65,32,25,14,25,36,25,14,47,58,58,69,85,74,71,82,82,93,93,93,12,12,45,78,78,985,412,74,3,62,125,458,658,147,432,124,328,952,635,368,634,637,874,124,35,23,65,896,21,41,745,49,2,7,8,4,8,7,2,6,5,6,9,8,9,6,5,9,5,9,5,9,11,41,5,24,98,78,45,54,65,32,21,12,18,38,48,68,78,75,72,95,92,65,55,5,87,412,158,654,219,943,218,952,357,753,159,951,485,862,1,741,22,444,452,487,478,478,495,456,444,141,238,9,445,421,441,444,436,478,51,24,24,24,24,24,24,247,741,98,99,999,111,444,323,33,5,5,5,85,85,85,85,654,456,5,4,566,445,5664,45,4556,45,5,6,5,4,56,66,5,456,5,45,6,68,7653,434,4,6,7,689,78,8,99,8700,344,65,45,8,899,86,65,3,422,3,4,3,4,7,68,544,454,545,65,4,6,878,423,64,97,8778,5456,5486,5485,545,5455,4548,81,999,8233,5223,8741,7747,7756,54,7884,5477,89,332,5999,9861,12545,9852,11452,5482,9358,9845,577,884,5589,5412,3669,32,6699,396,9629,953,321,45,5484,588,5872,85,872,87,1122,884,2458,471,22685,955,2845,6852,589,5896,2521,35696,5236,32633,56665,6633,326,5486,5487,8541,5495,2155,3,8523,65896,3852,5685,69536,1,1,1,1,1,2,3,4,5,6,
};
//N: size of data set
int N=sizeof(data)/sizeof(int);
//SavedBundle: data type to "hold" memoized values needed (minimized error sums and corresponding "list" of R values for a given round)
typedef struct _SavedBundle {
long e;
int head_index_value;
int tail_offset;
} SavedBundle;
//sb: (pts to) lookup table of all calculated values memoized
SavedBundle *sb; //holds winning values being memoized
//Sort in increasing order.
int sortfunc (const void *a, const void *b) {
return (*(int *)a - *(int *)b);
}
/****************************
Most interesting code in here
****************************/
long full_memh(int l, int n) {
long e;
long e_min=-1;
int ti;
if (sb[l*N+n].e) {
return sb[l*N+n].e; //convenience passing
}
for (int i=l+1; i<N-1; i++) {
e=0;
//sum first part
for (int j=l+1; j<i; j++) {
e+=data[j]-data[l];
}
//sum second part
if (n!=1) //general case, recursively
e+=full_memh(i, n-1);
else //base case, iteratively
for (int j=i+1; j<N-1; j++) {
e+=data[j]-data[i];
}
if (e_min==-1) {
e_min=e;
ti=i;
}
if (e<e_min) {
e_min=e;
ti=i;
}
}
sb[l*N+n].e=e_min;
sb[l*N+n].head_index_value=ti;
sb[l*N+n].tail_offset=ti*N+(n-1);
return e_min;
}
/**************************************************
Call to calculate and print results for the given k
**************************************************/
int full_memoization(int k) {
char *str;
long errorsum; //for convenience
int idx;
//Call recursive workhorse
errorsum=full_memh(0, k-2);
//Now print
str=(char *) malloc(k*20+100);
sprintf (str,"\n%6d %6d {%d,",k,errorsum,data[0]);
idx=0*N+(k-2);
for (int i=0; i<k-2; i++) {
sprintf (str+strlen(str),"%d,",data[sb[idx].head_index_value]);
idx=sb[idx].tail_offset;
}
sprintf (str+strlen(str),"%d}",data[N-1]);
printf ("%s",str);
free(str);
return 0;
}
/******************************************
Initialize and seek result for all k values
******************************************/
int main (int x, char **y) {
int t;
int i2;
qsort(data,N,sizeof(int),sortfunc);
sb = (SavedBundle *)calloc(sizeof(SavedBundle),N*N);
printf("\n Total data size: %d",N);
printf("\n k errSUM R values",N);
for (int i=3; i<=N; i++) {
full_memoization(i);
}
free(sb);
return 0;
}
Some sample results obtained:
Total data size: 375
k errSUM R values
3 475179 {1,5223,99999}
4 320853 {1,5223,56665,99999}
5 260103 {1,5223,7653,56665,99999}
6 210143 {1,5223,7653,32633,56665,99999}
7 171503 {1,421,5223,7653,32633,56665,99999}
8 142865 {1,412,2458,5223,7653,32633,56665,99999}
9 124403 {1,412,2458,5223,7653,32633,56665,65896,99999}
10 106790 {1,412,2458,5223,7653,9610,32633,56665,65896,99999}
11 93715 {1,412,2458,5223,7653,9610,22685,32633,56665,65896,99999}
12 81507 {1,412,848,2458,5223,7653,9610,22685,32633,56665,65896,99999}
13 71495 {1,412,848,2155,3610,5223,7653,9610,22685,32633,56665,65896,99999}
14 64243 {1,412,848,2155,3610,5223,6633,7747,9610,22685,32633,56665,65896,99999}
15 58355 {1,412,848,2155,3610,5223,6633,7653,8523,9610,22685,32633,56665,65896,99999}
16 53363 {1,65,412,848,2155,3610,5223,6633,7653,8523,9610,22685,32633,56665,65896,99999}
17 48983 {1,65,412,848,2155,3610,4548,5412,6633,7653,8523,9610,22685,32633,56665,65896,99999}
18 45299 {1,65,412,848,2155,3610,4548,5412,6633,7653,8523,9610,11452,22685,32633,56665,65896,99999}
19 41659 {1,65,412,848,2155,3610,4548,5412,6633,7653,8523,9610,11452,22685,32633,56665,65896,69536,99999}
20 38295 {1,65,321,441,848,2155,3610,4548,5412,6633,7653,8523,9610,11452,22685,32633,56665,65896,69536,99999}
21 35232 {1,65,321,441,848,2155,3610,4548,5412,6633,7653,8523,9610,11452,22685,32633,35696,56665,65896,69536,99999}
22 32236 {1,65,321,441,848,2155,3610,4410,5223,5455,6633,7653,8523,9610,11452,22685,32633,35696,56665,65896,69536,99999}
23 29323 {1,65,321,432,634,872,2155,3610,4410,5223,5455,6633,7653,8523,9610,11452,22685,32633,35696,56665,65896,69536,99999}
24 26791 {1,65,321,432,634,862,1690,2458,3610,4410,5223,5455,6633,7653,8523,9610,11452,22685,32633,35696,56665,65896,69536,99999}
25 25123 {1,65,321,432,634,862,1690,2458,3610,4410,5223,5455,5872,6633,7653,8523,9610,11452,22685,32633,35696,56665,65896,69536,99999}
26 23658 {1,65,321,432,634,862,1690,2458,3610,4410,5223,5455,5872,6633,7653,8233,8700,9610,11452,22685,32633,35696,56665,65896,69536,99999}
27 22333 {1,41,78,321,432,634,862,1690,2458,3610,4410,5223,5455,5872,6633,7653,8233,8700,9610,11452,22685,32633,35696,56665,65896,69536,99999}
28 21073 {1,41,78,321,432,634,862,1440,2155,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9610,11452,22685,32633,35696,56665,65896,69536,99999}
29 19973 {1,41,78,321,432,634,848,951,1960,2458,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9610,11452,22685,32633,35696,56665,65896,69536,99999}
30 18879 {1,41,78,321,432,634,848,951,1960,2458,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9358,9845,11452,22685,32633,35696,56665,65896,69536,99999}
31 17786 {1,41,78,321,432,634,848,951,1960,2458,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
32 16801 {1,41,78,321,432,634,848,943,1440,1960,2458,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
33 15821 {1,41,78,218,321,432,634,848,943,1440,1960,2458,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
34 14900 {1,41,78,218,321,421,544,634,848,943,1440,1960,2458,2845,3610,4410,5223,5455,5872,6633,7653,8233,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
35 14185 {1,41,78,218,321,421,544,634,848,943,1440,1960,2458,2845,3610,4410,5223,5455,5872,6633,7290,7747,8410,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
36 13503 {1,41,78,218,321,421,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5872,6633,7290,7747,8410,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
37 12859 {1,21,45,78,218,321,421,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5872,6633,7290,7747,8410,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
38 12232 {1,21,45,78,218,321,421,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5664,5872,6633,7290,7747,8410,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
39 11662 {1,21,45,78,218,321,412,444,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5664,5872,6633,7290,7747,8410,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
40 11127 {1,21,45,78,218,321,412,444,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
41 10623 {1,21,45,78,218,321,412,444,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
42 10121 {1,21,41,65,85,218,321,412,444,544,634,741,862,951,1440,1960,2458,2845,3610,4410,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
43 9637 {1,21,41,65,85,218,321,412,444,544,634,741,862,951,1440,1960,2458,2845,3610,3852,4410,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
44 9207 {1,21,41,65,85,218,321,412,444,544,634,741,862,951,1440,1960,2458,2845,3610,3852,4410,4840,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
45 8804 {1,21,41,65,85,218,321,412,444,544,634,741,862,943,1122,1690,2155,2458,2845,3610,3852,4410,4840,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
46 8409 {1,21,41,65,85,218,321,412,444,544,634,741,862,943,1122,1690,2155,2458,2845,3240,3610,3852,4410,4840,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,11452,12545,22685,32633,35696,56665,65896,69536,99999}
47 8014 {1,21,41,65,85,218,321,412,444,544,634,741,862,943,1122,1690,2155,2458,2845,3240,3610,3852,4410,4840,5223,5455,5664,5872,6633,7290,7747,8233,8523,8700,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
48 7636 {1,21,41,65,85,218,321,412,444,544,634,741,862,943,1122,1690,2155,2458,2845,3240,3610,3852,4410,4840,5223,5455,5664,5872,6250,6633,7290,7747,8233,8523,8700,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
49 7273 {1,21,41,65,85,218,321,412,444,544,634,741,862,943,1122,1690,2155,2458,2845,3240,3610,3852,4410,4840,5223,5455,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
50 6922 {1,21,41,65,85,124,218,321,412,444,544,634,741,862,943,1122,1690,2155,2458,2845,3240,3610,3852,4410,4840,5223,5455,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
51 6584 {1,21,41,65,85,124,218,321,412,444,544,634,741,862,943,1122,1440,1960,2155,2458,2845,3240,3610,3852,4410,4840,5223,5455,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
52 6283 {1,21,41,65,85,124,218,321,412,444,544,634,741,862,943,1122,1440,1960,2155,2458,2845,3240,3610,3852,4410,4840,5223,5412,5477,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
53 5983 {1,21,41,65,85,124,218,321,412,444,544,634,741,862,943,1122,1440,1960,2155,2458,2845,3240,3610,3852,4410,4840,5223,5412,5477,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
54 5707 {1,21,41,65,85,124,218,321,412,444,544,634,741,862,943,1122,1440,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
55 5450 {1,21,41,65,85,124,218,321,412,441,478,544,634,741,862,943,1122,1440,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,7290,7653,7747,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
56 5196 {1,21,41,65,85,124,218,321,412,441,478,544,634,741,862,943,1122,1440,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
57 4946 {1,21,41,65,85,124,218,321,412,441,478,544,634,741,862,943,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
58 4722 {1,5,21,41,65,85,124,218,321,412,441,478,544,634,741,862,943,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
59 4536 {1,5,21,41,65,85,124,218,321,412,441,478,544,634,741,862,943,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
60 4352 {1,5,21,41,65,85,124,218,321,412,441,478,544,634,741,848,872,951,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
61 4172 {1,5,21,41,65,85,124,218,321,357,412,441,478,544,634,741,848,872,951,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
62 3995 {1,5,21,41,65,85,124,218,321,357,412,441,478,544,634,741,848,872,951,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8410,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
63 3828 {1,5,21,41,65,85,124,218,321,357,412,441,478,544,634,741,848,872,943,985,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8410,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
64 3680 {1,5,21,41,65,85,124,218,321,357,412,441,478,544,634,741,848,872,943,985,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4000,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8410,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
65 3545 {1,5,21,32,45,65,85,124,218,321,357,412,441,478,544,634,741,848,872,943,985,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4000,4410,4548,4840,5223,5412,5477,5664,5872,6250,6633,6760,7290,7653,7747,7840,8233,8410,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
66 3418 {1,5,21,32,45,65,85,124,218,321,357,412,441,478,544,634,741,848,872,943,985,1122,1440,1690,1960,2155,2458,2845,3240,3610,3852,4000,4410,4548,4840,5223,5412,5477,5664,5872,5999,6250,6633,6760,7290,7653,7747,7840,8233,8410,8523,8700,9000,9358,9610,9845,10240,11452,12545,22685,32633,35696,56665,65896,69536,99999}
The output started slowing down on the PC too much before it reached 100 rows (of the possible 375).
This algorithm does not produce exact answers, but it allows very large data sets to be handled much faster than the memoized algorithm and with results that tends to be very close to the exact answer. Algorithm still needs improvement/debugging to prevent potential infinite loops in some cases, but that is not a deal-breaker since code can be added to stop that. Meanwhile, the alg excels with data sets too large to be managed by some other generally preferable algorithms (like the memoized alg answer submitted earlier). For example, on nearly 10,000 samples from the range [1-100000], k=500 is calculated in seconds on an old PC where the memo version appeared would take over an hour to do a much smaller k=90 on a much much smaller data set of size 375. For this kind of added performance, not getting the absolute lowest error sum is a very small price to pay. [I have not derived the quality of the results, but all comparisons made on data values where memo could keep up yielded not much over 10% worse, if that.]
/************************************************************
This program can be compiled and run (eg, on Linux):
$ gcc -std=c99 fast-inexact.c -o fast-inexact
$ .fast-inexact
************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
//a: Data set of values. Add extra large number at the end
int a[]={
10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,
4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240,99999,
1,2,3,4,5,6,7,8,9,10,
24,24,14,12,41,51,21,41,41,848,21, 547,3,2,888,4,1,66,5,4,2,11,742,
95,25,365,52,441,874,51,2,145,254,24,245,54,21,87,98,65,32,25,14,
25,36,25,14,47,58,58,69,85,74,71,82,82,93,93,93,12,12,45,78,78,985,
412,74,3,62,125,458,658,147,432,124,328,952,635,368,634,637,874,
124,35,23,65,896,21,41,745,49,2,7,8,4,8,7,2,6,5,6,9,8,9,6,5,9,5,9,
5,9,11,41,5,24,98,78,45,54,65,32,21,12,18,38,48,68,78,75,72,95,92,65,
55,5,87,412,158,654,219,943,218,952,357,753,159,951,485,862,1,741,22,
444,452,487,478,478,495,456,444,141,238,9,445,421,441,444,436,478,51,
24,24,24,24,24,24,247,741,98,99,999,111,444,323,33,5,5,5,85,85,85,85,
654,456,5,4,566,445,5664,45,4556,45,5,6,5,4,56,66,5,456,5,45,6,68,7653,
434,4,6,7,689,78,8,99,8700,344,65,45,8,899,86,65,3,422,3,4,3,4,7,68,
544,454,545,65,4,6,878,423,64,97,8778,5456,5486,5485,545,5455,4548,81,
999,8233,5223,8741,7747,7756,54,7884,5477,89,332,5999,9861,12545,9852,
11452,5482,9358,9845,577,884,5589,5412,3669,32,6699,396,9629,953,321,
45,5484,588,5872,85,872,87,1122,884,2458,471,22685,955,2845,6852,589,
5896,2521,35696,5236,32633,56665,6633,326,5486,5487,8541,5495,2155,3,
8523,65896,3852,5685,69536,
1,1,1,1,1,2,3,4,5,6,
//375
54,5451,545,54,885,855,8621,5,23,7,54,89,3,8545,196,35338,6412,5338,35512,8545,55483,3548,34878,37846,1545,2489,24534,84234,56465,8643,454,8,548,78,454,85,44,54564,87,85,45,48,54,564,67564,8945,864,54564,864,5453,554,7894,65456,45,5489,8424,84248,543,5454,82,54548,44,54654,8454,54,684,54,34,8,454,87,84,4,548,45456,48454,86465,4,454,45,4445,4564,484,4564,64654,56456,54,45,121,2851,15,248,24853,845,8485,384,3484,3484,3853,183,4835,83545,82,1851,6851,854,83,48434,87,34,854,943,849,468,4654,97,35,494,6549,878,65,2184,4845,4564,64,8,44,84,5,4454,4845,484,8513,897,47,8789,764,54,454,54894,454,842,181,54,81348,4518,548,51,813,1851,1841,5484,51,8431,8484,5487,79,4,31,31,84,87,74111,1,7272,7814,18,781,1,7,823,27872,8,8178,4156,485,184,84,45,18,75,18,715,48,78174,6541,8,54,8,41,8,4564,187,154,841,9,4194,53,4194,15,48,941,48,941,5,489,7415,41,49,41,54,54545,494,15,98,4189,5641,841,5145,41,416,48,414,4,841,5414,61,41,9891,61,169,19,1989,173,48154,56116,187,191,61,61418,8719,8187,51842,815,4815,4984,5,484,15,4897,18,4151,81,8941,549,1,5498,15,89,12,4,97,97,1,591,519,1,51,9,15,1655,65,2,3214,2365,8,77899,6565,6589,586,5,66,5,23669,5,9,59,9,8,569,3,3,6,96,99,955,5,96,9595,95,629,8971,81,5715,45,141,4819,84,518,81,87,2,41,5,98,41,54,9415,49,841,54,591,54,918,781,794,1221,2891,5,19878,154,9,4154,94,1518,41,49,415,49,15,4541,954,78,219,45,4515,49,9187,1549,15,985,14984,1597,91978,1541,41,5491,54197,815,914,91,78195,4179,1984,971,54,91,5198,71914,97,194,914,59419,49,4194,941,94191,41,9419,1941,914,9149,4191,1,19149,4949,454,141,1,9,489,415,4941,9841,24,8941,54,5915,198,419,24949,194,8545,4591,5498,714,54,984,5491,54,978,154,978,154,91495,41945,49,41954,8,154,94,149,4594,54,98,154,594,984,815,45,9148,4191,19,84,15,1,948,7897,184,5419,71,4194,8419,41984,954,54,1941,81798,789,459,45,4198,787,184,941,921,987,181,541,48,971,894,9145,594,19,78,48,4984,184,945,4,194,19849,454,978,4154,944,84154,9871,8489,4154,841,8945,198,710,45,4,51,541,984,982,1954,81,491,2465,498,5419,7481,5497,8515,498,4154,979,871,41,148,11971,184,94145,498,15,48,154,9418,41894,815,494,145,419,8,151,25,18940,5415,64348,74851,541,9481,24,9841,2,498,124,91,594,34614,64,8491,456,4164,81,3496,494,324,16498,15,4917,9841,546,4841,546,484,54541,654,81,246,518,4841,65,486,4165,4654,8415,4646,846,41654,864,165,45,4188,165,481,31,354,9415,491,549,484,87131,828,284,842,2,434,8434,64835,4313,143,48,35,498,7,154,8,7897,7154,654,987,564,3546,8789,715,4684,864,234,864,615,467,89,135,4198,7,654,64,189,7817,56,4,654,98,465,46,48,4,354,96,8413,54,8,768,45,165,46,81,654,3,48,7,41,54,6,71654,5618,745,4687,56461,8415,46841,654,18415,4641,684,8641,654,6848,
//1030 data points where 0 sum was reached around k=700
91971,84841,7108,25538,61927,311,13293,49323,82575,42047,42621,4528,33492,40233,8207,19313,17418,20046,97930,91319,21352,75522,80884,92887,3172,3402,30154,53295,45129,64875,76120,95241,75935,50600,14969,24058,64668,10739,74264,82103,95766,81604,31825,48253,98824,53223,50979,74839,22673,6901,6628,40582,11625,16851,74329,34832,99379,67076,64535,32430,87878,39846,87266,94771,68911,30598,78570,11443,96418,82912,14659,57422,88738,73430,37122,14757,65752,64413,55350,47566,40052,5269,63245,91024,62122,96172,73761,32491,9914,22246,56477,72743,31766,539,8060,51233,94746,38936,82773,4027,49755,75621,27878,36503,63731,96923,93088,71466,7829,91854,96506,5351,9372,792,8237,29526,10003,45061,84050,64869,44551,18686,31130,92931,77843,257,81465,33118,41736,19277,23252,95069,38862,84583,1510,78924,52875,83591,88760,51204,55668,31803,28820,72180,85375,31097,61709,65438,76378,50339,69786,24471,37894,62870,61760,37134,19589,41610,54127,65701,23447,47115,77960,13598,42731,76482,51722,53980,83969,68876,28247,64097,74556,89852,32215,28318,66235,62950,5848,45470,40770,50000,20546,47738,5013,56026,69247,9403,14276,78600,52114,49300,57225,70920,41405,25704,72529,85561,95069,24490,22578,66416,10333,42579,7541,34835,89226,88650,29651,87181,47493,73420,73326,86056,96184,881,3074,34043,12385,62809,32617,30558,47161,95675,18317,95487,1691,30156,70901,86281,29738,59373,94311,11038,62245,98438,48944,35946,67426,98144,37638,39288,90091,2419,74368,5501,53487,4721,45268,92114,77645,92420,55346,24469,10418,80834,72980,57352,54643,47955,28398,59555,4432,64450,94353,18022,7363,88904,18304,75731,28145,77099,37077,51892,77769,78618,58440,76279,93078,66569,40061,11341,95239,42097,34627,455,45190,15006,70919,28975,52242,94947,81103,98508,18289,49883,93925,10329,28593,59948,62807,53107,82485,46257,99603,81315,69200,57179,81100,70139,56208,71697,58216,18287,56682,80797,74856,68581,24932,56111,79553,44985,1078,33601,5052,46698,58454,21591,22216,75724,51901,19814,34312,93688,12404,1472,74226,42104,88751,74560,27770,52677,1257,3921,14543,38065,62154,80166,41952,83753,27875,96367,46870,56989,70061,29349,30417,14600,15638,9381,12672,32427,52193,63465,21644,79884,1788,84165,86538,32588,14481,62895,18922,17814,52043,27770,90651,30220,54177,684,12877,79534,9521,9151,62696,49504,17889,92016,34501,79437,49929,35694,79281,81751,61146,37207,14690,88139,71934,37867,42414,14138,68956,66459,78179,98301,41906,28393,66701,39038,98593,78928,19123,89097,7903,86555,7229,72289,30837,26828,75810,85795,52580,23946,52315,75066,6195,6247,10422,36205,85037,85639,37868,40653,13242,14990,17400,87468,33841,76043,15413,52200,15840,43988,4222,1163,97877,5894,27907,49478,82287,62434,88319,1326,96296,19314,63080,94678,65175,46033,18353,721,50185,87762,48604,70941,57076,15778,83744,24345,72384,93133,60848,51265,34558,58951,16594,45325,19575,41243,4129,36254,47318,68398,85336,10464,81489,49839,17483,40148,36113,1869,68571,90880,26744,26872,80029,40512,50642,85233,39595,61899,73401,33864,32744,45026,35147,62806,66004,75647,32795,25836,22709,46475,18975,89237,63503,37520,62019,72519,66694,78254,11971,26555,38208,51235,82437,73811,30071,60979,42083,59457,17922,53300,69295,14213,79140,14106,93565,39018,98767,12898,19065,99290,55406,96661,81503,17804,17835,45522,36121,15560,90373,29672,82686,95100,85898,30209,39965,18232,96036,83814,67533,31902,91084,43548,81247,34779,24890,45285,10364,29152,94940,1995,28647,63798,74587,51510,61728,52559,95367,41582,56753,92546,45668,73055,76292,80820,71398,87558,10149,25260,95802,56610,94918,65816,83004,32247,89064,94486,43603,91064,9278,44821,43852,46724,55095,8366,4778,36327,75601,71599,3061,64696,56375,58868,1881,13519,7193,3729,55724,98000,686,20422,84697,6823,99729,51581,9345,3230,33531,62041,75483,78380,13008,92322,73680,95761,3407,73779,1497,25348,4410,4715,97954,27151,96981,33027,16691,4754,50716,26714,15603,3877,63828,2177,78364,78663,90410,32799,40001,88635,31521,62240,71126,88550,45596,35836,30578,14734,48055,78423,99670,66613,25034,16271,95578,39832,59491,64164,90110,24612,94666,98316,36945,84526,23957,35914,74261,10148,89869,7362,96525,28747,19389,54348,30954,83866,24346,62858,96355,25336,89159,17438,19877,26213,67260,19395,50133,17429,80909,44168,77546,44149,40791,21306,59121,22933,97532,24283,47625,60143,50324,31150,79093,34412,15694,57816,56400,30645,44351,91535,47481,71120,45186,25358,96844,2731,37108,15691,10876,85188,81006,56378,7416,80928,73845,50342,33962,45379,14001,62637,45,66328,28684,75003,63335,81237,31773,34202,32170,51647,64902,65287,23594,39435,72560,25085,34321,96756,31878,39290,10456,313,58353,87017,45851,46863,88919,38035,94970,67059,21063,13281,91385,94599,5249,34230,96221,35681,18889,64631,49931,51949,23519,37007,59540,76583,70018,97867,98583,94493,13835,55055,56230,57409,48797,81045,97777,38919,4967,75806,36522,29159,64195,58832,51397,5911,47348,72203,31621,61132,32046,47295,81259,92105,61855,46985,8173,15735,16105,14233,36084,27771,77334,39122,50253,25481,17826,63048,64197,80649,172,94257,41669,22848,45634,72586,11604,36415,75842,95214,81968,86722,8491,5522,778,68350,83144,72919,27675,98142,63391,76649,1091,61181,77909,10498,4311,1144,73887,86234,49497,2192,89204,27685,19088,12111,74087,63381,72931,39497,22860,73816,96460,82602,26617,90907,742,77501,54128,6263,58682,81642,54077,13337,55144,73541,30715,98031,19841,26379,51787,48035,81621,81003,63135,71207,857,53082,49846,33006,69020,32600,28809,93781,27697,28789,84895,40154,42393,46255,83968,38531,59098,23078,2388,31081,47343,91678,12450,54226,9212,68542,55477,6778,75148,15625,15970,58963,76847,7532,43793,56065,6579,15151,54887,15814,80796,62039,38595,82848,30052,22450,42599,2606,11555,17245,8693,90166,98322,3856,43958,78906,24069,91181,74155,11157,26701,75147,79735,83698,59368,99053,27406,35721,38162,72535,90580,98451,36614,74207,57638,43118,31493,54616,3525,14593,70458,65804,2371,29952,95822,16967,46585,85324,44495,40046,40188,28571,49601,87926,46314,89084,54871,51785,30464,40750,88002,46775,99857,41941,70369,49355,82416,67822,88126,72305,68090,42573,6664,50620,8171,54154,64323,71018,70255,49214,19102,13961,38126,13767,75255,89885,24285,6784,45907,76710,69512,96761,36343,9178,43610,26232,20416,35417,79808,48812,1442,12738,14060,27780,73339,72251,22224,984,99484,3129,95242,5406,45172,93152,17698,79263,91020,2372,96955,93000,16632,24974,80075,22770,78679,52026,87169,97389,60924,95753,22470,73104,14341,89258,27802,15165,44009,16116,65558,26768,84349,42048,38158,69626,54520,6232,89607,70649,89678,64649,61427,73712,23429,60767,97914,19092,55872,67273,72611,17408,58426,45902,1158,3151,12460,6843,12175,39110,13795,48488,598,88102,62734,30051,85108,83685,4614,16221,89546,22251,33607,22389,28056,97714,97847,69668,14514,25876,47436,98820,80096,38333,73919,10210,53350,38424,71994,95426,16011,63218,46060,88059,54803,17782,4764,89636,75816,20450,71524,51424,66346,38996,51636,65503,35668,16180,35424,16688,71067,19510,20900,81505,44392,88822,66810,54956,22721,4020,55164,7768,53816,29369,97493,22693,50851,53883,40911,91519,8328,3488,89357,265,68837,37347,68925,53993,39617,21956,81340,90625,17603,82990,55479,96397,54300,90079,19013,58286,80248,93752,48825,87804,38548,82925,79145,68161,26215,27595,28166,84134,53883,72828,14699,57729,9756,21219,71888,58735,27888,77657,9862,29308,5713,10369,5132,16637,36379,74924,73424,7622,67815,44654,61976,37575,67544,41394,765,60364,48627,28929,35016,65876,16879,35727,58510,44848,68747,63314,45271,38285,19974,31022,46601,79594,28293,41943,93783,73472,25540,42352,12406,76008,60580,97316,35941,23328,63611,42353,32625,86073,50162,36848,48968,26200,44694,79594,18595,96664,3781,66827,18775,76745,23087,49444,9680,44804,1139,95993,35979,73285,78351,51555,55892,72987,919,6576,58724,74645,55748,15929,5263,9385,31276,81207,26297,13715,36839,23698,80161,78030,22099,23672,18946,52224,63113,52239,62193,69534,6218,9437,72951,26121,99384,89559,22585,41905,50820,17497,62181,95609,11040,48450,4672,19090,74922,25774,27276,59533,82413,90879,54002,1927,80107,48775,3426,13476,8,56421,57833,11826,84730,13248,44960,22347,77712,38664,51496,10446,26342,92613,2537,50518,15298,50077,95138,59074,38391,25995,51757,68041,45079,73503,44795,73578,73714,86823,85676,86652,24529,65036,8034,84053,27255,51289,41191,54733,32906,43556,99492,36724,54294,58745,21879,39742,9760,84910,89882,22040,74779,87799,6733,85094,51541,2561,42018,50703,77647,48273,51943,25317,76893,43156,11204,23775,31223,90188,61128,28268,90392,44615,8268,29091,49155,85684,99848,45164,39604,29565,7404,45835,49245,63540,86092,99420,29373,37648,26577,49013,58660,15547,88827,60902,28092,21057,57412,65051,95942,67104,62021,81178,64070,25825,74358,939,257,2738,85713,59958,34270,45916,48325,86031,68099,13435,96,26509,74600,67943,48462,59415,62719,63542,46788,33966,56201,74797,49645,34663,70653,13993,75123,64971,56299,24290,55613,17019,16469,10228,41530,65684,20980,5657,34485,74200,65094,45106,82980,23313,89679,78589,76463,44692,92419,28224,67769,79223,10358,12046,64508,5664,84873,98898,65128,22810,40006,78824,85890,28783,24195,88541,17804,77888,98005,3163,40585,37812,83540,65091,2368,10153,10123,67755,4906,26772,91241,61110,81443,96255,55859,82616,50686,39787,2824,67188,75051,72755,88297,70263,10886,52465,90967,67740,38048,41855,57897,36010,98311,39800,34970,61919,53605,35960,34556,75179,62847,99582,53397,64903,97681,21363,89165,18184,63832,93788,41877,73907,40862,72219,55696,34322,70520,86524,58044,96077,51867,68264,68773,42545,23291,87772,43135,35583,34601,78101,33345,33146,17877,55345,47317,93056,55095,65630,83641,32938,20800,94894,86836,18949,33463,96351,90210,15136,93306,57484,4970,44591,98902,99319,64390,71429,9453,10288,25967,27543,35135,10277,22490,95493,40130,69518,34173,68057,99926,85945,64916,85557,25386,77470,46018,32982,8214,45880,16878,64891,93415,13796,54925,65390,91184,73091,97527,20617,56281,20972,61849,6063,22834,23121,50132,23168,65229,36501,7732,61662,97901,80938,5449,4324,17177,98702,90374,47638,78220,8025,631,14868,7109,99196,81197,61445,13447,88088,61439,81253,11602,79082,91878,78818,23302,98516,17627,35548,90154,93396,36910,78363,42642,78977,42405,77288,36169,14039,62197,8267,34957,14627,80164,49675,44941,46282,38844,80471,92596,57433,129,85146,35443,63604,8925,2513,16232,75329,65547,12865,65564,40061,37036,2476,18330,67599,77393,27915,21717,39812,67331,39831,91457,82078,86552,84524,33187,78388,93499,31834,51610,48374,9599,62442,58001,79427,25427,10503,79516,447,67385,67312,58505,44860,57318,34878,5784,24556,75323,42995,80230,66273,70899,98707,35982,28881,77311,13652,56025,39197,99183,60848,78778,12611,55865,32717,30740,98863,41178,61939,7501,15980,32740,35422,11145,47642,95924,26318,2168,1409,43150,76210,83168,74739,80236,49366,56557,97408,79222,60346,18130,37973,34753,29330,6319,90128,81966,20999,93645,6116,58983,64143,64283,28777,72347,88576,31292,32646,60348,32715,27780,57558,71558,62441,13266,17590,44902,70808,47002,35206,76340,61551,21960,23786,90629,38566,37365,31258,7344,89896,68418,32815,88560,1462,20635,98402,15486,87818,87832,36665,62422,19706,36102,59077,73157,88408,62456,94333,70225,5552,24127,22076,11571,53084,13717,63016,69793,18229,59206,5782,57379,89482,91636,32108,36221,34750,75596,45483,42593,34406,5029,18772,57724,84624,39233,86922,77772,79550,46276,51367,81047,3458,57390,26930,10328,59690,9573,24463,84201,38847,23755,11125,21505,48379,61368,21367,57139,84731,655,79392,8614,574,39349,68808,91270,98549,33131,12817,91148,95310,90468,67711,69769,23201,54384,94404,70804,1585,8258,46100,64826,22817,47627,40503,7975,62228,2710,71282,16290,73345,9938,11950,57669,2767,47660,95717,4262,87231,23613,3132,5303,35138,15020,78273,55237,44423,43500,60718,8549,35593,34035,87978,94337,43352,66386,24438,4444,99250,15905,7122,12769,9057,38677,31656,91946,36395,6718,17485,60118,28635,16146,62946,37135,31640,81708,28255,76004,97645,67914,42872,64750,74295,54037,32473,77988,34434,87058,97884,71812,99386,59090,54572,94425,88778,26076,81966,74456,58589,75413,95109,35565,47090,62943,21610,15980,51616,83195,68938,77421,95579,43972,33475,80601,75068,71451,46172,53235,10594,229,14538,15779,3535,18792,74958,52376,39650,78112,30435,94447,23664,53290,14297,36564,57415,42577,62973,60178,54379,23471,86489,62865,63912,73881,27530,83805,88588,50955,79897,13092,25428,21032,55713,59695,44564,91593,38587,88118,40408,3458,52854,31271,82336,65550,75963,6379,94470,95815,57597,93874,49015,2623,78929,3542,96688,85728,93123,63370,93923,76560,81218,49577,40321,83277,75622,40868,20647,46612,53977,72510,27750,35045,37285,27490,11203,31005,67918,16305,66145,40471,68625,23301,84038,20189,36517,98936,39855,69536,39650,69495,4564,22889,54425,52056,70852,2908,50233,74545,75399,24963,43468,48828,24219,51897,9941,2651,36619,17298,43353,88712,82010,31618,63076,36919,84544,79994,83762,88616,25184,25358,90200,24379,1593,4588,99452,57834,24314,49316,55537,35259,88574,38989,461,34754,4295,90437,72079,88103,58951,94197,6143,66948,69639,73061,73794,18524,36274,36020,97571,22422,94371,55337,70290,23719,53295,26714,29618,44899,26361,6649,77299,54330,43848,2630,5462,43877,47405,17621,71980,48556,69593,10270,46270,96883,82522,97178,9692,38037,86735,12843,18547,90318,86142,36102,54725,67779,33960,70407,88337,68586,7416,32155,69809,76500,70306,61447,80616,15737,87763,27999,73308,44107,53341,12657,44004,61361,90129,11435,7085,83481,28657,86953,13639,35151,52055,98933,42279,8555,15128,98397,55654,21379,32574,97085,84159,36599,94225,9609,46510,85077,48658,2029,91337,35690,89749,37788,70610,10988,40687,76143,75614,96737,58393,53231,6038,72579,98172,68388,22202,75978,76105,62423,49984,89596,86908,52089,45258,42748,81535,12974,16453,27924,36973,55034,86639,71309,45107,17565,24195,39749,79635,82267,38262,24281,66895,22641,63707,26083,77311,11325,69871,66911,58230,47237,86189,86797,39368,87093,27872,48833,92246,72298,38752,73197,52211,18861,26297,48056,20065,65745,77341,96440,93208,79154,17581,91398,79828,59172,11980,31520,93006,90766,66947,25505,25662,51023,65611,68719,63162,26980,62111,33503,1486,79944,60541,97670,31407,32663,2175,36753,80740,98049,11417,70007,45707,20439,23306,81475,18876,79912,99032,38020,70540,25705,47679,54785,65712,60704,17881,63622,70029,11128,12249,58989,86318,73313,81666,73660,35613,13506,18101,38049,72504,24305,72834,59070,86071,40740,89168,78606,20938,68498,4058,52276,17717,47543,69021,7429,70137,18260,99292,15612,36939,26462,98679,74311,89545,57406,22151,89872,75116,77927,708,45972,38956,82904,59329,66204,82768,20511,72448,62524,87094,72666,36882,35918,2861,47804,24993,4741,22743,21102,23629,94258,42562,64222,95958,20392,86930,90823,49038,70593,4626,44967,17031,30479,24644,445,39185,42883,78137,45683,17756,54827,56304,89580,68790,85188,79520,52051,50933,47318,99246,22670,38591,17499,28045,35897,10805,97972,59912,69796,47591,87583,71044,9546,60276,47273,54530,64316,78565,55598,85509,68537,15822,53298,6913,78820,34806,23461,284,67945,80858,85813,9375,38770,83815,85870,58022,85829,52337,86324,79505,83821,11453,90476,36126,3399,87985,3568,71839,500,4500,88954,76271,29306,7336,56562,80863,6939,11635,25426,90038,43766,11098,1442,14002,76769,11714,51146,37615,88888,44783,94305,70233,19140,88140,54117,69619,93691,63860,64113,38961,85274,83126,16814,59257,35115,47597,69520,86248,82673,58660,20880,41290,12433,47215,34085,89592,19558,40045,76312,14199,74436,86458,70215,91487,71433,842,52231,56386,50702,87931,25493,45389,77371,50364,58718,16481,47618,59975,34313,10748,46240,47191,67951,40516,61168,65880,38343,23913,43060,82008,62035,75518,40273,83224,93027,21877,77435,88307,71632,1290,632,50128,9736,88768,37080,45712,4640,24781,93644,30882,33452,46119,42674,37702,89906,3306,6481,4953,12024,59540,87372,48001,2808,86196,55303,4471,18788,16408,59537,96648,92719,33499,6722,48955,33862,37879,13396,51670,43929,3939,48079,7524,30503,53480,5113,51939,94524,21433,17288,99724,16560,84499,11638,87050,60679,30010,88386,23716,28451,49719,22809,4109,25706,42582,55513,37422,47324,48847,53170,43576,84234,70617,40713,83624,15968,10641,63638,32104,65516,91641,5415,11173,5659,26470,28816,5998,33061,37595,42178,89808,43363,5269,23750,61805,51709,85293,88466,97116,4958,53628,55383,7265,38854,41885,40104,76385,44247,11543,30538,2550,62201,6462,67803,58608,73861,24575,92339,66125,11831,69331,66295,92341,93284,77231,44467,36331,47688,61093,39930,11186,17004,50702,88419,87123,4120,75947,73915,56934,53118,9829,22476,31866,85915,78365,50359,87479,80483,43982,24880,11468,69964,23984,20071,95228,63841,65270,25149,25133,58416,90652,74823,57118,96185,80077,2593,71310,39144,50045,38367,99790,79818,24103,43742,15546,60521,37955,28865,40358,85080,82134,23407,26816,91597,55285,34872,40824,81081,96452,96514,19764,63241,8287,7009,94878,93697,19786,50498,812,16033,1477,5958,72682,24719,2862,24640,75466,72096,62615,59825,51657,88987,31905,7150,1124,89274,93786,93492,12603,77640,93879,50029,38429,46416,14540,60096,99854,53701,21337,14776,23149,88425,87612,14118,73275,95677,39736,3861,38363,43532,86976,15608,57283,51214,31728,86864,3893,27587,60312,59186,75516,88981,54955,51563,28305,65703,39850,31114,26250,3584,46705,77618,3806,65896,68972,86255,44699,33004,84586,3957,78670,42706,9693,82971,71501,40885,48798,242,43439,36694,26933,51184,3285,42256,95213,33537,49816,36308,90626,951,40183,83542,11529,28352,57885,13323,48035,12880,7877,91954,93393,52484,52896,23149,50824,21749,49257,22391,26697,86114,57421,78006,81506,93889,24402,6114,92508,14331,62855,31552,80177,74401,2284,59442,18156,14048,16802,43234,64081,10157,79349,27857,18695,43181,37814,81532,19176,12963,37409,62303,51145,82240,82460,24582,65136,51998,26422,3929,13113,49884,43757,5622,55423,97518,40118,86731,92631,27205,90926,96293,52068,23709,65996,26947,96154,90337,7128,61897,4087,1991,23993,72488,86899,79374,11324,65148,88418,54336,12508,45647,16415,24710,13391,49148,95397,52338,72425,10023,68818,69355,49133,6885,23866,96638,15108,4489,91949,27222,5337,23033,81313,53666,77066,51356,802,88481,73212,23401,9683,58821,34532,78650,75983,37081,45008,45518,28358,73269,62559,78852,33061,22244,40088,55471,93262,69180,69656,21966,99573,56305,78275,32777,50891,53474,49154,20607,2148,90109,39213,57031,23313,61937,36049,86539,44626,86424,72189,79772,35617,2010,10973,83124,61716,63266,76741,70735,42465,12545,50465,55141,56235,27373,11852,54391,62949,35903,11676,42076,94570,98170,21346,17823,34684,94320,5241,17876,22221,86826,60898,6228,79471,82826,96582,25270,22077,78881,45064,40919,62442,72087,63729,34985,31142,15176,70720,52435,18755,39650,40171,90443,81261,36559,81823,67630,78532,56197,16870,77809,5654,18834,96386,3089,20120,95531,2744,78053,81987,16283,43645,86248,80595,11559,18234,59452,81379,53882,15148,5439,15665,98296,52359,40524,34081,
//+8000 rand ... cut to about 3000 to fit stackoverflow posting limits
};
//numofa: size of data set
int numofa=sizeof(a)/sizeof(int);
//Sort in increasing order. Used by slow algo to be not nearly as slow.
int sortfunc (const void *a, const void *b) {
return (*(int *)a - *(int *)b);
}
// Given 3 adjacent k values. Re-calculates the middle k value where (changing only this middle k value) the sum of the error on its left and error on its right is minimized (ie, ke[left] and ke[middle] are minimized).
int minimize_error_3(int *k, int *kai, int64_t *ke, const int left, const int middle, const int right) {
int64_t minerr=-1;
int64_t tmperr;
int64_t l_e=0, e=0; //not necessary to save errors by parts in general but
long minidx=kai[left]+1;
//printf ("%d %d %d %d ",k[middle], left, middle, right);
for (int i=kai[left]; i<kai[right]; i++) {
tmperr=0;
for (int j=kai[left]+1; j<i; j++) { //int j=kai[left]
tmperr+=a[j]-a[kai[left]];
}
e=tmperr;
for (int j=i+1; j<kai[right]; j++) {
tmperr+=a[j]-a[i];
}
if (minerr==-1)
minerr=tmperr;
if (tmperr<minerr) {
minerr=tmperr;
minidx=i;
l_e=e;
}
}
ke[left]=l_e;
ke[middle]=minerr>-1?minerr-l_e:0;
kai[middle]=minidx;
k[middle]=a[minidx];
//printf ("%d %d %d.%d ",ke[left], ke[middle], k[middle], minidx);
return 0;
}
int evenstartitercycles (int numofk) {
char *str, *str2;
int i, idx, err;
int done, moved;
int *k, *kai, *k_old;
int64_t *ke;
qsort(a,numofa,sizeof(int),sortfunc);
k=(int *) calloc(numofk,sizeof(int));
kai=(int *) malloc(numofk*sizeof(int));
k_old=(int *) malloc(numofk*sizeof(int));
ke=(int64_t *) calloc(numofk,sizeof(int64_t));
k[0]=a[0];
kai[0]=0;
k_old[0]=k[0];
for (int i=1; i<numofk-1; i++) {
k[i]=a[(numofa*i)/(numofk-1)];
kai[i]=(numofa*i)/(numofk-1);
k_old[i]=k[i];
}
k[numofk-1]=a[numofa-1];
kai[numofk-1]=numofa-1;
k_old[numofk-1]=k[numofk-1];
ke[numofk-1]=0; //already 0
i=0;
moved=1;
int at_end=0;
int min_x=k[2]-k[1]; //1 doing infin loop 0 ok but violates rule
int min_xi=1; //1 doing infin loop 0 ok but violates rule
int max_e=-1;
int max_ei=0;
while (!at_end || moved) {
if (i==0) {
moved=0;
at_end=0;
min_x=k[2]-k[1]; //?
min_xi=1;
max_e=-1; //?
max_ei=0;
}
minimize_error_3(k, kai, ke, i, i+1, i+2);
if (i>0) {
if (k[i+1]-k[i]<min_x) {
min_x=k[i+1]-k[i];
min_xi=i;
}
if (ke[i]>max_e && i>min_xi+1) {
max_e=ke[i];
max_ei=i;
}
//later do going to left version
}
if (k[i+1]!=k_old[i+1]) {
moved=1;
k_old[i+1]=k[i+1];
}
if (i<numofk-3) {
i++;
} else {
if (ke[i+1]>max_e && i+1>min_xi) {
max_e=ke[i+1];
max_ei=i+1;
}
//here see if can gain from shifting around some
if (max_ei>min_xi+3 && .1*ke[min_xi]*(k[min_xi]-k[min_xi-1])<max_e) { //fix the +3 to make it unnec??? .3??
//printf("1:%d %d %d %d ",min_x,min_xi,max_e,max_ei);
moved=1;
for (int i=min_xi; i<max_ei; i++) {
k[i]=a[kai[i+1]];
kai[i]=kai[i+1];
}
k[max_ei]=a[++kai[max_ei]];
}
i=0;
at_end=1;
}
}
err=0;
for (int i=0; i<numofk; i++)
err+=ke[i];
str=(char *) calloc(numofk,20);
for (int i=0; i<numofk; i++)
sprintf (str+strlen(str),"%d,",k[i]);
str2=(char *) calloc(numofk,20);
for (int i=0; i<numofk; i++)
sprintf (str2+strlen(str2),"%d,",(int)ke[i]);
printf ("\nevenstartitercycles(%d): The mininum error was %d, found at, k={%s} with error parts={%s} ",numofk,err,str,str2);
free(str);
free(str2);
return 0;
}
int main (int x, char **y) {
int t; //to track unique num of data if want this feature
int kmax;
qsort(a,numofa,sizeof(int),sortfunc);
t=1;
for (int i=1; i<numofa; i++)
if (a[i]!=a[i-1]) {
t++;
}
kmax=t; //t is value where we can reach 0 err sum for first time
kmax=numofa; //this will give many cases of 0 sum error for data sets that have many repeated data points.
for (int i=3; i<=kmax; i++) {
evenstartitercycles(i);
}
return 0;
}
This is is similar to one-dimensional k-medians clustering.
The DP I suggested previously won't work; I think we need a table from (n', k', i) to the optimal solution on D1 ≤ … ≤ Dn' with k' representatives of which the greatest is i. Given the bounds on D, the running time is on the order of n2 k with a very large constant, so you should probably adapt one of the heuristics that people use for k-means.
You may want to look into Ward's method using your particular distance function.
one first obvious point is that your Rs must be chosen amongst your Ds. (if you choose an R that is any D - 1 (but not another D), you'll improve the answer by incrementing your R by 1)
a second point is that your last R is useless and its error will always be 0

Generate a random number within range? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Generating Random Numbers in Objective-C
How do I generate a random number which is within a range?
This is actually a bit harder to get really correct than most people realize:
int rand_lim(int limit) {
/* return a random number between 0 and limit inclusive.
*/
int divisor = RAND_MAX/(limit+1);
int retval;
do {
retval = rand() / divisor;
} while (retval > limit);
return retval;
}
Attempts that just use % (or, equivalently, /) to get the numbers in a range almost inevitably introduce skew (i.e., some numbers will be generated more often than others).
As to why using % produces skewed results: unless the range you want is a divisor of RAND_MAX, skew is inevitable. If you start with small numbers, it's pretty easy to see why. Consider taking 10 pieces of candy (that we'll assume you can't cut, break, etc. into smaller pieces) and trying to divide it evenly between three children. Clearly it can't be done--if you hand out all the candy, the closest you can get is for two kids to get three pieces of candy, and one of them getting four.
There's only one way for all the kids to get the same number of pieces of candy: make sure you don't hand out the last piece of candy at all.
To relate this to the code above, let's start by numbering the candies from 1 to 10 and the kids from 1 to 3. The initial division says since there are three kids, our divisor is three. We then pull a random candy from the bucket, look at its number and divide by three and hand it to that kid -- but if the result is greater than 3 (i.e. we've picked out candy number 10) we just don't hand it out at all -- we discard it and pick out another candy.
Of course, if you're using a modern implementation of C++ (i.e., one that supports C++11 or newer), you should usually use one the distribution classes from the standard library. The code above corresponds most closely with std::uniform_int_distribution, but the standard library also includes uniform_real_distribution as well as classes for a number of non-uniform distributions (Bernoulli, Poisson, normal, maybe a couple others I don't remember at the moment).
int rand_range(int min_n, int max_n)
{
return rand() % (max_n - min_n + 1) + min_n;
}
For fractions:
double rand_range(double min_n, double max_n)
{
return (double)rand()/RAND_MAX * (max_n - min_n) + min_n;
}
For an integer value in the range [min,max):
double scale = (double) (max - min) / RAND_MAX;
int val = min + floor(rand() * scale)
I wrote this specifically in Obj-C for an iPhone project:
- (int) intInRangeMinimum:(int)min andMaximum:(int)max {
if (min > max) { return -1; }
int adjustedMax = (max + 1) - min; // arc4random returns within the set {min, (max - 1)}
int random = arc4random() % adjustedMax;
int result = random + min;
return result;
}
To use:
int newNumber = [aClass intInRangeMinimum:1 andMaximum:100];
Add salt to taste
+(NSInteger)randomNumberWithMin:(NSInteger)min WithMax:(NSInteger)max {
if (min>max) {
int tempMax=max;
max=min;
min=tempMax;
}
int randomy=arc4random() % (max-min+1);
randomy=randomy+min;
return randomy;
}
I use this method in a random number related class I made. Works well for my non-demanding needs, but may well be biased in some way.

Resources