Find k out of n subset with maximal area - c

I have n points and have to find the maximum united area between k points (k <= n). So, its the sum of those points area minus the common area between them.
]1
Suppose we have n=4, k=2. As illustrated in the image above, the areas are calculated from each point to the origin and, the final area is the sum of the B area with the D are (only counting the area of their intersection once). No point is dominated
I have implemented a bottom-up dynamic programming algorithm, but it has an error somewhere. Here is the code, that prints out the best result:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct point {
double x, y;
} point;
struct point *point_ptr;
int n, k;
point points_array[1201];
point result_points[1201];
void qsort(void *base, size_t nitems, size_t size,
int (*compar)(const void *, const void *));
int cmpfunc(const void *a, const void *b) {
point *order_a = (point *)a;
point *order_b = (point *)b;
if (order_a->x > order_b->x) {
return 1;
}
return -1;
}
double max(double a, double b) {
if (a > b) {
return a;
}
return b;
}
double getSingleArea(point p) {
return p.x * p.y;
}
double getCommonAreaX(point biggest_x, point new_point) {
double new_x;
new_x = new_point.x - biggest_x.x;
return new_x * new_point.y;
}
double algo() {
double T[k][n], value;
int i, j, d;
for (i = 0; i < n; i++) {
T[0][i] = getSingleArea(points_array[i]);
}
for (j = 0; j < k; j++) {
T[j][0] = getSingleArea(points_array[0]);
}
for (i = 1; i < k; i++) {
for (j = 1; j < n; j++) {
for (d = 0; d < j; d++) {
value = getCommonAreaX(points_array[j - 1], points_array[j]);
T[i][j] = max(T[i - 1][j], value + T[i - 1][d]);
}
}
}
return T[k - 1][n - 1];
}
void read_input() {
int i;
fscanf(stdin, "%d %d\n", &n, &k);
for (i = 0; i < n; i++) {
fscanf(stdin, "%lf %lf\n", &points_array[i].x, &points_array[i].y);
}
}
int main() {
read_input();
qsort(points_array, n, sizeof(point), cmpfunc);
printf("%.12lf\n", algo());
return 0;
}
with the input:
5 3
0.376508963445 0.437693410334
0.948798695015 0.352125307881
0.176318878234 0.493630156084
0.029394902328 0.951299438575
0.235041868262 0.438197791997
where the first number equals n, the second k and the following lines the x and y coordinates of every point respectively, the result should be: 0.381410589193,
whereas mine is 0.366431740966. So I am missing a point?

This is a neat little problem, thanks for posting! In the remainder, I'm going to assume no point is dominated, that is, there are no points c such that there exists a point d with c.x < d.x and c.y < d.y. If there are, then it is never optimal to use c (why?), so we can safely ignore any dominated points. None of your example points are dominated.
Your problem exhibits optimal substructure: once we have decided which item is to be included in the first iteration, we have the same problem again with k - 1, and n - 1 (we remove the selected item from the set of allowed points). Of course the pay-off depends on the set we choose - we do not want to count areas twice.
I propose we pre-sort all point by their x-value, in increasing order. This ensures the value of a selection of points can be computed as piece-wise areas. I'll illustrate with an example: suppose we have three points, (x1, y1), ..., (x3, y3) with values (2, 3), (3, 1), (4, .5). Then the total area covered by these points is (4 - 3) * .5 + (3 - 2) * 1 + (2 - 0) * 3. I hope it makes sense in a graph:
By our assumption that there are no dominated points, we will always have such a weakly decreasing figure. Thus, pre-sorting solves the entire problem of "counting areas twice"!
Let us turn this into a dynamic programming algorithm. Consider a set of n points, labelled {p_1, p_2, ..., p_n}. Let d[k][m] be the maximum area of a subset of size k + 1 where the (k + 1)-th point in the subset is point p_m. Clearly, m cannot be chosen as the (k + 1)-th point if m < k + 1, since then we would have a subset of size less than k + 1, which is never optimal. We have the following recursion,
d[k][m] = max {d[k - 1][l] + (p_m.x - p_l.x) * p_m.y, for all k <= l < m}.
The initial cases where k = 1 are the rectangular areas of each point. The initial cases together with the updating equation suffice to solve the problem. I estimate the following code as O(n^2 * k). The term squared in n can probably be lowered as well, as we have an ordered collection and might be able to apply a binary search to find the best subset in log n time, reducing n^2 to n log n. I leave this to you.
In the code, I have re-used my notation above where possible. It is a bit terse, but hopefully clear with the explanation given.
#include <stdio.h>
typedef struct point
{
double x;
double y;
} point_t;
double maxAreaSubset(point_t const *points, size_t numPoints, size_t subsetSize)
{
// This should probably be heap allocated in your program.
double d[subsetSize][numPoints];
for (size_t m = 0; m != numPoints; ++m)
d[0][m] = points[m].x * points[m].y;
for (size_t k = 1; k != subsetSize; ++k)
for (size_t m = k; m != numPoints; ++m)
for (size_t l = k - 1; l != m; ++l)
{
point_t const curr = points[m];
point_t const prev = points[l];
double const area = d[k - 1][l] + (curr.x - prev.x) * curr.y;
if (area > d[k][m]) // is a better subset
d[k][m] = area;
}
// The maximum area subset is now one of the subsets on the last row.
double result = 0.;
for (size_t m = subsetSize; m != numPoints; ++m)
if (d[subsetSize - 1][m] > result)
result = d[subsetSize - 1][m];
return result;
}
int main()
{
// I assume these are entered in sorted order, as explained in the answer.
point_t const points[5] = {
{0.029394902328, 0.951299438575},
{0.176318878234, 0.493630156084},
{0.235041868262, 0.438197791997},
{0.376508963445, 0.437693410334},
{0.948798695015, 0.352125307881},
};
printf("%f\n", maxAreaSubset(points, 5, 3));
}
Using the example data you've provided, I find an optimal result of 0.381411, as desired.

From what I can tell, you and I both use the same method to calculate the area, as well as the overall concept, but my code seems to be returning a correct result. Perhaps reviewing it can help you find a discrepancy.
JavaScript code:
function f(pts, k){
// Sort the points by x
pts.sort(([a1, b1], [a2, b2]) => a1 - a2);
const n = pts.length;
let best = 0;
// m[k][j] represents the optimal
// value if the jth point is chosen
// as rightmost for k points
let m = new Array(k + 1);
// Initialise m
for (let i=1; i<=k; i++)
m[i] = new Array(n);
for (let i=0; i<n; i++)
m[1][i] = pts[i][0] * pts[i][1];
// Build the table
for (let i=2; i<=k; i++){
for (let j=i-1; j<n; j++){
m[i][j] = 0;
for (let jj=j-1; jj>=i-2; jj--){
const area = (pts[j][0] - pts[jj][0]) * pts[j][1];
m[i][j] = Math.max(m[i][j], area + m[i-1][jj]);
}
best = Math.max(best, m[i][j]);
}
}
return best;
}
var pts = [
[0.376508963445, 0.437693410334],
[0.948798695015, 0.352125307881],
[0.176318878234, 0.493630156084],
[0.029394902328, 0.951299438575],
[0.235041868262, 0.438197791997]
];
var k = 3;
console.log(f(pts, k));

Related

How to replace a recursive function to using stack or iteration?

I have a recursive function that I wrote in C that looks like this:
void findSolutions(int** B, int n, int i) {
if (i > n) {
printBoard(B, n);
} else {
for (int x = 1; x <= n; x++) {
if (B[i][x] == 0) {
placeQueen(B, n, i, x);
findSolutions(B, n, i + 1);
removeQueen(B, n, i, x);
}
}
}
}
The initial call is (size is an integer given by user and B is a 2D array):
findSolutions(B, size, 1);
I tried to convert it into a iteration function but there is another function called removeQueen after findSolutions. I got stuck on where to put this function call. How to solve this problem? Stack is also fine but I'm also having trouble doing that.
I'm going to assume that placeQueen(B, n, i, x) makes a change to B and that removeQueen(B, n, i, x) undoes that change.
This answer shows how to approach the problem generically. It doesn't modify the algorithm like Aconcagua has.
Let's start by defining a state structure.
typedef struct {
int **B;
int n;
int i;
} State;
The original code is equivalent to the following:
void _findSolutions(State *state) {
if (state->i >= state->n) {
printBoard(state->B, state->n);
} else {
for (int x = 1; x <= state->n; ++x) {
if (state->B[state->i][x] == 0) {
State *state2 = State_clone(state); // Deep clone.
placeQueen(state2);
++state2->i;
findSolutions(state2);
}
}
}
State_free(state); // Frees the board too.
}
void findSolutions(int** B, int n, int i) {
State *state = State_new(B, n, i); // Deep clones B.
_findSolutions(state);
}
Now, we're in position to eliminate the recursion.
void _findSolutions(State *state) {
StateStack *S = StateStack_new();
do {
if (state->i >= state->n) {
printBoard(state->B, state->n);
} else {
for (int x = state->n; x>=1; --x) { // Reversed the loop to maintain order.
if (state->B[state->i][x] == 0) {
State *state2 = State_clone(state); // Deep clone.
placeQueen(state2);
++state2->i;
StateStack_push(S, state2);
}
}
}
State_free(state); // Frees the board too.
} while (StateStack_pop(&state));
StateStack_free(S);
}
void findSolutions(int** B, int n, int i) {
State *state = State_new(B, n, i); // Deep clones B.
_findSolutions(state);
}
We can eliminate the helper we no longer need.
void findSolutions(int** B, int n, int i) {
StateStack *S = StateStack_new();
State *state = State_new(B, n, i); // Deep clones B.
do {
if (state->i >= state->n) {
printBoard(state->B, state->n);
} else {
for (int x = state->n; x>=1; --x) { // Reversed the loop to maintain order.
if (state->B[state->i][x] == 0) {
State *state2 = State_clone(state); // Deep clone.
placeQueen(state2);
++state2->i;
StateStack_push(S, state2);
}
}
}
State_free(state); // Frees the board too.
} while (StateStack_pop(S, &state));
StateStack_free(S);
}
Functions you need to implement:
StateStack *StateStack_new(void)
void StateStack_free(StateStack *S)
void StateStack_push(StateStack *S, State *state)
int StateStack_pop(StateStack *S, State **p)
State *State_new(int **B, int n, int i) (Note: Clones B)
State *State_clone(const State *state) (Note: Clones state->B)
void State_free(State *state) (Note: Frees state->B)
Structures you need to implement:
StateStack
Tip:
It would be best if you replaced
int **B = malloc((n+1)*sizeof(int*));
for (int i=1; i<=n; ++i)
B[i] = calloc(n+1, sizeof(int));
...
for (int x = 1; x <= n; ++x)
...
B[i][x]
with
char *B = calloc(n*n, 1);
...
for (int x = 0; x < n; ++x)
...
B[(i-1)*n+(x-1)]
What you get by the recursive call is that you get stored the location of the queen in current row before you advance to next row. You will have to re-produce this in the non-recursive version of your function.
You might use another array storing these positions:
unsigned int* positions = calloc(n + 1, sizeof(unsigned int));
// need to initialise all positions to 1 yet:
for(unsigned int i = 1; i <= n; ++i)
{
positions[i] = 1;
}
I reserved a dummy element so that we can use the same indices...
You can now count up last position from 1 to n, and when reaching n there, you increment next position, restarting with current from 1 – just the same way as you increment numbers in decimal, hexadecimal or octal system: 1999 + 1 = 2000 (zero based in this case...).
for(;;)
{
for(unsigned int i = 1; i <= n; ++i)
{
placeQueen(B, n, i, positions[i]);
}
printBoard(B, n);
for(unsigned int i = 1; i <= n; ++i)
{
removeQueen(B, n, i, positions[i]);
}
for(unsigned int i = 1; i <= n; ++i)
{
if(++positions[i] <= n)
// break incrementing if we are in between the numbers:
// 1424 will get 1431 (with last position updated already before)
goto CONTINUE;
positions[i] = 1;
}
// we completed the entire positions list, i. e. we reset very
// last position to 1 again (comparable to an overflow: 4444 got 1111)
// so we are done -> exit main loop:
break;
CONTINUE: (void)0;
}
It's untested code, so you might find a bug in, but it should clearly illustrate the idea. It's the naive aproach, always placing the queens and removing them again.
You can do it a bit cleverer, though: place all queens at positions 1 initially and only move the queens if you really need:
for(unsigned int i = 1; i <= n; ++i)
{
positions[i] = 1;
placeQueen(B, n, i, 1);
}
for(;;)
{
printBoard(B, n);
for(unsigned int i = 1; i <= n; ++i)
{
removeQueen(B, n, i, positions[i]);
++positions[i]
if(++positions[i] <= n)
{
placeQueen(B, n, i, positions[i]);
goto CONTINUE;
}
placeQueen(B, n, i, 1);
positions[i] = 1;
}
break;
CONTINUE: (void)0;
}
// cleaning up the board again:
for(unsigned int i = 1; i <= n; ++i)
{
removeQueen(B, n, i, 1);
}
Again, untested...
You might discover that now the queens move within first row first, different to your recursive approach before. If that disturbs you, you can count down from n to 1 while incrementing the positions and you get original order back...
At the very end (after exiting the loop), don't forget to free the array again to avoid memory leak:
free(positions);
If n doesn't get too large (eight for a typical chess board?), you might use a VLA to prevent that problem.
Edit:
Above solutions will print any possible combinations to place eight queens on a chess board. For an 8x8 board, you get 88 possible combinations, which are more than 16 millions of combinations. You pretty sure will want to filter out some of these combinations, as you did in your original solution as well (if(B[i][x] == 0)), e. g.:
unsigned char* checks = malloc(n + 1);
for(;;)
{
memset(checks, 0, (n + 1));
for(unsigned int i = 1; i <= n; ++i)
{
if(checks[positions[i]] != 0)
goto SKIP;
checks[positions[i]] = 1;
}
// place queens and print board
SKIP:
// increment positions
}
(Trivial approach! Including the filter in the more elaborate approach will get more tricky!)
This will even be a bit more strict than your test, which would have allowed
_ Q _
Q _ _
_ Q _
on a 3x3 board, as you only compare against previous column, whereas my filter wouldn't (leaving a bit more than 40 000 boards to be printed for an 8x8 board).
Edit 2: The diagonals
To filter out those boards where the queens attack each other on the diagonals you'll need additional checks. For these, you'll have to find out what the common criterion is for the fields on the same diagonal. At first, we have to distinguish two types of diagonals, those starting at B[1][1], B[1][2], ... as well as B[2][1], B[3][1], ... – all these run from top left to bottom right direction. On the main diagonal, you'll discover that the difference between row and column index does not differ, on next neighbouring diagonals the indices differ by 1 and -1 respectively, and so on. So we'll have differences in the range [-(n-1); n-1].
If we make the checks array twice as large and shift all differences by n, can re-use do exactly the same checks as we did already for the columns:
unsigned char* checks = (unsigned char*)malloc(2*n + 1);
and after we checked the columns:
memset(checks, 0, (2 * n + 1));
for(unsigned int i = 1; i <= n; ++i)
{
if(checks[n + i - positions[i]] != 0)
goto SKIP;
checks[n + i - positions[i]] = 1;
}
Side note: Even if the array is larger, you still can just memset(checks, 0, n + 1); for the columns as we don't use the additional entries...
Now next we are interested in are the diagonals going from bottom left to top right. Similarly to the other direction, you'll discover that the difference between n - i and positions[i] remains constant for fields on the same diagonal. Again we shift by n and end up in:
memset(checks, 0, (2 * n + 1));
for(unsigned int i = 1; i <= n; ++i)
{
if(checks[2 * n - i - positions[i]] != 0)
goto SKIP;
checks[2 * n - i - positions[i]] = 1;
}
Et voilà, only boards on which queens cannot attack each other.
You might discover that some boards are symmetries (rotational or reflection) of others. Filtering these, though, is much more complicated...

Finding minimum of a function with N variables

I'm trying to code an algorithm to locate the minimum of Rosenbrock function that may have N variables. When N = 2, I can easily figure it out. The code that I'm using for N = 2 is below:
double y,z,x, aux1, aux2;
double menor = INT_MAX;
y = INT_MIN;
x = INT_MIN;
while(x < INT_MAX)
{
while(y < INT_MAX)
{
z = (1-x)*(1-x) + 100*(y - (x*x))*(y - (x*x));
if(menor > z)
{
menor = z;
aux1 = x;
aux2 = y;
}
y = y + 0.1;
}
y = 0.1;
x = x + 0.1;
}
printf("(x,y) : (%.2lf, %.2lf) Minimum value of z: %.2lf\n", aux1, aux2, menor);
This code is working fine and I'm summing y and x by 0.1 only because I already know what the minimum is given that function (it's on (1,1)). It takes a little while to run, but it works. My problem is for N variable. When I think about this, what comes to my mind is that I will need N repetition structures. Here is the code as it's by now. Its not working, but it may give some idea of what I'm trying to do:
//Calculates the value of the Rosenbrock function given n(the number of variables)
double rosen(double *x, int n){
double y;
for(int i = 0; i < n-1; i++)
{
y = y + 100*((x[i+1] - x[i]*x[i])*(x[i+1] - x[i]*x[i])) + (1 - x[i])*(1 - x[i]);
}
return y;
}
int main(void){
double *x;
//n is the number of variables and it may change
int n = 3;
x = (double*)malloc(n * sizeof(double));
double rosen(double *x, int n);
for(int i = 0; i < n; i++)
{
x[i] = INT_MIN;
}
//That's the part where I can't figure out how to compute all the possibilities, changing the value of the last variable between INT_MIN AND INT_MAX. Then this variable gets the value of INT_MIN again and I will sum 0.1 to the variable antecedent, and then do all the process again to the last variable. And so on for all the N variables.
for(int i = n - 1; i >= 0; i--)
{
while(x[i] < INT_MAX)
{
x[i] = x[i] + 0.1;
}
x[i] = INT_MIN;
}
This code above probably contain some erros. But, the only thing I'm needing help is to vary all the values of the N variables. So, what I want to do is take the last variable and vary its value between INT_MIN and INT_MAX, summing 0.1(I know its really a long journey). After that, this variable will receive INT_MIN value again and the antecedent variable will vary by +0.1. Then, the last variable will vary from INT_MIN to INT_MAX again. And this will happen for all the N variables.
This is a problem that I'm trying to solve, to brute-force the value of a function to get its minimum. If you guys have some tips for me or some library that may help, I will be very gratefull.
You can have a recursive function like the following (rough C):
void rosenMin(int maxDims, int currDim, double[] values, double* currMin)
{
if (currDims == maxDims) {
double rosenVal = rosen(values); // You need to implement this
if (rosenVal < *currMax) {
*currMin = rosenVal;
}
} else {
for (double c = INT_MIN; c <= INT_MAX; c += 0.1) {
values[currDim + 1] = c;
rosenMin(maxDim, currDim + 1, values, currMin);
}
}
}
double[] values = new double[N] { 0 }; // Check with C syntax how this'll look!
double min = INT_MAX
rosenMin(N, 1, values, &min);

Finding the nth fib number, in O(logn)

I am trying to solve this: SPOJ problem.
And after some research I found out that it comes down to a simple calculation of the nth fib number, however n can get really large so an O(n) solution won't do any good. Googling around, I found that you can calculate the nth fib number in O(logn) and also a code sample that does exactly that:
long long fibonacci(int n) {
long long fib[2][2] = {{1,1},{1,0}}, ret[2][2] = {{1,0},{0,1}}, tmp[2][2] = {{0,0},{0,0}};
int i, j, k;
while (n) {
if (n & 1) {
memset(tmp, 0, sizeof tmp);
for (i = 0; i < 2; i++)
for (j = 0; j < 2; j++)
for (k = 0; k < 2; k++)
tmp[i][j] = (tmp[i][j] + ret[i][k] * fib[k][j]);
for (i = 0; i < 2; i++)
for (j = 0; j < 2; j++)
ret[i][j] = tmp[i][j];
}
memset(tmp, 0, sizeof tmp);
for (i = 0; i < 2; i++)
for (j = 0; j < 2; j++)
for (k = 0; k < 2; k++)
tmp[i][j] = (tmp[i][j] + fib[i][k] * fib[k][j]);
for (i = 0; i < 2; i++)
for (j = 0; j < 2; j++)
fib[i][j] = tmp[i][j];
n /= 2;
}
return (ret[0][1]);
}
I tried to modify it for the problem and am still getting WA: http://ideone.com/3TtE5m
Am I calculating the modular arithmetic wrong? Or is something else the issue?
You mean the nth Fibonacci number I hope.
In order to do it you need a matrix decomposition of Fibonacci numbers described here.
The basic idea is you take the Donald E. Knuth matrix identity form for a Fibonacci number which is:
And instead of calculating the Fibonacci numbers in the traditional way you will try and find the matrix to the power of (k) where k is the given number.
So this is solving the problem in k matrix multiplications, not really helpful since we can do it in much easier way.
But wait! We can optimise the matrix multiplication. Instead of doing the k multiplications we can square it first and then do the half of the multiplications. And we can keep on doing it. So if the given number is 2a then we can do it in a steps. By keeping squaring the matrix.
If the number is not a power of 2 we can do the binary decomposition of a number and see whether to take the given squared matrix into final product or not.
In your case after each multiplication you also need to apply modulo operator 123456 to each matrix element.
Hope my explanation helps if not see the link for a clearer and longer one.
There is actually one more caveat of the task: as you are asked to provide some Fibonacci number modulo a given number, you should also prove that taking the remainder of each matrix element doesn't change the result. In other words if we multiply matrices and take remainder that we are actually still getting the Fibonacci number remainders. But since the remainder operation is distributive in addition and multiplication it actually does produce the correct results.
The Fibonacci numbers occur as the ratio of successive convergents of the continued fraction for , and the matrix formed from successive convergents of any continued fraction has a determinant of +1 or −1.
The matrix representation gives the following closed-form expression for the Fibonacci numbers i.e.
The matrix is multiplied n time because then only we can get the (n+1)th Fibonacci number as the element at the row and the column (0, 0) in the resultant matrix.
If we apply the above method without using recursive matrix multiplication, then the Time Complexity: O(n) and Space Complexity: O(1).
But we want Time Complexity: O(log n), so we have to optimize the above method, and this can be done by recursive multiplication of matrix to get the nth power.
Implementation of the above rule can be found below.
#include <stdio.h>
void multiply(int F[2][2], int M[2][2]);
void power(int F[2][2], int n);
/*
The function that returns nth Fibonacci number.
*/
int fib(int n) {
int F[2][2] = {{1, 1}, {1, 0}};
if (n == 0)
return 0;
power(F, n - 1);
return F[0][0];
}
/*
Optimized using recursive multiplication.
*/
void power(int F[2][2], int n) {
if ( n == 0 || n == 1)
return;
int M[2][2] = {{1, 1}, {1, 0}};
power(F, n / 2);
multiply(F, F);
if (n % 2 != 0)
multiply(F, M);
}
void multiply(int F[2][2], int M[2][2]) {
int x = F[0][0] * M[0][0] + F[0][1] * M[1][0];
int y = F[0][0] * M[0][1] + F[0][1] * M[1][1];
int z = F[1][0] * M[0][0] + F[1][1] * M[1][0];
int w = F[1][0] * M[0][1] + F[1][1] * M[1][1];
F[0][0] = x;
F[0][1] = y;
F[1][0] = z;
F[1][1] = w;
}
int main() {
printf("%d\n", fib(15));
/*
15th Fibonacci number is 610.
*/
return 0;
}
There is a very simple algorithm, using only integers:
long long fib(int n) {
long long a, b, p, q;
a = q = 1;
b = p = 0;
while (n > 0) {
if (n % 2 == 0) {
long long qq = q*q;
q = 2*p*q + qq;
p = p*p + qq;
n /= 2;
} else {
long long aq = a*q;
a = b*q + aq + a*p;
b = b*p + aq;
n -= 1;
}
}
return b;
}
This is based on the identities of the Lucas sequence.

Which is better way to calculate nCr

Approach 1:
C(n,r) = n!/(n-r)!r!
Approach 2:
In the book Combinatorial Algorithms by wilf, i have found this:
C(n,r) can be written as C(n-1,r) + C(n-1,r-1).
e.g.
C(7,4) = C(6,4) + C(6,3)
= C(5,4) + C(5,3) + C(5,3) + C(5,2)
. .
. .
. .
. .
After solving
= C(4,4) + C(4,1) + 3*C(3,3) + 3*C(3,1) + 6*C(2,1) + 6*C(2,2)
As you can see, the final solution doesn't need any multiplication. In every form C(n,r), either n==r or r==1.
Here is the sample code i have implemented:
int foo(int n,int r)
{
if(n==r) return 1;
if(r==1) return n;
return foo(n-1,r) + foo(n-1,r-1);
}
See output here.
In the approach 2, there are overlapping sub-problems where we are calling recursion to solve the same sub-problems again. We can avoid it by using Dynamic Programming.
I want to know which is the better way to calculate C(n,r)?.
Both approaches will save time, but the first one is very prone to integer overflow.
Approach 1:
This approach will generate result in shortest time (in at most n/2 iterations), and the possibility of overflow can be reduced by doing the multiplications carefully:
long long C(int n, int r) {
if(r > n - r) r = n - r; // because C(n, r) == C(n, n - r)
long long ans = 1;
int i;
for(i = 1; i <= r; i++) {
ans *= n - r + i;
ans /= i;
}
return ans;
}
This code will start multiplication of the numerator from the smaller end, and as the product of any k consecutive integers is divisible by k!, there will be no divisibility problem. But the possibility of overflow is still there, another useful trick may be dividing n - r + i and i by their GCD before doing the multiplication and division (and still overflow may occur).
Approach 2:
In this approach, you'll be actually building up the Pascal's Triangle. The dynamic approach is much faster than the recursive one (the first one is O(n^2) while the other is exponential). However, you'll need to use O(n^2) memory too.
# define MAX 100 // assuming we need first 100 rows
long long triangle[MAX + 1][MAX + 1];
void makeTriangle() {
int i, j;
// initialize the first row
triangle[0][0] = 1; // C(0, 0) = 1
for(i = 1; i < MAX; i++) {
triangle[i][0] = 1; // C(i, 0) = 1
for(j = 1; j <= i; j++) {
triangle[i][j] = triangle[i - 1][j - 1] + triangle[i - 1][j];
}
}
}
long long C(int n, int r) {
return triangle[n][r];
}
Then you can look up any C(n, r) in O(1) time.
If you need a particular C(n, r) (i.e. the full triangle is not needed), then the memory consumption can be made O(n) by overwriting the same row of the triangle, top to bottom.
# define MAX 100
long long row[MAX + 1];
int C(int n, int r) {
int i, j;
// initialize by the first row
row[0] = 1; // this is the value of C(0, 0)
for(i = 1; i <= n; i++) {
for(j = i; j > 0; j--) {
// from the recurrence C(n, r) = C(n - 1, r - 1) + C(n - 1, r)
row[j] += row[j - 1];
}
}
return row[r];
}
The inner loop is started from the end to simplify the calculations. If you start it from index 0, you'll need another variable to store the value being overwritten.
I think your recursive approach should work efficiently with DP. But it will start giving problems once the constraints increase. See http://www.spoj.pl/problems/MARBLES/
Here is the function which i use in online judges and coding contests. So it works quite fast.
long combi(int n,int k)
{
long ans=1;
k=k>n-k?n-k:k;
int j=1;
for(;j<=k;j++,n--)
{
if(n%j==0)
{
ans*=n/j;
}else
if(ans%j==0)
{
ans=ans/j*n;
}else
{
ans=(ans*n)/j;
}
}
return ans;
}
It is an efficient implementation for your Approach #1
Your Recursive Approach is fine but using DP with your approach will reduce the overhead of solving subproblems again.Now since we already have two Conditions-
nCr(n,r) = nCr(n-1,r-1) + nCr(n-1,r);
nCr(n,0)=nCr(n,n)=1;
Now we can easily build a DP solution by storing our subresults in a 2-D array-
int dp[max][max];
//Initialise array elements with zero
int nCr(int n, int r)
{
if(n==r) return dp[n][r] = 1; //Base Case
if(r==0) return dp[n][r] = 1; //Base Case
if(r==1) return dp[n][r] = n;
if(dp[n][r]) return dp[n][r]; // Using Subproblem Result
return dp[n][r] = nCr(n-1,r) + nCr(n-1,r-1);
}
Now if you want to further otimise, Getting the prime factorization of the binomial coefficient is probably the most efficient way to calculate it, especially if multiplication is expensive.
The fastest method I know is Vladimir's method. One avoids division all together by decomposing nCr into prime factors. As Vladimir says you can do this pretty efficiently using Eratosthenes sieve.Also,Use Fermat's little theorem to calculate nCr mod MOD(Where MOD is a prime number).
Using dynamic programming you can easily find the nCr here is the solution
package com.practice.competitive.maths;
import java.util.Scanner;
public class NCR1 {
public static void main(String[] args) {
try (Scanner scanner = new Scanner(System.in)) {
int testCase = scanner.nextInt();
while (testCase-- > 0) {
int n = scanner.nextInt();
int r = scanner.nextInt();
int[][] combination = combination();
System.out.println(combination[n][r]%1000000007);
}
} catch (Exception e) {
e.printStackTrace();
}
}
public static int[][] combination() {
int combination[][] = new int[1001][1001];
for (int i = 0; i < 1001; i++)
for (int j = 0; j <= i; j++) {
if (j == 0 || j == i)
combination[i][j] = 1;
else
combination[i][j] = combination[i - 1][j - 1] % 1000000007 + combination[i - 1][j] % 1000000007;
}
return combination;
}
}
unsigned long long ans = 1,a=1,b=1;
int k = r,i=0;
if (r > (n-r))
k = n-r;
for (i = n ; k >=1 ; k--,i--)
{
a *= i;
b *= k;
if (a%b == 0)
{
a = (a/b);
b=1;
}
}
ans = a/b;

LU Decomposition from Numerical Recipes not working; what am I doing wrong?

I've literally copied and pasted from the supplied source code for Numerical Recipes for C for in-place LU Matrix Decomposition, problem is its not working.
I'm sure I'm doing something stupid but would appreciate anyone being able to point me in the right direction on this; I've been working on its all day and can't see what I'm doing wrong.
POST-ANSWER UPDATE: The project is finished and working. Thanks to everyone for their guidance.
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define MAT1 3
#define TINY 1e-20
int h_NR_LU_decomp(float *a, int *indx){
//Taken from Numerical Recipies for C
int i,imax,j,k;
float big,dum,sum,temp;
int n=MAT1;
float vv[MAT1];
int d=1.0;
//Loop over rows to get implicit scaling info
for (i=0;i<n;i++) {
big=0.0;
for (j=0;j<n;j++)
if ((temp=fabs(a[i*MAT1+j])) > big)
big=temp;
if (big == 0.0) return -1; //Singular Matrix
vv[i]=1.0/big;
}
//Outer kij loop
for (j=0;j<n;j++) {
for (i=0;i<j;i++) {
sum=a[i*MAT1+j];
for (k=0;k<i;k++)
sum -= a[i*MAT1+k]*a[k*MAT1+j];
a[i*MAT1+j]=sum;
}
big=0.0;
//search for largest pivot
for (i=j;i<n;i++) {
sum=a[i*MAT1+j];
for (k=0;k<j;k++) sum -= a[i*MAT1+k]*a[k*MAT1+j];
a[i*MAT1+j]=sum;
if ((dum=vv[i]*fabs(sum)) >= big) {
big=dum;
imax=i;
}
}
//Do we need to swap any rows?
if (j != imax) {
for (k=0;k<n;k++) {
dum=a[imax*MAT1+k];
a[imax*MAT1+k]=a[j*MAT1+k];
a[j*MAT1+k]=dum;
}
d = -d;
vv[imax]=vv[j];
}
indx[j]=imax;
if (a[j*MAT1+j] == 0.0) a[j*MAT1+j]=TINY;
for (k=j+1;k<n;k++) {
dum=1.0/(a[j*MAT1+j]);
for (i=j+1;i<n;i++) a[i*MAT1+j] *= dum;
}
}
return 0;
}
void main(){
//3x3 Matrix
float exampleA[]={1,3,-2,3,5,6,2,4,3};
//pivot array (not used currently)
int* h_pivot = (int *)malloc(sizeof(int)*MAT1);
int retval = h_NR_LU_decomp(&exampleA[0],h_pivot);
for (unsigned int i=0; i<3; i++){
printf("\n%d:",h_pivot[i]);
for (unsigned int j=0;j<3; j++){
printf("%.1lf,",exampleA[i*3+j]);
}
}
}
WolframAlpha says the answer should be
1,3,-2
2,-2,7
3,2,-2
I'm getting:
2,4,3
0.2,2,-2.8
0.8,1,6.5
And so far I have found at least 3 different versions of the 'same' algorithm, so I'm completely confused.
PS yes I know there are at least a dozen different libraries to do this, but I'm more interested in understanding what I'm doing wrong than the right answer.
PPS since in LU Decomposition the lower resultant matrix is unity, and using Crouts algorithm as (i think) implemented, array index access is still safe, both L and U can be superimposed on each other in-place; hence the single resultant matrix for this.
I think there's something inherently wrong with your indices. They sometimes have unusual start and end values, and the outer loop over j instead of i makes me suspicious.
Before you ask anyone to examine your code, here are a few suggestions:
double-check your indices
get rid of those obfuscation attempts using sum
use a macro a(i,j) instead of a[i*MAT1+j]
write sub-functions instead of comments
remove unnecessary parts, isolating the erroneous code
Here's a version that follows these suggestions:
#define MAT1 3
#define a(i,j) a[(i)*MAT1+(j)]
int h_NR_LU_decomp(float *a, int *indx)
{
int i, j, k;
int n = MAT1;
for (i = 0; i < n; i++) {
// compute R
for (j = i; j < n; j++)
for (k = 0; k < i-2; k++)
a(i,j) -= a(i,k) * a(k,j);
// compute L
for (j = i+1; j < n; j++)
for (k = 0; k < i-2; k++)
a(j,i) -= a(j,k) * a(k,i);
}
return 0;
}
Its main advantages are:
it's readable
it works
It lacks pivoting, though. Add sub-functions as needed.
My advice: don't copy someone else's code without understanding it.
Most programmers are bad programmers.
For the love of all that is holy, don't use Numerical Recipies code for anything except as a toy implementation for teaching purposes of the algorithms described in the text -- and, really, the text isn't that great. And, as you're learning, neither is the code.
Certainly don't put any Numerical Recipies routine in your own code -- the license is insanely restrictive, particularly given the code quality. You won't be able to distribute your own code if you have NR stuff in there.
See if your system already has a LAPACK library installed. It's the standard interface to linear algebra routines in computational science and engineering, and while it's not perfect, you'll be able to find lapack libraries for any machine you ever move your code to, and you can just compile, link, and run. If it's not already installed on your system, your package manager (rpm, apt-get, fink, port, whatever) probably knows about lapack and can install it for you. If not, as long as you have a Fortran compiler on your system, you can download and compile it from here, and the standard C bindings can be found just below on the same page.
The reason it's so handy to have a standard API to linear algebra routines is that they are so common, but their performance is so system-dependant. So for instance, Goto BLAS
is an insanely fast implementation for x86 systems of the low-level operations which are needed for linear algebra; once you have LAPACK working, you can install that library to make everything as fast as possible.
Once you have any sort of LAPACK installed, the routine for doing an LU factorization of a general matrix is SGETRF for floats, or DGETRF for doubles. There are other, faster routines if you know something about the structure of the matrix - that it's symmetric positive definite, say (SBPTRF), or that it's tridiagonal (STDTRF). It's a big library, but once you learn your way around it you'll have a very powerful piece of gear in your numerical toolbox.
The thing that looks most suspicious to me is the part marked "search for largest pivot". This does not only search but it also changes the matrix A. I find it hard to believe that is correct.
The different version of the LU algorithm differ in pivoting, so make sure you understand that. You cannot compare the results of different algorithms. A better check is to see whether L times U equals your original matrix, or a permutation thereof if your algorithm does pivoting. That being said, your result is wrong because the determinant is wrong (pivoting does not change the determinant, except for the sign).
Apart from that #Philip has good advice. If you want to understand the code, start by understanding LU decomposition without pivoting.
To badly paraphrase Albert Einstein:
... a man with a watch always knows the
exact time, but a man with two is
never sure ....
Your code is definitely not producing the correct result, but even if it were, the result with pivoting will not directly correspond to the result without pivoting. In the context of a pivoting solution, what Alpha has really given you is probably the equivalent of this:
1 0 0 1 0 0 1 3 -2
P= 0 1 0 L= 2 1 0 U = 0 -2 7
0 0 1 3 2 1 0 0 -2
which will then satisfy the condition A = P.L.U (where . denotes the matrix product). If I compute the (notionally) same decomposition operation another way (using the LAPACK routine dgetrf via numpy in this case):
In [27]: A
Out[27]:
array([[ 1, 3, -2],
[ 3, 5, 6],
[ 2, 4, 3]])
In [28]: import scipy.linalg as la
In [29]: LU,ipivot = la.lu_factor(A)
In [30]: print LU
[[ 3. 5. 6. ]
[ 0.33333333 1.33333333 -4. ]
[ 0.66666667 0.5 1. ]]
In [31]: print ipivot
[1 1 2]
After a little bit of black magic with ipivot we get
0 1 0 1 0 0 3 5 6
P = 0 0 1 L = 0.33333 1 0 U = 0 1.3333 -4
1 0 0 0.66667 0.5 1 0 0 1
which also satisfies A = P.L.U . Both of these factorizations are correct, but they are different and they won't correspond to a correctly functioning version of the NR code.
So before you can go deciding whether you have the "right" answer, you really should spend a bit of time understanding the actual algorithm that the code you copied implements.
This thread has been viewed 6k times in the past 10 years. I had used NR Fortran and C for many years, and do not share the low opinions expressed here.
I explored the issue you encountered, and I believe the problem in your code is here:
for (k=j+1;k<n;k++) {
dum=1.0/(a[j*MAT1+j]);
for (i=j+1;i<n;i++) a[i*MAT1+j] *= dum;
}
while in the original if (j != n-1) { ... } is used. I think the two are not equivalent.
NR's lubksb() does have a small issue in the way they set up finding the first non-zero element, but this can be skipped at very low cost, even for a large matrix. With that, both ludcmp() and lubksb(), entered as published, work just fine, and as far as I can tell perform well.
Here's a complete test code, mostly preserving the notation of NR, wth minor simplifications (tested under Ubuntu Linux/gcc):
/* A sample program to demonstrate matrix inversion using the
* Crout's algorithm from Teukolsky and Press (Numerical Recipes):
* LU decomposition + back-substitution, with partial pivoting
* 2022.06 edward.sternin at brocku.ca
*/
#define N 7
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define a(i,j) a[(i)*n+(j)]
/* implied 1D layout is a(0,0), a(0,1), ... a(0,n-1), a(1,0), a(1,1), ... */
void matrixPrint (double *M, int nrow, int ncol) {
int i,j;
for (i=0;i<nrow;i++) {
for (j=0;j<ncol;j++) { fprintf(stderr," %+.3f\t",M[i*ncol+j]); }
fprintf(stderr,"\n");
}
}
void die(char msg[]) {
fprintf(stderr,"ERROR in %s, aborting\n",msg);
exit(1);
}
void ludcmp(double *a, int n, int *indx) {
int i, imax, j, k;
double big, dum, sum, temp;
double *vv;
/* i=row index, i=0..(n-1); j=col index, j=0..(n-1) */
vv=(double *)malloc((size_t)(n * sizeof(double)));
if (!vv) die("ludcmp: allocation failure");
for (i = 0; i < n; i++) { /* loop over rows */
big = 0.0;
for (j = 0; j < n; j++) {
if ((temp=fabs(a(i,j))) > big) big=temp;
}
if (big == 0.0) die("ludcmp: a singular matrix provided");
vv[i] = 1.0 / big; /* vv stores the scaling factor for each row */
}
for (j = 0; j < n; j++) { /* Crout's method: loop over columns */
for (i = 0; i < j; i++) { /* except for i=j */
sum = a(i,j);
for (k = 0; k < i; k++) { sum -= a(i,k) * a(k,j); }
a(i,j) = sum; /* Eq. 2.3.12, in situ */
}
big = 0.0; /* searching for the largest pivot element */
for (i = j; i < n; i++) {
sum = a(i,j);
for (k = 0; k < j; k++) { sum -= a(i,k) * a(k,j); }
a(i,j) = sum;
if ((dum = vv[i] * fabs(sum)) >= big) {
big = dum;
imax = i;
}
}
if (j != imax) { /* if needed, interchange rows */
for (k = 0; k < n; k++){
dum = a(imax,k);
a(imax,k) = a(j,k);
a(j,k) = dum;
}
vv[imax] = vv[j]; /* keep the scale factor with the new row location */
}
indx[j] = imax;
if (j != n-1) { /* divide by the pivot element */
dum = 1.0 / a(j,j);
for (i = j + 1; i < n; i++) a(i,j) *= dum;
}
}
free(vv);
}
void lubksb(double *a, int n, int *indx, double *b) {
int i, ip, j;
double sum;
for (i = 0; i < n; i++) {
/* Forward substitution, Eq.2.3.6, unscrambling permutations from indx[] */
ip = indx[i];
sum = b[ip];
b[ip] = b[i];
for (j = 0; j < i; j++) sum -= a(i,j) * b[j];
b[i] = sum;
}
for (i = n-1; i >= 0; i--) { /* backsubstitution, Eq. 2.3.7 */
sum = b[i];
for (j = i + 1; j < n; j++) sum -= a(i,j) * b[j];
b[i] = sum / a(i,i);
}
}
int main() {
double *a,*y,*col,*aa,*res,sum;
int i,j,k,*indx;
a=(double *)malloc((size_t)(N*N * sizeof(double)));
y=(double *)malloc((size_t)(N*N * sizeof(double)));
col=(double *)malloc((size_t)(N * sizeof(double)));
indx=(int *)malloc((size_t)(N * sizeof(int)));
aa=(double *)malloc((size_t)(N*N * sizeof(double)));
res=(double *)malloc((size_t)(N*N * sizeof(double)));
if (!a || !y || !col || !indx || !aa || !res) die("main: memory allocation failure");
srand48((long int) N);
for (i=0;i<N;i++) {
for (j=0;j<N;j++) { aa[i*N+j] = a[i*N+j] = drand48(); }
}
fprintf(stderr,"\nRandomly generated matrix A = \n");
matrixPrint(a,N,N);
ludcmp(a,N,indx);
for(j=0;j<N;j++) {
for(i=0;i<N;i++) { col[i]=0.0; }
col[j]=1.0;
lubksb(a,N,indx,col);
for(i=0;i<N;i++) { y[i*N+j]=col[i]; }
}
fprintf(stderr,"\nResult of LU/BackSub is inv(A) :\n");
matrixPrint(y,N,N);
for (i=0; i<N; i++) {
for (j=0;j<N;j++) {
sum = 0;
for (k=0; k<N; k++) { sum += y[i*N+k] * aa[k*N+j]; }
res[i*N+j] = sum;
}
}
fprintf(stderr,"\nResult of inv(A).A = (should be 1):\n");
matrixPrint(res,N,N);
return(0);
}

Resources