The greatest sum in part of matrix array - arrays

I am stuck at a simple problem, I am looking for a better solution than my.
I have an integers matrix array (tab[N][M]) and integer (k) and I have to find the smallest rectangle (sub matrix array) that has sum of it's elements greater then k
So, my current attempt of a solution is:
Make additional matrix array (sum[N][M]) and integer solution = infinity
For each 1 < i <= N + 1 and 1 < j <= M + 1
sum[ i ][ j ] = sum[ i - 1 ][ j ] + sum [ i ][ j - 1] + tab[ i ] [ j ] - sum[ i - 1] [ j - 1]
Then look on each rectangle f.e rectangle that starts at (x, y) and ends (a, b)
Rectangle_(x,y)_(a,b) = sum[ a ][ b ] - sum[ a - x ] [ b ] - sum[ a ][ b - y ] + sum[ a - x ][ b - y ]
and if Rectangle_(x,y)_(a,b) >= k then solution = minimum of current_solution and (a - x) * (b - y)
But this solution is quite slow (quartic time), is there any possibility to make it faster? I am looking for iterated logarithmic time (or worse/better). I managed to reduce my time , but not substantially.

If the matrix only contains values >= 0, then there is a linear time solution in the 1D case that can be extended to a cubic time solution in the 2D case.
For the 1D case, you do a single pass from left to right, sliding a window across the array, stretching or shrinking it as you go so that the numbers contained in the interval always sum to at least k (or breaking out of the loop if this is not possible).
Initially, set the left index bound of the interval to the first element, and the right index bound to -1, then in a loop:
Increment the right bound by 1, and then keep incrementing it until either the values inside the interval sum to > k, or end of the array is reached.
Increment the left bound to shrink the interval as small as possible without letting the values sum to less than or equal to k.
If the result is a valid interval (meaning the first step did not reach the end of the array without finding a valid interval) then compare it to the smallest so far and update if necessary.
This doesn't work if negative values are allowed, because in the second step you need to be able to assume that shrinking the interval always leads to a smaller sum, so when the sum dips below k you know that's the smallest possible for a given interval endpoint.
For the 2D case, you can iterate over all possible sub-matrix heights, and over each possible starting row for a given height, and perform this horizontal sweep for each row.
In pseudo-code:
Assume you have a function rectangle_sum(x, y, a, b) that returns the sum of the values from (x, y) to (a, b) inclusive and runs in O(1) time used a summed area table.
for(height = 1; height <= M; height++) // iterate over submatrix heights
{
for(row = 0; row <= (M-h); row++) // iterate over all rows
{
start = 0; end = -1; // initialize interval
while(end < N) // iterate across the row
{
valid_interval = false;
// increment end until the interval sums to > k:
while(end < (N-1))
{
end = end + 1;
if(rectangle_sum(start, row, end, row + height) > k)
{
valid_interval = true;
break;
}
}
if(!valid_interval)
break;
// shrink interval by incrementing start:
while((start < end) &&
rectangle_sum(start+1, row, end, row + height) > k))
start = start + 1;
compare (start, row), (end, row + height) with current smallest
submatrix and make it the new current if it is smaller
}
}
}

I have seen a number of answers to matrix rectangle problems here which worked by solving a similar 1-dimensional problem and then applying this to every row of the matrix, every row formed by taking the sum of two adjacent rows, every sum from three adjacent rows, and so on. So here's an attempt at finding the smallest interval in a line which has at least a given sum. (Clearly, if your matrix is tall and thin instead of short and fat you would work with columns instead of rows)
Work from left to right, maintaining the sums of all prefixes of the values seen so far, up to the current position. The value of an interval ending in a position is the sum up to and including that position, minus the sum of a prefix which ends just before the interval starts. So if you keep a list of the prefix sums up to just before the current position you can find, at each point, the shortest interval ending at that point which passes your threshold. I'll explain how to search for this efficiently in the next paragraph.
In fact, you probably don't need a list of all prefix sums. Smaller prefix sums are more valuable, and prefix sums which end further along are more valuable. So any prefix sum which ends before another prefix sum and is also larger than that other prefix sum is pointless. So the prefix sums you want can be arranged into a list which retains the order in which they were calculated but also has the property that each prefix sum is smaller than the prefix sum to the right of it. This means that when you want to find the closest prefix sum which is at most a given value you can do this by binary search. It also means that when you calculate a new prefix sum you can put it into its place in the list by just discarding all prefix sums at the right hand end of the list which are larger than it, or equal to it.

Related

Max path sum in a 2D array

Consider the following question:
Given a 2D array of unsigned integers and a maximum length n, find a path in that matrix that is not longer than n and which maximises the sum. The output should consist of both the path and the sum.
A path consists of neighbouring integers that are either all in the same row, or in the same column, or down a diagonal in the down-right direction.
For example, consider the following matrix and a given path length limit of 3:
1 2 3 4 5
2 1 2 2 1
3 4 5* 6 5
3 3 5 10* 5
1 2 5 7 15*
The most optimal path would be 5 + 10 + 15 (nodes are marked with *).
Now, upon seeing this problem, immediately a Dynamic Programming solution seems to be most appropriate here, given this problem's similarity to other problems like Min Cost Path or Maximum Sum Rectangular Submatrix. The issue is that in order to correctly solve this problem, you need to start building up the paths from every integer (node) in the matrix and not just start the path from the top left and end on the bottom right.
I was initially thinking of an approach similar to that of the solution for Maximum Sum Rectangular Submatrix in which I could store each possible path from every node (with path length less than n, only going right/down), but the only way I can envision that approach is by making recursive calls for down and right from each node which would seem to defeat the purpose of DP. Also, I need to be able to store the max path.
Another possible solution I was thinking about was somehow adapting a longest path search and running it from each int in the graph where each int is like an edge weight.
What would be the most efficient way to find the max path?
The challenge here is to avoid to sum the same nodes more than once. For that you could apply the following algorithm:
Algorithm
For each of the 3 directions (down, down+right, right) perform steps 2 and 3:
Determine the number of lines that exist in this direction. For the downward direction, this is the number of columns. For the rightward direction, this is the number of rows. For the diagonal direction, this is the number of diagonal lines, i.e. the sum of the number of rows and columns minus 1, as depicted by the red lines below:
For each line do:
Determine the first node on that line (call it the "head"), and also set the "tail" to that same node. These two references refer to the end points of the "current" path. Also set both the sum and path-length to zero.
For each head node on the current line perform the following bullet points:
Add the head node's value to the sum and increase the path length
If the path length is larger than the allowed maximum, subtract the tail's value from the sum, and set the tail to the node that follows it on the current line
Whenever the sum is greater than the greatest sum found so far, remember it together with the path's location.
Set the head to the node that follows it on the current line
At the end return the greatest sum and the path that generated this sum.
Code
Here is an implementation in basic JavaScript:
function maxPathSum(matrix, maxLen) {
var row, rows, col, cols, line, lines, dir, dirs, len,
headRow, headCol, tailRow, tailCol, sum, maxSum;
rows = matrix.length;
cols = matrix[0].length;
maxSum = -1;
dirs = 3; // Number of directions that paths can follow
if (maxLen == 1 || cols == 1)
dirs = 1; // Only need to check downward directions
for (dir = 1; dir <= 3; dir++) {
// Number of lines in this direction to try paths on
lines = [cols, rows, rows + cols - 1][dir-1];
for (line = 0; line < lines; line++) {
sum = 0;
len = 0;
// Set starting point depending on the direction
headRow = [0, line, line >= rows ? 0 : line][dir-1];
headCol = [line, 0, line >= rows ? line - rows : 0][dir-1];
tailRow = headRow;
tailCol = headCol;
// Traverse this line
while (headRow < rows && headCol < cols) {
// Lengthen the path at the head
sum += matrix[headRow][headCol];
len++;
if (len > maxLen) {
// Shorten the path at the tail
sum -= matrix[tailRow][tailCol];
tailRow += dir % 2;
tailCol += dir >> 1;
}
if (sum > maxSum) {
// Found a better path
maxSum = sum;
path = '(' + tailRow + ',' + tailCol + ') - '
+ '(' + headRow + ',' + headCol + ')';
}
headRow += dir % 2;
headCol += dir >> 1;
}
}
}
// Return the maximum sum and the string representation of
// the path that has this sum
return { maxSum, path };
}
// Sample input
var matrix = [
[1, 2, 3, 4, 5],
[2, 1, 2, 2, 1],
[3, 4, 5, 5, 5],
[3, 3, 5, 10, 5],
[1, 2, 5, 5, 15],
];
var best = maxPathSum(matrix, 3);
console.log(best);
Some details about the code
Be aware that row/column indexes start at 0.
The way the head and tail coordinates are incremented is based on the binary representation of the dir variable: it takes these three values (binary notation): 01, 10, 11
You can then take the first bit to indicate whether the next step in the direction is on the next column (1) or not (0), and the second bit to indicate whether it is on the next row (1) or not (0). You can depict it like this, where 00 represents the "current" node:
00 10
01 11
So we have this meaning to the values of dir:
01: walk along the column
10: walk along the row
11: walk diagonally
The code uses >>1 for extracting the first bit, and % 2 for extracting the last bit. That operation will result in a 0 or 1 in both cases, and is the value that needs to be added to either the column or the row.
The following expression creates a 1D array and takes one of its values on-the-fly:
headRow = [0, line, line >= rows ? 0 : line][dir-1];
It is short for:
switch (dir) {
case 1:
headRow = 0;
break;
case 2:
headRow = line;
break;
case 3:
if (line >= rows)
headRow = 0
else
headRow = line;
break;
}
Time and space complexity
The head will visit each node exactly once per direction. The tail will visit fewer nodes. The number of directions is constant, and the max path length value does not influence the number of head visits, so the time complexity is:
Θ(rows * columns)
There are no additional arrays used in this algorithm, just a few primitive variables. So the additional space complexity is:
Θ(1)
which both are the best you could hope for.
Is it Dynamic Programming?
In a DP solution you would typically use some kind of tabulation or memoization, possibly in the form of a matrix, where each sub-result found for a particular node is input for determining the result for neighbouring nodes.
Such solutions could need Θ(rows*columns) extra space. But this problem can be solved without such (extensive) space usage. When looking at one line at a time (a row, a column or a diagonal), the algorithm has some similarities with Kadane's algorithm:
One difference is that here the choice to extend or shorten the path/subarray is not dependent on the matrix data itself, but on the given maximum length. This is also related to the fact that here all values are guaranteed to be non-negative, while Kadane's algorithm is suitable for signed numbers.
Just like with Kadane's algorithm the best solution so far is maintained in a separate variable.
Another difference is that here we need to look in three directions. But that just means repeating the same algorithm in those three directions, while carrying over the best solution found so far.
This is a very basic use of Dynamic Programming, since you don't need the tabulation or memoization techniques here. We only keep the best results in the variables sum and maxSum. That cannot be viewed as tabluation or memoization, which typically keep track of several competing results that must be compared at some time. See this interesting answer on the subject.
Use F[i][j][k] as the max path sum where the path has length k and ends at position (i, j).
F[i][j][k] can be computed from F[i-1][j][k-1] and F[i][j-1][k-1].
The answer would be the maximum value of F.
To retrieve the max path, use another table G[i][j][k] to store the last step of F[i][j][k], i.e. it comes from (i-1,j) or (i,j-1).
The constraints are that the path can only be created by going down or to the right in the matrix.
Solution complexity O(N * M * L) where:
N: number of rows
M: number of columns
L: max length of the path
int solve(int x, int y, int l) {
if(x > N || y > M) { return -INF; }
if(l == 1) {matrix[x][y];}
if(dp[x][y][l] != -INF) {return dp[x][y][l];} // if cached before, return the answer
int option1 = solve(x+1, y, l-1); // take a step down
int option2 = solve(x, y+1, l-1); // take a step right
maxPath [x][n][l] = (option1 > option2 ) ? DOWN : RIGHT; // to trace the path
return dp[x][y][l] = max(option1, option2) + matrix[x][y];
}
example: solve(3,3,3): max path sum starting from (3,3) with length 3 ( 2 steps)

Find largest cuboid containing only 1's in an NxNxN binary array

Given an NxNxN binary array (containing only 0's or 1's), how can we obtain the largest cuboid with a non-trivial solution i.e. in O(N^3) ?
--
It is the same problem that Find largest rectangle containing only zeros in an N×N binary matrix but in an upper dimension.
Also, in my case, the largest rectangle can "cross the edge" of the array i.e. the space is like a torus for a 2D matrix.
For a 2D array, if the entry is :
00111
00111
11000
00000
00111
the solution depicted by 'X' is
00XXX
00XXX
11000
00000
00XXX
I've done the computation for a NxN binary array and find a solution for the largest rectangle problem in O(N^2) by following the idea in http://tech-queries.blogspot.de/2011/03/maximum-area-rectangle-in-histogram.html.
But I don't know how to apply it for a 3D array.
--
Example for a 3x3x3 array where the solution "cross the edge":
111
100
011
111
001
111
011
110
011
the solution should be:
1XX
100
0XX
1XX
001
1XX
0XX
110
0XX
This solution has O(N3 log2 N) complexity (may be optimized to O(N3 log N)). Additional integer array of size 2*8*N3 will be needed.
Compute r(i,j,k): for each of the N2 rows, compute cumulative sum of all non-zero elements, resetting it when a zero element is found.
Perform the following steps for various values of K, using Golden section search (or Fibonacci search) to find the maximum result.
Compute c(i,j,k): for each of the N2 columns, compute cumulative sum of all elements with r(i,j,k) >= K, resetting it when an element with r(i,j,k) < K is found. For good visualization of steps 1 and 2, see this answer.
Perform the last step for various values of M, using Golden section search to find the maximum result.
Compute sum: for each of the N2 values of 3rd coordinate, compute cumulative sum of all elements with c(i,j,k) >= M, resetting it when an element with c(i,j,k) < M is found. Calculate sumKM and update the best so far result if necessary.
"cross the edge" property of the array is handled in obvious way: iterate every index twice and keep all cumulative sums not larger than N.
For multidimensional case, this algorithm has O(ND logD-1 N) time complexity and O(D*ND) space complexity.
Optimization to O(N3 log N)
Step 4 of the algorithm sets a global value for M. This step may be excluded (and complexity decreased by log N) if value for M is determined locally.
To do this, step 5 should be improved. It should maintain a double-ended queue (head of which contains local value of M) and a stack (keeping starting positions for all values of M, evicted from the queue).
While c(i,j,k) increases, it is appended to the tail of the queue.
If c(i,j,k) decreases, all larger values are removed from the queue's tail. If it decreases further (queue is empty), stack is used to restore 'sum' value and put corresponding 'M' value to the queue.
Then several elements may be removed from the head of the queue (and pushed to the stack) if this allows to increase local solution's value.
For multidimensional case, this optimization gives O(ND logD-2 N) complexity.
Here is only O(N^4).
Lets assume you are storing cubiod in bool cuboid[N][N][N];
bool array2d[N][N];
for(int x_min = 0; x_min < N; x_min++) {
//initializing array2d
for(int y = 0; y < N; y++) {
for(int z = 0; z < N; z++) {
array2d[y][z] = true;
}
}
//computation
for(int x_max = x_min; x_max < N; x_max++) {
// now we want to find largest cube that
// X coordinates are equal to x_min and x_max
// cells at y,z can be used in cube if and only if
// there are only 1's in cuboid[x][y][z] where x_min <= x <= x_max
// so lets compute for each cell in array2d,
// if are only 1's in cuboid[x][y][z] where x_min <= x <= x_max
for(int y = 0; y < N; y++) {
for(int z = 0; z < N; z++) {
array2d[y][z] &= cubiod[x_max][y][z];
}
}
//you already know how to find largest rectangle in 2d in O(N^2)
local_volume = (x_max - x_min + 1) * find_largest_area(array2d);
largest_volume = max(largest_volumne, local_volume);
}
}
You can use the same trick, to compute best solution in X dimentions. Just reduce the problem to X-1 dimensions. Complexity: O(N^(2*X-2)).

Minimum value of maximum values in sub-segments ... in O(n) complexity

I interviewed with Amazon a few days ago. I could not answer one of the questions the asked me to their satisfaction. I have tried to get the answer after the interview but I have not been successful so far. Here is the question:
You have an array of integers of size n. You are given parameter k where k < n. For each segment of consecutive elements of size k in the array you need to calculate the maximum value. You only need to return the minimum value of these maximum values.
For instance given 1 2 3 1 1 2 1 1 1 and k = 3 the answer is 1.
The segments would be 1 2 3, 2 3 1, 3 1 1, 1 1 2, 1 2 1, 2 1 1, 1 1 1.
The maximum values in each segment are 3, 3, 3, 2, 2, 2, 1.
The minimum of these values are 1 thus the answer is 1.
The best answer I came up with is of complexity O(n log k). What I do is to create a binary search tree with the first k elements, get the maximum value in the tree and save it in variable minOfMax, then loop one element at a time with the remaining elements in the array, remove the first element in the previous segment from the binary search tree, insert the last element of the new segment in the tree, get the maximum element in the tree and compare it with minOfMax leaving in minOfMax the min value of the two.
The ideal answer needs to be of complexity O(n).
Thank you.
There is a very clever way to do this that's related to this earlier question. The idea is that it's possible to build a queue data structure that supports enqueue, dequeue, and find-max in amortized O(1) time (there are many ways to do this; two are explained in the original question). Once you have this data structure, begin by adding the first k elements from the array into the queue in O(k) time. Since the queue supports O(1) find-max, you can find the maximum of these k elements in O(1) time. Then, continuously dequeue an element from the queue and enqueue (in O(1) time) the next array element. You can then query in O(1) what the maximum of each of these k-element subarrays are. If you track the minimum of these values that you see over the course of the array, then you have an O(n)-time, O(k)-space algorithm for finding the minimum maximum of the k-element subarrays.
Hope this helps!
#templatetypedef's answer works, but I think I have a more direct approach.
Start by computing the max for the following (closed) intervals:
[k-1, k-1]
[k-2, k-1]
[k-3, k-1]
...
[0, k-1]
Note that each of these can be computed in constant time from the preceeding one.
Next, compute the max for these intervals:
[k, k]
[k, k+1]
[k, k+2]
...
[k, 2k-1]
Now these intervals:
[2k-1, 2k-1]
[2k-2, 2k-1]
[2k-3, 2k-1]
...
[k+1, 2k-1]
Next you do the intervals from 2k to 3k-1 ("forwards intervals"), then from 3k-1 down to 2k+1 ("backwards intervals"). And so on until you reach the end of the array.
Put all of these into a big table. Note that each entry in this table took constant time to compute. Observe that there are at most 2*n intervals in the table (because each element appears once on the right side of a "forwards interval" and once on the left side of a "backwards interval").
Now, if [a,b] is any interval of width k, it must contain exactly one of 0, k, 2k, ...
Say it contains m*k.
Observe that the intervals [a, m*k-1] and [m*k ,b] are both somewhere in our table. So we can simply look up the max for each, and the max of those two values is the max of the interval [a,b].
So for any interval of width k, we can use our table to get its maximum in constant time. We can generate the table in O(n) time. Result follows.
I implemented (and commented) templatetypedef's answer in C#.
n is array length, k is window size.
public static void printKMax(int[] arr, int n, int k)
{
Deque<int> qi = new Deque<int>();
int i;
for (i=0 ; i < k ; i++) // The first window of the array
{
while ((qi.Count > 0) && (arr[i] >= arr[qi.PeekBack()]))
{
qi.PopBack();
}
qi.PushBack(i);
}
for(i=k ; i < n ; ++i)
{
Console.WriteLine(arr[qi.PeekFront()]); // the first item is the largest element in previous window. The second item is its index.
while (qi.Count > 0 && qi.PeekFront() <= i - k)
{
qi.PopFront(); //When it's out of its window k
}
while (qi.Count>0 && arr[i] >= arr[qi.PeekBack()])
{
qi.PopBack();
}
qi.PushBack(i);
}
Console.WriteLine(arr[qi.PeekFront()]);
}

Total number of possible triangles from n numbers

If n numbers are given, how would I find the total number of possible triangles? Is there any method that does this in less than O(n^3) time?
I am considering a+b>c, b+c>a and a+c>b conditions for being a triangle.
Assume there is no equal numbers in given n and it's allowed to use one number more than once. For example, we given a numbers {1,2,3}, so we can create 7 triangles:
1 1 1
1 2 2
1 3 3
2 2 2
2 2 3
2 3 3
3 3 3
If any of those assumptions isn't true, it's easy to modify algorithm.
Here I present algorithm which takes O(n^2) time in worst case:
Sort numbers (ascending order).
We will take triples ai <= aj <= ak, such that i <= j <= k.
For each i, j you need to find largest k that satisfy ak <= ai + aj. Then all triples (ai,aj,al) j <= l <= k is triangle (because ak >= aj >= ai we can only violate ak < a i+ aj).
Consider two pairs (i, j1) and (i, j2) j1 <= j2. It's easy to see that k2 (found on step 2 for (i, j2)) >= k1 (found one step 2 for (i, j1)). It means that if you iterate for j, and you only need to check numbers starting from previous k. So it gives you O(n) time complexity for each particular i, which implies O(n^2) for whole algorithm.
C++ source code:
int Solve(int* a, int n)
{
int answer = 0;
std::sort(a, a + n);
for (int i = 0; i < n; ++i)
{
int k = i;
for (int j = i; j < n; ++j)
{
while (n > k && a[i] + a[j] > a[k])
++k;
answer += k - j;
}
}
return answer;
}
Update for downvoters:
This definitely is O(n^2)! Please read carefully "An Introduction of Algorithms" by Thomas H. Cormen chapter about Amortized Analysis (17.2 in second edition).
Finding complexity by counting nested loops is completely wrong sometimes.
Here I try to explain it as simple as I could. Let's fix i variable. Then for that i we must iterate j from i to n (it means O(n) operation) and internal while loop iterate k from i to n (it also means O(n) operation). Note: I don't start while loop from the beginning for each j. We also need to do it for each i from 0 to n. So it gives us n * (O(n) + O(n)) = O(n^2).
There is a simple algorithm in O(n^2*logn).
Assume you want all triangles as triples (a, b, c) where a <= b <= c.
There are 3 triangle inequalities but only a + b > c suffices (others then hold trivially).
And now:
Sort the sequence in O(n * logn), e.g. by merge-sort.
For each pair (a, b), a <= b the remaining value c needs to be at least b and less than a + b.
So you need to count the number of items in the interval [b, a+b).
This can be simply done by binary-searching a+b (O(logn)) and counting the number of items in [b,a+b) for every possibility which is b-a.
All together O(n * logn + n^2 * logn) which is O(n^2 * logn). Hope this helps.
If you use a binary sort, that's O(n-log(n)), right? Keep your binary tree handy, and for each pair (a,b) where a b and c < (a+b).
Let a, b and c be three sides. The below condition must hold for a triangle (Sum of two sides is greater than the third side)
i) a + b > c
ii) b + c > a
iii) a + c > b
Following are steps to count triangle.
Sort the array in non-decreasing order.
Initialize two pointers ‘i’ and ‘j’ to first and second elements respectively, and initialize count of triangles as 0.
Fix ‘i’ and ‘j’ and find the rightmost index ‘k’ (or largest ‘arr[k]‘) such that ‘arr[i] + arr[j] > arr[k]‘. The number of triangles that can be formed with ‘arr[i]‘ and ‘arr[j]‘ as two sides is ‘k – j’. Add ‘k – j’ to count of triangles.
Let us consider ‘arr[i]‘ as ‘a’, ‘arr[j]‘ as b and all elements between ‘arr[j+1]‘ and ‘arr[k]‘ as ‘c’. The above mentioned conditions (ii) and (iii) are satisfied because ‘arr[i] < arr[j] < arr[k]'. And we check for condition (i) when we pick 'k'
4.Increment ‘j’ to fix the second element again.
Note that in step 3, we can use the previous value of ‘k’. The reason is simple, if we know that the value of ‘arr[i] + arr[j-1]‘ is greater than ‘arr[k]‘, then we can say ‘arr[i] + arr[j]‘ will also be greater than ‘arr[k]‘, because the array is sorted in increasing order.
5.If ‘j’ has reached end, then increment ‘i’. Initialize ‘j’ as ‘i + 1′, ‘k’ as ‘i+2′ and repeat the steps 3 and 4.
Time Complexity: O(n^2).
The time complexity looks more because of 3 nested loops. If we take a closer look at the algorithm, we observe that k is initialized only once in the outermost loop. The innermost loop executes at most O(n) time for every iteration of outer most loop, because k starts from i+2 and goes upto n for all values of j. Therefore, the time complexity is O(n^2).
I have worked out an algorithm that runs in O(n^2 lgn) time. I think its correct...
The code is wtitten in C++...
int Search_Closest(A,p,q,n) /*Returns the index of the element closest to n in array
A[p..q]*/
{
if(p<q)
{
int r = (p+q)/2;
if(n==A[r])
return r;
if(p==r)
return r;
if(n<A[r])
Search_Closest(A,p,r,n);
else
Search_Closest(A,r,q,n);
}
else
return p;
}
int no_of_triangles(A,p,q) /*Returns the no of triangles possible in A[p..q]*/
{
int sum = 0;
Quicksort(A,p,q); //Sorts the array A[p..q] in O(nlgn) expected case time
for(int i=p;i<=q;i++)
for(int j =i+1;j<=q;j++)
{
int c = A[i]+A[j];
int k = Search_Closest(A,j,q,c);
/* no of triangles formed with A[i] and A[j] as two sides is (k+1)-2 if A[k] is small or equal to c else its (k+1)-3. As index starts from zero we need to add 1 to the value*/
if(A[k]>c)
sum+=k-2;
else
sum+=k-1;
}
return sum;
}
Hope it helps........
possible answer
Although we can use binary search to find the value of 'k' hence improve time complexity!
N0,N1,N2,...Nn-1
sort
X0,X1,X2,...Xn-1 as X0>=X1>=X2>=...>=Xn-1
choice X0(to Xn-3) and choice form rest two item x1...
choice case of (X0,X1,X2)
check(X0<X1+X2)
OK is find and continue
NG is skip choice rest
It seems there is no algorithm better than O(n^3). In the worst case, the result set itself has O(n^3) elements.
For Example, if n equal numbers are given, the algorithm has to return n*(n-1)*(n-2) results.

Suggest an Efficient Algorithm

Given an Array arr of size 100000, each element 0 <= arr[i] < 100. (not sorted, contains duplicates)
Find out how many triplets (i,j,k) are present such that arr[i] ^ arr[j] ^ arr[k] == 0
Note : ^ is the Xor operator. also 0 <= i <= j <= k <= 100000
I have a feeling i have to calculate the frequencies and do some calculation using the frequency, but i just can't seem to get started.
Any algorithm better than the obvious O(n^3) is welcome. :)
It's not homework. :)
I think the key is you don't need to identify the i,j,k, just count how many.
Initialise an array size 100
Loop though arr, counting how many of each value there are - O(n)
Loop through non-zero elements of the the small array, working out what triples meet the condition - assume the counts of the three numbers involved are A, B, C - the number of combinations in the original arr is (A+B+C)/!A!B!C! - 100**3 operations, but that's still O(1) assuming the 100 is a fixed value.
So, O(n).
Possible O(n^2) solution, if it works: Maintain variable count and two arrays, single[100] and pair[100]. Iterate the arr, and for each element of value n:
update count: count += pair[n]
update pair: iterate array single and for each element of index x and value s != 0 do pair[s^n] += single[x]
update single: single[n]++
In the end count holds the result.
Possible O(100 * n) = O(n) solution.
it solve problem i <= j <= k.
As you know A ^ B = 0 <=> A = B, so
long long calcTripletsCount( const vector<int>& sourceArray )
{
long long res = 0;
vector<int> count(128);
vector<int> countPairs(128);
for(int i = 0; i < sourceArray.size(); i++)
{
count[sourceArray[i]]++; // count[t] contain count of element t in (sourceArray[0]..sourceArray[i])
for(int j = 0; j < count.size(); j++)
countPairs[j ^ sourceArray[i]] += count[j]; // countPairs[t] contain count of pairs p1, p2 (p1 <= p2 for keeping order) where t = sourceArray[i] ^ sourceArray[j]
res += countPairs[sourceArray[i]]; // a ^ b ^ c = 0 if a ^ b = c, we add count of pairs (p1, p2) where sourceArray[p1] ^ sourceArray[p2] = sourceArray[i]. it easy to see that we keep order(p1 <= p2 <= i)
}
return res;
}
Sorry for my bad English...
I have a (simple) O(n^2 log n) solution which takes into account the fact that i, j and k refer to indices, not integers.
A simple first pass allow us to build an array A of 100 values: values -> list of indices, we keep the list sorted for later use. O(n log n)
For each pair i,j such that i <= j, we compute X = arr[i]^arr[j]. We then perform a binary search in A[X] to locate the number of indices k such that k >= j. O(n^2 log n)
I could not find any way to leverage sorting / counting algorithm because they annihilate the index requirement.
Sort the array, keeping a map of new indices to originals. O(nlgn)
Loop over i,j:i<j. O(n^2)
Calculate x = arr[i] ^ arr[j]
Since x ^ arr[k] == 0, arr[k] = x, so binary search k>j for x. O(lgn)
For all found k, print mapped i,j,k
O(n^2 lgn)
Start with a frequency count of the number of occurrences of each number between 1 and 100, as Paul suggests. This produces an array freq[] of length 100.
Next, instead of looping over triples A,B,C from that array and testing the condition A^B^C=0,
loop over pairs A,B with A < B. For each A,B, calculate C=A^B (so that now A^B^C=0), and verify that A < B < C < 100. (Any triple will occur in some order, so this doesn't miss triples. But see below). The running total will look like:
Sum+=freq[A]*freq[B]*freq[C]
The work is O(n) for the frequency count, plus about 5000 for the loop over A < B.
Since every triple of three different numbers A,B,C must occur in some order, this finds each such triple exactly once. Next you'll have to look for triples in which two numbers are equal. But if two numbers are equal and the xor of three of them is 0, the third number must be zero. So this amounts to a secondary linear search for B over the frequency count array, counting occurrences of (A=0, B=C < 100). (Be very careful with this case, and especially careful with the case B=0. The count is not just freq[B] ** 2 or freq[0] ** 3. There is a little combinatorics problem hiding there.)
Hope this helps!

Resources