Find three elements in a sorted array which sum to a fourth element - arrays

A friend of mine recently got this interview question, which seems to us to be solvable but not within the asymptotic time bounds that the interviewer thought should be possible. Here is the problem:
You have an array of N integers, xs, sorted but possibly non-distinct. Your goal is to find four array indices(1) (a,b,c,d) such that the following two properties hold:
xs[a] + xs[b] + xs[c] = xs[d]
a < b < c < d
The goal is to do this in O(N2) time.
First, an O(N3log(N)) solution is obvious: for each (a,b,c) ordered triple, use binary search to see if an appropriate d can be found. Now, how to do better?
One interesting suggestion from the interviewer is to rewrite the first condition as:
xs[a] + xs[b] = xs[d] - xs[c]
It's not clear what to do after this, but perhaps we could chose some pivot value P, and search for an (a,b) pair adding up to P, and a (d,c) pair subtracting to it. That search is easy enough to do in O(n) time for a given P, by searching inwards from both ends of the array. However, it seems to me that the problem with this is that there are N2 such values P, not just N of them, so we haven't actually reduced the problem size at all: we're doing O(N) work, O(N2) times.
We found some related problems being discussed online elsewhere: Find 3 numbers in an array adding to a given sum is solvable in N2 time, but requires that the sum be fixed ahead of time; adapting the same algorithm but iterating through each possible sum leaves us at N3 as always.
Another related problem seems to be Find all triplets in array with sum less than or equal to given sum, but I'm not sure how much of the stuff there is relevant here: an inequality rather than an equality mixes things up quite a bit, and of course the target is fixed rather than varying.
So, what are we missing? Is the problem impossible after all, given the performance requirements? Or is there a clever algorithm we're unable to spot?
(1) Actually the problem as posed is to find all such (a,b,c,d) tuples, and return a count of how many there are. But I think even finding a single one of them in the required time constraints is hard enough.

If the algorithm would have to list the solutions (i.e. the sets of a, b, c, and d that satisfy the condition), the worst case time complexity is O(n4):
1. There can be O(n4) solutions
The trivial example is an array with only 0 values in it. Then a, b, c and d have all the freedom as long as they stay in order. This represents O(n4) solutions.
But more generally arrays which follow the following pattern have O(n4) solutions:
w, w, w, ... x, x, x, ..., y, y, y, ... z, z, z, ....
With just as many occurrences of each, and:
w + x + y = z
However, to only produce the number of solutions, an algorithm can have a better time complexity.
2. Algorithm
This is a slight variation of the already posted algorithm, which does not involve the H factor. It also describes how to handle cases where different configurations lead to the same sums.
Retrieve all pairs and store them in an array X, where each element gets the following information:
a: the smallest index of the two
b: the other index
sum: the value of xs[a] + xs[b]
At the same time also store for each such pair in another array Y, the following:
c: the smallest index of the two
d: the other index
sum: the value of xs[d] - xs[c]
The above operation has a time complexity of O(n²)
Sort both arrays by their element's sum attribute. In case of equal sum values, the sort order will be determined as follows: for the X array by increasing b; for the Y array by decreasing c. Sorting can be done in O(n²) O(n²logn) time.
[Edit: I could not prove the earlier claim of O(n²) (unless some assumptions are made that allow for a radix/bucket sorting algorithm, which I will not assume). As noted in comments, in general an array with n² elements can be sorted in O(n²logn²), which is O(n²logn), but not O(n²)]
Go through both arrays in "tandem" to find pairs of sums that are equal. If that is the case, it needs to be checked that X[i].b < Y[j].c. If so it represents a solution. But there could be many of them, and counting those in an acceptable time needs special care.
Let m = n(n-1)/2, i.e. the number of elements in array X (which is also the size of array Y):
i = 0
j = 0
while i < m and j < m:
if X[i].sum < Y[j].sum:
i = i + 1
elif X[i].sum > Y[j].sum:
j = j + 1
else:
# We have a solution. Need to count all others that have same sums in X and Y.
# Find last match in Y and set k as index to it:
countY = 0
while k < m and X[i].sum == Y[j].sum and X[i].b < Y[j].c:
countY = countY + 1
j = j + 1
k = j - 1
# add chunks to `count`:
while i < m and countY >= 0 and X[i].sum == Y[k].sum:
while countY >= 0 and X[i].b >= Y[k].c:
countY = countY - 1
k = k - 1
count = count + countY
i = i + 1
Note that although there are nested loops, the variable i only ever increments, and so does j. The variable k always decrements in the innermost loop. Although it also gets higher values to start from, it can never address the same Y element more than a constant number of times via the k index, because while decrementing this index, it stays within the "same sum" range of Y.
So this means that this last part of the algorithm runs in O(m), which is O(n²). As my latest edit confirmed that the sorting step is not O(n²), that step determines the overall time-complexity: O(n²logn).

So one solution can be :
List all x[a] + x[b] value possible such that a < b and hash them in this fashion
key = (x[a]+x[b]) and value = (a,b).
Complexity of this step - O(n^2)
Now List all x[d] - x[c] values possible such that d > c. Also for each x[d] - x[c] search the entry in your hash map by querying. We have a solution if there exists an entry such that c > b for any hit.
Complexity of this step - O(n^2) * H.
Where H is the search time in your hashmap.
Total complexity - O(n^2)* H. Now H may be O(1). This could done if the range of values in the array is small. Also the choice of hash function would depend on the properties of elements in the array.

Related

Number of ways such that sum of k elements equal to p

Given series of integers having relation where a number is equal to sum of previous 2 numbers and starting integer is 1
Series ->1,2,3,5,8,13,21,34,55
find the number of ways such that sum of k elements equal to p.We can use an element any number of times.
p=8
k=4.
So,number of ways would be 4.Those are,
1,1,1,5
1,1,3,3
1,2,2,3
2,2,2,2
I am able to sove this question through recursion.I sense dynamic programming here but i am not getting how to do it.Can it be done in much lesser time???
EDIT I forgot to mention that the sequence of the numbers does not matter and will be counted once. for ex=3->(1,2)and(2,1).here number of ways would be 1 only.
EDIT: Poster has changed the original problem since this was posted. My algorithm still works, but maybe can be improved upon. Original problem had n arbitrary input numbers (he has now modified it to be a Fibonacci series). To apply my algorithm to the modified post, truncate the series by taking only elements less than p (assume there are n of them).
Here's an n^(k/2) algorithm. (n is the number of elements in the series)
Use a table of length p, such that table[i] contains all combinations of k/2 elements that sum to i. For example, in the example data that you provided, table[4] contains {1,3} and {2,2}.
EDIT: If the space is prohibitive, this same algorithm can be done with an ordered linked lists, where you only store the non-empty table entries. The linked list has to be both directions: forward and backwards, which makes the final step of the algorithm cleaner.
Once this table is computed, then we get all solutions by combining every table[j] with every table[p-j], whenever both are non-empty.
To get the table, initialize the entire thing to empty. Then:
For i_1 = 0 to n-1:
For i_2 = i_1 to n-1:
...
For i_k/2 = i_k/2-1 to n-1:
sum = series[i_1] + ... + series[i_k/2]
if sum <= p:
store {i_1, i_2, ... , i_k/2 } in table[sum]
This "variable number of loops" looks impossible to implement, but actually it can be done with an array of length k/2 that keeps track of where each i_` is.
Let's go back to your data and see how our table would look:
table[2] = {1,1}
table[3] = {1,2}
table[4] = {1,3} and {2,2}
table[5] = {2,3}
table[6] = {1,5}
table[7] = {2,5}
table[8] = {3,5}
Solutions are found by combining table[2] with table[6], table[3] with table[5], and table[4] with table[4]. Thus, solutions are: {1,1,1,5} {1,2,2,3}, {1,1,3,3}, {2,2,2,2}, {1,3,2,2}.
You can use dynamic programming. Let C(p, k) be the number of ways that sum k element equal to p and a be the array of elements. Then
C(p, k) = C(p - a[0], k - 1) + C(p - a[1], k - 1) + .... + C(p - a[n-1], k - 1)
Then, you can use memorization to speed up your code.
Hint:
Your problem is well-known. It is the sum set problem, a variation of knapsack problem. Check this pretty good explanation. sum-set problem

Find the median of the sum of the arrays

Two sorted arrays of length n are given and the question is to find, in O(n) time, the median of their sum array, which contains all the possible pairwise sums between every element of array A and every element of array B.
For instance: Let A[2,4,6] and B[1,3,5] be the two given arrays.
The sum array is [2+1,2+3,2+5,4+1,4+3,4+5,6+1,6+3,6+5]. Find the median of this array in O(n).
Solving the question in O(n^2) is pretty straight-forward but is there any O(n) solution to this problem?
Note: This is an interview question asked to one of my friends and the interviewer was quite sure that it can be solved in O(n) time.
The correct O(n) solution is quite complicated, and takes a significant amount of text, code and skill to explain and prove. More precisely, it takes 3 pages to do so convincingly, as can be seen in details here http://www.cse.yorku.ca/~andy/pubs/X+Y.pdf (found by simonzack in the comments).
It is basically a clever divide-and-conquer algorithm that, among other things, takes advantage of the fact that in a sorted n-by-n matrix, one can find in O(n) the amount of elements that are smaller/greater than a given number k. It recursively breaks down the matrix into smaller submatrixes (by taking only the odd rows and columns, resulting in a submatrix that has n/2 colums and n/2 rows) which combined with the step above, results in a complexity of O(n) + O(n/2) + O(n/4)... = O(2*n) = O(n). It is crazy!
I can't explain it better than the paper, which is why I'll explain a simpler, O(n logn) solution instead :).
O(n * logn) solution:
It's an interview! You can't get that O(n) solution in time. So hey, why not provide a solution that, although not optimal, shows you can do better than the other obvious O(n²) candidates?
I'll make use of the O(n) algorithm mentioned above, to find the amount of numbers that are smaller/greater than a given number k in a sorted n-by-n matrix. Keep in mind that we don't need an actual matrix! The Cartesian sum of two arrays of size n, as described by the OP, results in a sorted n-by-n matrix, which we can simulate by considering the elements of the array as follows:
a[3] = {1, 5, 9};
b[3] = {4, 6, 8};
//a + b:
{1+4, 1+6, 1+8,
5+4, 5+6, 5+8,
9+4, 9+6, 9+8}
Thus each row contains non-decreasing numbers, and so does each column. Now, pretend you're given a number k. We want to find in O(n) how many of the numbers in this matrix are smaller than k, and how many are greater. Clearly, if both values are less than (n²+1)/2, that means k is our median!
The algorithm is pretty simple:
int smaller_than_k(int k){
int x = 0, j = n-1;
for(int i = 0; i < n; ++i){
while(j >= 0 && k <= a[i]+b[j]){
--j;
}
x += j+1;
}
return x;
}
This basically counts how many elements fit the condition at each row. Since the rows and columns are already sorted as seen above, this will provide the correct result. And as both i and j iterate at most n times each, the algorithm is O(n) [Note that j does not get reset within the for loop]. The greater_than_k algorithm is similar.
Now, how do we choose k? That is the logn part. Binary Search! As has been mentioned in other answers/comments, the median must be a value contained within this array:
candidates[n] = {a[0]+b[n-1], a[1]+b[n-2],... a[n-1]+b[0]};.
Simply sort this array [also O(n*logn)], and run the binary search on it. Since the array is now in non-decreasing order, it is straight-forward to notice that the amount of numbers smaller than each candidate[i] is also a non-decreasing value (monotonic function), which makes it suitable for the binary search. The largest number k = candidate[i] whose result smaller_than_k(k) returns smaller than (n²+1)/2 is the answer, and is obtained in log(n) iterations:
int b_search(){
int lo = 0, hi = n, mid, n2 = (n²+1)/2;
while(hi-lo > 1){
mid = (hi+lo)/2;
if(smaller_than_k(candidate[mid]) < n2)
lo = mid;
else
hi = mid;
}
return candidate[lo]; // the median
}
Let's say the arrays are A = {A[1] ... A[n]}, and B = {B[1] ... B[n]}, and the pairwise sum array is C = {A[i] + B[j], where 1 <= i <= n, 1 <= j <= n} which has n^2 elements and we need to find its median.
Median of C must be an element of the array D = {A[1] + B[n], A[2] + B[n - 1], ... A[n] + B[1]}: if you fix A[i], and consider all the sums A[i] + B[j], you would see that the only A[i] + B[j = n + 1 - i] (which is one of D) could be the median. That is, it may not be the median, but if it is not, then all other A[i] + B[j] are also not median.
This can be proved by considering all B[j] and count the number of values that are lower and number of values that are greater than A[i] + B[j] (we can do this quite accurately because the two arrays are sorted -- the calculation is a bit messy thought). You'd see that for A[i] + B[n + 1 - j] these two counts are most "balanced".
The problem then reduces to finding median of D, which has only n elements. An algorithm such as Hoare's will work.
UPDATE: this answer is wrong. The real conclusion here is that the median is one of D's element, but then D's median is the not the same as C's median.
Doesn't this work?:
You can compute the rank of a number in linear time as long as A and B are sorted. The technique you use for computing the rank can also be used to find all things in A+B that are between some lower bound and some upper bound in time linear the size of the output plus |A|+|B|.
Randomly sample n things from A+B. Take the median, say foo. Compute the rank of foo. With constant probability, foo's rank is within n of the median's rank. Keep doing this (an expected constant number of times) until you have lower and upper bounds on the median that are within 2n of each other. (This whole process takes expected linear time, but it's obviously slow.)
All you have to do now is enumerate everything between the bounds and do a linear-time selection on a linear-sized list.
(Unrelatedly, I wouldn't excuse the interviewer for asking such an obviously crappy interview question. Stuff like this in no way indicates your ability to code.)
EDIT: You can compute the rank of a number x by doing something like this:
Set i = j = 0.
While j < |B| and A[i] + B[j] <= x, j++.
While i < |A| {
While A[i] + B[j] > x and j >= 0, j--.
If j < 0, break.
rank += j+1.
i++.
}
FURTHER EDIT: Actually, the above trick only narrows down the candidate space to about n log(n) members of A+B. Then you have a general selection problem within a universe of size n log(n); you can do basically the same trick one more time and find a range of size proportional to sqrt(n) log(n) where you do selection.
Here's why: If you sample k things from an n-set and take the median, then the sample median's order is between the (1/2 - sqrt(log(n) / k))th and the (1/2 + sqrt(log(n) / k))th elements with at least constant probability. When n = |A+B|, we'll want to take k = sqrt(n) and we get a range of about sqrt(n log n) elements --- that's about |A| log |A|. But then you do it again and you get a range on the order of sqrt(n) polylog(n).
You should use a selection algorithm to find the median of an unsorted list in O(n). Look at this: http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm

Need idea for solving this algorithm puzzle

I've came across some similar problems to this one in the past, and I still haven't got good idea how to solve this problem. Problem goes like this:
You are given an positive integer array with size n <= 1000 and k <= n which is the number of contiguous subarrays that you will have to split your array into. You have to output minimum m, where m = max{s[1],..., s[k]}, and s[i] is the sum of the i-th subarray. All integers in the array are between 1 and 100. Example :
Input: Output:
5 3 >> n = 5 k = 3 3
2 1 1 2 3
Splitting array into 2+1 | 1+2 | 3 will minimize the m.
My brute force idea was to make first subarray end at position i (for all possible i) and then try to split the rest of the array in k-1 subarrays in the best way possible. However, this is exponential solution and will never work.
So I'm looking for good ideas to solve it. If you have one please tell me.
Thanks for your help.
You can use dynamic programming to solve this problem, but you can actually solve with greedy and binary search on the answer. This algorithm's complexity is O(n log d), where d is the output answer. (An upper bound would be the sum of all the elements in the array.) (or O( n d ) in the size of the output bits)
The idea is to binary search on what your m would be - and then greedily move forward on the array, adding the current element to the partition unless adding the current element pushes it over the current m -- in that case you start a new partition. The current m is a success (and thus adjust your upper bound) if the numbers of partition used is less than or equal to your given input k. Otherwise, you used too many partitions, and raise your lower bound on m.
Some pseudocode:
// binary search
binary_search ( array, N, k ) {
lower = max( array ), upper = sum( array )
while lower < upper {
mid = ( lower + upper ) / 2
// if the greedy is good
if partitions( array, mid ) <= k
upper = mid
else
lower = mid
}
}
partitions( array, m ) {
count = 0
running_sum = 0
for x in array {
if running_sum + x > m
running_sum = 0
count++
running_sum += x
}
if running_sum > 0
count++
return count
}
This should be easier to come up with conceptually. Also note that because of the monotonic nature of the partitions function, you can actually skip the binary search and do a linear search, if you are sure that the output d is not too big:
for i = 0 to infinity
if partitions( array, i ) <= k
return i
Dynamic programming. Make an array
int best[k+1][n+1];
where best[i][j] is the best you can achieve splitting the first j elements of the array int i subarrays. best[1][j] is simply the sum of the first j array elements. Having row i, you calculate row i+1 as follows:
for(j = i+1; j <= n; ++j){
temp = min(best[i][i], arraysum[i+1 .. j]);
for(h = i+1; h < j; ++h){
if (min(best[i][h], arraysum[h+1 .. j]) < temp){
temp = min(best[i][h], arraysum[h+1 .. j]);
}
}
best[i+1][j] = temp;
}
best[m][n] will contain the solution. The algorithm is O(n^2*k), probably something better is possible.
Edit: a combination of the ideas of ChingPing, toto2, Coffee on Mars and rds (in the order they appear as I currently see this page).
Set A = ceiling(sum/k). This is a lower bound for the minimum. To find a good upper bound for the minimum, create a good partition by any of the mentioned methods, moving borders until you don't find any simple move that still decreases the maximum subsum. That gives you an upper bound B, not much larger than the lower bound (if it were much larger, you'd find an easy improvement by moving a border, I think).
Now proceed with ChingPing's algorithm, with the known upper bound reducing the number of possible branches. This last phase is O((B-A)*n), finding B unknown, but I guess better than O(n^2).
I have a sucky branch and bound algorithm ( please dont downvote me )
First take the sum of array and dvide by k, which gives you the best case bound for you answer i.e. the average A. Also we will keep a best solution seen so far for any branch GO ( global optimal ).Lets consider we put a divider( logical ) as a partition unit after some array element and we have to put k-1 partitions. Now we will put the partitions greedily this way,
Traverse the array elements summing them up until you see that at the next position we will exceed A, now make two branches one where you put the divider at this position and other where you put at next position, Do this recursiely and set GO = min (GO, answer for a branch ).
If at any point in any branch we have a partition greater then GO or the no of position are less then the partitions left to be put we bound. In the end you should have GO as you answer.
EDIT:
As suggested by Daniel, we could modify the divider placing strategy a little to place it until you reach sum of elements as A or the remaining positions left are less then the dividers.
This is just a sketch of an idea... I'm not sure that it works, but it's very easy (and probably fast too).
You start say by putting the separations evenly distributed (it does not actually matter how you start).
Make the sum of each subarray.
Find the subarray with the largest sum.
Look at the right and left neighbor subarrays and move the separation on the left by one if the subarray on the left has a lower sum than the one on the right (and vice-versa).
Redo for the subarray with the current largest sum.
You'll reach some situation where you'll keep bouncing the separation between the same two positions which will probably mean that you have the solution.
EDIT: see the comment by #rds. You'll have to think harder about bouncing solutions and the end condition.
My idea, which unfortunately does not work:
Split the array in N subarrays
Locate the two contiguous subarrays whose sum is the least
Merge the subarrays found in step 2 to form a new contiguous subarray
If the total number of subarrays is greater than k, iterate from step 2, else finish.
If your array has random numbers, you can hope that a partition where each subarray has n/k is a good starting point.
From there
Evaluate this candidate solution, by computing the sums
Store this candidate solution. For instance with:
an array of the indexes of every sub-arrays
the corresponding maximum of sum over sub-arrays
Reduce the size of the max sub-array: create two new candidates: one with the sub-array starting at index+1 ; one with sub-array ending at index-1
Evaluate the new candidates.
If their maximum is higher, discard
If their maximum is lower, iterate on 2, except if this candidate was already evaluated, in which case it is the solution.

Find 3 numbers in 3 sorted arrays with sum equal to some value [duplicate]

Assume we have three arrays of length N which contain arbitrary numbers of type long. Then we are given a number M (of the same type) and our mission is to pick three numbers A, B and C one from each array (in other words A should be picked from first array, B from second one and C from third) so the sum A + B + C = M.
Question: could we pick all three numbers and end up with time complexity of O(N2)?
Illustration:
Arrays are:
1) 6 5 8 3 9 2
2) 1 9 0 4 6 4
3) 7 8 1 5 4 3
And M we've been given is 19.
Then our choice would be 8 from first, 4 from second and 7 from third.
This can be done in O(1) space and O(N2) time.
First lets solve a simpler problem: Given two arrays A and B pick one element from each so that their sum is equal to given number K.
Sort both the arrays which takes O(NlogN).
Take pointers i and j so that i points to the start of the array A and j points to the end of B.
Find the sum A[i] + B[j] and compare it with K
if A[i] + B[j] == K we have found
the pair A[i] and B[j]
if A[i] + B[j] < K, we need to
increase the sum, so increment i.
if A[i] + B[j] > K, we need to
decrease the sum, so decrement j.
This process of finding the pair after sorting takes O(N).
Now lets take the original problem. We've got a third array now call it C.
So the algorithm now is :
foreach element x in C
find a pair A[i], B[j] from A and B such that A[i] + B[j] = K - x
end for
The outer loop runs N times and for each run we do a O(N) operation making the entire algorithm O(N2).
You can reduce it to the similar problem with two arrays, which is kinda famous and has simple O(n) solution (involving iterating from both ends).
Sort all arrays.
Try each number A from the first array once.
Find if the last two arrays can give us numbers B and C, such that B + C = M - A.
Steps 2 and 3 multiplied give us O(n^2) complexity.
The other solutions are already better, but here's my O(n^2) time and O(n) memory solution anyway.
Insert all elements of array C into a hashtable. (time complexity O(n), space O(n))
Take all pairs (a,b), a from A and b from B (time complexity O(n^2)).
For each pair, check if M-(a+b) exists in the hastable (complexity O(1) expected per query).
So, the overall time complexity is O(n^2), and a space complexity of O(n) for the hashtable.
Hash the last list. The time taken to do this is O(N) on that particular list but this will be added to the next phase.
The next phase is to create a "matrix" of the first two rows of their sums. Then look up in the hash if their matching number is there. Creating the matrix is O(N*N) whilst looking up in the hash is constant time.
1.Store A[i]*B[j] for all pair (i,j) in another array D, organized in the hash data-structure. The complexity of this step is O(N*N).
construct a hash named D
for i = 1 to n
for j = 1 to n
insert A[i]*B[j] into D
2.For each C[i] in the array C, find if M-C[i] exists in D. The complexity of this step is O(N).
for i = 1 to n
check if M - C[i] is in D
I have a solution. Insert all elements from one of the list into a hash table. This will not take O(n) time.
Once that is complete you find all the pairs from the remaining 2 arrays and see if their sum is present in the hash table.
Because hash hookup is constant we get a quadratic time in total.
Using this approach you save the time on sorting.
Another idea is if you know the max size of each element, you can use a variation of bucket sort and do it in nlogn time.
At the cost of O(N^2) space, but still using O(N^2) time, one could handle four arrays, by computing all possible sums from the first two arrays, and all possible residues from the last two, sort the lists (possible in linear time since they are all of type 'long', whose number of bits is independent of N), and then seeing if any sum equals any residue.
Sorting all 3 arrays and using binary search seems a better approach. Once the arrays are sorted, one should definitely go for binary search rather than linear search, which take n rather than log(n).
Hash table is also a viable option.
The combination of hash and sort can bring the time down but at cost of O(N square) space.
I have another O(N^2) time complexity, O(N) additional space complexity solution.
First, sort the three arrays, this step is O(N*log(N)). Then, for each element in A, create two arrays V = Ai + B and W = Ai + C (Ai is the current element). Ai + B means that each element of this new array V is the elemnent in that position in B plus Ai(the current element in A). W = Ai + C is similar.
Now, merge V and W, as in merge sort. Since both are sorted, this is O(N). In this new array with 2*N elements, search for M + Ai(because Ai is used twice). This can be done in O(log n) with binary search.
Therefore, total complexity is O(N^2).
Sort the three arrays.Then initialize three indices
i pointing to first element of A,
j pointing to last element of B and
k pointing to first element of C.
While i,j,k are in the limits of their respective arrays A,B,C
If A[i]+B[j]+C[k] == M return
If A[i]+B[j]+C[k] < M .Increment i if A[i]<=C[k] otherwise increment k.
If A[i]+B[j]+C[k] > M. Decrement j.
Which should run in O(n).
How about:
for a in A
for b in B
hash a*b
for c in C
if K-c is in hash
print a b c
The idea is to hash all possible pairs in A and B. Next for every element in C see if the residual zum is present in hash.

Interview question: three arrays and O(N*N)

Assume we have three arrays of length N which contain arbitrary numbers of type long. Then we are given a number M (of the same type) and our mission is to pick three numbers A, B and C one from each array (in other words A should be picked from first array, B from second one and C from third) so the sum A + B + C = M.
Question: could we pick all three numbers and end up with time complexity of O(N2)?
Illustration:
Arrays are:
1) 6 5 8 3 9 2
2) 1 9 0 4 6 4
3) 7 8 1 5 4 3
And M we've been given is 19.
Then our choice would be 8 from first, 4 from second and 7 from third.
This can be done in O(1) space and O(N2) time.
First lets solve a simpler problem: Given two arrays A and B pick one element from each so that their sum is equal to given number K.
Sort both the arrays which takes O(NlogN).
Take pointers i and j so that i points to the start of the array A and j points to the end of B.
Find the sum A[i] + B[j] and compare it with K
if A[i] + B[j] == K we have found
the pair A[i] and B[j]
if A[i] + B[j] < K, we need to
increase the sum, so increment i.
if A[i] + B[j] > K, we need to
decrease the sum, so decrement j.
This process of finding the pair after sorting takes O(N).
Now lets take the original problem. We've got a third array now call it C.
So the algorithm now is :
foreach element x in C
find a pair A[i], B[j] from A and B such that A[i] + B[j] = K - x
end for
The outer loop runs N times and for each run we do a O(N) operation making the entire algorithm O(N2).
You can reduce it to the similar problem with two arrays, which is kinda famous and has simple O(n) solution (involving iterating from both ends).
Sort all arrays.
Try each number A from the first array once.
Find if the last two arrays can give us numbers B and C, such that B + C = M - A.
Steps 2 and 3 multiplied give us O(n^2) complexity.
The other solutions are already better, but here's my O(n^2) time and O(n) memory solution anyway.
Insert all elements of array C into a hashtable. (time complexity O(n), space O(n))
Take all pairs (a,b), a from A and b from B (time complexity O(n^2)).
For each pair, check if M-(a+b) exists in the hastable (complexity O(1) expected per query).
So, the overall time complexity is O(n^2), and a space complexity of O(n) for the hashtable.
Hash the last list. The time taken to do this is O(N) on that particular list but this will be added to the next phase.
The next phase is to create a "matrix" of the first two rows of their sums. Then look up in the hash if their matching number is there. Creating the matrix is O(N*N) whilst looking up in the hash is constant time.
1.Store A[i]*B[j] for all pair (i,j) in another array D, organized in the hash data-structure. The complexity of this step is O(N*N).
construct a hash named D
for i = 1 to n
for j = 1 to n
insert A[i]*B[j] into D
2.For each C[i] in the array C, find if M-C[i] exists in D. The complexity of this step is O(N).
for i = 1 to n
check if M - C[i] is in D
I have a solution. Insert all elements from one of the list into a hash table. This will not take O(n) time.
Once that is complete you find all the pairs from the remaining 2 arrays and see if their sum is present in the hash table.
Because hash hookup is constant we get a quadratic time in total.
Using this approach you save the time on sorting.
Another idea is if you know the max size of each element, you can use a variation of bucket sort and do it in nlogn time.
At the cost of O(N^2) space, but still using O(N^2) time, one could handle four arrays, by computing all possible sums from the first two arrays, and all possible residues from the last two, sort the lists (possible in linear time since they are all of type 'long', whose number of bits is independent of N), and then seeing if any sum equals any residue.
Sorting all 3 arrays and using binary search seems a better approach. Once the arrays are sorted, one should definitely go for binary search rather than linear search, which take n rather than log(n).
Hash table is also a viable option.
The combination of hash and sort can bring the time down but at cost of O(N square) space.
I have another O(N^2) time complexity, O(N) additional space complexity solution.
First, sort the three arrays, this step is O(N*log(N)). Then, for each element in A, create two arrays V = Ai + B and W = Ai + C (Ai is the current element). Ai + B means that each element of this new array V is the elemnent in that position in B plus Ai(the current element in A). W = Ai + C is similar.
Now, merge V and W, as in merge sort. Since both are sorted, this is O(N). In this new array with 2*N elements, search for M + Ai(because Ai is used twice). This can be done in O(log n) with binary search.
Therefore, total complexity is O(N^2).
Sort the three arrays.Then initialize three indices
i pointing to first element of A,
j pointing to last element of B and
k pointing to first element of C.
While i,j,k are in the limits of their respective arrays A,B,C
If A[i]+B[j]+C[k] == M return
If A[i]+B[j]+C[k] < M .Increment i if A[i]<=C[k] otherwise increment k.
If A[i]+B[j]+C[k] > M. Decrement j.
Which should run in O(n).
How about:
for a in A
for b in B
hash a*b
for c in C
if K-c is in hash
print a b c
The idea is to hash all possible pairs in A and B. Next for every element in C see if the residual zum is present in hash.

Resources