Find 3 numbers in 3 sorted arrays with sum equal to some value [duplicate] - arrays

Assume we have three arrays of length N which contain arbitrary numbers of type long. Then we are given a number M (of the same type) and our mission is to pick three numbers A, B and C one from each array (in other words A should be picked from first array, B from second one and C from third) so the sum A + B + C = M.
Question: could we pick all three numbers and end up with time complexity of O(N2)?
Illustration:
Arrays are:
1) 6 5 8 3 9 2
2) 1 9 0 4 6 4
3) 7 8 1 5 4 3
And M we've been given is 19.
Then our choice would be 8 from first, 4 from second and 7 from third.

This can be done in O(1) space and O(N2) time.
First lets solve a simpler problem: Given two arrays A and B pick one element from each so that their sum is equal to given number K.
Sort both the arrays which takes O(NlogN).
Take pointers i and j so that i points to the start of the array A and j points to the end of B.
Find the sum A[i] + B[j] and compare it with K
if A[i] + B[j] == K we have found
the pair A[i] and B[j]
if A[i] + B[j] < K, we need to
increase the sum, so increment i.
if A[i] + B[j] > K, we need to
decrease the sum, so decrement j.
This process of finding the pair after sorting takes O(N).
Now lets take the original problem. We've got a third array now call it C.
So the algorithm now is :
foreach element x in C
find a pair A[i], B[j] from A and B such that A[i] + B[j] = K - x
end for
The outer loop runs N times and for each run we do a O(N) operation making the entire algorithm O(N2).

You can reduce it to the similar problem with two arrays, which is kinda famous and has simple O(n) solution (involving iterating from both ends).
Sort all arrays.
Try each number A from the first array once.
Find if the last two arrays can give us numbers B and C, such that B + C = M - A.
Steps 2 and 3 multiplied give us O(n^2) complexity.

The other solutions are already better, but here's my O(n^2) time and O(n) memory solution anyway.
Insert all elements of array C into a hashtable. (time complexity O(n), space O(n))
Take all pairs (a,b), a from A and b from B (time complexity O(n^2)).
For each pair, check if M-(a+b) exists in the hastable (complexity O(1) expected per query).
So, the overall time complexity is O(n^2), and a space complexity of O(n) for the hashtable.

Hash the last list. The time taken to do this is O(N) on that particular list but this will be added to the next phase.
The next phase is to create a "matrix" of the first two rows of their sums. Then look up in the hash if their matching number is there. Creating the matrix is O(N*N) whilst looking up in the hash is constant time.

1.Store A[i]*B[j] for all pair (i,j) in another array D, organized in the hash data-structure. The complexity of this step is O(N*N).
construct a hash named D
for i = 1 to n
for j = 1 to n
insert A[i]*B[j] into D
2.For each C[i] in the array C, find if M-C[i] exists in D. The complexity of this step is O(N).
for i = 1 to n
check if M - C[i] is in D

I have a solution. Insert all elements from one of the list into a hash table. This will not take O(n) time.
Once that is complete you find all the pairs from the remaining 2 arrays and see if their sum is present in the hash table.
Because hash hookup is constant we get a quadratic time in total.
Using this approach you save the time on sorting.
Another idea is if you know the max size of each element, you can use a variation of bucket sort and do it in nlogn time.

At the cost of O(N^2) space, but still using O(N^2) time, one could handle four arrays, by computing all possible sums from the first two arrays, and all possible residues from the last two, sort the lists (possible in linear time since they are all of type 'long', whose number of bits is independent of N), and then seeing if any sum equals any residue.

Sorting all 3 arrays and using binary search seems a better approach. Once the arrays are sorted, one should definitely go for binary search rather than linear search, which take n rather than log(n).
Hash table is also a viable option.
The combination of hash and sort can bring the time down but at cost of O(N square) space.

I have another O(N^2) time complexity, O(N) additional space complexity solution.
First, sort the three arrays, this step is O(N*log(N)). Then, for each element in A, create two arrays V = Ai + B and W = Ai + C (Ai is the current element). Ai + B means that each element of this new array V is the elemnent in that position in B plus Ai(the current element in A). W = Ai + C is similar.
Now, merge V and W, as in merge sort. Since both are sorted, this is O(N). In this new array with 2*N elements, search for M + Ai(because Ai is used twice). This can be done in O(log n) with binary search.
Therefore, total complexity is O(N^2).

Sort the three arrays.Then initialize three indices
i pointing to first element of A,
j pointing to last element of B and
k pointing to first element of C.
While i,j,k are in the limits of their respective arrays A,B,C
If A[i]+B[j]+C[k] == M return
If A[i]+B[j]+C[k] < M .Increment i if A[i]<=C[k] otherwise increment k.
If A[i]+B[j]+C[k] > M. Decrement j.
Which should run in O(n).

How about:
for a in A
for b in B
hash a*b
for c in C
if K-c is in hash
print a b c
The idea is to hash all possible pairs in A and B. Next for every element in C see if the residual zum is present in hash.

Related

The kth smallest number in two arrays, one sorted the other unsorted

There is already an answer for two sorted arrays. However, in my question one of the arrays is unsorted.
Suppose X[1..n] and Y[1..m] where n < m. X is sorted and Y is unsorted. What is the efficient algorithm to find the kth smallest number of X U Y.
MinHeap can be used to find the kth smallest number in an unsorted array. However, here one of these arrays is sorted. I can think of:
1. Building a `MinHeap` for `Y`
2. i = 1, j = 1
3. x1 = extract Min from Y
4. x2 = X[i];
5. if j == k: return min(x1, x2)
5. if x1 < x2: j++; goto 3
6. else: j++; i++; goto 4
Is it efficient and correct?
There is no help for it but you have to scan Y. This takes O(m) so you can't do better than O(m).
However quickselect has average performance O(m). Basically that algorithm is just to do a quicksort, except that you ignore all partitions that don't have your final answer in them.
Given that n < m we can simply join one array to the other and do quickselect.
Note that the average performance is good, but the worst case performance is quadratic. To fix that, if you do not make progress sufficiently quickly you can switch over to the same median of medians algorithm that gives quicksort guaranteed performance (albeit with bad constants). If you're not familiar with it, that's the one where you divide the array into groups of 5, find the median of each group, and then repeat until you are down to 1 element. Then use that element as the pivot for the whole array.
Make a max-heap of the smallest k items from the sorted array (X). That takes O(k) time.
For each item in the unsorted array (Y), if it's smaller than the largest item in the heap (the root), then remove the root from the heap and add the new item. Worst case for this is O(m log k).
When you're done, the kth smallest number will be at the top of the heap.
Whereas the worst case for the second part is O(m log k), average case is much better because typically a small percentage of items have to be inserted in the heap.

Given Two Arrays A & B , find the first number in both arrays such that replacing them makes the sum of array A equal to sum of array B

Lets consider
A = [1,7,9,10] sumA = 27
B = [0,10,9,6] sumB = 25
Find first elements(if present) from A and B such that if we replace A[i] with B[j]. sumA = sumB
Here, if we replace 1 with 0.
sumA=sumB=26
I know the brute force O(n2) solution. But a better solution,like O(n) is needed.
Thanks.
Yes, there is an O(n) solution that involves hashing the values of one of the arrays; since for each element, a, in array A there is only one possible element, b, in array B that would solve the question:
sumA - a + b = sumB - b + a
2*b = sumB - sumA + 2*a
b = (sumB - sumA) / 2 + a
Hash the values of B in O(n) and for each element, a, in A, determine in O(1) if the value (sumB - sumA) / 2 + a exists in B.
O(nlogn) complexity, O(n) memory solution:
Given the initial arrays A and B, sumA-sumB is twice the difference of the swapped elements. If sumA-sumB is odd, there is no solution. If sumA-sumB is even (in the example, it is equal to 2), we might proceed.
Now sort both arrays (if you can't do it in-place, construct sorted arrays somewhere nearby, hence O(n) memory) in O(nlogn) time. Once the arrays are sorted, you might scan them simultaneously in O(n) time, maintaining the invariant: iterator to sortedB always points to an element that's at least (sumA-sumB)/2 less than this of sortedA iterator. Drive sortedA iterator forward, bump sortedB iterator as needed - the traversal somewhat resembles merge sort. Once you find values in two arrays that differ by exactly (sumA-sumB)/2 - stop the traversal, fall back to original arrays and find these values (and their indices) - this is your solution. If the traversal does not find an exact match - there is no solution.
If the initial arrays are sorted, the complexity is reduced to O(n) time, O(1) memory.

Find three elements in a sorted array which sum to a fourth element

A friend of mine recently got this interview question, which seems to us to be solvable but not within the asymptotic time bounds that the interviewer thought should be possible. Here is the problem:
You have an array of N integers, xs, sorted but possibly non-distinct. Your goal is to find four array indices(1) (a,b,c,d) such that the following two properties hold:
xs[a] + xs[b] + xs[c] = xs[d]
a < b < c < d
The goal is to do this in O(N2) time.
First, an O(N3log(N)) solution is obvious: for each (a,b,c) ordered triple, use binary search to see if an appropriate d can be found. Now, how to do better?
One interesting suggestion from the interviewer is to rewrite the first condition as:
xs[a] + xs[b] = xs[d] - xs[c]
It's not clear what to do after this, but perhaps we could chose some pivot value P, and search for an (a,b) pair adding up to P, and a (d,c) pair subtracting to it. That search is easy enough to do in O(n) time for a given P, by searching inwards from both ends of the array. However, it seems to me that the problem with this is that there are N2 such values P, not just N of them, so we haven't actually reduced the problem size at all: we're doing O(N) work, O(N2) times.
We found some related problems being discussed online elsewhere: Find 3 numbers in an array adding to a given sum is solvable in N2 time, but requires that the sum be fixed ahead of time; adapting the same algorithm but iterating through each possible sum leaves us at N3 as always.
Another related problem seems to be Find all triplets in array with sum less than or equal to given sum, but I'm not sure how much of the stuff there is relevant here: an inequality rather than an equality mixes things up quite a bit, and of course the target is fixed rather than varying.
So, what are we missing? Is the problem impossible after all, given the performance requirements? Or is there a clever algorithm we're unable to spot?
(1) Actually the problem as posed is to find all such (a,b,c,d) tuples, and return a count of how many there are. But I think even finding a single one of them in the required time constraints is hard enough.
If the algorithm would have to list the solutions (i.e. the sets of a, b, c, and d that satisfy the condition), the worst case time complexity is O(n4):
1. There can be O(n4) solutions
The trivial example is an array with only 0 values in it. Then a, b, c and d have all the freedom as long as they stay in order. This represents O(n4) solutions.
But more generally arrays which follow the following pattern have O(n4) solutions:
w, w, w, ... x, x, x, ..., y, y, y, ... z, z, z, ....
With just as many occurrences of each, and:
w + x + y = z
However, to only produce the number of solutions, an algorithm can have a better time complexity.
2. Algorithm
This is a slight variation of the already posted algorithm, which does not involve the H factor. It also describes how to handle cases where different configurations lead to the same sums.
Retrieve all pairs and store them in an array X, where each element gets the following information:
a: the smallest index of the two
b: the other index
sum: the value of xs[a] + xs[b]
At the same time also store for each such pair in another array Y, the following:
c: the smallest index of the two
d: the other index
sum: the value of xs[d] - xs[c]
The above operation has a time complexity of O(n²)
Sort both arrays by their element's sum attribute. In case of equal sum values, the sort order will be determined as follows: for the X array by increasing b; for the Y array by decreasing c. Sorting can be done in O(n²) O(n²logn) time.
[Edit: I could not prove the earlier claim of O(n²) (unless some assumptions are made that allow for a radix/bucket sorting algorithm, which I will not assume). As noted in comments, in general an array with n² elements can be sorted in O(n²logn²), which is O(n²logn), but not O(n²)]
Go through both arrays in "tandem" to find pairs of sums that are equal. If that is the case, it needs to be checked that X[i].b < Y[j].c. If so it represents a solution. But there could be many of them, and counting those in an acceptable time needs special care.
Let m = n(n-1)/2, i.e. the number of elements in array X (which is also the size of array Y):
i = 0
j = 0
while i < m and j < m:
if X[i].sum < Y[j].sum:
i = i + 1
elif X[i].sum > Y[j].sum:
j = j + 1
else:
# We have a solution. Need to count all others that have same sums in X and Y.
# Find last match in Y and set k as index to it:
countY = 0
while k < m and X[i].sum == Y[j].sum and X[i].b < Y[j].c:
countY = countY + 1
j = j + 1
k = j - 1
# add chunks to `count`:
while i < m and countY >= 0 and X[i].sum == Y[k].sum:
while countY >= 0 and X[i].b >= Y[k].c:
countY = countY - 1
k = k - 1
count = count + countY
i = i + 1
Note that although there are nested loops, the variable i only ever increments, and so does j. The variable k always decrements in the innermost loop. Although it also gets higher values to start from, it can never address the same Y element more than a constant number of times via the k index, because while decrementing this index, it stays within the "same sum" range of Y.
So this means that this last part of the algorithm runs in O(m), which is O(n²). As my latest edit confirmed that the sorting step is not O(n²), that step determines the overall time-complexity: O(n²logn).
So one solution can be :
List all x[a] + x[b] value possible such that a < b and hash them in this fashion
key = (x[a]+x[b]) and value = (a,b).
Complexity of this step - O(n^2)
Now List all x[d] - x[c] values possible such that d > c. Also for each x[d] - x[c] search the entry in your hash map by querying. We have a solution if there exists an entry such that c > b for any hit.
Complexity of this step - O(n^2) * H.
Where H is the search time in your hashmap.
Total complexity - O(n^2)* H. Now H may be O(1). This could done if the range of values in the array is small. Also the choice of hash function would depend on the properties of elements in the array.

Minimum set of numbers that sum to least K

Given a list of n objects, write a function that outputs the minimum set of numbers that sum to at least K. FOLLOW UP: can you beat O(n ln n)?
The minimum set will be a set with 1 element. Don't we just have to traverse the array and find an element i.e. >= K.
Otherwise for O(nlgn), I understand we have to first sort the array and then we can find pair or triplets which sum >=k.
What if we don't find such a combination and have to go for bigger sets won't this problem be same as N sum problem?
Here's a linear algorithm that uses linear-time median finding as a subroutine:
Findsum(A, K) {
Let n be the length of A.
Let M be the median element of A, found in linear time.
Let L be the elements of A less than M.
Let U be the elements of A greater than M.
Let E be the elements of A equal to M.
If the sum of the elements in U is at least K,
Return Findsum(U, K).
Else, if the sum of the elements in U and E is at least K,
Return U together with enough elements of E that the sum is at least K.
Else,
Return Findsum(L, K - sum(U) - sum(E)).
}
Each recursive call is done on a list at most half the size of A and all other steps take at most linear time, so this algorithm takes linear time overall.
This is very different from the N Sum problem because it requires the set add up to at least K instead of exactly K.
It can be done in O(n ln n) by sorting the list and progressing from the maximum element until the sum is greater than K. It can be optimized by scanning the list first to eliminate the case where a single number > K and the case where the sum of all members < K. You could also get the average value of the list, and, sometimes, only sort the "upper" half of the list. These optimizations don't improve the O(n ln n) time, though.
The sorting can be done using an index array (or list of integers), so the original values or objects don't need to be moved.

Interview question: three arrays and O(N*N)

Assume we have three arrays of length N which contain arbitrary numbers of type long. Then we are given a number M (of the same type) and our mission is to pick three numbers A, B and C one from each array (in other words A should be picked from first array, B from second one and C from third) so the sum A + B + C = M.
Question: could we pick all three numbers and end up with time complexity of O(N2)?
Illustration:
Arrays are:
1) 6 5 8 3 9 2
2) 1 9 0 4 6 4
3) 7 8 1 5 4 3
And M we've been given is 19.
Then our choice would be 8 from first, 4 from second and 7 from third.
This can be done in O(1) space and O(N2) time.
First lets solve a simpler problem: Given two arrays A and B pick one element from each so that their sum is equal to given number K.
Sort both the arrays which takes O(NlogN).
Take pointers i and j so that i points to the start of the array A and j points to the end of B.
Find the sum A[i] + B[j] and compare it with K
if A[i] + B[j] == K we have found
the pair A[i] and B[j]
if A[i] + B[j] < K, we need to
increase the sum, so increment i.
if A[i] + B[j] > K, we need to
decrease the sum, so decrement j.
This process of finding the pair after sorting takes O(N).
Now lets take the original problem. We've got a third array now call it C.
So the algorithm now is :
foreach element x in C
find a pair A[i], B[j] from A and B such that A[i] + B[j] = K - x
end for
The outer loop runs N times and for each run we do a O(N) operation making the entire algorithm O(N2).
You can reduce it to the similar problem with two arrays, which is kinda famous and has simple O(n) solution (involving iterating from both ends).
Sort all arrays.
Try each number A from the first array once.
Find if the last two arrays can give us numbers B and C, such that B + C = M - A.
Steps 2 and 3 multiplied give us O(n^2) complexity.
The other solutions are already better, but here's my O(n^2) time and O(n) memory solution anyway.
Insert all elements of array C into a hashtable. (time complexity O(n), space O(n))
Take all pairs (a,b), a from A and b from B (time complexity O(n^2)).
For each pair, check if M-(a+b) exists in the hastable (complexity O(1) expected per query).
So, the overall time complexity is O(n^2), and a space complexity of O(n) for the hashtable.
Hash the last list. The time taken to do this is O(N) on that particular list but this will be added to the next phase.
The next phase is to create a "matrix" of the first two rows of their sums. Then look up in the hash if their matching number is there. Creating the matrix is O(N*N) whilst looking up in the hash is constant time.
1.Store A[i]*B[j] for all pair (i,j) in another array D, organized in the hash data-structure. The complexity of this step is O(N*N).
construct a hash named D
for i = 1 to n
for j = 1 to n
insert A[i]*B[j] into D
2.For each C[i] in the array C, find if M-C[i] exists in D. The complexity of this step is O(N).
for i = 1 to n
check if M - C[i] is in D
I have a solution. Insert all elements from one of the list into a hash table. This will not take O(n) time.
Once that is complete you find all the pairs from the remaining 2 arrays and see if their sum is present in the hash table.
Because hash hookup is constant we get a quadratic time in total.
Using this approach you save the time on sorting.
Another idea is if you know the max size of each element, you can use a variation of bucket sort and do it in nlogn time.
At the cost of O(N^2) space, but still using O(N^2) time, one could handle four arrays, by computing all possible sums from the first two arrays, and all possible residues from the last two, sort the lists (possible in linear time since they are all of type 'long', whose number of bits is independent of N), and then seeing if any sum equals any residue.
Sorting all 3 arrays and using binary search seems a better approach. Once the arrays are sorted, one should definitely go for binary search rather than linear search, which take n rather than log(n).
Hash table is also a viable option.
The combination of hash and sort can bring the time down but at cost of O(N square) space.
I have another O(N^2) time complexity, O(N) additional space complexity solution.
First, sort the three arrays, this step is O(N*log(N)). Then, for each element in A, create two arrays V = Ai + B and W = Ai + C (Ai is the current element). Ai + B means that each element of this new array V is the elemnent in that position in B plus Ai(the current element in A). W = Ai + C is similar.
Now, merge V and W, as in merge sort. Since both are sorted, this is O(N). In this new array with 2*N elements, search for M + Ai(because Ai is used twice). This can be done in O(log n) with binary search.
Therefore, total complexity is O(N^2).
Sort the three arrays.Then initialize three indices
i pointing to first element of A,
j pointing to last element of B and
k pointing to first element of C.
While i,j,k are in the limits of their respective arrays A,B,C
If A[i]+B[j]+C[k] == M return
If A[i]+B[j]+C[k] < M .Increment i if A[i]<=C[k] otherwise increment k.
If A[i]+B[j]+C[k] > M. Decrement j.
Which should run in O(n).
How about:
for a in A
for b in B
hash a*b
for c in C
if K-c is in hash
print a b c
The idea is to hash all possible pairs in A and B. Next for every element in C see if the residual zum is present in hash.

Resources