Min-prefix-array using divide and conquer - arrays

I am struggling to work on a divide-and-conquer problem as I can't quite wrap my head around it.
Lets say we have some array X[1:n]. How would we go about finding the min-prefixarray X[1:k] where 1 ≤ k ≤ n and the prefix is defined as X[1]×X[2]×...×X[n] for an array of real numbers?
My approach so far has been:
function min_prefix(array[1:n],n)
begin
if array.length == 1 then
return n, array[0], array[0]
endif
integer best_k, real b_total, total = min_prefix([1:n-1],n-1)
new_total = total*array[n]
if new_total < b_total then
return n, new_total, new_total
endif
return best_k, b_total, new_total
end
I dont think this is a valid divide-and-conquer solution as I still have to iterate over every element in the array.
Edit:
The best example I could think of:
Consider the array {-1,2,2,2} the min-prefix would be k=3 as, when all the elements are multiplied together the resultant answer is -6.
However if we then consider the array {-1,2,-2,2} then the min prefix would be k=1 as k[0]*k[1] = -2 multiplying the 3rd element onwards would only make the number larger.

The algorithm to find the "minimal prefix product" is, basically, to calculate all possible prefixes and find the minimum among them. This can be done in linear time, and not faster.
Pseudocode:
min_pref_l = 1
min_pref_v = arr[0]
prev_f = arr[0]
for i in 1 until arr.length:
pref_v *= arr[i]
if pref_v < min_pref_v:
min_pref_v = pref_v
min_pref_l = i + 1
return min_pref_v, min_pref_l
The strangest part of the question is the "divide and conquer" requirement. I think, if you squint hard enough and look at this algorithm, you can probably say, that it's "divide and conquer", as for calculating the prefix of length i it uses the previously calculated prefix of length i-1.
To illustrate that, the algorithm could be rewritten as a recursive function:
# min_prefix returns tuple of three values:
# First two define the minimal prefix of length ≤ i, as the pair of (value, length)
# Third is the product of the prefix of length i
fun min_prefix(i: int) -> (int, int, int):
if i == 0:
return arr[0], 1, arr[0]
prev_min_l, prev_min_v, prev_v = min_prefix(i-1)
v = prev_v * arr[i]
if v < prev_min_v:
return i+1, v, v
else:
return prev_min_l, prev_min_v, v
# program result
return min_prefix(arr.length - 1)
Note:
in the recursive variant the space complexity went from O(1) to O(n), the function can be rewritten as tail recursive to avoid that
corner cases, such as empty array and product overflow were not considered intentionally, to simplify the code

Related

Find the element occuring once in an array where all other elements occur twice (without using XOR)

I have tried solving this for so long but I can't seem to be able to.
The question is as follows:
Given an array n numbers where all of the numbers in it occur twice except for one, which occurs only once, find the number that occurs only once.
Now, I have found many solutions online for this, but none of them satisfy the additional constraints of the question.
The solution should:
Run in linear time (aka O(n)).
Not use hash tables.
Assume that computer supports only comparison and the arithmetic (addition, subtraction, multiplication, division).
The number of bits in each number in the array is about O(log(n)).
Therefore, trying something like this https://stackoverflow.com/a/4772568/7774315 using the XOR operator isn't possible, since we don't have the XOR operator. Since the number of bits in each number is about O(log(n)), trying to implement the XOR operator using normal arithmetic (bit by bit) will take about O(log(n)) actions, which will give us an overall solution of O(nlog(n)).
The closest I have come to solving it is if I had a way to get the sum of all unique values in the array in linear time, I could subtract twice that sum from the overall sum to get (negative) the element that occurs only once, because if the numbers that appear twice are {a1,a2,....,ak} and the number that appears once is x, then the overall sum is
sum=2(a1+...+ak)+x
As far as I know, sets are implemented using hash tables, so using them to find the sum of all unique values is no good.
Let's imagine we had a way to find the exact median in linear time and partition the array so all greater elements are on one side and smaller elements on the other. By the parity of expected number of elements, we could identify which side the target element is in. Now perform this routine recursively in the section we identified. Since the section is halved in size each time, the total number of elements traversed cannot exceed O(2n) = O(n).
The key element in the question seems to be this one:
The number of bits in each number in the array is about O(log(n)).
The issue is that this clue is vague a little bit.
A first approach is to consider that the maximum value is O(n). Then a counting sort can be performed in O(n) operations and O(n) memory.
It will consists in finding the maximum value MAX, setting an integer array C[MAX] and performing directly a classical counting sort thanks to it
C[a[i]]++;
Looking for an odd value in array C[] will provide the solution.
A second approach, I guess more efficient, would be to set an array of size n, each element consisting of an array of unknown size. Then, a kind of almost counting sort would consists in :
C[a[i]%n].append (a[i]);
To find the unique element, we then have to find a sub-array of odd size, and then to examine the elements in this sub-array.
The maximum size k of each sub-array will be about 2*(MAX/n). According to the clue, this value should be very low. Dealing with this sub-array has a complexity O(k), for example by performing a counting sort on the b[j]/n, all the elements being equal modulo n.
We can note that practically, this is equivalent to perform a kind of ad-hoc hashing.
Global complexity is O(n + MAX/n).
This should do the trick as long as your a dealing with integers of size O(log n). It is a Python implementation of the algorithm sketched #גלעד ברקן answer (including #OneLyner comments), where the median is replaced by a mean or mid-value.
def mean(items):
result = 0
for i, item in enumerate(items, 1):
result = (result * (i - 1) + item) / i
return result
def midval(items):
min_val = max_val = items[0]
for item in items:
if item < min_val:
min_val = item
elif item > max_val:
max_val = item
return (max_val - min_val) / 2
def find_singleton(items, pivoting=mean):
n = len(items)
if n == 1:
return items[0]
else:
# find pivot - O(n)
pivot = pivoting(items)
# partition the items - O(n)
j = 0
for i, item in enumerate(items):
if item > pivot:
items[j], items[i] = items[i], items[j]
j += 1
# recursion on the partition with odd number of elements
if j % 2:
return find_singleton(items[:j])
else:
return find_singleton(items[j:])
The following code is just for some sanity-checking on random inputs:
def gen_input(n, randomize=True):
"""Generate inputs with unique pairs except one, with size (2 * n + 1)."""
items = sorted(set(random.randint(-n, n) for _ in range(n)))[:n]
singleton = items[-1]
items = items + items[:-1]
if randomize:
random.shuffle(items)
return items, singleton
items, singleton = gen_input(100)
print(singleton, len(items), items.index(singleton), items)
print(find_singleton(items, mean))
print(find_singleton(items, midval))
For a symmetric distribution the median and the mean or mid-value coincide.
With the log(n) requirement on the number of bits for the entries, one
can show that any arbitrary sub-sampling cannot be skewed enough to provide more than log(n) recursions.
For example, considering the case of k = log(n) bits with k = 4 and only positive numbers, the worst case is: [0, 1, 1, 2, 2, 4, 4, 8, 8, 16, 16]. Here pivoting by the mean will reduce the input by 2 at time, resulting in k + 1 recursive calls, but adding any other couple to the input will not increase the number of recursive calls, while it will increase the input size.
(EDITED to provide a better explanation.)
Here is an (unoptimized) implementation of the idea sketched by גלעד ברקן .
I'm using Median_of_medians to get a value close enough to the median to ensure the linear time in the worst case.
NB: this in fact uses only comparisons, and is O(n) whatever the size of the integers as long as comparisons and copies are counted as O(1).
def median_small(L):
return sorted(L)[len(L)//2]
def median_of_medians(L):
if len(L) < 20:
return median_small(L)
return median_of_medians([median_small(L[i:i+5]) for i in range(0, len(L), 5)])
def find_single(L):
if len(L) == 1:
return L[0]
pivot = median_of_medians(L)
smaller = [i for i in L if i <= pivot]
bigger = [i for i in L if i > pivot]
if len(smaller) % 2:
return find_single(smaller)
else:
return find_single(bigger)
This version needs O(n) additional space, but could be implemented with O(1).

Recursive function that returns the number of possible combinations

I had an interview and was asked a question that I'd like to understand the solution.
The Question
Create a recursive function that returns the number of possible combinations of arrays of a given length that could be made from an array of non-repeating consecutive integers.
f(array, length) = Combinations
Example 1
array = [0,1,2,3]
length = 2
Combinations = 10 (all combinations: [0,0] [0,1] [0,2] [0,3] [1,1] [1,2] [1,3] [2,2] [2,3] [3,3])
Note that [0,0] is allowed but [1,0] is not because [0,1] is defined
Example 2
array = [0,1]
length = 3
Combinations = 4 (all combinations: [0,0,0] [0,0,1] [0,1,1] [1,1,1])
One "hint" was offered. The interviewer said the array itself shouldn't matter; the length was all that was needed.
This algorithm can be expressed recursively because the solution can be expressed in terms of solutions for smaller inputs. "Smaller" here has two meanings:
A subset of the array; specifically the sub-array after the current element index
Solutions for smaller length; these can be added together to give the solution for length + 1
Stopping conditions:
When the array size A = 1 - only one combination can be generated
When the length L = 1 - number of combinations = number of elements in array
The fully recursive procedure is a surprisingly simple one-liner:
return [recursive call to rest of array, same length] +
[recursive call to same array, length - 1]
This is called dynamic programming.
Code:
int F(int A, int L)
{
if (A <= 1) return 1;
if (L <= 1) return A;
return F(A - 1, L) + F(A, L - 1);
}
Tests:
F(4, 2) = 10
F(2, 3) = 4
F(3, 5) = 21 (trace it with pen-and-paper to see for yourself)
EDIT: I've given an elegant and simple solution, but I perhaps haven't explained it as well as #RoryDaulton. Consider giving his answer credit too.
You do not give a target language and you do not say just how much help you want. So I'll give the overall idea of an algorithm that should be simple to code if you know recursion in a certain language. Ask if you want more code in Python, my current preferred language.
You know you need to do recursion, and you have two things you could recurse on: the length of the given array or the length of the desired arrays. Let's recurse on the second, and let's say the given array is [0, 1, ..., n-1] since you know that the actual contents are irrelevant.
If the desired length r is 1 you know there are only n desired arrays, namely [0], [1], ..., [n-1]. So there is the base case for your recursion.
If you have a "combination" of length r-1, how can that be expanded to length r and keep the requirements? Look at the last element in the array of length r-1--let's call it k. The next element cannot be less than that, so all the possible arrays extended to length r are the r-1 array appended with k', 'k+1, ..., n-1. Those are n-k arrays of length r.
Is it clear how to code that? Note that you do not need to keep all the arrays of length r-1, you only need the count of how many arrays there are that end with the element 0 or 1 or ... n-1. That makes it convenient to code--not much memory is needed. In fact, things can be reduced further--I'll leave that to you.
Note that the interviewer probably did not want the code, he wanted your thought-process leading to the code to see the way you think. This is one way to think the problem through.

Find three elements in a sorted array which sum to a fourth element

A friend of mine recently got this interview question, which seems to us to be solvable but not within the asymptotic time bounds that the interviewer thought should be possible. Here is the problem:
You have an array of N integers, xs, sorted but possibly non-distinct. Your goal is to find four array indices(1) (a,b,c,d) such that the following two properties hold:
xs[a] + xs[b] + xs[c] = xs[d]
a < b < c < d
The goal is to do this in O(N2) time.
First, an O(N3log(N)) solution is obvious: for each (a,b,c) ordered triple, use binary search to see if an appropriate d can be found. Now, how to do better?
One interesting suggestion from the interviewer is to rewrite the first condition as:
xs[a] + xs[b] = xs[d] - xs[c]
It's not clear what to do after this, but perhaps we could chose some pivot value P, and search for an (a,b) pair adding up to P, and a (d,c) pair subtracting to it. That search is easy enough to do in O(n) time for a given P, by searching inwards from both ends of the array. However, it seems to me that the problem with this is that there are N2 such values P, not just N of them, so we haven't actually reduced the problem size at all: we're doing O(N) work, O(N2) times.
We found some related problems being discussed online elsewhere: Find 3 numbers in an array adding to a given sum is solvable in N2 time, but requires that the sum be fixed ahead of time; adapting the same algorithm but iterating through each possible sum leaves us at N3 as always.
Another related problem seems to be Find all triplets in array with sum less than or equal to given sum, but I'm not sure how much of the stuff there is relevant here: an inequality rather than an equality mixes things up quite a bit, and of course the target is fixed rather than varying.
So, what are we missing? Is the problem impossible after all, given the performance requirements? Or is there a clever algorithm we're unable to spot?
(1) Actually the problem as posed is to find all such (a,b,c,d) tuples, and return a count of how many there are. But I think even finding a single one of them in the required time constraints is hard enough.
If the algorithm would have to list the solutions (i.e. the sets of a, b, c, and d that satisfy the condition), the worst case time complexity is O(n4):
1. There can be O(n4) solutions
The trivial example is an array with only 0 values in it. Then a, b, c and d have all the freedom as long as they stay in order. This represents O(n4) solutions.
But more generally arrays which follow the following pattern have O(n4) solutions:
w, w, w, ... x, x, x, ..., y, y, y, ... z, z, z, ....
With just as many occurrences of each, and:
w + x + y = z
However, to only produce the number of solutions, an algorithm can have a better time complexity.
2. Algorithm
This is a slight variation of the already posted algorithm, which does not involve the H factor. It also describes how to handle cases where different configurations lead to the same sums.
Retrieve all pairs and store them in an array X, where each element gets the following information:
a: the smallest index of the two
b: the other index
sum: the value of xs[a] + xs[b]
At the same time also store for each such pair in another array Y, the following:
c: the smallest index of the two
d: the other index
sum: the value of xs[d] - xs[c]
The above operation has a time complexity of O(n²)
Sort both arrays by their element's sum attribute. In case of equal sum values, the sort order will be determined as follows: for the X array by increasing b; for the Y array by decreasing c. Sorting can be done in O(n²) O(n²logn) time.
[Edit: I could not prove the earlier claim of O(n²) (unless some assumptions are made that allow for a radix/bucket sorting algorithm, which I will not assume). As noted in comments, in general an array with n² elements can be sorted in O(n²logn²), which is O(n²logn), but not O(n²)]
Go through both arrays in "tandem" to find pairs of sums that are equal. If that is the case, it needs to be checked that X[i].b < Y[j].c. If so it represents a solution. But there could be many of them, and counting those in an acceptable time needs special care.
Let m = n(n-1)/2, i.e. the number of elements in array X (which is also the size of array Y):
i = 0
j = 0
while i < m and j < m:
if X[i].sum < Y[j].sum:
i = i + 1
elif X[i].sum > Y[j].sum:
j = j + 1
else:
# We have a solution. Need to count all others that have same sums in X and Y.
# Find last match in Y and set k as index to it:
countY = 0
while k < m and X[i].sum == Y[j].sum and X[i].b < Y[j].c:
countY = countY + 1
j = j + 1
k = j - 1
# add chunks to `count`:
while i < m and countY >= 0 and X[i].sum == Y[k].sum:
while countY >= 0 and X[i].b >= Y[k].c:
countY = countY - 1
k = k - 1
count = count + countY
i = i + 1
Note that although there are nested loops, the variable i only ever increments, and so does j. The variable k always decrements in the innermost loop. Although it also gets higher values to start from, it can never address the same Y element more than a constant number of times via the k index, because while decrementing this index, it stays within the "same sum" range of Y.
So this means that this last part of the algorithm runs in O(m), which is O(n²). As my latest edit confirmed that the sorting step is not O(n²), that step determines the overall time-complexity: O(n²logn).
So one solution can be :
List all x[a] + x[b] value possible such that a < b and hash them in this fashion
key = (x[a]+x[b]) and value = (a,b).
Complexity of this step - O(n^2)
Now List all x[d] - x[c] values possible such that d > c. Also for each x[d] - x[c] search the entry in your hash map by querying. We have a solution if there exists an entry such that c > b for any hit.
Complexity of this step - O(n^2) * H.
Where H is the search time in your hashmap.
Total complexity - O(n^2)* H. Now H may be O(1). This could done if the range of values in the array is small. Also the choice of hash function would depend on the properties of elements in the array.

Number of ways such that sum of k elements equal to p

Given series of integers having relation where a number is equal to sum of previous 2 numbers and starting integer is 1
Series ->1,2,3,5,8,13,21,34,55
find the number of ways such that sum of k elements equal to p.We can use an element any number of times.
p=8
k=4.
So,number of ways would be 4.Those are,
1,1,1,5
1,1,3,3
1,2,2,3
2,2,2,2
I am able to sove this question through recursion.I sense dynamic programming here but i am not getting how to do it.Can it be done in much lesser time???
EDIT I forgot to mention that the sequence of the numbers does not matter and will be counted once. for ex=3->(1,2)and(2,1).here number of ways would be 1 only.
EDIT: Poster has changed the original problem since this was posted. My algorithm still works, but maybe can be improved upon. Original problem had n arbitrary input numbers (he has now modified it to be a Fibonacci series). To apply my algorithm to the modified post, truncate the series by taking only elements less than p (assume there are n of them).
Here's an n^(k/2) algorithm. (n is the number of elements in the series)
Use a table of length p, such that table[i] contains all combinations of k/2 elements that sum to i. For example, in the example data that you provided, table[4] contains {1,3} and {2,2}.
EDIT: If the space is prohibitive, this same algorithm can be done with an ordered linked lists, where you only store the non-empty table entries. The linked list has to be both directions: forward and backwards, which makes the final step of the algorithm cleaner.
Once this table is computed, then we get all solutions by combining every table[j] with every table[p-j], whenever both are non-empty.
To get the table, initialize the entire thing to empty. Then:
For i_1 = 0 to n-1:
For i_2 = i_1 to n-1:
...
For i_k/2 = i_k/2-1 to n-1:
sum = series[i_1] + ... + series[i_k/2]
if sum <= p:
store {i_1, i_2, ... , i_k/2 } in table[sum]
This "variable number of loops" looks impossible to implement, but actually it can be done with an array of length k/2 that keeps track of where each i_` is.
Let's go back to your data and see how our table would look:
table[2] = {1,1}
table[3] = {1,2}
table[4] = {1,3} and {2,2}
table[5] = {2,3}
table[6] = {1,5}
table[7] = {2,5}
table[8] = {3,5}
Solutions are found by combining table[2] with table[6], table[3] with table[5], and table[4] with table[4]. Thus, solutions are: {1,1,1,5} {1,2,2,3}, {1,1,3,3}, {2,2,2,2}, {1,3,2,2}.
You can use dynamic programming. Let C(p, k) be the number of ways that sum k element equal to p and a be the array of elements. Then
C(p, k) = C(p - a[0], k - 1) + C(p - a[1], k - 1) + .... + C(p - a[n-1], k - 1)
Then, you can use memorization to speed up your code.
Hint:
Your problem is well-known. It is the sum set problem, a variation of knapsack problem. Check this pretty good explanation. sum-set problem

Median of 5 sorted arrays

I am trying to find the solution for median of 5 sorted arrays. This was an interview questions.
The solution I could think of was merge the 5 arrays and then find the median [O(l+m+n+o+p)].
I know that for 2 sorted arrays of same size we can do it in log(2n). [by comparing the median of both arrays and then throwing out 1 half of each array and repeating the process]. .. Finding median can be constant time in sorted arrays .. so I think this is not log(n) ? .. what is the time complexity for this ?
1] Is there a similar solution for 5 arrays . What if the arrays are of same size , is there a better solution then ?
2] I assume since this was asked for 5, there would be some solution for N sorted arrays ?
Thanks for any pointers.
Some clarification/questions I asked back to the interviewer:
Are the arrays of same length
=> No
I guess there would be an overlap in the values of arrays
=> Yes
As an exercise, I think the logic for 2 arrays doesnt extend . Here is a try:
Applying the above logic of 2 arrays to say 3 arrays:
[3,7,9] [4,8,15] [2,3,9] ... medians 7,8,3
throw elements [3,7,9] [4,8] [3,9] .. medians 7,6,6
throw elements [3,7] [8] [9] ..medians 5,8,9 ...
throw elements [7] [8] [9] .. median = 8 ... This doesnt seem to be correct ?
The merge of sorted elements => [2,3,4,7,8,9,15] => expected median = 7
(This is a generalization of your idea for two arrays.)
If you start by looking at the five medians of the five arrays, obviously the overall median must be between the smallest and the largest of the five medians.
Proof goes something like this: If a is the min of the medians, and b is the max of the medians, then each array has less than half of its elements less than a and less than half of its elements greater than b. Result follows.
So in the array containing a, throw away numbers less than a; in the array containing b, throw away numbers greater than b... But only throw away the same number of elements from both arrays.
That is, if a is j elements from the start of its array, and b is k elements from the end of its array, you throw away the first min(j,k) elements from a's array and the last min(j,k) elements from b's array.
Iterate until you are down to 1 or 2 elements total.
Each of these operations (i.e., finding median of a sorted array and throwing away k elements from the start or end of an array) is constant time. So each iteration is constant time.
Each iteration throws away (more than) half the elements from at least one array, and you can only do that log(n) times for each of the five arrays... So the overall algorithm is log(n).
[Update]
As Himadri Choudhury points out in the comments, my solution is incomplete; there are a lot of details and corner cases to worry about. So, to flesh things out a bit...
For each of the five arrays R, define its "lower median" as R[n/2-1] and its "upper median" as R[n/2], where n is the number of elements in the array (and arrays are indexed from 0, and division by 2 rounds down).
Let "a" be the smallest of the lower medians, and "b" be the largest of the upper medians. If there are multiple arrays with the smallest lower median and/or multiple arrays with the largest upper median, choose a and b from different arrays (this is one of those corner cases).
Now, borrowing Himadri's suggestion: Erase all elements up to and including a from its array, and all elements down to and including b from its array, taking care to remove the same number of elements from both arrays. Note that a and b could be in the same array; but if so, they could not have the same value, because otherwise we would have been able to choose one of them from a different array. So it is OK if this step winds up throwing away elements from the start and end of the same array.
Iterate as long as you have three or more arrays. But once you are down to just one or two arrays, you have to change your strategy to be exclusive instead of inclusive; you only erase up to but not including a and down to but not including b. Continue like this as long as both of the remaining one or two arrays has at least three elements (guaranteeing you make progress).
Finally, you will reduce to a few cases, the trickiest of which is two arrays remaining, one of which has one or two elements. Now, if I asked you: "Given a sorted array plus one or two additional elements, find the median of all elements", I think you can do that in constant time. (Again, there are a bunch of details to hammer out, but the basic idea is that adding one or two elements to an array does not "push the median around" very much.)
Should be pretty straight to apply the same idea to 5 arrays.
First, convert the question to more general one. Finding Kth element in N sorted arrays
Find (K/N)th element in each sorted array with binary search, say K1, K2... KN
Kmin = min(K1 ... KN), Kmax = max(K1 ... KN)
Throw away all elements less than Kmin or larger than Kmax, say X elements has been thrown away.
Now repeat the process by find (K - X)th element in sorted arrays with remaining elements
You don't need to do a complete merge of the 5 arrays. You can do a merge sort until you have (l+n+o+p+q)/2 elements then you have the median value.
Finding the kth element in a list of sorted lists can be done by binary search.
from bisect import bisect_left
from bisect import bisect_right
def kthOfPiles(givenPiles, k, count):
'''
Perform binary search for kth element in multiple sorted list
parameters
==========
givenPiles are list of sorted list
count is the total number of
k is the target index in range [0..count-1]
'''
begins = [0 for pile in givenPiles]
ends = [len(pile) for pile in givenPiles]
#print('finding k=', k, 'count=', count)
for pileidx,pivotpile in enumerate(givenPiles):
while begins[pileidx] < ends[pileidx]:
mid = (begins[pileidx]+ends[pileidx])>>1
midval = pivotpile[mid]
smaller_count = 0
smaller_right_count = 0
for pile in givenPiles:
smaller_count += bisect_left(pile,midval)
smaller_right_count += bisect_right(pile,midval)
#print('check midval', midval,smaller_count,k,smaller_right_count)
if smaller_count <= k and k < smaller_right_count:
return midval
elif smaller_count > k:
ends[pileidx] = mid
else:
begins[pileidx] = mid+1
return -1
def medianOfPiles(givenPiles,count=None):
'''
Find statistical median
Parameters:
givenPiles are list of sorted list
'''
if not givenPiles:
return -1 # cannot find median
if count is None:
count = 0
for pile in givenPiles:
count += len(pile)
# get mid floor
target_mid = count >> 1
midval = kthOfPiles(givenPiles, target_mid, count)
if 0 == (count&1):
midval += kthOfPiles(givenPiles, target_mid-1, count)
midval /= 2
return '%.1f' % round(midval,1)
The code above gives correct-statistical median as well.
Coupling this above binary search with patience-sort, gives a valuable technique.
There is worth mentioning median of median algorithm for selecting pivot. It gives approximate value. I guess that is different from what we are asking here.
Use heapq to keep each list's minum candidates.
Prerequisite: N sorted K length list
O(NKlgN)
import heapq
class Solution:
def f1(self, AS):
def f(A):
n = len(A)
m = n // 2
if n % 2:
return A[m]
else:
return (A[m - 1] + A[m]) / 2
res = []
q = []
for i, A in enumerate(AS):
q.append([A[0], i, 0])
heapq.heapify(q)
N, K = len(AS), len(AS[0])
while len(res) < N * K:
mn, i, ii = heapq.heappop(q)
res.append(mn)
if ii < K - 1:
heapq.heappush(q, [AS[i][ii + 1], i, ii + 1])
return f(res)
def f2(self, AS):
q = []
for i, A in enumerate(AS):
q.append([A[0], i, 0])
heapq.heapify(q)
N, K = len(AS), len(AS[0])
n = N * K
m = n // 2
m1 = m2 = float('-inf')
k = 0
while k < N * K:
mn, i, ii = heapq.heappop(q)
res.append(mn)
k += 1
if k == m - 1:
m1 = mn
elif k == m:
m2 = mn
return m2 if n % 2 else (m1 + m2) / 2
if ii < K - 1:
heapq.heappush(q, [AS[i][ii + 1], i, ii + 1])
return 'should not go here'

Resources