Smallest Lexicographic Subsequence of size k in an Array

Smallest Lexicographic Subsequence of size k in an Array - arrays

Given an Array of integers, Find the smallest Lexical subsequence with size k.
EX: Array : [3,1,5,3,5,9,2] k =4
Expected Soultion : 1 3 5 2

The problem can be solved in O(n) by maintaining a double ended queue(deque). We iterate the element from left to right and ensure that the deque always holds the smallest lexicographic sequence upto that point. We should only pop off element if the current element is smaller than the elements in deque and the total elements in deque plus remaining to be processed are at least k.
vector<int> smallestLexo(vector<int> s, int k) {
deque<int> dq;
for(int i = 0; i < s.size(); i++) {
while(!dq.empty() && s[i] < dq.back() && (dq.size() + (s.size() - i - 1)) >= k) {
dq.pop_back();
}
dq.push_back(s[i]);
}
return vector<int> (dq.begin(), dq.end());
}

Here is a greedy algorithm that should work:
Choose Next Number ( lastChoosenIndex, k ) {
minNum = Find out what is the smallest number from lastChoosenIndex to ArraySize-k
//Now we know this number is the best possible candidate to be the next number.
lastChoosenIndex = earliest possible occurance of minNum after lastChoosenIndex
//do the same process for k-1
Choose Next Number ( lastChoosenIndex, k-1 )
}
Algorithm above is high complexity.
But we can pre-sort all the array elements paired with their array index and do the same process greedily using a single loop.
Since we used sorting complexity still will be n*log(n)

Ankit Joshi's answer works. But I think it can be done with just a vector itself, not using a deque as all the operations done are available in vector too. Also in Ankit Joshi's answer, the deque can contain extra elements, we have to manually pop off those elements before returning. Add these lines before returning.
while(dq.size() > k)
{
dq.pop_back();
}

It can be done with RMQ in O(n) + Klog(n).
Construct an RMQ in O(n).
Now find the sequence where every ith element will be the smallest no. from pos [x(i-1)+1 to n-(K-i)] (for i [1 to K] , where x0 = 0, xi is the position of the ith smallest element in the given array)

If I've understood the question right, here's a DP Algorithm that should work but it takes O(NK) time.
//k is the given size and n is the size of the array
create an array dp[k+1][n+1]
initialize the first column with the maximum integer value (we'll need it later)
and the first row with 0's (keep element dp[0][0] = 0)
now run the loop while building the solution
for(int i=1; i<=k; i++) {
for(int j=1; j<=n; j++) {
//if the number of elements in the array is less than the size required (K)
//initialize it with the maximum integer value
if( j < i ) {
dp[i][j] = MAX_INT_VALUE;
}else {
//last minimum of size k-1 with present element or last minimum of size k
dp[i][j] = minimun (dp[i-1][j-1] + arr[j-1], dp[i][j-1]);
}
}
}
//it consists the solution
return dp[k][n];
The last element of the array contains the solution.

I suggest you can try use modified merge sort. The place for
modified is merge part, discard the duplicate value.
select the smallest four
The complexity is o(n logn)
Still thinking whether complexity can be o(n)

Related

Data Structure Question on Arrays - How to find the best of array given conditions

I am new and learning Data structure and algorithm, I need help to solve this question
The best of an array having N elements is defined as sum of best of all elements of Array. The best of element A[i] is defined in the following manner
a: The best of element A[i] is 1 if, A[i-1]<A[i]<A[i+1]
b: The best of element A[i] is 2 if, A[i]> A[j] for j ranging from 0 to n-1
and A[i]<A[h] for h ranging from i+1 to N-1
Write program to find best of array
Note- A[0] and A[N-1] are excluded to find best of array, all elements are unique
Input - 2,1,3,9,20,7,8
Output - 3
The best of element 3 is 2 and 9 is 1. For rest element it is 0. Hence 2+1 =3
This is what I tried so far -
public static void main (String [] args) {
int [] A = {2,1,3,9,20,7,8};
int result = 0;
for(int i=1; i<A.length-2; i++) {
if(A[i-1] < A[i] && A[i]< A[i+1] ) {
result += 1;
}else if(A[i]>A[j] && A[i]<A[h]){
result +=2;
}else {
result+=0;
}
}
}

Note how the phrase:
A[i]> A[j] for j ranging from 0 to n-1
simply means: If the current element is not the Minimum of the array. Hence, if you find the minimum at the beginning, this condition can be changed into a much simpler and lightweight condition:
Let m be the minimum of the array, then if A[i] > m
So you don't need to do a linear search every iteration --> Less time complexity.
Now you have the problem with a complexity of O(N^2), ..which can be reduced further.
Regarding
and A[i]<A[h] for h ranging from i+1 to N-1
Get the maximum element from 2 to N-1. Then at every iteration, check if the current element is less than the maximum. If so, consider it while composing the score, otherwise, that means the current element is the maximum, in this case, re-calculate the maximum element from i+1 to N-1.
The worst case scenario is to find the maximum is always at index i where the array is already sorted in descending order.
Whereas the best case scenario is if the maximum is always the last element, hence the overall complexity is reduced to O(N).
Regarding
A[i-1]<A[i]<A[i+1]
This is straightforward, you simply compare the elements reside at those three indices at every iteration.
Implementation
Before anything, the following are important notes:
The result you've got in your example isn't correct as elements 3 and 9 both fulfill both conditions, so each should score either 1 or 2, but cannot be one with score of 1 and another with score of 2. Hence the overall score should be either 1+1 = 2 or 2 + 2 = 4.
I implemented this algorithm in Java (although I prefer Python), as I could guess it from your code snippet.
import java.util.Arrays;
public class ArrayBest {
private static int[] findMinMax(Integer [] B) {
// find minimum and the maximum: Time Complexity O(n log(n))
Integer[] b = Arrays.copyOf(B, B.length);
Arrays.sort(b);
return new int []{b[0], b[B.length-1]};
}
public static int find(Integer [] A) {
// Exclude the first and last elements
int N = A.length;
Integer [] B = Arrays.copyOfRange(A, 1, N-1);
N -= 2;
// find minimum and the maximum: Time Complexity O(n log(n))
// min at index 0, and max at index 1
int [] minmax = findMinMax(B);
int result = 0;
// start the search
for (int i=0; i<N-1; i++) {
// start with first condition : the easier
if (i!=0 && B[i-1]<B[i] && B[i]<B[i+1]) {
result += 1;
}else if (B[i] != minmax[0]) { // Equivalent to A[i]> A[j] : j in [0, N-1]
if (B[i] < minmax[1]) { // if it is less than the maximum
result += 2;
}else { // it is the maximum --> re-calculate the max over the range (i+1, N)
int [] minmax_ = findMinMax(Arrays.copyOfRange(B, i+1, N));
minmax[1] = minmax_[1];
}
}
}
return result;
}
public static void main(String[] args) {
Integer [] A = {2,1,3,9,7,20,8};
int res = ArrayBest.find(A);
System.out.println(res);
}
}
Ignoring the first sort, the best case scenario is when the last element is the maximum (i.e, at index N-1), hence time complexity is O(N).
The worst case scenario, is when the array is already sorted in a descending order, so the current element that is being processed is always the maximum, hence at each iteration the maximum should be found again. Consequently, the time complexity is O(N^2).
The average case scenario depends on the probability of how the elements are distributed in the array. In other words, the probability that the element being processed at the current iteration is the maximum.
Although it requires more study, my initial guess is as follows:
The probability of any i.i.d element to be the maximum is simply 1/N, and that is at the very beginning, but as we are searching over (i+1, N-1), N will be decreasing, hence the probability will go like: 1/N, 1/(N-1), 1/(N-2), ..., 1. Counting the outer loop, we can write the average complexity as O(N (1/N + 1/(N-1), 1/(N-2), + ... +1)) = O(N (1 + 1/2 + 1/3 + ... + 1/N)) where its asymptotic upper bound (according to Harmonic series) is approximately O(N log(N)).

Merge k sorted arrays using C

I need to merge k (1 <= k <= 16) sorted arrays into one sorted array. This is for a homework assignment and the Professor requires that this be done using an O(n) algorithm. Merging 2 arrays is no problem and I can do it easily using an O(n) algorithm. I feel that what my professor is asking is undoable for n arrays with an O(n) algorithm.
I am using the below algorithm to split the array indices and running InsertionSort on each partition. I could save these start and end indices into a 2D array. I just don't see how the merging can be done using O(n) because this is going to require more than one loop. If it is possible, does anyone have any hints. I'm not looking for actual code, just a hint as to where I should start/
int chunkSize = round(float(arraySize) / numThreads);
for (int i = 0; i < numThreads; i++) {
int start = i * chunkSize;
int end = start + chunkSize - 1;
if (i == numThreads - 1) {
end = arraySize - 1;
}
InsertionSort(&array[start], end - start + 1);
}
EDIT: The requirement is that the algorithm be O(n) where n is the number of elements in the array. Also, I need to solve this without using a min heap.
EDIT #2: Here is an algorithm I came up with. The problem here is that I'm not storing the result of each iteration back into the original array. I could just copy all of it back in for a loop but that would be expensive. Is there any way I can do this, other than using something memcpy? In the below code, indices is a 2D array [numThreads][2] where array[i][0] is the start index and array[i][1] is the end index of the ith array.
void mergeArrays(int array[], int indices[][2], int threads, int result[]) {
for (int i = 0; i < threads - 1; i++) {
int resPos = 0;
int lhsPos = 0;
int lhsEnd = indices[i][1];
int rhsPos = indices[i+1][0];
int rhsEnd = indices[i+1][1];
while (lhsPos <= lhsEnd && rhsPos <= rhsEnd) {
if (array[lhsPos] <= array[rhsPos]) {
result[resPos] = array[lhsPos];
lhsPos++;
} else {
result[resPos] = array[rhsPos];
rhsPos++;
}
resPos++;
}
while (lhsPos <= lhsEnd) {
result[resPos] = array[lhsPos];
lhsPos++;
resPos++;
}
while (rhsPos <= rhsEnd) {
result[resPos] = array[rhsPos];
rhsPos++;
resPos++;
}
}
}

You can merge K sorted arrays in one sorted array with O(N*log(K)) algorithm, using priority queue with K entries, where N is overall number of elements in all arrays.
If K is considered as constant value (it is limited by 16 in your case), then complexity is O(N).
Note again: N is number of elements in my post, not number of arrays.
It is impossible to merge arrays in O(K) - simple copy takes O(N)

Using the facts you provided:
(1) n is the number of arrays to to merge;
(2) the arrays to be merged are already sorted;
(3) the merge needs to be of order n, that is linear in the number of arrays
(and NOT linear in the number of elements in each array, as you might mistakenly think at first sight).
Use the analogy of merging 4 sorted piles of cards, low to high, face up. You would pick the card with the lowest face value from one of the piles and put it (face down) on the merged deck, until all piles are exhausted.
For your program: keep a counter for each array for the number of elements you have already transferred to the output. This is at the same time an index to the next element in each array NOT merged in the output. Pick the smallest element that you find at one of these locations. You have to lookup the first waiting element in all the arrays for that, so that is of order n.
Also, I don't understand why the answer from MoB got up-votes, it does not answer the question.

Here is one way to do it (pseudocode)
input array[k][n]
init indices[k] = { 0, 0, 0, ... }
init queue = { empty priority queue }
for i in 0..k:
insert i into queue with priority (array[i][0])
while queue is not empty:
let x = pop queue
output array[x, indices[x]]
increment indices[x]
insert x into queue with priority (array[x][indices[x]])
This can probably be simplified further in C. You would have to find a suitable queue implementation to use though as there are none in libc.
Complexity for this operation:
"while queue is not empty" => O(n)
"insert x into queue ..." => O(log k)
=> O(n log k)
Which, if you consider k = constant, is O(n).

After sorting the k sub-arrays (the method doesn't matter), the code does a k-way merge. The simplest implementation does k-1 compares to determine the smallest leading element of each of the k arrays, then moves that element from it's sub-array to the output array and gets the next element from that array. When the end of an array is reached, the algorithm drops down to a (k-1) way merge, then (k-2) way merge, finally there's just one sub-array left and it's copied. This will be O(n) time since k-1 is a constant.
The k-1 compares can be sped up by using a minimum heap (which is how some priority queues are implemented), but it's still O(n), with just a smaller constant. The heap needs to be initialized at the start, then updated each time an element is removed and a new one added.

find the largest ten numbers in an array in C

I have an array of int (the length of the array can go from 11 to 500) and i need to extract, in another array, the largest ten numbers.
So, my starting code could be this:
arrayNumbers[n]; //array in input with numbers, 11<n<500
int arrayMax[10];
for (int i=0; i<n; i++){
if(arrayNumbers[i] ....
//here, i need the code to save current int in arrayMax correctly
}
//at the end of cycle, i want to have in arrayMax, the ten largest numbers (they haven't to be ordered)
What's the best efficient way to do this in C?

Study maxheap. Maintain a heap of size 10 and ignore all spilling elements. If you face a difficulty please ask.
EDIT:
If number of elements are less than 20, find n-10 smallest elements and rest if the numbers are top 10 numbers.
Visualize a heap here
EDIT2: Based on comment from Sleepy head, I searched and found this (I have not tested). You can find kth largest element (10 in this case) in )(n) time. Now in O(n) time, you can find first 10 elements which are greater than or equal to this kth largest number. Final complexity is linear.

Here is a algo which solves in linear time:
Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.
Get the top k using the pivot got in step 1.

This is my idea:
insert first 10 elements of your arrayNum into arrMax.
Sort those 10 elements arrMax[0] = min , arrMax[9] = max.
then check the remaining elements one by one and insert every possible candidate into it's right position as follow (draft):
int k, r, p;
for (int k = 10; k < n; k++)
{
r = 0;
while(1)
{
if (arrMax[r] > arrNum[k]) break; // position to insert new comer
else if (r == 10) break; // don't exceed length of arrMax
else r++; // iteration
}
if (r != 0) // no need to insert number smaller than all members
{
for (p=0; p<r-1; p++) arrMax[p]=arrMax[p+1]; // shift arrMax to make space for new comer
arrMax[r-1] = arrNum[k]; // insert new comer at it's position
}
} // done!

Sort the array and insert Max 10 elements in another array

you can use the "select" algorithm which finds you the i-th largest number (you can put any number you like instead of i) and then iterate over the array and find the numbers that are bigger than i. in your case i=10 of course..

The following example can help you. it arranges the biggest 10 elements of the original array into arrMax assuming you have all positive numbers in the original array arrNum. Based on this you can work for negative numbers also by initializing all elements of the arrMax with possible smallest number.
Anyway, using a heap of 10 elements is a better solution rather than this one.
void main()
{
int arrNum[500]={1,2,3,21,34,4,5,6,7,87,8,9,10,11,12,13,14,15,16,17,18,19,20};
int arrMax[10]={0};
int i,cur,j,nn=23,pos;
clrscr();
for(cur=0;cur<nn;cur++)
{
for(pos=9;pos>=0;pos--)
if(arrMax[pos]<arrNum[cur])
break;
for(j=1;j<=pos;j++)
arrMax[j-1]=arrMax[j];
if(pos>=0)
arrMax[pos]=arrNum[cur];
}
for(i=0;i<10;i++)
printf("%d ",arrMax[i]);
getch();
}

When improving efficiency of an algorithm, it is often best (and instructive) to start with a naive implementation and improve it. Since in your question you obviously don't even have that, efficiency is perhaps a moot point.
If you start with the simpler question of how to find the largest integer:
Initialise largest_found to INT_MIN
Iterate the array with :
IF value > largest_found THEN largest_found = value
To get the 10 largest, you perform the same algorithm 10 times, but retaining the last_largest and its index from the previous iteration, modify the largest_found test thus:
IF value > largest_found &&
value <= last_largest_found &&
index != last_largest_index
THEN
largest_found = last_largest_found = value
last_largest_index = index
Start with that, then ask yourself (or here) about efficiency.

max. distance of a number greater than a given number in array

i was going through an interview question ..and came up with logic that requires to find:
Find an index j for an element a[j] larger than a[i] (with j < i), such that (i-j) is the largest. And I want to find this j for every index i in the array, in O(n) or O(n log n) time with O(n) extra space.`
What I have done until now :
1) O(n^2) by using simple for loops
2) Build balanced B.S.T. as we scan the elements from left to right and for i'th element find index of element greater than it. But I realized that it can easily be O(n) for single element, therefore O(n^2) for entire array.
I want to know if it is possible to do it in O(n) or O(n log n). If yes, please give some hints.
EDIT : i think i am unable to explain my question . let me explain it clearly:
i want arr[j] on left of arr[i] such that (i-j) is the largest possible ,and arr[j]>arr[i] and find this for all index i i.e.for(i=0 to n-1).
EDIT 2 :example - {2,3,1,6,0}
for 2 , ans=-1
for 3 , ans=-1
for 1 , ans=2 (i-j)==(2-0)
for 6 , ans=-1
for 0 , ans=4 (i-j)==(4-0)

Create an auxillary array of maximums, let it be maxs, which will basically contain the max value on the array up to the current index.
Formally: maxs[i] = max { arr[0], arr[1], ..., arr[i] }
Note that this is pre processing step that can be done in O(n)
Now for each element i, you are looking for the first element in maxs that is larger then arr[i]. This can be done using binary search, and is O(logn) per op.
Gives you total of O(nlogn) time and O(n) extra space.

You can do this in O(n) time using a stack data structure for array indexes for which you have yet to find a solution. It can be implemented as an array of at most n elements.
Iterate over the input array from left to right, starting with the last element:
Pop all indexes from the stack for which the array element is less than the current element. Mark the index of the current element as the solution for each index you pop.
Push the index of the current element on the stack.
Invariant: the array items corresponding to the indexes in the stack are always in ascending order, with the least item on top.
When you reach the beginning of the input, mark any items that still remain on the stack with -1; for them there is no answer.
Each array index is pushed into the stack exactly once and popped at most once, so this algorithm runs in O(n) time.
An example in Python:
def solution(arr):
stack = []
out = [-1]*len(arr)
for i in xrange(len(arr)-1, -1, -1):
while len(stack) > 0 and arr[stack[-1]] < arr[i]:
out[stack.pop()] = i
stack.append(i);
return out
Note that the correct answer for input [2, 4, 1, 5, 3] is [-1, -1, 1, -1, 3]: for a fixed i, the difference j-i is greatest when j is greatest, so you are looking for the leftmost index j, which minimizes the distance. (When j < i, the difference j-i is negative.)

The fastest solution I can think of is allocating a second array and scanning the array left-to-right. As you traverse the array and scan each element, append the index of the element to your second array if arr[index] is greater than the right-most element of your second array. This is O(1) time per append, maximum of n appends, so O(n).
Finally, once your array is complete, take a second pass through your array. For each element, scan your second array using binary search (this is possible since it is implicitly sorted) and find the leftmost (earliest inserted) index j in your array such that arr[j] > arr[i].
To do this, you have to do a modification of binary search. If you find an index j such that arr[j] > arr[i], you still have to check to see if there are any indices k to the left such that arr[k] > arr[i]. You must do this until you find the left-most index.
I think this is O(log n) per binary search and you have to do the search for n elements. So the total time complexity would be close to O(n log n), but I am not sure of this. Any comments/suggestions to this answer would be much appreciated.

Here's my solution in C++
We maintain an increasing array. Compare the current element with the element at the back of the array.
If it is larger or equals to the larget element so far, then append this element to the array, return -1, there's no smaller element on its left.
If not, we use a binary search, find the index and return the difference.
(We still need to append vec.back() to the array, because we cannot change the index)
int findIdx(vector<int>& vec, int target){
auto it = upper_bound(vec.begin(), vec.end(), target);
int idx = int(it-vec.begin());
return idx;
}
vector<int> farestBig(vector<int>& arr){
vector<int> ans{-1};
vector<int> vec{arr[0]};
int n = (int)arr.size();
for(int i=1; i<n; i++){
if(arr[i] >= vec.back()){
ans.push_back(-1);
vec.push_back(arr[i]);
}
else{
int idx = findIdx(vec, arr[i]);
ans.push_back(i-idx);
vec.push_back(vec.back());
}
}
return ans;
}

Finding kth smallest number from n sorted arrays

So, you have n sorted arrays (not necessarily of equal length), and you are to return the kth smallest element in the combined array (i.e the combined array formed by merging all the n sorted arrays)
I have been trying it and its other variants for quite a while now, and till now I only feel comfortable in the case where there are two arrays of equal length, both sorted and one has to return the median of these two.
This has logarithmic time complexity.
After this I tried to generalize it to finding kth smallest among two sorted arrays. Here is the question on SO.
Even here the solution given is not obvious to me. But even if I somehow manage to convince myself of this solution, I am still curious as to how to solve the absolute general case (which is my question)
Can somebody explain me a step by step solution (which again in my opinion should take logarithmic time i.e O( log(n1) + log(n2) ... + log(nN) where n1, n2...nN are the lengths of the n arrays) which starts from the more specific cases and moves on to the more general one?
I know similar questions for more specific cases are there all over the internet, but I haven't found a convincing and clear answer.
Here is a link to a question (and its answer) on SO which deals with 5 sorted arrays and finding the median of the combined array. The answer just gets too complicated for me to able to generalize it.
Even clean approaches for the more specific cases (as I mentioned during the post) are welcome.
PS: Do you think this can be further generalized to the case of unsorted arrays?
PPS: It's not a homework problem, I am just preparing for interviews.

This doesn't generalize the links, but does solve the problem:
Go through all the arrays and if any have length > k, truncate to length k (this is silly, but we'll mess with k later, so do it anyway)
Identify the largest remaining array A. If more than one, pick one.
Pick the middle element M of the largest array A.
Use a binary search on the remaining arrays to find the same element (or the largest element <= M).
Based on the indexes of the various elements, calculate the total number of elements <= M and > M. This should give you two numbers: L, the number <= M and G, the number > M
If k < L, truncate all the arrays at the split points you've found and iterate on the smaller arrays (use the bottom halves).
If k > L, truncate all the arrays at the split points you've found and iterate on the smaller arrays (use the top halves, and search for element (k-L).
When you get to the point where you only have one element per array (or 0), make a new array of size n with those data, sort, and pick the kth element.
Because you're always guaranteed to remove at least half of one array, in N iterations, you'll get rid of half the elements. That means there are N log k iterations. Each iteration is of order N log k (due to the binary searches), so the whole thing is N^2 (log k)^2 That's all, of course, worst case, based on the assumption that you only get rid of half of the largest array, not of the other arrays. In practice, I imagine the typical performance would be quite a bit better than the worst case.

It can not be done in less than O(n) time. Proof Sketch If it did, it would have to completely not look at at least one array. Obviously, one array can arbitrarily change the value of the kth element.
I have a relatively simple O(n*log(n)*log(m)) where m is the length of the longest array. I'm sure it is possible to be slightly faster, but not a lot faster.
Consider the simple case where you have n arrays each of length 1. Obviously, this is isomorphic to finding the kth element in an unsorted list of length n. It is possible to find this in O(n), see Median of Medians algorithm, originally by Blum, Floyd, Pratt, Rivest and Tarjan, and no (asymptotically) faster algorithms are possible.
Now the problem is how to expand this to longer sorted arrays. Here is the algorithm: Find the median of each array. Sort the list of tuples (median,length of array/2) and sort it by median. Walk through keeping a sum of the lengths, until you reach a sum greater than k. You now have a pair of medians, such that you know the kth element is between them. Now for each median, we know if the kth is greater or less than it, so we can throw away half of each array. Repeat. Once the arrays are all one element long (or less), we use the selection algorithm.
Implementing this will reveal additional complexities and edge conditions, but nothing that increases the asymptotic complexity. Each step
Finds the medians or the arrays, O(1) each, so O(n) total
Sorts the medians O(n log n)
Walks through the sorted list O(n)
Slices the arrays O(1) each so, O(n) total
that is O(n) + O(n log n) + O(n) + O(n) = O(n log n). And, we must perform this untill the longest array is length 1, which will take log m steps for a total of O(n*log(n)*log(m))
You ask if this can be generalized to the case of unsorted arrays. Sadly, the answer is no. Consider the case where we only have one array, then the best algorithm will have to compare at least once with each element for a total of O(m). If there were a faster solution for n unsorted arrays, then we could implement selection by splitting our single array into n parts. Since we just proved selection is O(m), we are stuck.

You could look at my recent answer on the related question here. The same idea can be generalized to multiple arrays instead of 2. In each iteration you could reject the second half of the array with the largest middle element if k is less than sum of mid indexes of all arrays. Alternately, you could reject the first half of the array with the smallest middle element if k is greater than sum of mid indexes of all arrays, adjust k. Keep doing this until you have all but one array reduced to 0 in length. The answer is kth element of the last array which wasn't stripped to 0 elements.
Run-time analysis:
You get rid of half of one array in each iteration. But to determine which array is going to be reduced, you spend time linear to the number of arrays. Assume each array is of the same length, the run time is going to be cclog(n), where c is the number of arrays and n is the length of each array.

There exist an generalization that solves the problem in O(N log k) time, see the question here.

Old question, but none of the answers were good enough. So I am posting the solution using sliding window technique and heap:
class Node {
int elementIndex;
int arrayIndex;
public Node(int elementIndex, int arrayIndex) {
super();
this.elementIndex = elementIndex;
this.arrayIndex = arrayIndex;
}
}
public class KthSmallestInMSortedArrays {
public int findKthSmallest(List<Integer[]> lists, int k) {
int ans = 0;
PriorityQueue<Node> pq = new PriorityQueue<>((a, b) -> {
return lists.get(a.arrayIndex)[a.elementIndex] -
lists.get(b.arrayIndex)[b.elementIndex];
});
for (int i = 0; i < lists.size(); i++) {
Integer[] arr = lists.get(i);
if (arr != null) {
Node n = new Node(0, i);
pq.add(n);
}
}
int count = 0;
while (!pq.isEmpty()) {
Node curr = pq.poll();
ans = lists.get(curr.arrayIndex)[curr.elementIndex];
if (++count == k) {
break;
}
curr.elementIndex++;
pq.offer(curr);
}
return ans;
}
}
The maximum number of elements that we need to access here is O(K) and there are M arrays. So the effective time complexity will be O(K*log(M)).

This would be the code. O(k*log(m))
public int findKSmallest(int[][] A, int k) {
PriorityQueue<int[]> queue = new PriorityQueue<>(Comparator.comparingInt(x -> A[x[0]][x[1]]));
for (int i = 0; i < A.length; i++)
queue.offer(new int[] { i, 0 });
int ans = 0;
while (!queue.isEmpty() && --k >= 0) {
int[] el = queue.poll();
ans = A[el[0]][el[1]];
if (el[1] < A[el[0]].length - 1) {
el[1]++;
queue.offer(el);
}
}
return ans;
}

If the k is not that huge, we can maintain a priority min queue. then loop for every head of the sorted array to get the smallest element and en-queue. when the size of the queue is k. we get the first k smallest .
maybe we can regard the n sorted array as buckets then try the bucket sort method.

This could be considered the second half of a merge sort. We could simply merge all the sorted lists into a single list...but only keep k elements in the combined lists from merge to merge. This has the advantage of only using O(k) space, but something slightly better than merge sort's O(n log n) complexity. That is, it should in practice operate slightly faster than a merge sort. Choosing the kth smallest from the final combined list is O(1). This is kind of complexity is not so bad.

It can be done by doing binary search in each array, while calculating the number of smaller elements.
I used the bisect_left and bisect_right to make it work for non-unique numbers as well,
from bisect import bisect_left
from bisect import bisect_right
def kthOfPiles(givenPiles, k, count):
'''
Perform binary search for kth element in multiple sorted list
parameters
==========
givenPiles are list of sorted list
count is the total number of
k is the target index in range [0..count-1]
'''
begins = [0 for pile in givenPiles]
ends = [len(pile) for pile in givenPiles]
#print('finding k=', k, 'count=', count)
for pileidx,pivotpile in enumerate(givenPiles):
while begins[pileidx] < ends[pileidx]:
mid = (begins[pileidx]+ends[pileidx])>>1
midval = pivotpile[mid]
smaller_count = 0
smaller_right_count = 0
for pile in givenPiles:
smaller_count += bisect_left(pile,midval)
smaller_right_count += bisect_right(pile,midval)
#print('check midval', midval,smaller_count,k,smaller_right_count)
if smaller_count <= k and k < smaller_right_count:
return midval
elif smaller_count > k:
ends[pileidx] = mid
else:
begins[pileidx] = mid+1
return -1

Please find the below C# code to Find the k-th Smallest Element in the Union of Two Sorted Arrays. Time Complexity : O(logk)
public int findKthElement(int k, int[] array1, int start1, int end1, int[] array2, int start2, int end2)
{
// if (k>m+n) exception
if (k == 0)
{
return Math.Min(array1[start1], array2[start2]);
}
if (start1 == end1)
{
return array2[k];
}
if (start2 == end2)
{
return array1[k];
}
int mid = k / 2;
int sub1 = Math.Min(mid, end1 - start1);
int sub2 = Math.Min(mid, end2 - start2);
if (array1[start1 + sub1] < array2[start2 + sub2])
{
return findKthElement(k - mid, array1, start1 + sub1, end1, array2, start2, end2);
}
else
{
return findKthElement(k - mid, array1, start1, end1, array2, start2 + sub2, end2);
}
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Smallest Lexicographic Subsequence of size k in an Array - arrays

Given an Array of integers, Find the smallest Lexical subsequence with size k. EX: Array : [3,1,5,3,5,9,2] k =4 Expected Soultion : 1 3 5 2

It can be done with RMQ in O(n) + Klog(n). Construct an RMQ in O(n). Now find the sequence where every ith element will be the smallest no. from pos [x(i-1)+1 to n-(K-i)] (for i [1 to K] , where x0 = 0, xi is the position of the ith smallest element in the given array)

I suggest you can try use modified merge sort. The place for modified is merge part, discard the duplicate value. select the smallest four The complexity is o(n logn) Still thinking whether complexity can be o(n)

Related

Data Structure Question on Arrays - How to find the best of array given conditions

Merge k sorted arrays using C

find the largest ten numbers in an array in C

max. distance of a number greater than a given number in array

Finding kth smallest number from n sorted arrays

Categories

Resources