Hacker Earth(Basic I/O Question) Play With Numbers [ subarry ] - arrays

I have been trying to solve this problem and it works good with small numbers but not the big 10^9 numbers in Hacker Earth
You are given an array of n numbers and q queries. For each query you have to print the floor of the expected value(mean) of the subarray from L to R.
INPUT:
First line contains two integers N and Q denoting number of array elements and number of queries.
Next line contains N space separated integers denoting array elements.
Next Q lines contain two integers L and R(indices of the array).
OUTPUT:
print a single integer denoting the answer.
Constraints:
1<= N ,Q,L,R <= 10^6
1<= Array elements <= 10^9
NOTE
Use Fast I/O
using namespace std;
long int solvepb(int a, int b, long int *arr,int n){
int result, count = 0;
vector<long int>res;
for(int i=0;i<n;i++){
if(i+1 >= a && i+1 <=b){
res.push_back(arr[i]);
count += arr[i];
}
}
result = count / res.size();
return result;
}
int main(){
int n,q;cin>>n>>q;
long int arr[n];
for(int i=0;i<n;i++){
cin>>arr[i];
}
while(q--){
int a,b;
cin>>a>>b;
cout<<solvepb(a,b,arr,n)<<endl;
}
return 0;
}```

So currently, the issue with your algorithm is that each time you are computing the mean over two indices in the array. This means that if the queries are particularly bad, for each of the Q queries, you might iterate through all N elements of the array.
How can one try to reduce this? Notice that because sums are additive, the sum up to an index i is the same as the sum up to an index j plus the sum of the numbers between i and j. Let me rewrite that as an equation -
sum[0:i] = sum[0:j] + sum[j+1:i]
It should be obvious now that by rearranging this equation, you can quickly get the sum between two indices by storing the sum of numbers up to an index. (i.e. sum[j+1:i] = sum[0:i] - sum[0:j]). This means that rather than having O(N*Q), you can have O(N + Q) runtime complexity. The O(N) part of the new complexity is from iterating the array once to get all the sums. The O(Q) part comes from answering the Q queries.
This kind of approach is called prefix sums. There are some optimized data structures like Fenwick trees made specifically for prefix sums that you can read about online or on Wikipedia. But for your question, a simple array should work just fine.
A few comments about your code:
In your for loop in the solvepb function, you are going from 0 to n always, but you didn't need to. You could have specified to go from a to b if you knew a was smaller than b. Otherwise, you go from b to a.
You also do not really use the vector. The vector in the solvepb function stores array elements, but these are never used again. You only seem to use it to find the number of elements from a to b, but you can get that by simply subtracting the difference between the two indices (i.e. b-a+1 if a < b otherwise a-b+1)

Related

Given an array of integers of size n+1 consisting of the elements [1,n]. All elements are unique except one which is duplicated k times

I have been attempting to solve the following problem:
You are given an array of n+1 integers where all the elements lies in [1,n]. You are also given that one of the elements is duplicated a certain number of times, whilst the others are distinct. Develop an algorithm to find both the duplicated number and the number of times it is duplicated.
Here is my solution where I let k = number of duplications:
struct LatticePoint{ // to hold duplicate and k
int a;
int b;
LatticePoint(int a_, int b_) : a(a_), b(b_) {}
}
LatticePoint findDuplicateAndK(const std::vector<int>& A){
int n = A.size() - 1;
std::vector<int> Numbers (n);
for(int i = 0; i < n + 1; ++i){
++Numbers[A[i] - 1]; // A[i] in range [1,n] so no out-of-access
}
int i = 0;
while(i < n){
if(Numbers[i] > 1) {
int duplicate = i + 1;
int k = Numbers[i] - 1;
LatticePoint result{duplicate, k};
return LatticePoint;
}
So, the basic idea is this: we go along the array and each time we see the number A[i] we increment the value of Numbers[A[i]]. Since only the duplicate appears more than once, the index of the entry of Numbers with value greater than 1 must be the duplicate number with the value of the entry the number of duplications - 1. This algorithm of O(n) in time complexity and O(n) in space.
I was wondering if someone had a solution that is better in time and/or space? (or indeed if there are any errors in my solution...)
You can reduce the scratch space to n bits instead of n ints, provided you either have or are willing to write a bitset with run-time specified size (see boost::dynamic_bitset).
You don't need to collect duplicate counts until you know which element is duplicated, and then you only need to keep that count. So all you need to track is whether you have previously seen the value (hence, n bits). Once you find the duplicated value, set count to 2 and run through the rest of the vector, incrementing count each time you hit an instance of the value. (You initialise count to 2, since by the time you get there, you will have seen exactly two of them.)
That's still O(n) space, but the constant factor is a lot smaller.
The idea of your code works.
But, thanks to the n+1 elements, we can achieve other tradeoffs of time and space.
If we have some number of buckets we're dividing numbers between, putting n+1 numbers in means that some bucket has to wind up with more than expected. This is a variant on the well-known pigeonhole principle.
So we use 2 buckets, one for the range 1..floor(n/2) and one for floor(n/2)+1..n. After one pass through the array, we know which half the answer is in. We then divide that half into halves, make another pass, and so on. This leads to a binary search which will get the answer with O(1) data, and with ceil(log_2(n)) passes, each taking time O(n). Therefore we get the answer in time O(n log(n)).
Now we don't need to use 2 buckets. If we used 3, we'd take ceil(log_3(n)) passes. So as we increased the fixed number of buckets, we take more space and save time. Are there other tradeoffs?
Well you showed how to do it in 1 pass with n buckets. How many buckets do you need to do it in 2 passes? The answer turns out to be at least sqrt(n) bucekts. And 3 passes is possible with the cube root. And so on.
So you get a whole family of tradeoffs where the more buckets you have, the more space you need, but the fewer passes. And your solution is merely at the extreme end, taking the most spaces and the least time.
Here's a cheekier algorithm, which requires only constant space but rearranges the input vector. (It only reorders; all the original elements are still present at the end.)
It's still O(n) time, although that might not be completely obvious.
The idea is to try to rearrange the array so that A[i] is i, until we find the duplicate. The duplicate will show up when we try to put an element at the right index and it turns out that that index already holds that element. With that, we've found the duplicate; we have a value we want to move to A[j] but the same value is already at A[j]. We then scan through the rest of the array, incrementing the count every time we find another instance.
#include <utility>
#include <vector>
std::pair<int, int> count_dup(std::vector<int> A) {
/* Try to put each element in its "home" position (that is,
* where the value is the same as the index). Since the
* values start at 1, A[0] isn't home to anyone, so we start
* the loop at 1.
*/
int n = A.size();
for (int i = 1; i < n; ++i) {
while (A[i] != i) {
int j = A[i];
if (A[j] == j) {
/* j is the duplicate. Now we need to count them.
* We have one at i. There's one at j, too, but we only
* need to add it if we're not going to run into it in
* the scan. And there might be one at position 0. After that,
* we just scan through the rest of the array.
*/
int count = 1;
if (A[0] == j) ++count;
if (j < i) ++count;
for (++i; i < n; ++i) {
if (A[i] == j) ++count;
}
return std::make_pair(j, count);
}
/* This swap can only happen once per element. */
std::swap(A[i], A[j]);
}
}
/* If we get here, every element from 1 to n is at home.
* So the duplicate must be A[0], and the duplicate count
* must be 2.
*/
return std::make_pair(A[0], 2);
}
A parallel solution with O(1) complexity is possible.
Introduce an array of atomic booleans and two atomic integers called duplicate and count. First set count to 1. Then access the array in parallel at the index positions of the numbers and perform a test-and-set operation on the boolean. If a boolean is set already, assign the number to duplicate and increment count.
This solution may not always perform better than the suggested sequential alternatives. Certainly not if all numbers are duplicates. Still, it has constant complexity in theory. Or maybe linear complexity in the number of duplicates. I am not quite sure. However, it should perform well when using many cores and especially if the test-and-set and increment operations are lock-free.

Smallest sum of a sequence of numbers in the array. Do not need code [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
Write a function called smallest_sum_sequence() that accepts an array
of signed integers and the number of items in the array as arguments,
and returns the smallest sum of a sequence of numbers in the array. A
sequence is defined as a single item or multiple items that are in
adjacent memory locations.
This is obviously homework, I do not need anyone to write the code for me just an explanation of what they actually are looking for, as it is worded weirdly in my opinion.
I think what they are wanting is:
Given an array and the total items in the array.
Have the user input a sequence of values for the array ( array[7] -> array[9] )
return smallest sum
Then determine the smallest sum? Is that supposed to be the smallest value or the smallest combination of items? The first sounds too easy and the second doesn't make sense even if you have negatives.
Am looking for any sort of enlightenment.
So a sequence is a set of any number of adjacent numbers in an array. In a set like
[A B C D E]
Any individual could be an answer. Or [A B] could be an answer. Or [A B C]. Or [C D E]. Or even [A B C D E]. But, definitely not [A D E] since A is not adjacent to D in the original set. Easy.
Now you have to write code that will compare the sum of the values in every possible adjacent sequence, in any set of numbers (given the size of that set beforehand).
Edited as the prior answer was wrong!
This is how I understand it. Assume you have an array of signed integers, called A, consisting of, say, <3, 4, 5>. So n = 3, the length of the array.
Your sequence is defined to be a single (or multiple) items in adjacent memory locations. So A[0] and A[1] would be a sequence as they are in adjacent memory locations, but A[0] and A[2] wouldn't be.
You call your function: smallest_sum_sequence(A, n) with A and n as above.
So your sequences are:
+ of length 1) <3>, <4>, <5>
+ of length 2) <3,4>, <4,5>
+ of length 3) <3, 4, 5>
Hence your function should return 3 in this case.
You have to sum each int with the next one and find the min of the sum
You can walk like this
int min = INT_MAX;
for (i = 0; i < len; i++) {
sum = array[i];
min = MIN(min, sum);
for (j = i + 1; j < len ; j++ ) {
sum += array[j];
min = MIN(min, sum);
}
}
With an array of signed integers it is possible that a larger sequence produces a smaller sum than a single number or a pair.
To find out you need to produce all possible sequences:
Start with first number alone, then first and second, then first, second and third.
Then start with second number ...
Then the sum of each sequence.
Return smallest sum (and probably the matching sequence)
Let's look closely at the requirements:
Write a function ... smallest_sum_sequence() that accepts (1) an
array of signed integers and (2) the number of items in the array
as arguments, and (3) returns the smallest sum of a sequence of
numbers in the array.
A sequence is defined as a single item or multiple items ... in
adjacent memory locations.
Taking them one at a time, you know you will write a function that accepts an array of type int and then number of items (won't be negative, so size_t is a good type). Since it must return a smallest sum, the return type of the function can be int as well:
int smallest_sum_sequence (int *a, size_t n)
{
...
return ssum;
}
That is the basic framework for your function. The next issue to address is the smallest sum. Since you are told you are accepting an array of signed values, you must presume that the values within the array can be both negative and positive numbers. You are next told the sum of the smallest sequence can be be derived from a single or multiple adjacent values?
What I interpret this to mean is that you must keep 2 running values. (1) the minimum value in the array; and (2) a sum of the smallest sequences.
In the arguments you get the number of elements in the array providing you with an easy means to iterate over the array itself:
int i = 0;
int min = INT_MAX; /* largest possible minimum number */
int ssum = INT_MAX; /* largest possible smallest sum */
for (i = 0; i < n; i++) {
min = a[i] < min ? a[i] : min;
if (i > 0) {
int tssum = 0; /* temporary smallest sum */
/* test adjacent values {
if adjacent: tssum += a[i]; {
if no longer adjacent {
compare tssum < ssum, if so ssum = tssum;
}
}
} */
}
}
In your first iteration over the array, you have found min the minimum single value and ssum the sum of the smallest sequence. Now all that is left is the return:
return min < ssum ? min : ssum;
That is my impression of what the logic asked for. You may have to adjust the logic of the pseudo-code and you need to figure out how to identify a sequence start/end, but this should at least give you an outline of one way to approach it. Good luck.

Find the unique integer that appears m times in an array

where all other integers in this array appeared n times. we have n>m.
all elements in this array are integers. Can you design an algorithm that works in O(N) where N is the number of elements in the array, while minimizing the space complexity? In the best case space complexity can be limited to O(log(m)).
a special case is n=2 and m=1 (which is easy). Is there a generalized algorithm that can handle arbitrary m and n?
thanks
You can use a hashtable that maps numbers in the array to the number of occurrences. You can iterate through the array, incrementing the number of occurrences for each number. Then, you can iterate through the hashtable, searching for a key with n occurrences.
If the array has length > m, then pivot on a random element in the array. Find the half of the array that has length m (mod n), and repeat on that half.
This has expected run-time O(N), and requires O(1) additional storage.
The below algo might be what you're looking for. Basically, you create a map with key as the integer value and the value as the number of occurrence.you loop through your array just once and anytime a numbers occurs more than once, you increase it's count
public static void findCount (int[] array,int m, int n){
if(m>n){
thrown new IllegalArgumentException("m is greater than n");
}
Map<Integer,Integer> intCount = new HashMap<Integer,Integer>();
for(int i = 0; i<array.length; i++){
if (!intCount.containsKey(array[i])) intCount.put(array[i], 0);
intCount.put(array[i], intCount.get(array[i]) + 1);
}
for (Map.Entry<String,Integer> entry : words.entrySet()) {
Integer key = entry.getKey();
Integer value = entry.getValue();
if(value==m){
System.out.println("Value "+key+" Occurs "+value+" times");
}
}
}
The case n=2, m=1 can be done by xoring all the numbers together in the array A.
This idea can be generalised by counting (modulo n) the number of elements in A with the i'th bit set. That count is non-zero if and only if the i'th bit of the answer is set.
This gives you an O(N.log(max(A))) way to compute the solution using O(log(n)) additional storage.
This doesn't achieve the O(log(N)) run-time and O(log(m)) storage complexities given in the question, but it seems an interesting approach.

How to sort an int array in linear time?

I had been given a homework to do a program to sort an array in ascending order.I did this:
#include <stdio.h>
int main()
{
int a[100],i,n,j,temp;
printf("Enter the number of elements: ");
scanf("%d",&n);
for(i=0;i<n;++i)
{
printf("%d. Enter element: ",i+1);
scanf("%d",&a[i]);
}
for(j=0;j<n;++j)
for(i=j+1;i<n;++i)
{
if(a[j]>a[i])
{
temp=a[j];
a[j]=a[i];
a[i]=temp;
}
}
printf("Ascending order: ");
for(i=0;i<n;++i)
printf("%d ",a[i]);
return 0;
}
The input will not be more than 10 numbers. Can this be done in less amount of code than i did here? I want the code to be as shortest as possible.Any help will be appreciated.Thanks!
If you know the range of the array elements, one way is to use another array to store the frequency of each of the array elements ( all elements should be int :) ) and print the sorted array. I am posting it for large number of elements (106). You can reduce it according to your need:
#include <stdio.h>
#include <malloc.h>
int main(void){
int t, num, *freq = malloc(sizeof(int)*1000001);
memset(freq, 0, sizeof(int)*1000001); // Set all elements of freq to 0
scanf("%d",&t); // Ask for the number of elements to be scanned (upper limit is 1000000)
for(int i = 0; i < t; i++){
scanf("%d", &num);
freq[num]++;
}
for(int i = 0; i < 1000001; i++){
if(freq[i]){
while(freq[i]--){
printf("%d\n", i);
}
}
}
}
This algorithm can be modified further. The modified version is known as Counting sort and it sorts the array in Θ(n) time.
Counting sort:1
Counting sort assumes that each of the n input elements is an integer in the range
0 to k, for some integer k. When k = O(n), the sort runs in Θ(n) time.
Counting sort determines, for each input element x, the number of elements less
than x. It uses this information to place element x directly into its position in the
output array. For example, if 17 elements are less than x, then x belongs in output
position 18. We must modify this scheme slightly to handle the situation in which
several elements have the same value, since we do not want to put them all in the
same position.
In the code for counting sort, we assume that the input is an array A[1...n] and
thus A.length = n. We require two other arrays: the array B[1....n] holds the
sorted output, and the array C[0....k] provides temporary working storage.
The pseudo code for this algo:
for i ← 1 to k do
c[i] ← 0
for j ← 1 to n do
c[A[j]] ← c[A[j]] + 1
//c[i] now contains the number of elements equal to i
for i ← 2 to k do
c[i] ← c[i] + c[i-1]
// c[i] now contains the number of elements ≤ i
for j ← n downto 1 do
B[c[A[i]]] ← A[j]
c[A[i]] ← c[A[j]] - 1
1. Content has been taken from Introduction to Algorithms by
Thomas H. Cormen and others.
You have 10 lines doing the sorting. If you're allowed to use someone else's work (subsequent notes indicate that you can't do this), you can reduce that by writing a comparator function and calling the standard C library qsort() function:
static int compare_int(void const *v1, void const *v2)
{
int i1 = *(int *)v1;
int i2 = *(int *)v2;
if (i1 < i2)
return -1;
else if (i1 > i2)
return +1;
else
return 0;
}
And then the call is:
qsort(a, n, sizeof(a[0]), compare_int);
Now, I wrote the function the way I did for a reason. In particular, it avoids arithmetic overflow which writing this does not:
static int compare_int(void const *v1, void const *v2)
{
return *(int *)v1 - *(int *)v2;
}
Also, the original pattern generalizes to comparing structures, etc. You compare the first field for inequality returning the appropriate result; if the first fields are unequal, then you compare the second fields; then the third, then the Nth, only returning 0 if every comparison shows the values are equal.
Obviously, if you're supposed to write the sort algorithm, then you'll have to do a little more work than calling qsort(). Your algorithm is a Bubble Sort. It is one of the most inefficient sorting techniques — it is O(N2). You can look up Insertion Sort (also O(N2)) but more efficient than Bubble Sort), or Selection Sort (also quadratic), or Shell Sort (very roughly O(N3/2)), or Heap Sort (O(NlgN)), or Quick Sort (O(NlgN) on average, but O(N2) in the worst case), or Intro Sort. The only ones that might be shorter than what you wrote are Insertion and Selection sorts; the others will be longer but faster for large amounts of data. For small sets like 10 or 100 numbers, efficiency is immaterial — all sorts will do. But as you get towards 1,000 or 1,000,000 entries, then the sorting algorithms really matter. You can find a lot of questions on Stack Overflow about different sorting algorithms. You can easily find information in Wikipedia for any and all of the algorithms mentioned.
Incidentally, if the input won't be more than 10 numbers, you don't need an array of size 100.

Finding kth smallest number from n sorted arrays

So, you have n sorted arrays (not necessarily of equal length), and you are to return the kth smallest element in the combined array (i.e the combined array formed by merging all the n sorted arrays)
I have been trying it and its other variants for quite a while now, and till now I only feel comfortable in the case where there are two arrays of equal length, both sorted and one has to return the median of these two.
This has logarithmic time complexity.
After this I tried to generalize it to finding kth smallest among two sorted arrays. Here is the question on SO.
Even here the solution given is not obvious to me. But even if I somehow manage to convince myself of this solution, I am still curious as to how to solve the absolute general case (which is my question)
Can somebody explain me a step by step solution (which again in my opinion should take logarithmic time i.e O( log(n1) + log(n2) ... + log(nN) where n1, n2...nN are the lengths of the n arrays) which starts from the more specific cases and moves on to the more general one?
I know similar questions for more specific cases are there all over the internet, but I haven't found a convincing and clear answer.
Here is a link to a question (and its answer) on SO which deals with 5 sorted arrays and finding the median of the combined array. The answer just gets too complicated for me to able to generalize it.
Even clean approaches for the more specific cases (as I mentioned during the post) are welcome.
PS: Do you think this can be further generalized to the case of unsorted arrays?
PPS: It's not a homework problem, I am just preparing for interviews.
This doesn't generalize the links, but does solve the problem:
Go through all the arrays and if any have length > k, truncate to length k (this is silly, but we'll mess with k later, so do it anyway)
Identify the largest remaining array A. If more than one, pick one.
Pick the middle element M of the largest array A.
Use a binary search on the remaining arrays to find the same element (or the largest element <= M).
Based on the indexes of the various elements, calculate the total number of elements <= M and > M. This should give you two numbers: L, the number <= M and G, the number > M
If k < L, truncate all the arrays at the split points you've found and iterate on the smaller arrays (use the bottom halves).
If k > L, truncate all the arrays at the split points you've found and iterate on the smaller arrays (use the top halves, and search for element (k-L).
When you get to the point where you only have one element per array (or 0), make a new array of size n with those data, sort, and pick the kth element.
Because you're always guaranteed to remove at least half of one array, in N iterations, you'll get rid of half the elements. That means there are N log k iterations. Each iteration is of order N log k (due to the binary searches), so the whole thing is N^2 (log k)^2 That's all, of course, worst case, based on the assumption that you only get rid of half of the largest array, not of the other arrays. In practice, I imagine the typical performance would be quite a bit better than the worst case.
It can not be done in less than O(n) time. Proof Sketch If it did, it would have to completely not look at at least one array. Obviously, one array can arbitrarily change the value of the kth element.
I have a relatively simple O(n*log(n)*log(m)) where m is the length of the longest array. I'm sure it is possible to be slightly faster, but not a lot faster.
Consider the simple case where you have n arrays each of length 1. Obviously, this is isomorphic to finding the kth element in an unsorted list of length n. It is possible to find this in O(n), see Median of Medians algorithm, originally by Blum, Floyd, Pratt, Rivest and Tarjan, and no (asymptotically) faster algorithms are possible.
Now the problem is how to expand this to longer sorted arrays. Here is the algorithm: Find the median of each array. Sort the list of tuples (median,length of array/2) and sort it by median. Walk through keeping a sum of the lengths, until you reach a sum greater than k. You now have a pair of medians, such that you know the kth element is between them. Now for each median, we know if the kth is greater or less than it, so we can throw away half of each array. Repeat. Once the arrays are all one element long (or less), we use the selection algorithm.
Implementing this will reveal additional complexities and edge conditions, but nothing that increases the asymptotic complexity. Each step
Finds the medians or the arrays, O(1) each, so O(n) total
Sorts the medians O(n log n)
Walks through the sorted list O(n)
Slices the arrays O(1) each so, O(n) total
that is O(n) + O(n log n) + O(n) + O(n) = O(n log n). And, we must perform this untill the longest array is length 1, which will take log m steps for a total of O(n*log(n)*log(m))
You ask if this can be generalized to the case of unsorted arrays. Sadly, the answer is no. Consider the case where we only have one array, then the best algorithm will have to compare at least once with each element for a total of O(m). If there were a faster solution for n unsorted arrays, then we could implement selection by splitting our single array into n parts. Since we just proved selection is O(m), we are stuck.
You could look at my recent answer on the related question here. The same idea can be generalized to multiple arrays instead of 2. In each iteration you could reject the second half of the array with the largest middle element if k is less than sum of mid indexes of all arrays. Alternately, you could reject the first half of the array with the smallest middle element if k is greater than sum of mid indexes of all arrays, adjust k. Keep doing this until you have all but one array reduced to 0 in length. The answer is kth element of the last array which wasn't stripped to 0 elements.
Run-time analysis:
You get rid of half of one array in each iteration. But to determine which array is going to be reduced, you spend time linear to the number of arrays. Assume each array is of the same length, the run time is going to be cclog(n), where c is the number of arrays and n is the length of each array.
There exist an generalization that solves the problem in O(N log k) time, see the question here.
Old question, but none of the answers were good enough. So I am posting the solution using sliding window technique and heap:
class Node {
int elementIndex;
int arrayIndex;
public Node(int elementIndex, int arrayIndex) {
super();
this.elementIndex = elementIndex;
this.arrayIndex = arrayIndex;
}
}
public class KthSmallestInMSortedArrays {
public int findKthSmallest(List<Integer[]> lists, int k) {
int ans = 0;
PriorityQueue<Node> pq = new PriorityQueue<>((a, b) -> {
return lists.get(a.arrayIndex)[a.elementIndex] -
lists.get(b.arrayIndex)[b.elementIndex];
});
for (int i = 0; i < lists.size(); i++) {
Integer[] arr = lists.get(i);
if (arr != null) {
Node n = new Node(0, i);
pq.add(n);
}
}
int count = 0;
while (!pq.isEmpty()) {
Node curr = pq.poll();
ans = lists.get(curr.arrayIndex)[curr.elementIndex];
if (++count == k) {
break;
}
curr.elementIndex++;
pq.offer(curr);
}
return ans;
}
}
The maximum number of elements that we need to access here is O(K) and there are M arrays. So the effective time complexity will be O(K*log(M)).
This would be the code. O(k*log(m))
public int findKSmallest(int[][] A, int k) {
PriorityQueue<int[]> queue = new PriorityQueue<>(Comparator.comparingInt(x -> A[x[0]][x[1]]));
for (int i = 0; i < A.length; i++)
queue.offer(new int[] { i, 0 });
int ans = 0;
while (!queue.isEmpty() && --k >= 0) {
int[] el = queue.poll();
ans = A[el[0]][el[1]];
if (el[1] < A[el[0]].length - 1) {
el[1]++;
queue.offer(el);
}
}
return ans;
}
If the k is not that huge, we can maintain a priority min queue. then loop for every head of the sorted array to get the smallest element and en-queue. when the size of the queue is k. we get the first k smallest .
maybe we can regard the n sorted array as buckets then try the bucket sort method.
This could be considered the second half of a merge sort. We could simply merge all the sorted lists into a single list...but only keep k elements in the combined lists from merge to merge. This has the advantage of only using O(k) space, but something slightly better than merge sort's O(n log n) complexity. That is, it should in practice operate slightly faster than a merge sort. Choosing the kth smallest from the final combined list is O(1). This is kind of complexity is not so bad.
It can be done by doing binary search in each array, while calculating the number of smaller elements.
I used the bisect_left and bisect_right to make it work for non-unique numbers as well,
from bisect import bisect_left
from bisect import bisect_right
def kthOfPiles(givenPiles, k, count):
'''
Perform binary search for kth element in multiple sorted list
parameters
==========
givenPiles are list of sorted list
count is the total number of
k is the target index in range [0..count-1]
'''
begins = [0 for pile in givenPiles]
ends = [len(pile) for pile in givenPiles]
#print('finding k=', k, 'count=', count)
for pileidx,pivotpile in enumerate(givenPiles):
while begins[pileidx] < ends[pileidx]:
mid = (begins[pileidx]+ends[pileidx])>>1
midval = pivotpile[mid]
smaller_count = 0
smaller_right_count = 0
for pile in givenPiles:
smaller_count += bisect_left(pile,midval)
smaller_right_count += bisect_right(pile,midval)
#print('check midval', midval,smaller_count,k,smaller_right_count)
if smaller_count <= k and k < smaller_right_count:
return midval
elif smaller_count > k:
ends[pileidx] = mid
else:
begins[pileidx] = mid+1
return -1
Please find the below C# code to Find the k-th Smallest Element in the Union of Two Sorted Arrays. Time Complexity : O(logk)
public int findKthElement(int k, int[] array1, int start1, int end1, int[] array2, int start2, int end2)
{
// if (k>m+n) exception
if (k == 0)
{
return Math.Min(array1[start1], array2[start2]);
}
if (start1 == end1)
{
return array2[k];
}
if (start2 == end2)
{
return array1[k];
}
int mid = k / 2;
int sub1 = Math.Min(mid, end1 - start1);
int sub2 = Math.Min(mid, end2 - start2);
if (array1[start1 + sub1] < array2[start2 + sub2])
{
return findKthElement(k - mid, array1, start1 + sub1, end1, array2, start2, end2);
}
else
{
return findKthElement(k - mid, array1, start1, end1, array2, start2 + sub2, end2);
}
}

Resources