finding 10 largest integers in an array in O(n) time - arrays

Let S be a set of n integers stored in an array (not necessarily sorted). Design an algorithm to find the 10 largest integers in S (by creating a separate array of length 10 storing those integers). Your algorithm must finish in O(n) time.
I thought I could maybe answer this by using count sort and then adding last 10 elements into the new array. But apparently this is wrong. Does anyone know a better way?

Method 1:
you can use FindMax() algorithm that find the max number in O(N) and if you use it 10 time :
10 * O(N) =O(N)
each time you find max num you put it in the new array and you will ignore it the next time you use FindMax();
Method 2:
you can use Bubble 10 times:
1) Modify Bubble Sort to run the outer loop at most 10 times.
2) Save the last 10 elements of the array obtained in step 1 to the new array.
10 * O(N) =O(N)
Method 3:
You can use MAX Heap:
1) Build a Max Heap in O(n)
2) Use Extract Max 10 times to get 10 maximum elements from the Max Heap 10
* O(logn)
O(N) + 10 * O(logN) = O(N)

Visit :
http://www.geeksforgeeks.org/k-largestor-smallest-elements-in-an-array/
They have mentioned six methods for this.

Use order statistic algorithm to find the 10th biggest element.
Next, iterate over the array to find all elements which are lesser/equal it.
TimeComplexity: O(n) for order statistic + O(n) for iterating the array once => O(n)

Insert them in a balanced binary tree. O(N) + O(lg2 N).

Related

Why is the complexity of merging M sorted arrays linear time?

Suppose we want to perform external sort and have M number of blocks sorted, where each block contains k comparable items such that n = Mk. k in this case also refers to the maximum number of items you can fit into memory for sorting, and n is the total number of items to sort.
Then using the merge function in merge sort, each element will have to be compared against all other elements from other blocks, which gives me O(M) comparison for one element. Since we have to do this for all elements, we will have O(M * Mk) = O(M^2 * k) = O(nM) time complexity.
This seems to be linear at first, but suppose in the worst case we can only fit 1 item into memory. So we have M=n blocks, and the time complexity is O(n^2) directly. How does the merging gives you linear time in external sort?
Also, in the case where k = 1, how is the sorting even feasible when there cannot be any comparisons done?
Make priority queue based? for example, on binary heap, fill it with current items (or their indexes) from every block, and extract top item at every step.
Extracting takes O(log(M)) per output element, so full merging is O(n*log(M))
For your artificial example: O(n*log(n))

Range minimum queries when array is dynamic

I have an array say A(0 indexed) of size 1.
I want to find minimum in array A between indexes k1 (k1>=0) and A.size()-1(i.e the last element).
Then I would insert the value : (minimum element in given range + some "random" constant) at the end of the array.Then I have another query to find minimum between indexes k2 and A.size()-1. I find that, insert the value : (minimum in the given range + another "random" constant) at the end. I have to do many such queries.
Say, I have N queries. Naive approach would take O(N^2).
Cannot use segment trees as array is not static. But, a clever way to do is make segment tree for size N+1 array; beforehand and fill the unknown values with infinity. This would give me O(Nlog N) complexity.
Is there any other method for NlogN complexity or even N?
There is absolutely no need to use advanced data structures like tree here. Just a simple local variable and list will do it all:
Create an empty list(say minList).
Start from the end index and go till the start index of the initially given array, put the minimum values (till that index from the end) at the front of the list(i.e. do push_front).
Lets say the provided array is:
70 10 50 40 60 90 20 30
So the resultant minList will be:
10 10 20 20 20 20 20 30
After doing that, you only need to keep track of the minimum among newly appended elements in the continuously modifying array(say, minElemAppended).
Lets say you get k = 5 and randomConstant = -10, then
minElemAppended = minimum(minList[k-1] + randomConstant, minElemAppended)
By adopting this approach,
You don't need to traverse the appended part of or even the initial given array.
You have option not to append the elements at all.
Time Complexity: O(N) to process N queries.
Space Complexity: O(N) to store the minList

An array has 2 elements. One element is unique and the other can repeat any number of times. Find the distinct element in < O(n) time

O(n) solution is straight forward. I can think of something in terms of binary search but that again will be O(n) in worst case as the input array need not be sorted. Is there any solution that runs in O(log n) time?
May be one of the solution can be based on divide and conquer techniques.
eg Array -> [2,2,2,2,3,2,2]
Take three variables `First`, `Mid` and `Last`
1. Find `First` first element -> 2 in our case.
2. Find length of array -> 7
3. Find `Mid` element -> 4
4. Sum till `Mid` element -> 2+2+2+2=8 (calculate by Mid * First)
5. If sum from `First` and `Mid` is equal to (`Mid` * `First`) then set `First` as `Mid`+-1 else set `Last` as `Mid`+1
The above method will divide the array to half for each comparision in worst case scenario.

Time Complexity of Insertion and Selection sort When there are only two key values in an array

I am reviewing Algorithm, 4th Editon by sedgewick recently, and come across such a problem and cannot solve it.
The problem goes like this:
2.1.28 Equal keys. Formulate and validate hypotheses about the running time of insertion
sort and selection sort for arrays that contain just two key values, assuming that
the values are equally likely to occur.
Explanation: You have n elements, each can be 0 or 1 (without loss of generality), and for each element x: P(x=0)=P(x=1).
Any help will be welcomed.
Selection sort:
The time complexity is going to remain the same (as it is without the 2 keys assumption), it is independent on the values of the arrays, only the number of elements.
Time complexity for selection sort in this case is O(n^2)
However, this is true only for the original algorithm that scans the entire tail of the array for each outer loop iteration. if you optimize it to find the next "0", at iteration i, since you have already "cleared" the first i-1 zeros, the i'th zero mean location is at index 2i. This means each time, the inner loop will need to do 2i-(i-1)=i+1 iterations.
Suming it up will be:
1 + 2 + ... + n = n(n+1)/2
Which is, unfortunately, still in O(n^2).
Another optimization could be to "remmber" where you have last stopped. This will significantly improve complexity to O(n), since you don't need to traverse the same element more than once - but that's going to be a different algorithm, not selection sort.
Insertion Sort:
Here, things are more complicated. Note that in the inner loop (taken from wikipedia), the number of operations depends on the values:
while j > 0 and A[j-1] > x
However, recall that in insertion sort, after the ith step, the first i elements are sorted. Since we are assuming P(x=0)=P(x=1), an average of i/2 elements are 0's and i/2 are 1's.
This means, the time complexity on average, for the inner loop is O(i/2).
Summing this up will get you:
1/2 + 2/2 + 3/2 + ... + n/2 = 1/2* (1+2+...+n) = 1/2*n(n+1)/2 = n(n+1)/4
The above is however, still in O(n^2).
The above is not a formal proof, because it implicitly uses E(f(E(x)) = E(f(x)), which is not true, but it can give you guidelines how to formally build your proof.
Well obviosuly you only need to search until you find the first 0, when searching for the next smmalest. For example, in the selection sort, you scan the array looking for the next smallest number to swap into the current position. Since there are only 0s and 1s you can stop the scan when encountering the first 0 (since it is the next smallest number), so there is no need to continue scanning the rest of the array in this cycle. If 0 is not found then the sorting is complete, since the "unsorted" portion is all 1s.
Insertion sort is basically the same. They are both O(N) in this case.

finding the maximum number in array

there is an array of numbers an this array is irregular and we should find a maximum number (n) that at least n number is bigger than it (this number may be in array and may not be in array )
for example if we give 2 5 7 6 9 number 4 is maximum number that at least 4 number (or more than it ) is bigger than 4 (5 6 7 9 are bigger)
i solve this problem but i think it gives time limit in big array of numbers so i want to resolve this problem in another way
so i use merge sort for sorting that because it take nlog(n) and then i use a counter an it counts from 1 to k if we have k number more than k we count again for example we count from 1 to 4 then in 5 we don't have 5 number more than 5 so we give k-1 = 4 and this is our n .
it's good or it maybe gives time limit ? does anybody have another idea ?
thanks
In c++ there is a function called std::nth_element and it can find the nth element of an array in linear time. Using this function you should find the N - n- th element (where N is the total number of elements in the array) and subtract 1 from it.
As you seek a solution in C you can not make use of this function, but you can implement your solution similarly. nth_element performs something quite similar to qsort, but it only performs partition on the part of the array where the n-th element is.
Now let's assume you have nth_element implemented. We will perform something like combination of binary search and nth_element. First we assume that the answer of the question is the middle element of the array (i.e. the N/2-th element). We use nth_element and we find the N/2th element. If it is more than N/2 we know the answer to your problem is at least N/2, otherwise it will be less. Either way in order to find the answer we will only continue with one of the two partitions created by the N/2th element. If this partition is the right one(elements bigger than N/2) we continue solving the same problem, otherwise we start searching for the max element M on the left of the N/2th element that has at least x bigger elements such that x + N/2 > M. The two subproblems will have the same complexity. You continue performing this operation until the interval you are interested in is of length 1.
Now let's prove the complexity of the above algorithm is linear. First nth_element is linear performing operations in the order of N, second nth_element that only considers one half of the array will perform operations in the order of N/2 the third - in the order of N/4 and so on. All in all you will perform operations in the order of N + N/2 + N/4 + ... + 1. This sum is less than 2 * N thus your complexity is still linear.
Your solution is asymptotically slower than what I propose above as it has a complexity O(n*log(n)), while my solution has complexity of O(n).
I would use a modified variant of a sorting algorithm that uses pivot values.
The reason is that you want to sort as few elements as possible.
So I would use qsort as my base algorithm and let the pivot element control which partition to sort (you will only need to sort one).

Resources