algorithm of sorting d sorted arrays - arrays

Please help to understand the running time of the following algorithm
I have d already sorted arrays (every array have more than 1 element) with total n elements.
i want to have one sorted array of size n
if i am not mistaken insertion sort is running linearly on partially sorted arrays
if i will concatenate this d arrays into one n element array and sort it with insertion sort
isn't it a partially sorted array and running time of insertion sort on this array wont be O(n) ?

Insertion sort is O(n²), even when original array is concatenation of several presorted arrays. You probably need to use mergesort to combine several sorted arrays into one sorted array. This will give you O(n·ln(d)) performance

No, this will take quadratic time. Insertion sort is only linear if each element is at most a constant distance d away from the point where it would be in a sorted array, in which case it takes O(nd) time -- that's what's meant by partially sorted. You don't have that guarantee.
You can do this in linear time only under the assumption that the number of subarrays is guaranteed to be a small constant. In that case, you can use a k-way merge.

Insertion sort is fairly (relatively) linear for small values of N. If N is large then your performance will more likely be N^2.
The fact that the sub-arrays are sort wont, I believe, help that much if N is sufficiently large.
Timsort is a good candidate for partially sorted arrays

If the arrays are known to be sorted, it's a simple matter of treating each array as a queue, sorting the "heads", selecting the smallest of the heads to put into the new array, then "popping" the selected value from its array.
If D is small then a simple bubble sort works well for sorting the heads, otherwise you should use some sort of insertion sort, since only one element needs to be placed into the order.
This is basically a "merge sort", I believe. Very useful when the list to be sorted exceeds working storage, since you can sort smaller lists first, without thrashing, then combine using very little working storage.

Related

Insertion Sort - Specific Array O(n^(7/4))

I'm looking for an Array of n digits, that will be sorted using Insertion Sort in time of O(n^(7/4)).
What kind of Array, as a function of n, will give me such running time?
As an example, it can be an array which is half (or more) sorted and half not, or sorted back-words, or whatever.
Hope it's clear enough.
Thanks!

Sorting a partially sorted array in O(n)

Hey so I'm just really stuck on this question.
I need to devise an algorithm (no need for code) that sorts a certain partially sorted array into a fully sorted array. The array has N real numbers and the first N-[N\sqrt(N)] (the [] denotes the floor of this number) elements are sorted, while are the rest are not. There are no special properties to the unsorted numbers at the end, in fact I'm told nothing about them other than they're obviously real numbers like the rest.
The kicker is time complexity for the algorithm needs to be O(n).
My first thought was to try and sort only the unsorted numbers and then use a merge algorithm, but I can't figure out any sorting algorithm that would work here in O(n). So I'm thinking about this all wrong, any ideas?
This is not possible in the general case using a comparison-based sorting algorithm. You are most likely missing something from the question.
Imagine the partially sorted array [1, 2, 3, 4564, 8481, 448788, 145, 86411, 23477]. It contains 9 elements, the first 3 of which are sorted (note that floor(N/sqrt(N)) = floor(sqrt(N)) assuming you meant N/sqrt(N), and floor(sqrt(9)) = 3). The problem is that the unsorted elements are all in a range that does not contain the sorted elements. It makes the sorted part of the array useless to any sorting algorithm, since they will stay there anyway (or be moved to the very end in the case where they are greater than the unsorted elements).
With this kind of input, you still need to sort, independently, N - floor(sqrt(N)) elements. And as far as I know, N - floor(sqrt(N)) ~ N (the ~ basically means "is the same complexity as"). So you are left with an array of approximately N elements to sort, which takes O(N log N) time in the general case.
Now, I specified "using a comparison-based sorting algorithm", because sorting real numbers (in some range, like the usual floating-point numbers stored in computers) can be done in amortized O(N) time using a hash sort (similar to a counting sort), or maybe even a modified radix sort if done properly. But the fact that a part of the array is already sorted doesn't help.
In other words, this means there are sqrt(N) unsorted elements at the end of the array. You can sort them with an O(n^2) algorithm which will give a time of O(sqrt(N)^2) = O(N); then do the merge you mentioned which will also run in O(N). Both steps together will therefore take just O(N).

How many comparisons does insertion sort do in an already-ordered 2-element array?

The best case scenario of insertion sort is meant to be O(n), however, if you have 2 elements in an array that are already sorted, such as 10 and 11, doesn't it only make one comparison rather than 2?
Time complexity of O(n) does not mean that the number of steps is exactly n, it means that the number of steps is dominated by a linear function. Basically, sorting twice as many elements should take at most twice as much time for large numbers.
The best case scenario for insert sort is when you can insert the new element after just one comparison. This can happen in only 2 cases:
You are inserting elements in from a reverse sorted list and you compare the new element with the first element of the target list.
You are inserting elements from a sorted list and you compare the new element with the last one of the target list.
In these 2 cases, each new element is inserted after just one comparison, including in the case you mention.
The time complexity would be indeed O(n) for these very special cases. You do not need such a favorable case for this complexity, the time complexity will be O(n) if there is a constant upper bound for the number of comparisons independent of the list length.
Note that it is a common optimization to try and handle sorted lists in an optimized way. If the optimization mentioned in the second paragraph above is not implemented, sorting an already sorted list would be the worst case scenario, with n comparisons for the insertion of the n+1th element.
In the general case, insertion sort on lists has a time complexity of O(n2), but careful implementation can produce an optimal solution for already sorted lists.
Note that this is true for lists where inserting at any position has a constant cost, insertion sort on arrays does not have this property. It can still be optimized to handle these special cases, but not both at the same time.
Insertion sort does N - 1 comparisons if the input is already sorted.
This is because for every element it compares it with a previous element and does something if the order is not right (it is not important what it does now, because the order is always right). So you will do this N - 1 times.
So it looks like you have to understand a big O notation. Because O(n) does not mean n operations, it does not even mean close to n operations (n/10^9 is O(n) and it is not really close to n). All it mean that the function approximately linear (think about it as limit where n-> inf).

Sorting an array which is already sorted, Except one element which is out of order

I was preparing for a competition and came across this question, which I can't comprehend.
Consider a set of 'n' elements in an array, which is sorted except for one element that appears out of order. which of the following sort sequence takes O(n) time?
Quick Sort
Heap Sort
Merge Sort
Bubble Sort
Now I already know the best method would be to use Insertion sort which would take O(n) time in this case but since its telling other than that, I'm not sure which to use.
Quick sort will be real bad since the array is already sorted.
Heap sort will not exactly utilize the property that the array is sorted and will take O(nlogn) time.
Merge sort also takes O(nlogn) as it doesn't discriminate the input ordering.
Bubble sort would also take O(n^2).
Would really like some help here , Am I missing something?
The natural variant of merge sort will sort the described list in O(n) time.
It works the same as merge sort but begins by identifying natural runs in the data. So it will identify the two runs (sorted groups) around the unsorted element, then merge the unsorted element into one of the runs, then merge the two runs together. This only requires two O(n) merges (plus some O(n) run detection), no matter the size of the data, so it's O(n).
Insertion sort would still take O(n^2) because it won't check that the array is sorted. The best solution would be bubble sort as it would scan the array twice: the first time it would move the element to its correct place and the second time it would realize the array is sorted. keeps track of the number of swaps it makes at every iteration.
Unfortunately it is not as simple as this; it depends on the location of the unsorted item with respect to its correct place. The solution provided by AndyG would make it O(n) in all cases.
If there is exactly one element out of order, you could find it and then insert it at the correct place -> O(n) effort.

Sorting and merging two arrays in efficient way?

We have two arrays (not sorted), of capacity n and n+m. The first array has n elements. The second array has m elements (and additionally n places reserved for more elements).
The goal is to merge the two arrays and store the result in the second array in a sorted manner, without using extra space.
Currently, I sort both arrays using quick-sort and then merge them using merge-sort. Is there a more efficient way to achieve this???
You can explore merge sort.
https://www.google.com/search?q=mergesort&ie=UTF-8&oe=UTF-8&hl=en&client=safari#itp=open0
Or depending on the size, you can do quicksort on each array, and then merge them using the merge sort technique (or merge, then quicksort).
I would go with mergesort, it basically works by sorting each array individually, then it puts them together in order
You're looking at O(nlogn) for mergesort and O(nlogn) for quicksort, but possible O(n^2) worst case with quicksort.
Clearly the best thing to do is to copy the contents of N into the free space in the N+M array and quicksort the N+M array.
By doing 2 quicksorts and then a merge sort you are just making the entire operation less efficient.
Here is a mental exercise, if you had to sort an array of length M, would you split it into 2 arrays, M1 and M2, sort each and then merge sort them together? No. If you did that you would just be limiting information available to each call of quicksort, slowing down the process.
So why would you keep your two starting arrays separate?
If i wanted to guarantee also the O(n*log(n)) behaviour, I would use modified version of Heapsort, which would use the both arrays as a base for the the heap, and would store the sorted data in the additional part of the array.
This also might be faster than two Quicksorts, because it does not require the additional merge operation. Also Quicksort is terribly slow, when small arrays are being sorted (the size of the problem is not mentioned the the setting of the problem).
If you are storing in a second array, than you are using extra space, but you can minimize that by helping the GC like this:
join both arrays in a second array
set both previous variables to null so they are eligible to be garbage collected
sort the second array, Arrays.sort(...) - O(n log(n))
Look at the javadoc for this method:
/**
* Sorts the specified array into ascending numerical order.
*
* <p>Implementation note: The sorting algorithm is a Dual-Pivot Quicksort
* by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm
* offers O(n log(n)) performance on many data sets that cause other
* quicksorts to degrade to quadratic performance, and is typically
* faster than traditional (one-pivot) Quicksort implementations.
*
* #param a the array to be sorted
*/
public static void sort(int[] a) {
Let's call the smaller array the N array and the other one the M array. I'm assuming the elements of the M array are initially in locations 0 through m-1. Sort both arrays using your favorite technique, which may depend on other criteria such as stability or limiting worst-case behavior.
if min(N) > max(M)
copy N's elements over starting at location m [O(n) time]
else
move M's elements to the end of the M array (last down to first) [O(m) time]
if min(M) > max(N)
copy N's elements over starting at location 0 [O(n) time for the copy]
else
perform classic merge: min of remaining m's and n's gets migrated
to next available space in M [O(min(m,n) time]
Overall this is dominated by the initial sorting time, the merge phase is all linear. Migrating the m's to the end of the M array guarantees no space collisions, so you don't need extra side storage as per your specification.

Resources