Regarding in-place merge in an array - arrays

I came across the following question.
Given an array of n elements and an integer k where k < n. Elements {a0...ak} and
{ak+1...an} are already sorted. Give an algorithm to sort in O(n) time and O(1) space.
It does not seem to me like it can be done in O(n) time and O(1) space. The problem really seems to be asking how to do the merge step of mergesort but in-place. If it was possible, wouldn't mergesort be implemented that way? I am unable to convince myself though and need some opinion.

This seems to indicate that it is possible to do in O(lg^2 n) space. I cannot see how to prove that it is impossible to merge in constant space, but I cannot see how to do it either.
Edit:
Chasing references, Knuth Vol 3 - Exercise 5.5.3 says "A considerably more complicated algorithm of L. Trabb-Pardo provides the best possible answer to this problem: It is possible to do stable merging in O(n) time and stable sorting in O(n lg n) time, using only O(lg n) bits of auxiliary memory for a fixed number of index variables.
More references that I have not read. Thanks for an interesting problem.
Further edit:
This article claims that the article by Huang and Langston have an algorithm that merges two lists of size m and n in time O(m + n), so the answer to your question would seem to be yes. Unfortunately I do not have access to the article, so I must trust the second hand information. I'm not sure how to reconcile this with Knuth's pronouncement that the Trabb-Pardo algorithm is optimal. If my life depended on it, I'd go with Knuth.
I now see that this had been asked as and earlier Stack Overflow question a number of times. I don't have the heart to flag it as a duplicate.
Huang B.-C. and Langston M. A., Practical in-place merging, Comm. ACM 31 (1988) 348-352

There are several algorithms for doing this, none of which are very easy to intuit. The key idea is to use a part of the arrays to merge as a buffer, then doing a standard merge using this buffer for auxiliary space. If you can then reposition the elements so that the buffer elements are in the right place, you're golden.
I have written up an implementation of one of these algorithms on my personal site if you're interested in looking at it. It's based on the paper "Practical In-Place Merging" by Huang and Langston. You probably will want to look over that paper for some insight.
I've also heard that there are good adaptive algorithms for this, which use some fixed-size buffer of your choosing (which could be O(1) if you wanted), but then scale elegantly with the buffer size. I don't know any of these off the top of my head, but I'm sure a quick search for "adaptive merge" might turn something up.

No it isn't possible, although my job would be much easier if it was :).
You have a O(log n) factor which you can't avoid. You can choose to take it as time or space, but the only way to avoid it is to not sort. With O(log n) space you can build a list of continuations that keep track of where you stashed the elements that didn't quite fit. With recursion this can be made to fit in O(1) heap, but that's only by using O(log n) stack frames instead.
Here is the progress of merge-sorting odds and evens from 1-9. Notice how you require log-space accounting to track the order inversions caused by the twin constraints of constant space and linear swaps.
. -
135792468
. -
135792468
: .-
125793468
: .-
123795468
#.:-
123495768
:.-
123459768
.:-
123456798
.-
123456789
123456789
There are some delicate boundary conditions, slightly harder than binary search to get right, and even in this (possible) form, and therefore a bad homework problem; but a really good mental exercise.
Update
Apparently I am mistaken and there is an algorithm that provides O(n) time and O(1) space. I have downloaded the papers to enlighten myself, and withdraw this answer as incorrect.

Related

Is O(cn) at least as fast as O(n) in a non asymptotically way?

So first of all let me talk about the motivation for this question. Let's supose you have to find the minimum and the maximum values in an array. In this case, you wave two ways of doing so.
The first one consists in iterating over the array and finding the maximum value, then doing the same thing to find the minimum value. This solution is O(2n).
The second one consists in iterating over the array just one time and finding both the minimum and maximum value at the same time. This solution is O(n).
Even though the time complexity has been halved, for each iteration of the O(n) solution you now have twice as many instructions (ignoring how the compiler can possibly optmize these instructions) so I believe they should take the same amount of time to execute.
Let me give you a second example. Now you need to reverse an array. Again, you have two ways of doing so.
The first one is to create an empty array, iterate over the data array filling the empty array. This solution is O(n).
The second one is to iterate over the data array, swapping the 0th and n-1th elements, then the 1th and n-2th elements and so on (using this strategy) until you reach the middle of the array. This solution is O((1/2)n).
Again, even though the time complexity has been cutted in half, you have three times more instructions per iteration. You're iterating over (1/2)n elements, but for each iteration you have to perform three XOR instructions. If you were not to use XOR, but an auxiliary variable you would still need 2 more instructions to perform the variable swapping, so now I believe that o((1/2)n) should actually be worse than o(n).
Having said these things, my question is the following:
Ignoring space complexity, garbage collecting and the compiler possible optimizations, can I assume that having O(c1*n) and O(c2*n) algorithms so that c1 > c2, can I be sure that the algorithm that gives me O(c1*n) is as fast or faster than the one that gives me O(c2*n)?
This question is cool because it can make a difference on how I start writing code from here and on. If the "more complex" (c1) way is as fast as the "less complex" (c2) but more readable, i'm sticking with the "more complex" one.
c1 > c2, can I be sure that the algorithm that gives me O(c1n) is as fast or faster than the one that gives me O(c2n)?
The whole issue lies within the words "fast" or "faster". Computational complexity doesn't strictly measure what we intuitively understand as "fast". Without going into mathematical details (although it's a good idea: https://en.wikipedia.org/wiki/Big_O_notation), it answers the question "how fast it will go slower when my input grows". So if you have O(n^2) complexity you can roughly expect that doubling the size of the input will make your algorithm take 4 times more time. Whereas for linear complexity, 2 times bigger input gives only doubles the time. As you can see, it's relative, so any constants cancel out.
To sum up: from the way you ask your question, it doesn't seem the big-O notation is the correct tool here.
By definition, if c1 and c2 are constants, O(c1*n) === O(c2*c) === O(n). That is, the number of operations per element of your array of length n is completely irrelevant in this kind of complexity analysis.
All that it will tell you is that "it's linear". That is, if you have 1 bazillion operations for an array of length n, then you'll have 2 bazillion operations for an array of length 2*n (plus or minus something that grows slower than linear).
can I assume that having O(c1n) and O(c2n) algorithms so that c1 > c2, can I be sure that the algorithm that gives me O(c1n) is as fast or faster than the one that gives me O(c2n)?
Nope, not at all.
First, because the constants there are meaningless in that analysis. There's no way to put it: it is absolutely irrelevant whatever restrictions you put in c1 and c2 for big-O analysis. The whole idea is that it will discard those restrictions.
Second, because they don't tell you anything that would enable you to compare the two algorithms runtime for a specific value of n.
Such complexity analysis only enables you to compare the asymptotic behavior of algorithms. Real-world problems in general don't care about where the asymptotes are.
Assume that A1(n) is the number of operations Algorithm 1 needs for an input of length n, and A2(n) is the same for Algorithm 2. You could have:
A1(n) = 10n + 900
A2(n) = 100n
The complexity of both is O(A1) = O(A2) = O(n). For small inputs, A2 is faster. For large inputs, A1 is faster. The point where they change is n == 10.
This question is cool because it can make a difference on how I start writing code from here and on. If the "more complex" (c1) way is as fast as the "less complex" (c2) but more readable, i'm sticking with the "more complex" one.
Not only that, but also there's the fact that when you have 2 different algorithms that are really of different complexity classes (e.g., linear vs quadratic), it might still make sense to use the one of higher complexity as it may still be faster.
For example:
A3(n) = n^2
A4(n) = n + 10^20.
E.g., Algorithm 3 is quadratic, while Algorithm 4 is linear but it has a constant huge initialization time.
For inputs of size of up to around n == 10^10, it will be faster to use the quadratic algorithm.
It may very well be the case that all relevant inputs for your specific problem fall within that range, meaning that the quadratic algorithm would be the better, faster choice.
The bottom line is: for analyzing the actual time it will take to run an algorithm on a given input (or a given bounded range of inputs, as nearly all real-world problems are) and compare it with another algorithm, big-O analysis is meaningless.
Another way to put it: you're asking a practical "engineering" question (i.e., which option is better / faster) but trying to answer the question with a tool that's only useful for "theoretical" analysis. That tool is important, yes. But it has no chance of giving you the answer you're looking for, by design.
By definition, time complexity ignores constants. So O((1/2)n) == O(n) == O(2n) == O(cn).
Your example of O((1/2)n) shows why this is the case, because the constants can measure units of anything, so comparing them is meaningless.
You can never tell which algorithm is faster based only on the time complexity. But, you can tell which one would be faster as n approaches infinity. Since constants are removed from the time complexity, they would be considered equal and therefore with O(c1n) and O(c2n) you still would not be able to tell which one is faster even as n approaches infinity.
(my theoretical computer science courses are a couple of decades ago)
O(cn) is O(n).
It's still a linear search over the array.

Cache Optimization - Hashmap vs QuickSort?

Suppose that I have N unsorted arrays, of integers. I'd like to find the intersection of those arrays.
There are two good ways to approach this problem.
One, I can sort the arrays in place with an nlogn sort, like QuickSort or MergeSort. Then I can put a pointer at the start of each array. Compare each array to the one below it, iterating the pointer of whichever array[pointer] is smaller, or if they're all equal, you've found an intersection.
This is an O(nlogn) solution, with constant memory (since everything is done in-place).
The second solution is to use a hash map, putting in the values that appear in the first array as keys, and then incrementing those values as you traverse through the remaining arrays (and then grabbing everything that had a value of N). This is an O(n) solution, with O(n) memory, where n is the total size of all of the arrays.
Theoretically, the former solution is o(nlogn), and the latter is O(n). However, hash maps do not have great locality, due to the way that items can be randomly scattered through the map, due to collisions. The other solution, although o(nlogn), traverses through the array one at a time, exhibiting excellent locality. Since a CPU will tend to pull the array values from memory that are next to the current index into the cache, the O(nlogn) solution will be hitting the cache much more often than the hash map solution.
Therefore, given a significantly large array size (as number of elements goes to infinity), is it feasible that the o(nlogn) solution is actually faster than the O(n) solution?
For integers you can use a non-comparison sort (see counting, radix sort). A large set might be encoded, e.g. sequential runs into ranges. That would compress the data set and allow for skipping past large blocks (see RoaringBitmaps). There is the potential to be hardware friendly and have O(n) complexity.
Complexity theory does not account for constants. As you suspect there is always the potential for an algorithm with a higher complexity to be faster than the alternative, due to the hidden constants. By exploiting the nature of the problem, e.g. limiting the solution to integers, there are potential optimizations not available to general purpose approach. Good algorithm design often requires understanding and leveraging those constraints.

Which sort takes O(n) in the given condition

I am preparing for a competition and stumbled upon this question: Considering a set of n elements which is sorted except for one element that appears out of order. Which of the following takes O(n) time?
Quick Sort
Heap Sort
Merge Sort
Bubble Sort
My reasoning is as follows:
I know Merge sort takes O(nlogn) even in best case so its not the answer.
Quick sort too will take O(n^2) since the array is almost sorted.
Bubble sort can be chosen but only if we modify it slightly to check whether a swap has been made in a pass or not.
Heap sort can be chosen as if we create the min heap of a sorted array it takes O(n) time since only one guy is not in place so he takes logn.
Hence I think its Heap sort. Is this reasoning correct? I would like to know if I'm missing something.
Let's start from the bubble sort. From my experience most resources I have used defined bubble sort with a stopping condition of not performing any swaps in an iteration (see e.g. Wikipedia). In this case indeed bubble sort will indeed stop after a linear number of steps. However, I remember that I have stumbled upon descriptions that stated a constant number of iterations, which makes your case quadratic. Therefore, all I can say about this case is "probably yes"—it depends on the definition used by the judges of the competition.
You are right regarding merge sort and quick sort—the classical versions of both algorithms enforce Θ(n log n) behavior on every input.
However, your reasoning regarding heap sort seems incorrect to me. In a typical implementation of heap sort, the heap is being built in the order opposite to the desired final order. Therefore, if you decide to build a min-heap, the outcome of the algorithm will be a reversed order, which—I guess—is not the desired one. If, on the other hand, you decide to build a max-heap, heap sort will obviously spend lots of time sifting elements up and down.
Therefore, in this case I'd go with bubble sort.
This is a bad question because you can guess which answer is supposed to be right, but it takes so many assumptions to make it it actually right that the question is meaningless.
If you code bubblesort as shown on the Wikipedia page, then it will stop in O(n) if the element that's out of order is "below" its proper place with respect to the sort iteration. If it's above, then it moves no more than one position toward its proper location on each pass.
To get the element unconditionally to its correct location in O(n), you'd need a variation of bubblesort that alternately makes passes in each direction.
The conventional implementations of the other sorts are O(n log n) on nearly sorted input, though Quicksort can be O(n^2) if you're not careful. A proper implementation with a Dutch National Flag partition is required to prevent bad behavior.
Heapsort takes only O(n) time to build the heap, but Theta(n log n) time to pull n items off the heap in sorted order, each in Theta(log n) time.

In which cases do we use heapsort?

In which cases heap sort can be used? As we know, heap sort has a complexity of n×lg(n). But it's used far less often than quick and merge sort. So when do we use this heap sort exactly and what are its drawbacks?
Characteristics of Heapsort
O(nlogn) time best, average, worst case performance
O(1) extra memory
Where to use it?
Guaranteed O(nlogn) performance. When you don't necessarily need very fast performance, but guaranteed O(nlogn) performance (e.g. in a game), because Quicksort's O(n^2) can be painfully slow. Why not use Mergesort then? Because it takes O(n) extra memory.
To avoid Quicksort's worst case. C++'s std::sort routine generally uses a varation of Quicksort called Introsort, which uses Heapsort to sort the current partition if the Quicksort recursion goes too deep, indicating that a worst case has occurred.
Partially sorted array even if stopped abruptly. We get a partially sorted array if Heapsort is somehow stopped abruptly. Might be useful, who knows?
Disadvantages
Relatively slow as compared to Quicksort
Cache inefficient
Not stable
Not really adaptive (Doesn't get faster if given somewhat sorted array)
Based on the wikipedia article for sorting algorithms, it appears that the Heapsort and Mergesort all have identical time complexity O(n log n) for best, average and worst case.
Quicksort has a disadvantage there as its worst case time complexity of O(n2) (a).
Mergesort has the disadvantage that its memory complexity is O(n) whereas Heapsort is O(1). On the other hand, Mergesort is a stable sort and Heapsort is not.
So, based on that, I would choose Heapsort in preference to Mergesort if I didn't care about the stability of the sort, so as to minimise memory usage. If stability was required, I would choose MergeSort.
Or, more correctly, if I had huge amounts of data to sort, and I had to code my own algorithms to do it, I'd do that. For the vast majority of cases, the difference between the two is irrelevant, until your data sets get massive.
In fact, I've even used bubble sort in real production environments where no other sort was provided, because:
it's incredibly easy to write (even the optimised version);
it's more than efficient enough if the data has certain properties (either small datsets or datasets that were already mostly sorted before you added a couple of items).
Like goto and multiple return points, even seemingly bad algorithms have their place :-)
(a) And, before you wonder why C uses a less efficient algorithm, it doesn't (necessarily). Despite the qsort name, there's no mandate that it use Quicksort under the covers - that's a common misconception. It may well use one of the other algorithms.
Kindly note that the running time complexity of heap sort is the same as O(n log n) irrespective of whether the array is already partially sorted in either ascending or descending order.
Kindly refer to below link for further clarification on big O calculation for the same :
https://ita.skanev.com/06/04/03.html

what is the running time of Extracting from a Max Heap?

I got this homework question
"James claims that he succeed to implement extracting from a maximum heap (ExtractMax) which takes O((log n)^0.5)
explain why James wrongs
I know that extracting from maximum heap takes O(log n) but how I can prove that James is wrong?
As can be seen here, building a heap can be done in O(n). Now if extracting the maximum could be done in O((log n)^0.5), then it would be possible to sort the entire set in n * O((log n)^0.5) by repeatedly extracting the largest element. This, however is impossible because the lower bound for sorting is n*logn.
Therefore, James's does not exist.
#Duh's solution of converting your extraction problem into a sorting problem is actually very creative. It shouldn't be too hard to find some proof that sorting is O(n * log n) and it's very common in the study of algorithms to convert one problem into a different one (for example, all NP-Complete problems are conversions of each other. That's how you prove they are NP-Complete). That said, I think there's a much simpler solution.
You stated it directly in your question: extracting from a binary heap is O(log n). Think about why it is O(log n). What is the structure of a binary heap? What actions are required to extract from a binary heap? Why is the worst case log n operations? Are these limits influenced at all by implementation?
Now, remember that there are two parts to James' claim:
He can extract in O((log n)^0.5)
He is using a binary heap.
Given what you know about binary heaps, can both these claims be true? Why or why not? Is there a contradiction? If so, why is there a contradiction? Finally, think about what this means for James.

Resources