Can I implement quick sort using queue?
I found this article only https://www.quora.com/Can-we-use-a-queue-in-quicksort-in-C.
Is this article correct?
If yes, why does the textbook always implement quick sort by stack or recursive method only?
Because the information about this question is rare, so I ask here.
bad cache performance
With stack, we have enough temporal locality, while with queue it is lost completely. We basically are trying to sort the array in breadth first search way in queue method.
EDIT(from Will Ness' answer): And larger arrays(>RAM), queue method won't even work, since it requires O(n) space for sorting an array of size n. While stack based method required only log n space. All theoretical time complexity of both of them is same.
why does the textbook always implement quick sort by stack or recursive method only
Because the essence of quicksort is that it is in-place, and sorting is achieved by repeated partitions of the same array that is being sorted.
The partitions are done from the top down, from bigger-sized portions of the array down to smaller and smaller ones.
If we use stack to manage the-work-yet-to-be-done, the size of that stack will be logarithmic in the array's size (if the partition being put on the stack is the bigger one, always). This is equivalent to depth-first traversal.
But if we'd use queue for that, it would be equivalent to breadth-first traversal, and the size of the queue would be linear (which is exponentially worse than logarithmic).
Related
Suppose that I have N unsorted arrays, of integers. I'd like to find the intersection of those arrays.
There are two good ways to approach this problem.
One, I can sort the arrays in place with an nlogn sort, like QuickSort or MergeSort. Then I can put a pointer at the start of each array. Compare each array to the one below it, iterating the pointer of whichever array[pointer] is smaller, or if they're all equal, you've found an intersection.
This is an O(nlogn) solution, with constant memory (since everything is done in-place).
The second solution is to use a hash map, putting in the values that appear in the first array as keys, and then incrementing those values as you traverse through the remaining arrays (and then grabbing everything that had a value of N). This is an O(n) solution, with O(n) memory, where n is the total size of all of the arrays.
Theoretically, the former solution is o(nlogn), and the latter is O(n). However, hash maps do not have great locality, due to the way that items can be randomly scattered through the map, due to collisions. The other solution, although o(nlogn), traverses through the array one at a time, exhibiting excellent locality. Since a CPU will tend to pull the array values from memory that are next to the current index into the cache, the O(nlogn) solution will be hitting the cache much more often than the hash map solution.
Therefore, given a significantly large array size (as number of elements goes to infinity), is it feasible that the o(nlogn) solution is actually faster than the O(n) solution?
For integers you can use a non-comparison sort (see counting, radix sort). A large set might be encoded, e.g. sequential runs into ranges. That would compress the data set and allow for skipping past large blocks (see RoaringBitmaps). There is the potential to be hardware friendly and have O(n) complexity.
Complexity theory does not account for constants. As you suspect there is always the potential for an algorithm with a higher complexity to be faster than the alternative, due to the hidden constants. By exploiting the nature of the problem, e.g. limiting the solution to integers, there are potential optimizations not available to general purpose approach. Good algorithm design often requires understanding and leveraging those constraints.
I am preparing for a competition and stumbled upon this question: Considering a set of n elements which is sorted except for one element that appears out of order. Which of the following takes O(n) time?
Quick Sort
Heap Sort
Merge Sort
Bubble Sort
My reasoning is as follows:
I know Merge sort takes O(nlogn) even in best case so its not the answer.
Quick sort too will take O(n^2) since the array is almost sorted.
Bubble sort can be chosen but only if we modify it slightly to check whether a swap has been made in a pass or not.
Heap sort can be chosen as if we create the min heap of a sorted array it takes O(n) time since only one guy is not in place so he takes logn.
Hence I think its Heap sort. Is this reasoning correct? I would like to know if I'm missing something.
Let's start from the bubble sort. From my experience most resources I have used defined bubble sort with a stopping condition of not performing any swaps in an iteration (see e.g. Wikipedia). In this case indeed bubble sort will indeed stop after a linear number of steps. However, I remember that I have stumbled upon descriptions that stated a constant number of iterations, which makes your case quadratic. Therefore, all I can say about this case is "probably yes"—it depends on the definition used by the judges of the competition.
You are right regarding merge sort and quick sort—the classical versions of both algorithms enforce Θ(n log n) behavior on every input.
However, your reasoning regarding heap sort seems incorrect to me. In a typical implementation of heap sort, the heap is being built in the order opposite to the desired final order. Therefore, if you decide to build a min-heap, the outcome of the algorithm will be a reversed order, which—I guess—is not the desired one. If, on the other hand, you decide to build a max-heap, heap sort will obviously spend lots of time sifting elements up and down.
Therefore, in this case I'd go with bubble sort.
This is a bad question because you can guess which answer is supposed to be right, but it takes so many assumptions to make it it actually right that the question is meaningless.
If you code bubblesort as shown on the Wikipedia page, then it will stop in O(n) if the element that's out of order is "below" its proper place with respect to the sort iteration. If it's above, then it moves no more than one position toward its proper location on each pass.
To get the element unconditionally to its correct location in O(n), you'd need a variation of bubblesort that alternately makes passes in each direction.
The conventional implementations of the other sorts are O(n log n) on nearly sorted input, though Quicksort can be O(n^2) if you're not careful. A proper implementation with a Dutch National Flag partition is required to prevent bad behavior.
Heapsort takes only O(n) time to build the heap, but Theta(n log n) time to pull n items off the heap in sorted order, each in Theta(log n) time.
In which cases heap sort can be used? As we know, heap sort has a complexity of n×lg(n). But it's used far less often than quick and merge sort. So when do we use this heap sort exactly and what are its drawbacks?
Characteristics of Heapsort
O(nlogn) time best, average, worst case performance
O(1) extra memory
Where to use it?
Guaranteed O(nlogn) performance. When you don't necessarily need very fast performance, but guaranteed O(nlogn) performance (e.g. in a game), because Quicksort's O(n^2) can be painfully slow. Why not use Mergesort then? Because it takes O(n) extra memory.
To avoid Quicksort's worst case. C++'s std::sort routine generally uses a varation of Quicksort called Introsort, which uses Heapsort to sort the current partition if the Quicksort recursion goes too deep, indicating that a worst case has occurred.
Partially sorted array even if stopped abruptly. We get a partially sorted array if Heapsort is somehow stopped abruptly. Might be useful, who knows?
Disadvantages
Relatively slow as compared to Quicksort
Cache inefficient
Not stable
Not really adaptive (Doesn't get faster if given somewhat sorted array)
Based on the wikipedia article for sorting algorithms, it appears that the Heapsort and Mergesort all have identical time complexity O(n log n) for best, average and worst case.
Quicksort has a disadvantage there as its worst case time complexity of O(n2) (a).
Mergesort has the disadvantage that its memory complexity is O(n) whereas Heapsort is O(1). On the other hand, Mergesort is a stable sort and Heapsort is not.
So, based on that, I would choose Heapsort in preference to Mergesort if I didn't care about the stability of the sort, so as to minimise memory usage. If stability was required, I would choose MergeSort.
Or, more correctly, if I had huge amounts of data to sort, and I had to code my own algorithms to do it, I'd do that. For the vast majority of cases, the difference between the two is irrelevant, until your data sets get massive.
In fact, I've even used bubble sort in real production environments where no other sort was provided, because:
it's incredibly easy to write (even the optimised version);
it's more than efficient enough if the data has certain properties (either small datsets or datasets that were already mostly sorted before you added a couple of items).
Like goto and multiple return points, even seemingly bad algorithms have their place :-)
(a) And, before you wonder why C uses a less efficient algorithm, it doesn't (necessarily). Despite the qsort name, there's no mandate that it use Quicksort under the covers - that's a common misconception. It may well use one of the other algorithms.
Kindly note that the running time complexity of heap sort is the same as O(n log n) irrespective of whether the array is already partially sorted in either ascending or descending order.
Kindly refer to below link for further clarification on big O calculation for the same :
https://ita.skanev.com/06/04/03.html
I recently just started up a project with some code that has been already written. I decided to look into his implementation and found that he implemented a Priority Queue with a Singly Linked List.
My understanding of SLLs is that since you may have to iterate over the entire list, it's inefficient to implement it as such, which is why Heaps are preferred. However, perhaps I am missing some sort of reasoning behind it and was wondering if anyone has ever chosen an SLL over a Heap for a Priority Queue?
There are situations where a SLL is better than a heap for implementing a priority queue. For example:
When removing from the queue needs to be as fast as possible. Removing from an SLL is O(1) (from the front of the list/queue) while removing from a heap is O(log n). I actually ran into this while writing a version of the alarm() syscall for a simple OS. I simply could not afford the O(log n) lookup time. Related to this is when you need to remove multiple elements at a time. Removing k elements from an SLL takes O(k) time while it takes O(k log n) time for a heap.
Memory issues. The traditional implementation of a min or max heap involves an array, which needs to be resized as the heap grows. If you can't afford the time it takes to do a large realloc then this strategy is out. If you implement the heap as a binary tree, then you need two pointers instead of one for an SLL.
When you have to maintain multiple priorities. It is relatively easy to keep track of the same nodes in different linked lists. Doing this with heaps is much more complicated.
In college I was told that the only reason we were made to use Linked Lists was to help us with our understanding of pointers.
I came across the following question.
Given an array of n elements and an integer k where k < n. Elements {a0...ak} and
{ak+1...an} are already sorted. Give an algorithm to sort in O(n) time and O(1) space.
It does not seem to me like it can be done in O(n) time and O(1) space. The problem really seems to be asking how to do the merge step of mergesort but in-place. If it was possible, wouldn't mergesort be implemented that way? I am unable to convince myself though and need some opinion.
This seems to indicate that it is possible to do in O(lg^2 n) space. I cannot see how to prove that it is impossible to merge in constant space, but I cannot see how to do it either.
Edit:
Chasing references, Knuth Vol 3 - Exercise 5.5.3 says "A considerably more complicated algorithm of L. Trabb-Pardo provides the best possible answer to this problem: It is possible to do stable merging in O(n) time and stable sorting in O(n lg n) time, using only O(lg n) bits of auxiliary memory for a fixed number of index variables.
More references that I have not read. Thanks for an interesting problem.
Further edit:
This article claims that the article by Huang and Langston have an algorithm that merges two lists of size m and n in time O(m + n), so the answer to your question would seem to be yes. Unfortunately I do not have access to the article, so I must trust the second hand information. I'm not sure how to reconcile this with Knuth's pronouncement that the Trabb-Pardo algorithm is optimal. If my life depended on it, I'd go with Knuth.
I now see that this had been asked as and earlier Stack Overflow question a number of times. I don't have the heart to flag it as a duplicate.
Huang B.-C. and Langston M. A., Practical in-place merging, Comm. ACM 31 (1988) 348-352
There are several algorithms for doing this, none of which are very easy to intuit. The key idea is to use a part of the arrays to merge as a buffer, then doing a standard merge using this buffer for auxiliary space. If you can then reposition the elements so that the buffer elements are in the right place, you're golden.
I have written up an implementation of one of these algorithms on my personal site if you're interested in looking at it. It's based on the paper "Practical In-Place Merging" by Huang and Langston. You probably will want to look over that paper for some insight.
I've also heard that there are good adaptive algorithms for this, which use some fixed-size buffer of your choosing (which could be O(1) if you wanted), but then scale elegantly with the buffer size. I don't know any of these off the top of my head, but I'm sure a quick search for "adaptive merge" might turn something up.
No it isn't possible, although my job would be much easier if it was :).
You have a O(log n) factor which you can't avoid. You can choose to take it as time or space, but the only way to avoid it is to not sort. With O(log n) space you can build a list of continuations that keep track of where you stashed the elements that didn't quite fit. With recursion this can be made to fit in O(1) heap, but that's only by using O(log n) stack frames instead.
Here is the progress of merge-sorting odds and evens from 1-9. Notice how you require log-space accounting to track the order inversions caused by the twin constraints of constant space and linear swaps.
. -
135792468
. -
135792468
: .-
125793468
: .-
123795468
#.:-
123495768
:.-
123459768
.:-
123456798
.-
123456789
123456789
There are some delicate boundary conditions, slightly harder than binary search to get right, and even in this (possible) form, and therefore a bad homework problem; but a really good mental exercise.
Update
Apparently I am mistaken and there is an algorithm that provides O(n) time and O(1) space. I have downloaded the papers to enlighten myself, and withdraw this answer as incorrect.