I find it hard to understand Skiena's quick sort. Specifically, what he is doing with the partition function, especially the firsthigh parameter?
quicksort(item_type s[], int l, int h) {
int p; /* index of partition */
if ((h - l) > 0) {
p = partition(s, l, h);
quicksort(s, l, p-1);
quicksort(s, p+1, h);
}
}
We can partition the array in one linear scan for a particular pivot element by maintaining three sections of the array: less than the pivot (to the left of firsthigh), greater than or equal to the pivot (between firsthigh and i), and unexplored (to the right of i), as implemented below:
int partition(item_type s[], int l, int h) {

int i; /* counter */
int p; /* pivot element index */
int firsthigh; /* divider position for pivot element */
p = h;
firsthigh = l;
for (i = l; i <h; i++) {
if (s[i] < s[p]) {
swap(&s[i],&s[firsthigh]);
firsthigh ++;
}
swap(&s[p],&s[firsthigh]);
return(firsthigh);
}
I recommend following the reasoning with pencil and paper while reading through this answer and its considered example case
Some parenthesis are missing from the snippet:
int partition(item_type s[], int l, int h)
{
int i;/* counter */
int p;/* pivot element index */
int firsthigh;/* divider position for pivot element */
p = h;
firsthigh = l;
for (i = l; i < h; i++) {
if (s[i] < s[p]) {
swap(s[i], s[firsthigh]);
firsthigh++;
}
}
swap(s[p], s[firsthigh]);
return(firsthigh);
}
void quicksort(item_type s[], int l, int h)
{
int p; /* index of partition */
if ((h - l)>0) {
p = partition(s, l, h);
quicksort(s, l, p - 1);
quicksort(s, p + 1, h);
}
}
Anyway the partition function works as follows: suppose we have the array { 2,4,5,1,3 } of size 5. The algorithm grabs the last element 3 as the pivot and starts exploring the items iteratively:
2 is first encountered.. since 2 is less than the pivot element 3, it is swapped with the position 0 pointed by firsthigh. This has no effect since 2 is already at position 0
2,4,5,1,3
^
firsthigh is incremented since 2 is now a stable value at that position.
Then 4 is encountered. This time 4 is greater than 3 (than the pivot) so no swap is necessary. Notice that firsthigh continues pointing to 4. The same happens for 5.
When 1 is encountered, this value should be put after 2, therefore it is swapped with the position pointed by firsthigh, i.e. with 4's position
2,4,5,1,3
^ ^ swap
2,1,5,4,3
^ now firsthigh points here
When the elements end, the pivot element is swapped with firsthigh's position and therefore we get
2,1,| 3,| 4,5
notice how the values less than the pivot are put on the left while the values greater than the pivot remain on the right. Exactly what is expected by a partition function.
The position of the pivot element is returned and the process is repeated on the subarrays on the left and right of the pivot until a group of 0 elements is encountered (the if condition is the bottom of the recursion).
Therefore firsthigh means: the first element greater than the pivot that I know of. In the example above firsthigh is put on the first element since we still don't know if that element is greater or less than the pivot
2,4,5,1,3
^
as soon as we realize 2 is not the first element greater than the pivot or we swap a less-than-the-pivot element in that position, we try to keep our invariant valid: ok, advance firsthigh and consider 4 as the first element greater than the pivot. This gives us the three sections cited in the textbook.
At all times, everything strictly to the left of firstHigh is known to be less than the pivot (notice that there are initially no elements in this set), and everything at or to the right of it is either unknown, or known to be >= the pivot. Think of firstHigh as the next place where we can put a value lower than the pivot.
This algorithm is very similar to the in-place algorithm you would use to delete all items that are >= the pivot while "compacting" the remaining items as far to the left as possible. For the latter, you would maintain two indices l and firstHigh (which you could think of as from and to, respectively) that both start at 0, and walk l through the array; whenever you encounter an s[l] that should not be removed, you shunt it as far left as possible: i.e., you copy it to s[firstHigh] and then you increment firstHigh. This is safe because we always have firstHigh <= l. The only difference here is that we can't afford to overwrite the deleted (possibly->=-to-pivot) item currently residing at s[firstHigh], so we swap the two items instead.
Related
I have been studying quick sort for a few hours and am confused about choosing a pivot value. Does the pivot value need to exist in the array?
For example if the array is 1,2,5,6 , can we use the value 3 or 4 as a pivot?
We use the position of pivot for dividing the array into sub-arrays but I am a little confused about what will be the pivot position after we move values < 5 to left of the array and values > 5 to right?
7,1,5,3,3,5,8,9,2,1
I dry runned the algo with the pivot 5 and came with the following result:
1,1,5,3,3,5,8,9,2,7
1,1,5,3,3,5,8,9,2,7
1,1,5,3,3,5,8,2,9,7
We can see that the value 2 is still not in the correct position. What am i doing wrong? Sorry if it's a silly question.
I came up with the following code but it's only working when pivot = left, I can't use a random pivot.
template <class T>
void quickSort(vector <T> & arr, int p, int r, bool piv_flag) {
if (p < r) {
int q, piv(p); counter++;
//piv = ((p + r) / 2); doesn't work
q = partition(arr, p, r, piv);
quickSort(arr, p, q - 1, piv_flag); //Sort left half
quickSort(arr, q + 1, r, piv_flag); //Sort right half
}
return;
}
int partition(vector <T> & arr, int left, int right, int piv) {
int i{ left - 0 }, j{ right + 0 }, pivot{ arr[piv] };
while (i < j) {
while (arr[i] <= pivot) { i++; }
while (arr[j] > pivot) { j--; }
if (i < j) (swap(arr[i], arr[j]));
else {
swap(arr[j], arr[piv]);
return j;
}
}
}
Thank you.
In many applications the pivot is chosen as some element in the array, but it can also be any value you may use to separate the numbers in the array into two. If pivot value you choose is a specific element in the array you need to place it between those two groups after you partition the array into two. If not, you can just proceed with the recursive sorting process by calling the indices properly. (i.e. keeping in mind that there is no pivot element in the array, but just the two groups of values)
See this response to a similar question for a concise explanation of some widely-used alternatives for selecting a pivot.
The most important function of the pivot as to serve as a boundary between the groups we are trying to create during the partitioning phase of quicksort. The goal/challenge here is to create those groups in such a way that they are equal or almost equal in size so that quicksort can work efficiently. That challenge is the reason so many pivot selection methods are conceived. (i.e. so that at least in most cases the numbers will be separated into groups of similar size)
As to the second part of your question regarding how the position of the pivot will change once the partitioning is done, see below for a sample partitioning phase.
Say we have an array A with elements [4,1,5,3,3,5,8,9,2,1] and we chose pivot to be the first element, namely 4. The letter E used below indicates the end of the elements that are smaller than the pivot. (i.e. the last element that is smaller than the pivot)
E
[4,1,5,3,3,5,8,9,2,1]
E
[4,1,3,5,3,5,8,9,2,1]
E
[4,1,3,3,5,5,8,9,2,1]
E
[4,1,3,3,2,5,8,9,5,1]
E
[4,1,3,3,2,1,8,9,5,5]
[1,1,3,3,2,4,8,9,5,5] // swap pivot with the rightmost element that is smaller than its value
After this partitioning, the elements are still not sorted, obviously. But all the elements that is to the left of 4 are smaller than 4, and all the ones to its right are larger than 4. To sort them, we recursively use Quicksort on those groups.
Based on your code, below is a sample partitioning code based on the procedure I described above. You may also observe its execution here.
template <class T>
int partition(vector<T>& arr, int left, int right, int piv) {
int leftmostSmallerThanPivot = left;
if(piv != left)
swap(arr[piv], arr[left]);
for(int i=left+1; i <= right; ++i) {
if(arr[i] < arr[left])
swap(arr[++leftmostSmallerThanPivot], arr[i]);
}
swap(arr[left], arr[leftmostSmallerThanPivot]);
return leftmostSmallerThanPivot;
}
template <class T>
void quickSort(vector<T>& arr, int p, int r) {
if (p < r) {
int q, piv(p);
piv = ((p + r) / 2); // works
q = partition(arr, p, r, piv);
quickSort(arr, p, q - 1); //Sort left half
quickSort(arr, q + 1, r); //Sort right half
}
}
Suppose you are provided with the following function declaration in the C programming language.
int partition(int a[], int n);
The function treats the first element of a[] as a pivot and rearranges the array so that all elements less than or equal to the pivot is in the left part of the array, and all elements greater than the pivot is in the right part. In addition, it moves the pivot so that the pivot is the last element of the left part. The return value is the number of elements in the left part.
The following partially given function in the C programming language is used to find the kth smallest element in an array a[] of size n using the partition function. We assume k≤n.
int kth_smallest (int a[], int n, int k)
{
int left_end = partition (a, n);
if (left_end+1==k) {
return a[left_end];
}
if (left_end+1 > k) {
return kth_smallest (___________);
} else {
return kth_smallest (___________);
}
}
The missing arguments lists are respectively
(a, left_end, k) and (a+left_end+1, n-left_end-1, k-left_end-1)
(a, left_end, k) and (a, n-left_end-1, k-left_end-1)
(a, left_end+1, n-left_end-1, k-left_end-1) and (a, left_end, k)
(a, n-left_end-1, k-left_end-1) and (a, left_end, k)
I found here a nice explanation about "How to find the kth largest element in an unsorted array of length n in O(n)?"
I've read partition , used in quick sort .Answer is given option (1).I agree with answer . But I need formal explanation .
Can you explain little bit please ?
Edit : AFAIK , Partition algorithm puts the chosen pivot in its correct position . We need recursively partition algorithm to find kth smallest element in an array .Partition algorithm run on a single side of array , either left or right of it's sorted pivot. I got stuck here . I'm thinking , it depends on kth index number ?
Its simple. Say, you pick a q th largest element of the array. In that case, partition has q-1 elements in left half and n-q elements in the right half, while, q th element is the pivot. Now, 3 possibilities:
If q is k, you get the answer, which is your return statement.
If q > k, then k th element is in the left half of the array, and, in the left half, it is, still, the k th largest element. So, in the partition, we pass left half of the array, and k, that we have to find k th largest element there.
If q < k, then, k th largest element in in the right half of the array. Also, since there are q elements smaller than smallest element of this right part, k th largest element in the original array is k - q th largest in the right array. So, we pass the right array, and k-q, to find k-q th largest element of the partition.
EDIT:
Adding comments to your code:
int partition(int a[], int n); //breaks array into 2 parts, according to pivot (1st element of array), left is smaller and right is larger han pivot.
Now, your recursive algorithm:
int kth_smallest (int a[], int n, int k)
{
int left_end = partition (a, n); //get index of a[0] in sorted array a
if (left_end+1==k) { //kth largest element found
return a[left_end];
}
if (left_end+1 > k) { //k th largest element in left part of array, and is k th largest in the left part
return kth_smallest (___________);
} else { ////k th largest element in right part of array, and is (k - left_end) th largest in the right part
return kth_smallest (___________);
}
}
I am reading ANSI C by K&R. I came across the qsort program. I want a little help. Suppose I have 9 elements with index 0->8. Please read the comments to see if I am understanding it correct or not. Thanks a lot for you efforts
void qsort(int v[] , int left, int right)
{
int i, j, last;
void swap(int v[], int i, int j);
if(left >= right) /*if the array has only one element return it*/
return;
swap(v,left, (left+right)/2); /* now, left=(left+right)/2= 0+8/2= 4 we have 4 as left*/
last= left; /* putting left = last of the first partition group i.e. last=4*/
for(i=left+1; i<=right,i++) /* now looping from 4+1=5 to 8 with increment of 1*/
if(v[i] < v[left]) /*if value at 5th is less than value at 4th */
swap(v, ++last, i);
I have problem in this last swap step. As my values suggest swap ++4 i.e. to i i.e. 4+1= 5 (swapping 5 position with 5?). How can I understand this? There must be a swapping between 4 and 5, not 5 and 5 is it?
code continues
swap(v,left, last);
qsort(v,left,last-1);
qsort(v,last+1,right);
}
Firstly, you have a small misconception about the swap function. Let's say the prototype of the function is -
swap(int array[], int i, int j)
The swap function swaps the numbers at location array[i] and array[j]. So, the swap function swaps the elements in the array. So, the line -
swap(v, left, (left + right) / 2);
Means that, the middle element in the array is swapped with the leftmost element. Clearly, the quicksort is taking the middle element as the pivot. This swap has no effect on the local variables or parameters. According to your data input example, the value of 'left' = 0, and the value of right = '8', even after the swapping. This is where you got confused. The elements of array are swapped, not the values of variables. So, now, the line -
last = left;
makes, 'last' point to the location of the pivot ('left'), so here the value of 'last' = 0 not 4. So, the loop,
for(i = left + 1; i <= right; i++)
Runs from i = 1 to 8. BTW, you forgot the semicolon..! Then, the line,
if(v[i] < v[left])
checks if the current element ('v[i]') is less than the pivot ('v[left]') or not. Then, accordingly swaps the lesser elements as in the line,
swap(v, ++last, i);
from the location (last + 1) till where ever it increments to. So, the elements to the left of 'last' are less than pivot and the elements to the right are greater. I think you are missing another line, where we bring the pivot back to the middle which was at the location 'v[left]' during the execution of the algorithm. Then, the recursive calls play their roles. If you are looking for help with quicksort, this is a good place to start !
I hope my answer has helped you, if it did, let me know..! ☺
I have a problem understanding quicksort algorithm (the simplified version without pointers) from K&R. There is already a thorough explanation provided by Dave Gamble here explanation.
However I noticed that by starting with a slightly changed string we can obtain no swaps during many loops of the for loop.
Firstly the code:
void qsort(int v[], int left, int right)
{
int i, last;
void swap(int v[], int i, int j);
if (left >= right) /* do nothing if array contains */
return; /* fewer than two elements */
swap(v, left, (left + right)/2); /* move partition elem */
last = left; /* to v[0] */
for (i = left + 1; i <= right; i++) /* partition */
if (v[i] < v[left])
swap(v, ++last, i);
swap(v, left, last); /* restore partition elem */
qsort(v, left, last-1);
qsort(v, last+1, right);
}
Walkthrough in my opinion:
we start with CADBE; left=0; right=4; D is the pivot
so according to algorithm we swap D with C obtaining DACBE
last = left =0
i = 1 if ( v1 < v[0] ) it is true so we swap v1 (because last is incremented before operation) with v1 so nothing changes, last = 1, still having DACBE;
now i = 2 if ( v[2] < v[0] ) -> true so we swap v[2] with v[2] nothing changed again; last = 2
now i = 3 if ( v[3] < v[0] ) -> true so we swap v[3] with v[3] nothing changed AGAIN (!), last = 3
So apparently something is wrong, algorithm does nothing.
Your opinions appreciated very much. I must be wrong, authors are better than me ;D
Thanks in advance!
The loop goes from left + 1 up to and including right. When i=4, the test fails and last does not get incremented.
Then the recursive calls sort BACDE with left=0,right=2 and left=4,right=4. (Which is correct when D is the pivot.)
Well, it just so happened that your input sub-array ACBE is already partitioned by D (ACB is smaller than D and E is bigger than D), so it is not surprising the partitioning cycle does not physically swap any values.
In reality, it is not correct to say that it "does nothing". It does not reorder anything in the cycle, since your input data need no extra reordering. But it still does one thing: it finds the value of last that says where smaller elements end and bigger elements begin, i.e. it separates ACBE into ACB and E parts. The cycle ends with last == 3, which is the partitioning point for further recursive steps.
I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code I'm using is shown below:
void QuickSort(int a[],int start,int end) {
int q=HoarePartition(a,start,end);
if (end<=start) return;
QuickSort(a,q+1,end);
QuickSort(a,start,q);
}
int HoarePartition (int a[],int p, int r) {
int x=a[p],i=p-1,j=r;
while (1) {
do j--; while (a[j] > x);
do i++; while (a[i] < x);
if (i < j)
swap(&a[i],&a[j]);
else
return j;
}
}
Also, I don't really get why HoarePartition works. Can someone explain why it works, or at least link me to an article that does?
I have seen a step-by-step work-through of the partitioning algorithm, but I don't have an intuitive feel for it. In my code, it doesn't even seem to work. For example, given the array
13 19 9 5 12 8 7 4 11 2 6 21
It will use pivot 13, but end up with the array
6 2 9 5 12 8 7 4 11 19 13 21
And will return j which is a[j] = 11. I thought it was supposed to be true that the array starting at that point and going forward should have values that are all larger than the pivot, but that isn't true here because 11 < 13.
Here's pseudocode for Hoare partitioning (from CLRS, second edition), in case this is useful:
Hoare-Partition (A, p, r)
x ← A[p]
i ← p − 1
j ← r + 1
while TRUE
repeat j ← j − 1
until A[j] ≤ x
repeat i ← i + 1
until A[i] ≥ x
if i < j
exchange A[i] ↔ A[j]
else return j
Thanks!
EDIT:
The right C code for this problem will end up being:
void QuickSort(int a[],int start,int end) {
int q;
if (end-start<2) return;
q=HoarePartition(a,start,end);
QuickSort(a,start,q);
QuickSort(a,q,end);
}
int HoarePartition (int a[],int p, int r) {
int x=a[p],i=p-1,j=r;
while (1) {
do j--; while (a[j] > x);
do i++; while (a[i] < x);
if (i < j)
swap(&a[i],&a[j]);
else
return j+1;
}
}
To answer the question of "Why does Hoare partitioning work?":
Let's simplify the values in the array to just three kinds: L values (those less than the pivot value), E values (those equal to the pivot value), and G value (those larger than the pivot value).
We'll also give a special name to one location in the array; we'll call this location s, and it's the location where the j pointer is when the procedure finishes. Do we know ahead of time which location s is? No, but we know that some location will meet that description.
With these terms, we can express the goal of the partitioning procedure in slightly different terms: it is to split a single array into two smaller sub-arrays which are not mis-sorted with respect to each other. That "not mis-sorted" requirement is satisfied if the following conditions are true:
The "low" sub-array, that goes from the left end of the array up to and includes s, contains no G values.
The "high" sub-array, that starts immediately after s and continues to the right end, contains no L values.
That's really all we need to do. We don't even need to worry where the E values wind up on any given pass. As long as each pass gets the sub-arrays right with respect to each other, later passes will take care of any disorder that exists inside any sub-array.
So now let's address the question from the other side: how does the partitioning procedure ensure that there are no G values in s or to the left of it, and no L values to the right of s?
Well, "the set of values to the right of s" is the same as "the set of cells the j pointer moves over before it reaches s". And "the set of values to the left of and including s" is the same as "the set of values that the i pointer moves over before j reaches s".
That means that any values which are misplaced will, on some iteration of the loop, be under one of our two pointers. (For convenience, let's say it's the j pointer pointing at a L value, though it works exactly the same for the i pointer pointing at a G value.) Where will the i pointer be, when the j pointer is on a misplaced value? We know it will be:
at a location in the "low" subarray, where the L value can go with no problems;
pointing at a value that's either an E or a G value, which can easily replace the L value under the j pointer. (If it wasn't on an E or a G value, it wouldn't have stopped there.)
Note that sometimes the i and j pointer will actually both stop on E values. When this happens, the values will be switched, even though there's no need for it. This doesn't do any harm, though; we said before that the placement of the E values can't cause mis-sorting between the sub-arrays.
So, to sum up, Hoare partitioning works because:
It separates an array into smaller sub-arrays which are not mis-sorted relative to each other;
If you keep doing that and recursively sorting the sub-arrays, eventually there will be nothing left of the array that's unsorted.
I believe that there are two problems with this code. For starters, in your Quicksort function, I think you want to reorder the lines
int q=HoarePartition(a,start,end);
if (end<=start) return;
so that you have them like this:
if (end<=start) return;
int q=HoarePartition(a,start,end);
However, you should do even more than this; in particular this should read
if (end - start < 2) return;
int q=HoarePartition(a,start,end);
The reason for this is that the Hoare partition fails to work correctly if the range you're trying to partition has size zero or one. In my edition of CLRS this isn't mentioned anywhere; I had to go to the book's errata page to find this. This is almost certainly the cause of the problem you encountered with the "access out of range" error, since with that invariant broken you might run right off the array!
As for an analysis of Hoare partitioning, I would suggest starting off by just tracing through it by hand. There's also a more detailed analysis here. Intuitively, it works by growing two ranges from the ends of the range toward one another - one on the left-hand side containing elements smaller than the pivot and one on the right-hand side containing elements larger than the pivot. This can be slightly modified to produce the Bentley-McIlroy partitioning algorithm (referenced in the link) that scales nicely to handle equal keys.
Hope this helps!
Your final code is wrong, since the initial value of j should be r + 1 instead of r. Otherwise your partition function always ignore the last value.
Actually, HoarePartition works because for any array A[p...r] which contains at least 2 elements(i.e. p < r), every element of A[p...j] is <= every element of A[j+1...r] when it terminates.
So the next two segments that the main algorithm recurs on are [start...q] and [q+1...end]
So the right C code is as follows:
void QuickSort(int a[],int start,int end) {
if (end <= start) return;
int q=HoarePartition(a,start,end);
QuickSort(a,start,q);
QuickSort(a,q + 1,end);
}
int HoarePartition (int a[],int p, int r) {
int x=a[p],i=p-1,j=r+1;
while (1) {
do j--; while (a[j] > x);
do i++; while (a[i] < x);
if (i < j)
swap(&a[i],&a[j]);
else
return j;
}
}
More clarifications:
partition part is just the translation of the pseudocode. (Note the return value is j)
for the recursive part, note that the base case checking (end <= start instead of end <= start + 1 otherwise you will skip the [2 1] subarray )
First of all u misunderstood the Hoare's partition algorithm,which can be see from the translated code in c,
Since u considered pivot as leftmost element of subarray.
Ill explain u considering the leftmost element as pivot.
int HoarePartition (int a[],int p, int r)
Here p and r represents the lower and upper bound of array which can be part of a larger array also(subarray) to be partitioned.
so we start with the pointers(marker) initially pointing to before and after end points of array(simply bcoz using do while loop).Therefore,
i=p-1,
j=r+1; //here u made mistake
Now as per partitioning we want every element to the left of pivot to be less than or equal to pivot and greater than on right side of pivot.
So we will move 'i' marker untill we get element which is greaterthan or equal to pivot. And similarly 'j' marker untill we find element less than or equal to pivot.
Now if i < j we swap the elements bcoz both the elements are in wrong part of array. So code will be
do j--; while (a[j] <= x); //look at inequality sign
do i++; while (a[i] >= x);
if (i < j)
swap(&a[i],&a[j]);
Now if 'i' is not less than 'j',that means now there is no element in between to swap so we return 'j' position.
So now the array after partitioned lower half is from 'start to j'
upper half is from 'j+1 to end'
so code will look like
void QuickSort(int a[],int start,int end) {
int q=HoarePartition(a,start,end);
if (end<=start) return;
QuickSort(a,start,q);
QuickSort(a,q+1,end);
}
Straightforward implementation in java.
public class QuickSortWithHoarePartition {
public static void sort(int[] array) {
sortHelper(array, 0, array.length - 1);
}
private static void sortHelper(int[] array, int p, int r) {
if (p < r) {
int q = doHoarePartitioning(array, p, r);
sortHelper(array, p, q);
sortHelper(array, q + 1, r);
}
}
private static int doHoarePartitioning(int[] array, int p, int r) {
int pivot = array[p];
int i = p - 1;
int j = r + 1;
while (true) {
do {
i++;
}
while (array[i] < pivot);
do {
j--;
}
while (array[j] > pivot);
if (i < j) {
swap(array, i, j);
} else {
return j;
}
}
}
private static void swap(int[] array, int i, int j) {
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}
You last C code works. But it's not intuitive.
And now I'm studying CLRS luckily.
In my opinion, The pseudocode of CLRS is wrong.(At 2e)
At last, I find that it would be right if changing a place.
Hoare-Partition (A, p, r)
x ← A[p]
i ← p − 1
j ← r + 1
while TRUE
repeat j ← j − 1
until A[j] ≤ x
repeat i ← i + 1
until A[i] ≥ x
if i < j
exchange A[i] ↔ A[j]
else
exchnage A[r] ↔ A[i]
return i
Yes, Add a exchange A[r] ↔ A[i] can make it works.
Why?
Because A[i] is now bigger than A[r] OR i == r.
So We must exchange to guarantee the feature of a partition.
move pivot to first. (eg, use median of three. switch to insertion sort for small input size.)
partition,
repetitively swap currently leftmost 1 with currently rightmost 0.
0 -- cmp(val, pivot) == true, 1 -- cmp(val, pivot) == false.
stop if not left < right.
after that, swap pivot with rightmost 0.