Quick sort programmed in C - c

I am reading ANSI C by K&R. I came across the qsort program. I want a little help. Suppose I have 9 elements with index 0->8. Please read the comments to see if I am understanding it correct or not. Thanks a lot for you efforts
void qsort(int v[] , int left, int right)
{
int i, j, last;
void swap(int v[], int i, int j);
if(left >= right) /*if the array has only one element return it*/
return;
swap(v,left, (left+right)/2); /* now, left=(left+right)/2= 0+8/2= 4 we have 4 as left*/
last= left; /* putting left = last of the first partition group i.e. last=4*/
for(i=left+1; i<=right,i++) /* now looping from 4+1=5 to 8 with increment of 1*/
if(v[i] < v[left]) /*if value at 5th is less than value at 4th */
swap(v, ++last, i);
I have problem in this last swap step. As my values suggest swap ++4 i.e. to i i.e. 4+1= 5 (swapping 5 position with 5?). How can I understand this? There must be a swapping between 4 and 5, not 5 and 5 is it?
code continues
swap(v,left, last);
qsort(v,left,last-1);
qsort(v,last+1,right);
}

Firstly, you have a small misconception about the swap function. Let's say the prototype of the function is -
swap(int array[], int i, int j)
The swap function swaps the numbers at location array[i] and array[j]. So, the swap function swaps the elements in the array. So, the line -
swap(v, left, (left + right) / 2);
Means that, the middle element in the array is swapped with the leftmost element. Clearly, the quicksort is taking the middle element as the pivot. This swap has no effect on the local variables or parameters. According to your data input example, the value of 'left' = 0, and the value of right = '8', even after the swapping. This is where you got confused. The elements of array are swapped, not the values of variables. So, now, the line -
last = left;
makes, 'last' point to the location of the pivot ('left'), so here the value of 'last' = 0 not 4. So, the loop,
for(i = left + 1; i <= right; i++)
Runs from i = 1 to 8. BTW, you forgot the semicolon..! Then, the line,
if(v[i] < v[left])
checks if the current element ('v[i]') is less than the pivot ('v[left]') or not. Then, accordingly swaps the lesser elements as in the line,
swap(v, ++last, i);
from the location (last + 1) till where ever it increments to. So, the elements to the left of 'last' are less than pivot and the elements to the right are greater. I think you are missing another line, where we bring the pivot back to the middle which was at the location 'v[left]' during the execution of the algorithm. Then, the recursive calls play their roles. If you are looking for help with quicksort, this is a good place to start !
I hope my answer has helped you, if it did, let me know..! ☺

Related

Skiena's Quick Sort implementation

I find it hard to understand Skiena's quick sort. Specifically, what he is doing with the partition function, especially the firsthigh parameter?
quicksort(item_type s[], int l, int h) {
int p; /* index of partition */
if ((h - l) > 0) {
p = partition(s, l, h);
quicksort(s, l, p-1);
quicksort(s, p+1, h);
}
}
We can partition the array in one linear scan for a particular pivot element by maintaining three sections of the array: less than the pivot (to the left of firsthigh), greater than or equal to the pivot (between firsthigh and i), and unexplored (to the right of i), as implemented below:
int partition(item_type s[], int l, int h) {

int i; /* counter */
int p; /* pivot element index */
int firsthigh; /* divider position for pivot element */
p = h;
firsthigh = l;
for (i = l; i <h; i++) {
if (s[i] < s[p]) {
swap(&s[i],&s[firsthigh]);
firsthigh ++;
}
swap(&s[p],&s[firsthigh]);
return(firsthigh);
}
I recommend following the reasoning with pencil and paper while reading through this answer and its considered example case
Some parenthesis are missing from the snippet:
int partition(item_type s[], int l, int h)
{
int i;/* counter */
int p;/* pivot element index */
int firsthigh;/* divider position for pivot element */
p = h;
firsthigh = l;
for (i = l; i < h; i++) {
if (s[i] < s[p]) {
swap(s[i], s[firsthigh]);
firsthigh++;
}
}
swap(s[p], s[firsthigh]);
return(firsthigh);
}
void quicksort(item_type s[], int l, int h)
{
int p; /* index of partition */
if ((h - l)>0) {
p = partition(s, l, h);
quicksort(s, l, p - 1);
quicksort(s, p + 1, h);
}
}
Anyway the partition function works as follows: suppose we have the array { 2,4,5,1,3 } of size 5. The algorithm grabs the last element 3 as the pivot and starts exploring the items iteratively:
2 is first encountered.. since 2 is less than the pivot element 3, it is swapped with the position 0 pointed by firsthigh. This has no effect since 2 is already at position 0
2,4,5,1,3
^
firsthigh is incremented since 2 is now a stable value at that position.
Then 4 is encountered. This time 4 is greater than 3 (than the pivot) so no swap is necessary. Notice that firsthigh continues pointing to 4. The same happens for 5.
When 1 is encountered, this value should be put after 2, therefore it is swapped with the position pointed by firsthigh, i.e. with 4's position
2,4,5,1,3
^ ^ swap
2,1,5,4,3
^ now firsthigh points here
When the elements end, the pivot element is swapped with firsthigh's position and therefore we get
2,1,| 3,| 4,5
notice how the values less than the pivot are put on the left while the values greater than the pivot remain on the right. Exactly what is expected by a partition function.
The position of the pivot element is returned and the process is repeated on the subarrays on the left and right of the pivot until a group of 0 elements is encountered (the if condition is the bottom of the recursion).
Therefore firsthigh means: the first element greater than the pivot that I know of. In the example above firsthigh is put on the first element since we still don't know if that element is greater or less than the pivot
2,4,5,1,3
^
as soon as we realize 2 is not the first element greater than the pivot or we swap a less-than-the-pivot element in that position, we try to keep our invariant valid: ok, advance firsthigh and consider 4 as the first element greater than the pivot. This gives us the three sections cited in the textbook.
At all times, everything strictly to the left of firstHigh is known to be less than the pivot (notice that there are initially no elements in this set), and everything at or to the right of it is either unknown, or known to be >= the pivot. Think of firstHigh as the next place where we can put a value lower than the pivot.
This algorithm is very similar to the in-place algorithm you would use to delete all items that are >= the pivot while "compacting" the remaining items as far to the left as possible. For the latter, you would maintain two indices l and firstHigh (which you could think of as from and to, respectively) that both start at 0, and walk l through the array; whenever you encounter an s[l] that should not be removed, you shunt it as far left as possible: i.e., you copy it to s[firstHigh] and then you increment firstHigh. This is safe because we always have firstHigh <= l. The only difference here is that we can't afford to overwrite the deleted (possibly->=-to-pivot) item currently residing at s[firstHigh], so we swap the two items instead.

quicksort code understanding

i have a quicksort code that is supposed to run on the text "B A T T A J U S" (ignore blanks). But i dont seem to understand the code that well.
void quicksort (itemType a[], int l, int r)
{
int i, j; itemType v;
if (r>l)
{
v = a[r]; i = l-1; j = r;
for (;;)
{
while (a[++i] < v);
while (a[--j] >= v);
if (i >= j) break;
swap(a,i,j);
}
swap(a,i,r);
quicksort(a,l,i-1);
quicksort(a,i+1,r);
}
}
i can explain what i understand: the first if check if l < r which in this case it is since, s is greater than b. THen i get alittle confused: v is set to be equal to a[r], does this mean S? since S is all the way to the right? then l is set to outside the "array" since its -1. (so its undefined, i assume) then j is set to be equal to r, but is that the posision r? as in S?
I kinda dont understand what values are set to what, if the a[r] = the letter in the posision or the or anything else. Hopefully some1 can explain me how the first swap works, so i hopefully can learn this?
It is probably better to start with an understanding of the QuickSort algorithm, and then see how the code corresponds to it, than to study the code to try to figure out how QuickSort works. Basic QuickSort (which is what you have) is in fact a pretty simple algorithm. To sort an array A:
If the length of A is less than 2 then the array is already sorted. Otherwise,
Select any element of A to be a "pivot element".
Rearrange the other elements as needed so that all those that are less than the pivot are at the beginning of A, and those that are greater than or equal to the pivot are at the end. (This particular version also puts the pivot itself between the two, which is common but not strictly necessary; it could simply be included in the upper subarray, and the algorithm would still work.)
Apply the QuickSort procedure to each of the two sub-arrays produced by (3).
Your particular code chooses the right-most element of each (sub)array as the pivot element, and at step (4) it excludes the pivot from the sub-arrays to be recursively sorted.
Quick sort works by separating your array into a "left" subarray which contains only values stricly less than an arbitrarily chosen a pivot value and a "right" subarray that contains only elements that are greater than or equal to the pivot. Once the array has been divided like this, each of the two subarrays are sorted using the same algorithm. Here is how this applies to your code:
v = a[r] sets the pivot value to the last element in the array. This works well since the array is presumably unsorted to begin with, so a[r] is as good a value as any.
while(a[++i] < v) ; keeps stopping at the first element of the left sub-array that is greater than or equal to the pivot, v. When this loop ends, i is the index of an element that should be in the right sub-array rather than the left.
while(a[--j] >= v) ; does the same thing, except that it stops at the last element of the right sub-array that is strictly less than the pivot, v. When this loop ends, j is the index of an element that should be in the left sub-array rather than the right.
Whenever we find a pair of elements that are in the wrong sub-arrays, we swap them.
When all of the elements in the array are sorted (i meets j), we swap the pivot with the element at index i (which is now guaranteed to be in the right sub-array).
Since the pivot is guaranteed to be in the right position (left sub-array is strictly less and right sub-array is greater than or equal), we need to sort the sub-arrays but not the pivot. That is why the recursive calls use indices l,i-1 and i+1,r, leaving the pivot at index i.
I can't offer a solution in that exact form. That code is overly complicated in my thinking.
Also not sure if what I'm proposing is a bubble sort, or modified bubble, but to me just easier. My added comment is that quicksort() is calling itself, therefore it is recursive. Not good in my book for something as simple as a sort. This all depends on what you need for size and efficiency. If you're sorting many terms, then my proposed sort is not the best.
for(i = 0; i < (n - 1); i++) {
for(j = (i + 1); j < n; j++) {
if(value[i] > value[j]) {
tmp = value[i];
value[i] = value[j];
value[j] = tmp;
}
}
}
Where
n is the number of total elements.
i, j, and tmp are integers
value[] is an array of integers to sort

What is the purpose of swap(v, left, (left + right)/2) in K&R's qsort() implementation?

in K&R second edition, section 5.11, page 107:
void qsort(void *v[], int left, int right, int (*comp)(void *, void *))
{
int i, last;
void swap(void *v[], int, int);
if (left >= right)
return;
swap(v, left, (left + right)/2);
last = left;
for (i = left+1; i <= right; i++)
if ((*comp)(v[i], v[left]) < 0) /* Here's the function call */
swap(v, ++last, i);
swap(v, left, last);
qsort(v, left, last-1, comp);
qsort(v, last+1, right, comp);
}
However, I am confused about the "swap(v, left, (left + right)/2);". I think it is useless... What's the purpose of this sentence?
If the sentence is absent, there is no problem if the array V[ ] is random data.
However, if V[ ] is sorted data, no swap is encountered,
and the variable last is not changed.
Thus, last qsort() is equivalent to qsort(v, left+1, right, comp).
It means that only one element decreased in recursive call.
When comparison number of times in the function is n,
comparison needs n + (n-1) + (n-2) + ... + 1 = n(n+1)/2 times to complete.
It takes a long time if n is big.
Furthermore, if V[ ] is very large, stack overflow error may be encountered.
The statement exists to prevent them.
In addition, (left + right) / 2 should be left + (right - left) / 2
to prevent overflow error.
This swap is meant to pick the pivot element and place it at the leftmost position. Then, the variable last is used to count how many elements were greater than the pivot hance belong to the right partition. We don't actually compute the number, though - but only a position of the leftmost element of the right partition. After that, we can safely place the pivot back to where it belongs (exactly to the left of the right partition) - this is what the second swap does - and form the left partition at no cost since we know for sure that all elements that are not the pivot or right partition must belong to the left one.
Still, I strongly recommend to to simply take a piece of paper, write some random integer array and try to walk through the code line by line to see how the partitioning process is performed.

quicksort special case - seems to be a faulty algorithm from K&R

I have a problem understanding quicksort algorithm (the simplified version without pointers) from K&R. There is already a thorough explanation provided by Dave Gamble here explanation.
However I noticed that by starting with a slightly changed string we can obtain no swaps during many loops of the for loop.
Firstly the code:
void qsort(int v[], int left, int right)
{
int i, last;
void swap(int v[], int i, int j);
if (left >= right) /* do nothing if array contains */
return; /* fewer than two elements */
swap(v, left, (left + right)/2); /* move partition elem */
last = left; /* to v[0] */
for (i = left + 1; i <= right; i++) /* partition */
if (v[i] < v[left])
swap(v, ++last, i);
swap(v, left, last); /* restore partition elem */
qsort(v, left, last-1);
qsort(v, last+1, right);
}
Walkthrough in my opinion:
we start with CADBE; left=0; right=4; D is the pivot
so according to algorithm we swap D with C obtaining DACBE
last = left =0
i = 1 if ( v1 < v[0] ) it is true so we swap v1 (because last is incremented before operation) with v1 so nothing changes, last = 1, still having DACBE;
now i = 2 if ( v[2] < v[0] ) -> true so we swap v[2] with v[2] nothing changed again; last = 2
now i = 3 if ( v[3] < v[0] ) -> true so we swap v[3] with v[3] nothing changed AGAIN (!), last = 3
So apparently something is wrong, algorithm does nothing.
Your opinions appreciated very much. I must be wrong, authors are better than me ;D
Thanks in advance!
The loop goes from left + 1 up to and including right. When i=4, the test fails and last does not get incremented.
Then the recursive calls sort BACDE with left=0,right=2 and left=4,right=4. (Which is correct when D is the pivot.)
Well, it just so happened that your input sub-array ACBE is already partitioned by D (ACB is smaller than D and E is bigger than D), so it is not surprising the partitioning cycle does not physically swap any values.
In reality, it is not correct to say that it "does nothing". It does not reorder anything in the cycle, since your input data need no extra reordering. But it still does one thing: it finds the value of last that says where smaller elements end and bigger elements begin, i.e. it separates ACBE into ACB and E parts. The cycle ends with last == 3, which is the partitioning point for further recursive steps.

QuickSort and Hoare Partition

I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code I'm using is shown below:
void QuickSort(int a[],int start,int end) {
int q=HoarePartition(a,start,end);
if (end<=start) return;
QuickSort(a,q+1,end);
QuickSort(a,start,q);
}
int HoarePartition (int a[],int p, int r) {
int x=a[p],i=p-1,j=r;
while (1) {
do j--; while (a[j] > x);
do i++; while (a[i] < x);
if (i < j)
swap(&a[i],&a[j]);
else
return j;
}
}
Also, I don't really get why HoarePartition works. Can someone explain why it works, or at least link me to an article that does?
I have seen a step-by-step work-through of the partitioning algorithm, but I don't have an intuitive feel for it. In my code, it doesn't even seem to work. For example, given the array
13 19 9 5 12 8 7 4 11 2 6 21
It will use pivot 13, but end up with the array
6 2 9 5 12 8 7 4 11 19 13 21
And will return j which is a[j] = 11. I thought it was supposed to be true that the array starting at that point and going forward should have values that are all larger than the pivot, but that isn't true here because 11 < 13.
Here's pseudocode for Hoare partitioning (from CLRS, second edition), in case this is useful:
Hoare-Partition (A, p, r)
x ← A[p]
i ← p − 1
j ← r + 1
while TRUE
repeat j ← j − 1
until A[j] ≤ x
repeat i ← i + 1
until A[i] ≥ x
if i < j
exchange A[i] ↔ A[j]
else return j
Thanks!
EDIT:
The right C code for this problem will end up being:
void QuickSort(int a[],int start,int end) {
int q;
if (end-start<2) return;
q=HoarePartition(a,start,end);
QuickSort(a,start,q);
QuickSort(a,q,end);
}
int HoarePartition (int a[],int p, int r) {
int x=a[p],i=p-1,j=r;
while (1) {
do j--; while (a[j] > x);
do i++; while (a[i] < x);
if (i < j)
swap(&a[i],&a[j]);
else
return j+1;
}
}
To answer the question of "Why does Hoare partitioning work?":
Let's simplify the values in the array to just three kinds: L values (those less than the pivot value), E values (those equal to the pivot value), and G value (those larger than the pivot value).
We'll also give a special name to one location in the array; we'll call this location s, and it's the location where the j pointer is when the procedure finishes. Do we know ahead of time which location s is? No, but we know that some location will meet that description.
With these terms, we can express the goal of the partitioning procedure in slightly different terms: it is to split a single array into two smaller sub-arrays which are not mis-sorted with respect to each other. That "not mis-sorted" requirement is satisfied if the following conditions are true:
The "low" sub-array, that goes from the left end of the array up to and includes s, contains no G values.
The "high" sub-array, that starts immediately after s and continues to the right end, contains no L values.
That's really all we need to do. We don't even need to worry where the E values wind up on any given pass. As long as each pass gets the sub-arrays right with respect to each other, later passes will take care of any disorder that exists inside any sub-array.
So now let's address the question from the other side: how does the partitioning procedure ensure that there are no G values in s or to the left of it, and no L values to the right of s?
Well, "the set of values to the right of s" is the same as "the set of cells the j pointer moves over before it reaches s". And "the set of values to the left of and including s" is the same as "the set of values that the i pointer moves over before j reaches s".
That means that any values which are misplaced will, on some iteration of the loop, be under one of our two pointers. (For convenience, let's say it's the j pointer pointing at a L value, though it works exactly the same for the i pointer pointing at a G value.) Where will the i pointer be, when the j pointer is on a misplaced value? We know it will be:
at a location in the "low" subarray, where the L value can go with no problems;
pointing at a value that's either an E or a G value, which can easily replace the L value under the j pointer. (If it wasn't on an E or a G value, it wouldn't have stopped there.)
Note that sometimes the i and j pointer will actually both stop on E values. When this happens, the values will be switched, even though there's no need for it. This doesn't do any harm, though; we said before that the placement of the E values can't cause mis-sorting between the sub-arrays.
So, to sum up, Hoare partitioning works because:
It separates an array into smaller sub-arrays which are not mis-sorted relative to each other;
If you keep doing that and recursively sorting the sub-arrays, eventually there will be nothing left of the array that's unsorted.
I believe that there are two problems with this code. For starters, in your Quicksort function, I think you want to reorder the lines
int q=HoarePartition(a,start,end);
if (end<=start) return;
so that you have them like this:
if (end<=start) return;
int q=HoarePartition(a,start,end);
However, you should do even more than this; in particular this should read
if (end - start < 2) return;
int q=HoarePartition(a,start,end);
The reason for this is that the Hoare partition fails to work correctly if the range you're trying to partition has size zero or one. In my edition of CLRS this isn't mentioned anywhere; I had to go to the book's errata page to find this. This is almost certainly the cause of the problem you encountered with the "access out of range" error, since with that invariant broken you might run right off the array!
As for an analysis of Hoare partitioning, I would suggest starting off by just tracing through it by hand. There's also a more detailed analysis here. Intuitively, it works by growing two ranges from the ends of the range toward one another - one on the left-hand side containing elements smaller than the pivot and one on the right-hand side containing elements larger than the pivot. This can be slightly modified to produce the Bentley-McIlroy partitioning algorithm (referenced in the link) that scales nicely to handle equal keys.
Hope this helps!
Your final code is wrong, since the initial value of j should be r + 1 instead of r. Otherwise your partition function always ignore the last value.
Actually, HoarePartition works because for any array A[p...r] which contains at least 2 elements(i.e. p < r), every element of A[p...j] is <= every element of A[j+1...r] when it terminates.
So the next two segments that the main algorithm recurs on are [start...q] and [q+1...end]
So the right C code is as follows:
void QuickSort(int a[],int start,int end) {
if (end <= start) return;
int q=HoarePartition(a,start,end);
QuickSort(a,start,q);
QuickSort(a,q + 1,end);
}
int HoarePartition (int a[],int p, int r) {
int x=a[p],i=p-1,j=r+1;
while (1) {
do j--; while (a[j] > x);
do i++; while (a[i] < x);
if (i < j)
swap(&a[i],&a[j]);
else
return j;
}
}
More clarifications:
partition part is just the translation of the pseudocode. (Note the return value is j)
for the recursive part, note that the base case checking (end <= start instead of end <= start + 1 otherwise you will skip the [2 1] subarray )
First of all u misunderstood the Hoare's partition algorithm,which can be see from the translated code in c,
Since u considered pivot as leftmost element of subarray.
Ill explain u considering the leftmost element as pivot.
int HoarePartition (int a[],int p, int r)
Here p and r represents the lower and upper bound of array which can be part of a larger array also(subarray) to be partitioned.
so we start with the pointers(marker) initially pointing to before and after end points of array(simply bcoz using do while loop).Therefore,
i=p-1,
j=r+1; //here u made mistake
Now as per partitioning we want every element to the left of pivot to be less than or equal to pivot and greater than on right side of pivot.
So we will move 'i' marker untill we get element which is greaterthan or equal to pivot. And similarly 'j' marker untill we find element less than or equal to pivot.
Now if i < j we swap the elements bcoz both the elements are in wrong part of array. So code will be
do j--; while (a[j] <= x); //look at inequality sign
do i++; while (a[i] >= x);
if (i < j)
swap(&a[i],&a[j]);
Now if 'i' is not less than 'j',that means now there is no element in between to swap so we return 'j' position.
So now the array after partitioned lower half is from 'start to j'
upper half is from 'j+1 to end'
so code will look like
void QuickSort(int a[],int start,int end) {
int q=HoarePartition(a,start,end);
if (end<=start) return;
QuickSort(a,start,q);
QuickSort(a,q+1,end);
}
Straightforward implementation in java.
public class QuickSortWithHoarePartition {
public static void sort(int[] array) {
sortHelper(array, 0, array.length - 1);
}
private static void sortHelper(int[] array, int p, int r) {
if (p < r) {
int q = doHoarePartitioning(array, p, r);
sortHelper(array, p, q);
sortHelper(array, q + 1, r);
}
}
private static int doHoarePartitioning(int[] array, int p, int r) {
int pivot = array[p];
int i = p - 1;
int j = r + 1;
while (true) {
do {
i++;
}
while (array[i] < pivot);
do {
j--;
}
while (array[j] > pivot);
if (i < j) {
swap(array, i, j);
} else {
return j;
}
}
}
private static void swap(int[] array, int i, int j) {
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}
You last C code works. But it's not intuitive.
And now I'm studying CLRS luckily.
In my opinion, The pseudocode of CLRS is wrong.(At 2e)
At last, I find that it would be right if changing a place.
Hoare-Partition (A, p, r)
x ← A[p]
i ← p − 1
j ← r + 1
while TRUE
repeat j ← j − 1
until A[j] ≤ x
repeat i ← i + 1
until A[i] ≥ x
if i < j
exchange A[i] ↔ A[j]
else
exchnage A[r] ↔ A[i]
return i
Yes, Add a exchange A[r] ↔ A[i] can make it works.
Why?
Because A[i] is now bigger than A[r] OR i == r.
So We must exchange to guarantee the feature of a partition.
move pivot to first. (eg, use median of three. switch to insertion sort for small input size.)
partition,
repetitively swap currently leftmost 1 with currently rightmost 0.
0 -- cmp(val, pivot) == true, 1 -- cmp(val, pivot) == false.
stop if not left < right.
after that, swap pivot with rightmost 0.

Resources