I am writing a simple merge sort function to sort based on a given compar function:
void merge(int left, int mid, int right, int(*compar)(const void *, const void *))
{
// sublist sizes
int left_size = mid - left + 1;
int right_size = right - mid;
// counts
int i, j, k;
// create left and right arrays
B *left_list = (B*) malloc(left_size*sizeof(B));
B *right_list = (B*) malloc(right_size*sizeof(B));
// copy sublists, could be done with memcpy()?
for (i = 0; i < left_size; i++)
left_list[i] = list[left + i];
for (j = 0; j < right_size; j++)
right_list[j] = list[mid + j + 1];
// reset counts
i = 0; j = 0;
for (k = left; k <= right; k++)
{
if (j == right_size)
list[k] = left_list[i++];
else if (i == left_size)
list[k] = right_list[j++];
// here we call the given comparision function
else if (compar(&left_list[i], &right_list[j]) < 0)
list[k] = left_list[i++];
else
list[k] = right_list[j++];
}
}
void sort(int left, int right, int(*compar)(const void *, const void *))
{
if (left < right)
{
// find the pivot point
int mid = (left + right) / 2;
// recursive step
sort(left, mid, compar);
sort(mid + 1, right, compar);
// merge resulting sublists
merge(left, mid, right, compar);
}
}
I am then calling this several times on the same list array using different comparison functions. I am finding that the sort is stable for the first call, but then after that I see elements are swapped even though they are equal.
Can anyone suggest the reason for this behaviour?
I'm not sure if this will do it but try changing this line:
compar(&left_list[i], &right_list[j]) < 0
to this:
compar(&left_list[i], &right_list[j]) <= 0
This will make it so that if they are already equal it does the first action which will (hopefully) preserve the stability rather than moving things around.
This is just a guess though.
I think you got your sizes wrong
int left_size = mid - left;
And, as pointed by arasmussen, you need to give preference to the left list in order to mantain stability
compar(&left_list[i], &right_list[j]) <= 0
In adition to all of this, you are not calling free after malloc-ing the helper lists. This will not make the algorithm return incorrect results but will cause your program's memory use to grow irreversably everytime you call the sort function.
Related
I need to implement a quicksort algorithm that uses random pivot; I'm working with big matrices, so i can't afford the worst case.
Now, I've found this implementation that works correctly, but it uses as pivot the first element.
I've modified it to fit my scenario (I'm working with Sparse Matrices, and I need to sort the elements by "row index, col index") and this is what I have:
void quicksortSparseMatrix(struct sparsematrix *matrix,int first,int last){
int i, j, pivot, temp_I, temp_J;
double temp_val;
if(first<last){
pivot=first; //(rand() % (last - first + 1)) + first;
i=first;
j=last;
while(i<j){
while(lessEqual(matrix,i, pivot)&&i<last)
i++;
while(greater(matrix,j, pivot))
j--;
if(i<j){
temp_I = matrix->I[i];
temp_J = matrix->J[i];
temp_val = matrix->val[i];
matrix->I[i] = matrix->I[j];
matrix->J[i] = matrix->J[j];
matrix->val[i] = matrix->val[j];
matrix->I[j]=temp_I;
matrix->J[j]=temp_J;
matrix->val[j]=temp_val;
}
}
temp_I = matrix->I[pivot];
temp_J = matrix->J[pivot];
temp_val = matrix->val[pivot];
matrix->I[pivot] = matrix->I[j];
matrix->J[pivot] = matrix->J[j];
matrix->val[pivot] = matrix->val[j];
matrix->I[j]=temp_I;
matrix->J[j]=temp_J;
matrix->val[j]=temp_val;
quicksortSparseMatrix(matrix,first,j-1);
quicksortSparseMatrix(matrix,j+1,last);
}
}
Now, the problem is that some of the matrices i'm working with are almost sorted and the algorithm runs extremely slow. I want to modify my algorithm to make it use random pivot, but if I apply the change you see commented in the code above pivot=(rand() % (last - first + 1)) + first;, the algorithm does not sort the data correctly.
Can anyone help me figure out how to change the algorithm to use a random pivot and sort the data correctly?
EDIT: this is the struct sparsematrix definition, I don't think you need it, but for completeness...
struct sparsematrix {
int M, N, nz;
int *I, *J;
double *val;
};
Pivot should be a value, not an index. The first comparison should be lessthan (not lessthanorequal), which will also eliminate the need for checking for i < last . After swapping, there should be i++ and j-- . The last two lines should be quicksortSparseMatrix(matrix,first,j); and quicksortSparseMatrix(matrix,i,last); , for this variation of Hoare partition scheme. Example code for array:
void QuickSort(int *a, int lo, int hi)
{
int i, j;
int p, t;
if(lo >= hi)
return;
p = a[lo + 1 + (rand() % (hi - lo))];
i = lo;
j = hi;
while (i <= j){
while (a[i] < p)i++;
while (a[j] > p)j--;
if (i > j)
break;
t = a[i];
a[i] = a[j];
a[j] = t;
i++;
j--;
}
QuickSort(a, lo, j);
QuickSort(a, i, hi);
}
A merge sort on an array of indexes to rows of matrix may be faster: more moves of the indexes, but fewer compares of rows of matrix. A second temp array of indexes will be needed for merge sort.
I tried to implement Quicksort. It works fine except when there is a duplicate key, in which case there is an infinite loop and it never terminates. Can you help me understand what I am doing wrong?
// quick sort
void quickSort(int arr[], const unsigned size)
{
// base case
if (size < 2)
return;
int pivot = arr[size / 2];
unsigned L = 0, U = size - 1;
// partitioning
while (L < U) {
while (arr[L] < pivot)
L++;
while (arr[U] > pivot)
U--;
swap(&arr[L], &arr[U]);
}
quickSort(arr, L); // sort left array
quickSort(&arr[U + 1], size - L - 1); // sort right array
}
You have the less than and greater than conditions. But there is no condition for =. This is why it will run an infinite loop. Change them to <= and >=.
I have a recursive function that I wrote in C that looks like this:
void findSolutions(int** B, int n, int i) {
if (i > n) {
printBoard(B, n);
} else {
for (int x = 1; x <= n; x++) {
if (B[i][x] == 0) {
placeQueen(B, n, i, x);
findSolutions(B, n, i + 1);
removeQueen(B, n, i, x);
}
}
}
}
The initial call is (size is an integer given by user and B is a 2D array):
findSolutions(B, size, 1);
I tried to convert it into a iteration function but there is another function called removeQueen after findSolutions. I got stuck on where to put this function call. How to solve this problem? Stack is also fine but I'm also having trouble doing that.
I'm going to assume that placeQueen(B, n, i, x) makes a change to B and that removeQueen(B, n, i, x) undoes that change.
This answer shows how to approach the problem generically. It doesn't modify the algorithm like Aconcagua has.
Let's start by defining a state structure.
typedef struct {
int **B;
int n;
int i;
} State;
The original code is equivalent to the following:
void _findSolutions(State *state) {
if (state->i >= state->n) {
printBoard(state->B, state->n);
} else {
for (int x = 1; x <= state->n; ++x) {
if (state->B[state->i][x] == 0) {
State *state2 = State_clone(state); // Deep clone.
placeQueen(state2);
++state2->i;
findSolutions(state2);
}
}
}
State_free(state); // Frees the board too.
}
void findSolutions(int** B, int n, int i) {
State *state = State_new(B, n, i); // Deep clones B.
_findSolutions(state);
}
Now, we're in position to eliminate the recursion.
void _findSolutions(State *state) {
StateStack *S = StateStack_new();
do {
if (state->i >= state->n) {
printBoard(state->B, state->n);
} else {
for (int x = state->n; x>=1; --x) { // Reversed the loop to maintain order.
if (state->B[state->i][x] == 0) {
State *state2 = State_clone(state); // Deep clone.
placeQueen(state2);
++state2->i;
StateStack_push(S, state2);
}
}
}
State_free(state); // Frees the board too.
} while (StateStack_pop(&state));
StateStack_free(S);
}
void findSolutions(int** B, int n, int i) {
State *state = State_new(B, n, i); // Deep clones B.
_findSolutions(state);
}
We can eliminate the helper we no longer need.
void findSolutions(int** B, int n, int i) {
StateStack *S = StateStack_new();
State *state = State_new(B, n, i); // Deep clones B.
do {
if (state->i >= state->n) {
printBoard(state->B, state->n);
} else {
for (int x = state->n; x>=1; --x) { // Reversed the loop to maintain order.
if (state->B[state->i][x] == 0) {
State *state2 = State_clone(state); // Deep clone.
placeQueen(state2);
++state2->i;
StateStack_push(S, state2);
}
}
}
State_free(state); // Frees the board too.
} while (StateStack_pop(S, &state));
StateStack_free(S);
}
Functions you need to implement:
StateStack *StateStack_new(void)
void StateStack_free(StateStack *S)
void StateStack_push(StateStack *S, State *state)
int StateStack_pop(StateStack *S, State **p)
State *State_new(int **B, int n, int i) (Note: Clones B)
State *State_clone(const State *state) (Note: Clones state->B)
void State_free(State *state) (Note: Frees state->B)
Structures you need to implement:
StateStack
Tip:
It would be best if you replaced
int **B = malloc((n+1)*sizeof(int*));
for (int i=1; i<=n; ++i)
B[i] = calloc(n+1, sizeof(int));
...
for (int x = 1; x <= n; ++x)
...
B[i][x]
with
char *B = calloc(n*n, 1);
...
for (int x = 0; x < n; ++x)
...
B[(i-1)*n+(x-1)]
What you get by the recursive call is that you get stored the location of the queen in current row before you advance to next row. You will have to re-produce this in the non-recursive version of your function.
You might use another array storing these positions:
unsigned int* positions = calloc(n + 1, sizeof(unsigned int));
// need to initialise all positions to 1 yet:
for(unsigned int i = 1; i <= n; ++i)
{
positions[i] = 1;
}
I reserved a dummy element so that we can use the same indices...
You can now count up last position from 1 to n, and when reaching n there, you increment next position, restarting with current from 1 – just the same way as you increment numbers in decimal, hexadecimal or octal system: 1999 + 1 = 2000 (zero based in this case...).
for(;;)
{
for(unsigned int i = 1; i <= n; ++i)
{
placeQueen(B, n, i, positions[i]);
}
printBoard(B, n);
for(unsigned int i = 1; i <= n; ++i)
{
removeQueen(B, n, i, positions[i]);
}
for(unsigned int i = 1; i <= n; ++i)
{
if(++positions[i] <= n)
// break incrementing if we are in between the numbers:
// 1424 will get 1431 (with last position updated already before)
goto CONTINUE;
positions[i] = 1;
}
// we completed the entire positions list, i. e. we reset very
// last position to 1 again (comparable to an overflow: 4444 got 1111)
// so we are done -> exit main loop:
break;
CONTINUE: (void)0;
}
It's untested code, so you might find a bug in, but it should clearly illustrate the idea. It's the naive aproach, always placing the queens and removing them again.
You can do it a bit cleverer, though: place all queens at positions 1 initially and only move the queens if you really need:
for(unsigned int i = 1; i <= n; ++i)
{
positions[i] = 1;
placeQueen(B, n, i, 1);
}
for(;;)
{
printBoard(B, n);
for(unsigned int i = 1; i <= n; ++i)
{
removeQueen(B, n, i, positions[i]);
++positions[i]
if(++positions[i] <= n)
{
placeQueen(B, n, i, positions[i]);
goto CONTINUE;
}
placeQueen(B, n, i, 1);
positions[i] = 1;
}
break;
CONTINUE: (void)0;
}
// cleaning up the board again:
for(unsigned int i = 1; i <= n; ++i)
{
removeQueen(B, n, i, 1);
}
Again, untested...
You might discover that now the queens move within first row first, different to your recursive approach before. If that disturbs you, you can count down from n to 1 while incrementing the positions and you get original order back...
At the very end (after exiting the loop), don't forget to free the array again to avoid memory leak:
free(positions);
If n doesn't get too large (eight for a typical chess board?), you might use a VLA to prevent that problem.
Edit:
Above solutions will print any possible combinations to place eight queens on a chess board. For an 8x8 board, you get 88 possible combinations, which are more than 16 millions of combinations. You pretty sure will want to filter out some of these combinations, as you did in your original solution as well (if(B[i][x] == 0)), e. g.:
unsigned char* checks = malloc(n + 1);
for(;;)
{
memset(checks, 0, (n + 1));
for(unsigned int i = 1; i <= n; ++i)
{
if(checks[positions[i]] != 0)
goto SKIP;
checks[positions[i]] = 1;
}
// place queens and print board
SKIP:
// increment positions
}
(Trivial approach! Including the filter in the more elaborate approach will get more tricky!)
This will even be a bit more strict than your test, which would have allowed
_ Q _
Q _ _
_ Q _
on a 3x3 board, as you only compare against previous column, whereas my filter wouldn't (leaving a bit more than 40 000 boards to be printed for an 8x8 board).
Edit 2: The diagonals
To filter out those boards where the queens attack each other on the diagonals you'll need additional checks. For these, you'll have to find out what the common criterion is for the fields on the same diagonal. At first, we have to distinguish two types of diagonals, those starting at B[1][1], B[1][2], ... as well as B[2][1], B[3][1], ... – all these run from top left to bottom right direction. On the main diagonal, you'll discover that the difference between row and column index does not differ, on next neighbouring diagonals the indices differ by 1 and -1 respectively, and so on. So we'll have differences in the range [-(n-1); n-1].
If we make the checks array twice as large and shift all differences by n, can re-use do exactly the same checks as we did already for the columns:
unsigned char* checks = (unsigned char*)malloc(2*n + 1);
and after we checked the columns:
memset(checks, 0, (2 * n + 1));
for(unsigned int i = 1; i <= n; ++i)
{
if(checks[n + i - positions[i]] != 0)
goto SKIP;
checks[n + i - positions[i]] = 1;
}
Side note: Even if the array is larger, you still can just memset(checks, 0, n + 1); for the columns as we don't use the additional entries...
Now next we are interested in are the diagonals going from bottom left to top right. Similarly to the other direction, you'll discover that the difference between n - i and positions[i] remains constant for fields on the same diagonal. Again we shift by n and end up in:
memset(checks, 0, (2 * n + 1));
for(unsigned int i = 1; i <= n; ++i)
{
if(checks[2 * n - i - positions[i]] != 0)
goto SKIP;
checks[2 * n - i - positions[i]] = 1;
}
Et voilà, only boards on which queens cannot attack each other.
You might discover that some boards are symmetries (rotational or reflection) of others. Filtering these, though, is much more complicated...
In the question we were told that the crux of the algorithm is the fact that
"When we get down to single elements, that single
element is returned as the majority of its (1-element) array. At every other level, it will get return values from its
two recursive calls. The key to this algorithm is the fact that if there is a majority element in the combined array,
then that element must be the majority element in either the left half of the array, or in the right half of the array."
My implementation was this, probably very buggy but the general idea was this:
#include <stdio.h>
int merge(int *input, int left, int middle, int right, int maj1, int maj2)
{
// determine length
int length1 = middle - left + 1;
int length2 = right - middle;
// create helper arrays
int left_subarray[length1];
int right_subarray[length2];
// fill helper arrays
int i;
for (i=0; i<length1; ++i)
{
left_subarray[i] = input[left + i];
}
for (i=0; i<length2; ++i)
{
right_subarray[i] = input[middle + 1 + i];
}
left_subarray[length1] = 100;
right_subarray[length2] = 100;
//both return majority element
int count1 = 0;
int count2 = 0;
for (int i = 0; i < length1; ++i) {
if (left_subarray[i] == maj1) {
count1++;
}
if (right_subarray[i] == maj1) {
count1++;
}
}
for (int i = 0; i < length2; ++i) {
if (right_subarray[i] == maj2) {
count2++;
}
if (left_subarray[i] == maj2) {
count2++;
}
}
if (count1 > ((length1+length2) - 2)/2){
return maj1;
}
else if (count2 > ((length1+length2) - 2)/2){
return maj2;
}
else
return 0;
}
int merge_sort(int *input, int start, int end, int maj1, int maj2)
{
//base case: when array split to one
if (start == end){
maj1 = start;
return maj1;
}
else
{
int middle = (start + end ) / 2;
maj1 = merge_sort(input, start, middle, maj1, maj2);
maj2 = merge_sort(input, middle+1, end, maj1, maj2);
merge(input, start, middle, end, maj1, maj2);
}
return 0;
}
int main(int argc, const char* argv[])
{
int num;
scanf("%i", &num);
int input[num];
for (int i = 0; i < num; i++){
scanf("%i", &input[i]);
}
int maj;
int maj1 = -1;
int maj2 = -1;
maj = merge_sort(&input[0], 0, num - 1, maj1, maj2);
printf("%d", maj);
return 0;
}
This obviously isn't divide and conquer. I was wondering what is the correct way to implement this, so I can have a better understanding of divide and conquer implementations. My main gripe was in how to merge the two sub-array to elevate it to the next level, but I am probably missing something fundamental on the other parts too.
Disclaimer: This WAS for an assignment, but I am analyzing it now to further my understanding.
The trick about this particular algorithm, and why it ends up O(n log n) time is that you still need to iterate over the array you are dividing in order to confirm the majority element. What the division provides is the correct candidates for this iteration.
For example:
[2,1,1,2,2,2,3,3,3,2,2]
|maj 3| maj 2
maj 2 | maj None
<-------------------> still need to iterate
This is implicit in the algorithm statement: "if there is a majority element in the combined array, then that element must be the majority element in either the left half of the array." That "if" indicates confirmation is still called for.
I am trying to implement a custom quicksort and a custom comparator because i need to sort a struct by two elements (if the first are equal, sort by the second).
I used the following code, which was originally posted at the first answer of:
Sorting an array using multiple sort criteria (QuickSort)
typedef struct player {
int total;
char name[16];
} player;
void swap(player *p1,player *p2) {
player tmp = *p2;
*p2 = *p1;
*p1 = tmp;
}
int comp(const player *p1,const player *p2) {
if (p1->total < p2->total) return 1;
if (p1->total > p2->total) return -1;
return strcmp(p1->name, p2->name);
}
static void quickSort(player *arr, int left, int right) {
int m = (left+right)/2;
int l = left, r = right;
while (l <= r) {
while (comp(arr+l, arr+m) < 0) l++;
while (comp(arr+r, arr+m) > 0) r--;
if (l <= r) {
swap(arr+l, arr+r);
l++; r--;
}
}
if (r > left) quickSort(arr, left, r);
if (l < right) quickSort(arr, l, right);
}
I cant get this to work. It will succesfully sort by total but fails to sort by name when the two totals are equal.
Yes, Ive tried using this comparator with the standard qsort function and it worked just fine. But using it will be my last alternative.
Any help is appreciated.
Edit:
I am guessing the pivot is the problem. When I add 1 to it the 'name' ordering works fine but a few 'total' elements gets out of order.
There are a number of discrepancies between your quicksort algorithm and a standard implementation (see e.g. http://www.codingbot.net/2013/01/quick-sort-algorithm-and-c-code.html), mainly based around edge conditions which is why you've been able to see the problems when you have a number of identical entries in your list to be sorted.
If you change the quickSort routine to this, all should be well - the main differences are:
1) main while loop does not continue with equality condition
2) do not swap if items are at the same index, and do not change our walking pointers after swapping.
3) choose the first item in the list as the pivot each time, and then swap that with one of the items we've walked towards the middle of the list (the right item in this case).
4) after completing the sort either side of the pivot, then search the top and bottom half explicitly (i.e. from start to pivot-1, then pivot+1 to end).
static void quickSort(player *arr, int left, int right) {
int m = left;
int l = left, r = right;
while (l < r) {
while (comp(arr+l, arr+m) <= 0) l++;
while (comp(arr+r, arr+m) > 0) r--;
if (l < r) {
swap(arr+l, arr+r);
}
}
swap (arr+m, arr+r);
if (r > left) quickSort(arr, left, r-1);
if (l < right) quickSort(arr, r+1, right);
}
the problems of your quickSort function is that it does not consider that there may pivot is replaced.
static void quickSort(player *arr, int left, int right) {
int m = (left+right)/2;
player mp = arr[m];//I'll fixed
int l = left, r = right;
while (l <= r) {
while (comp(arr+l, &mp) < 0) l++;
while (comp(arr+r, &mp) > 0) r--;
if (l <= r) {
swap(arr+l, arr+r);
l++; r--;
}
}
if (r > left) quickSort(arr, left, r);
if (l < right) quickSort(arr, l, right);
}
Yes, you should worry. Standard string functions expect strings NUL-terminated and scanf stores string in this manner, too. If you enter a string longer than 15 characters, it gets stored in some player structure name member field (say, arr[0].name), but will overflow the name array, and some tail characters together with a terminating NUL (ASCII zero char) get stored outside the name array and outside the arr[0] variable, probably overwriting the next player's total (arr[1].total). Next you store new player's data in arr[1] and some bytes of arr[1] total 'glue' to the initial 16 chars of arr[0].name. That causes some unpredicted differences between 'equal' names. Furthermore during sorting the player structures get swapped and 'the same' name suddenly 'glues' to some new 'tail', resulting in inconsistent comparisions (same data, when moved to a different place, may compare either less or greater than the pivot).