Arguments to Partition function for Quicksort - c

So I've been attempting to write a stack-based quicksort function, which calls a partition function. The header for partition() is as follows:
int partition(void **A, int n, void *pivot, int (cmp)(void *, void *));
where A is the array, n is the size of the array and pivot is the value of the pivot (not the index).
My current call to partition is:
partition(&A[low_int], high_int + 1, A[low_int+(high_int-low_int)/2], cmp)
Above, my values for low and high are the classic 'l' and 'h' used in iterative quicksort, where l begins as the lowest index (and h the highest). These values then change as the function continues.
I'll post my partition function below:
int
partition(void **A, int n, void *pivot, int (cmp)(void *, void *)) {
int k;
int i = 0;
for (int j = 0; j <= n-2; j++) {
if (cmp(A[j], pivot) <= 0) {
i++;
swap(A, i, j);
}
}
swap(A, i+1, n-1);
k = i + 2;
return k; //k is the first value after the pivot in partitioned A
}
My problem is deciding on the inputs for my call to partition(). For the first argument, I've chose &A[low_int], as I'm not using a "left" as one of my inputs and therefore am trying to create a pointer to essentially start my array later. The third argument if for pivot, where I've been trying to select an element within that range, but both this and argument 1 have been causing my code to either return an unsorted array or run infinitely.
Could I please get some help here with what I've done wrong and how to fix it?
I've tried to include all relevant code, but please let me know if I've missed anything important or if anything I've written is unclear, or you need more info. Thank you

Consider what happens if low_int is 1000 and high_int is 2000 and the array ends at 2000. Now you give it the array B = &A[1000] and the value 2001. The value of 2001 causes it to access the element B[2001-1] = B[2000] = A[3000]. It's accessing the array out of bounds.
Shouldn't you use something like high_int - low_int + 1 for the second argument? Note: I haven't verified that your code doesn't have off-by-one errors with the argument high_int - low_int + 1 but anyway it seems to me that you should be substracting low_int from high_int.
Another option would be to give it A, low_int and high_int.

Related

Minimum number of arguments for a recursive function to explore matrix

sorry if someone already asked this, didn't find it.
I'm wandering what's the minimum number of arguments i have to pass to a recursive function in order to explore all its values.
I'll make an example; suppose I want to write a function which returns me the sum of all the values contained in a MXM matrix, i can surely do it (and did it) like this:
int sum(int mat[][M], int i, int j){
if(j==M-1&&i==M-1){
return mat[i][j];
}
if(j==M){
i++;
j=0;
}
return mat[i][j] + sum(mat, i, j+1);
}
Calling
sum(mat,0,0);
I obtain the result.
My question is: can I obtain the same result writing a function with less arguments?
About the example: can I obtain the same result writing a funtion like:
int sum(int mat[][M], int i){...}
or just
int sum(int mat[][M]){...}
?
More abstractly speaking, what is the minimum number of arguments I need to pass to a recursive function in order to explore a matrix?
Thanks everyone.
You could do it with one argument by having it be the row-major index of the element to add. The function can calculate the row and column indexes from that.
int sum(int mat[M][M], int x){
int i = x / M;
int j = x % M;
if(j==M-1 && i==M-1){
return mat[i][j];
}
return mat[i][j] + sum(mat, x+1);
}
You can't do it by just passing the array, because you need some way of telling when you've reached the base case so the recursion should stop. A recursive call has to have some parameter that changes each time, getting closer to the base case. But the array value doesn't change.

What causes my array address to be corrupted (change) when passed to function?

I am performing Compressed Sparse Raw Matrix Vector multiplications (CSR SPMV): This involves dividing the array A into multiple chunks, then pass this chunk by reference to a function, however only the first part of the array (A[0] first chunk starting the beginning of the array) is modified. However starting from the second loop A[0 + chunkIndex], when the function reads the sub array it jumps and reads a different address beyond the total array address range, although the indices are correct.
For reference:
The SPMV kernel is:
void serial_matvec(size_t TS, double *A, int *JA, int *IA, double *X, double *Y)
{
double sum;
for (int i = 0; i < TS; ++i)
{
sum = 0.0;
for (int j = IA[i]; j < IA[i + 1]; ++j)
{
sum += A[j] * X[JA[j]]; // the error is here , the function reads diffrent
// address of A, and JA, so the access
// will be out-of-bound
}
Y[i] = sum;
}
}
and it is called this way:
int chunkIndex = 0;
for(size_t k = 0; k < rows/TS; ++k)
{
chunkIndex = IA[k * TS];
serial_matvec(TS, &A[chunkIndex], &JA[chunkIndex], &IA[k*TS], &X[0], &Y[k*TS]);
}
assume I process (8x8) Matrix, and I process 2 rows per chunk, so the loop k will be rows/TS = 4 loops, the chunkIndex and array passed to the function will be as following:
chunkIndex: 0 --> loop k = 0, &A[0], &JA[0]
chunkIndex: --> loop k = 1, &A[16], &JA[16] //[ERROR here, function reads different address]
chunkIndex: --> loop k = 2, &A[32], &JA[32] //[ERROR here, function reads different address]
chunkIndex: --> loop k = 3, &A[48], &JA[48] //[ERROR here, function reads different address]
When I run the code, only the first chunk executes correctly, the other 3 chunks memory are corrupted and the array pointers jump into boundary beyond the array size.
I've checked all indices manually, of all the parameter, they are all correct, however when I print the addresses they are not the same. (debugging this for 3 days now)
I used valgrind and it reported:
Invalid read of size 8 and Use of uninitialised value of size 8 at the sum += A[j] * X[JA[j]]; line
I compiled it with -g -fsanitize=address and I got
heap-buffer-overflow
I tried to access these chunks manually outside the function, and they are correct, so what can cause the heap memory to be corrupted like this ?
The code is here, This is the minimum I can do.
The problem was that I was using global indices (indices inside main) when indexing the portion of the array (chunk) passed to the function, hence the out-of-bound problem.
The solution is to start indexing the sub-arrays from 0 at each function call, but I had another problem. At each function call, I process TS rows, each row has different number of non-zeros.
As an example, see the picture, chunk 1, sorry for my bad handwriting, it is easier this way. As you can see we will need 3 indices, one for the TS rows proceeded per chunk i , and the other because each row has different number of non-zeros j, and the third one to index the sub-array passed l, which was the original problem.
and the serial_matvec function will be as following:
void serial_matvec(size_t TS, const double *A, const int *JA, const int *IA,
const double *X, double *Y) {
int l = 0;
for (int i = 0; i < TS; ++i) {
for (int j = 0; j < (IA[i + 1] - IA[i]); ++j) {
Y[i] += A[l] * X[JA[l]];
l++;
}
}
}
The complete code with test is here If anyone has a more elegant solution, you are more than welcome.

Understanding the following code

Give this code:
int solution(int X, int A[], int N) {
int *jumps = calloc(X+1, sizeof(int));
int counter = 0;
int i;
for(i=0; i<N; i++) {
if(A[i]<=X && *(jumps+A[i])!=1) {
*(jumps+A[i])=1;
if(++counter==X) {
return i;
}
}
}
free(jumps);
return -1;
}
Here is what I think I know:
1) int *jumps = calloc(X+1, sizeof(int));
This is making an array storing X+1 elements of an int type. Since it's
calloc they are all initialized as 0.
2) if(A[i]<=X && *(jumps+A[i])!=1)
This if statement's condition is that the element of A at index i is less than or equal to X and the second part I am confused with. I am totally confused what *(jumps+A[i])!=1) means. I know that whatever *(jumps+A[i]) is cannot equal 1.
3) if(++counter==X)
This also confuses me. I'm not sure what ++ does in front of counter. I thought ++ was used as adding an increment of 1 to something. Also, how does counter change? If given the example (5,[1,3,1,4,2,3,5,4]) it changes to 5 but I don't understand why.
So here is what i understand :
every value in A that is superior to X are ignored. (A[i] <= X)
every duplicate value in A are ignored : this is the purpose of (jumps+A[i]) statements.
lastly it will return the index of the current loop if your A array contains at least X unique values inferior to X.
Conclusion : if X is 10. Then it will return the index of A when the function will have found every value from 0 to 9 once whatever their order is. If not found return -1. The ++counter make it so it will stop a 9 and not 10.

Array percentage algorithm implementation

So I just stared programming in C a few days ago and I have this program which takes an unsorted file full of integers, sorts it using quicksort
1st algorithm
Any suggestions on what I have done wrong in this?
From what you have described, it sounds like you are almost there. You are attempting to get the first element of a collection that has a value equal to (or just greather than) 90% of all the other members of the collection. You have already done the sort. The rest should be simply following these steps (if I have understood your question):
1) sort collection into an into array (you've already done this I think)
2) count numbers in collection, store in float n; //number of elements in collection
3) index through sorted array to the 0.9*n th element, (pick first one beyond that point not a duplicate of previous)
4) display results
Here is an implementation (sort of, I did not store n) of what I have described: (ignore the random number generator, et al., it is just a fast way to get an array)
#include <ansi_c.h>
#include <windows.h>
int randomGenerator(int min, int max);
int NotUsedRecently (int number);
int cmpfunc (const void * a, const void * b);
int main(void)
{
int array[1000];
int i;
for(i=0;i<1000;i++)
{
array[i]=randomGenerator(1, 1000);
Sleep(1);
}
//sort array
qsort(array, 1000, sizeof(int), cmpfunc);
//pick the first non repeat 90th percent and print
for(i=900;i<999;i++)
{
if(array[i+1] != array[i])
{
printf("this is the first number meeting criteria: %d", array[i+1]);
break;
}
}
getchar();
return 0;
}
int cmpfunc (const void * a, const void * b)
{
return ( *(int*)a - *(int*)b );
}
int randomGenerator(int min, int max)
{
int random=0, trying=0;
trying = 1;
srand(clock());
while(trying)
{
random = (rand()/32767.0)*(max+1);
(random >= min) ? (trying = 0) : (trying = 1);
}
return random;
}
And here is the output for my first randomly generated array (centering around the 90th percentile), compared against what the algorithm selected: Column on left is the element number, on the right is the sorted list of randomly generated integers. (notice it skips the repeats to ensure smallest value past 90%)
In summary: As I said, I think you are already, almost there. Notice how similar this section of my code is to yours:
You have something already, very similar. Just modify it to start looking at 90% index of the array (whatever that is), then just pick the first value that is not equal to the previous.
One issue in your code is that you need a break case for your second algorithm, once you find the output. Also, you cannot declare variables in your for loop, except under certain conditions. I'm not sure how you got it to compile.
According this part:
int output = array[(int)(floor(0.9*count)) + 1];
int x = (floor(0.9*count) + 1);
while (array[x] == array[x + 1])
{
x = x + 1;
}
printf(" %d ", output);
In while you do not check if x has exceeded count... (What if all the top 10% numbers are equal?)
You set output in first line and print it in last, but do not do antything with it in meantime. (So all those lines in between do nothing).
You definitely are on the right track.

order of recursion in c

I have written a program to print all the permutations of the string using backtracking method.
# include <stdio.h>
/* Function to swap values at two pointers */
void swap (char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
/* Function to print permutations of string
This function takes three parameters:
1. String
2. Starting index of the string
3. Ending index of the string. */
void permute(char *a, int i, int n)
{
int j;
if (i == n)
printf("%s\n", a);
else
{
for (j = i; j <= n; j++)
{
swap((a+i), (a+j));
permute(a, i+1, n);
swap((a+i), (a+j)); //backtrack
}
}
}
/* Driver program to test above functions */
int main()
{
char a[] = "ABC";
permute(a, 0, 2);
getchar();
return 0;
}
What would be time complexity here.Isn't it o(n2).How to check the time complexity in case of recursion? Correct me if I am wrong.
Thanks.
The complexity is O(N*N!), You have N! permutations, and you get all of them.
In addition, each permutation requires you to print it, which is O(N) - so totaling in O(N*N!)
My answer is going to focus on methodology since that's what the explicit question is about. For the answer to this specific problem see others' answer such as amit's.
When you are trying to evaluate complexity on algorithms with recursion, you should start counting just as you would with an iterative one. However, when you encounter recursive calls, you don't know yet what the exact cost is. Just write the cost of the line as a function and still count the number of times it's going to run.
For example (Note that this code is dumb, it's just here for the example and does not do anything meaningful - feel free to edit and replace it with something better as long as it keeps the main point):
int f(int n){ //Note total cost as C(n)
if(n==1) return 0; //Runs once, constant cost
int i;
int result = 0; //Runs once, constant cost
for(i=0;i<n;i++){
int j;
result += i; //Runs n times, constant cost
for(j=0;j<n;j++){
result+=i*j; //Runs n^2 times, constant cost
}
}
result+= f(n/2); //Runs once, cost C(n/2)
return result;
}
Adding it up, you end up with a recursive formula like C(n) = n^2 + n + 1 + C(n/2) and C(1) = 1. The next step is to try and change it to bound it by a direct expression. From there depending on your formula you can apply many different mathematical tricks.
For our example:
For n>=2: C(n) <= 2n^2 + C(n/2)
since C is monotone, let's consider C'(p)= C(2^p):
C'(p)<= 2*2^2p + C'(p-1)
which is a typical sum expression (not convenient to write here so let's skip to next step), that we can bound: C'(p)<=2p*2^2p + C'(0)
turning back to C(n)<=2*log(n)*n^2 + C(1)
Hence runtime in O(log n * n^2)
The exact number of permutations via this program is (for a string of length N)
start : N p. starting each N-1 p. etc...
number of permutations is N + N(N-1) + N(N-1)(N-2) + ... + N(N-1)...(2) (ends with 2 since the next call just returns)
or N(1+(N-1)(1+(N-2)(1+(N-3)(1+...3(1+2)...))))
Which is roughly 2N!
Adding a counter in the for loop (removing the printf) matches the formula
N=3 : 9
N=4 : 40
N=5 : 205
N=6 : 1236
...
The time complexity is O(N!)

Resources