Compare Function used by Qsort - Comparing (char **) - c

Here is the compare function:
int compare(const void *a, const void *b) {
char* *s = (char* *) a;
char* *t = (char* *) b;
return sort_order * strcmp(*s, *t); // sort_order is -1 or 1
}
Now my question is what is the reasoning behind casting to a double pointer of a particular type? Or rather, why is the double pointer cast needed and how is it used internally?
Other variables used:
char **wordlist; int nbr_words; (array elements are) char *word;
Ex qsort call: qsort(wordlist, nbr_words, sizeof(char *), compare);

It would help if you showed the definition of wordlist, but most likely it's defined as a char **. The compare() function receives a pointer to each element of your list. If each element of your list is of type char *, then compare() is going to receive two pointers to char *, or two char ** in other words.
The conversion to char ** (note that an actual cast would be superfluous, in this particular case, if you weren't going from a const void pointer, to a non-const char **) itself is necessary because qsort() has to work on any kind of type, so the arguments get converted to void * before they are passed. You can't deference a void * so you have to convert them back to their original types before doing anything with them.
For instance:
#include <stdio.h>
int compare_int(void * a, void * b) {
int * pa = a;
int * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_double(void * a, void * b) {
double * pa = a;
double * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_any(void * a, void * b, int (*cfunc)(void *, void *)) {
return cfunc(a, b);
}
int main(void) {
int a = 1, b = 2;
if ( compare_any(&a, &b, compare_int) ) {
puts("a and b are not equal");
} else {
puts("a and b are equal");
}
double c = 3.0, d = 3.0;
if ( compare_any(&c, &d, compare_double) ) {
puts("c and d are not equal");
} else {
puts("c and d are equal");
}
return 0;
}
Outputs:
paul#local:~/src/c/scratch$ ./comp
a and b are not equal
c and d are equal
paul#local:~/src/c/scratch$
The compare_any() function will compare any type which is supported, in this case, int and double, because we can pass a function pointer to it. However, the signature of the passed function must be the same, so we can't declare compare_int() to take two int * arguments, and compare_double() to take two double *. We have to declare them both as taking two void * arguments, and when we do this, we have to convert those void * arguments to something useful within those functions before we can work with them.
What's happening in your case is exactly the same, but the data themselves are pointers, so we're passing pointers to pointers, and so we need to convert void * to, in your case, char **.
EDIT: To explain some confusion in the comments to the original question about how qsort() works, here's the qsort() signature:
void qsort(void *base, size_t nmemb, size_t size,
int(*compar)(const void*, const void*))
base is a pointer to the first element of an array, nmemb is the number of members of that array, and size is the size of each element.
When qsort() calls compar on, say, the first and second elements of your array, it'll send the address of the first element (i.e. base itself) and the address of the element (i.e. base + size).
If base was originally declared as an array of int, then the compare function must interpret those pointers it receives as pointers to int, as int *. If base was originally declared as an array of strings, as a char **, then the compare function must interpret those pointers as pointers to char *, i.e. as char **.
In all cases, the compare function gets pointers to elements. If you have an int array, then you must interpret those pointers as int * in your compare function. If you have a char * array, then you must interpret them as char **, and so on.
In this case, you obviously could call strcmp() if you just passed plain char * arguments to the compare function. But, because qsort() is generic it can only pass pointers to the compare function, it can't actually pass the value of your elements - it's the use of void * which allows it to be generic, because any type of object pointer can be converted to void *, but there is no equivalent datatype to which any non-pointer value can be converted. For that reason, it has to work the same way with regular types like int and double, with pointers, and with structs, and the only way to get it to work correctly with all possible types is to have it deal with pointers to elements, not with the elements themselves, even when the elements are themselves also pointers. For this reason, it might seem like you're getting an unnecessary level of indirection, here, but it actually is necessary in order for qsort() to be able to function in the generic way that it does.
You can see this more clearly if I modify the code above so that compare_any() is more similar to qsort(), and takes not two pointers, but a single pointer to a two-element array of various types (slightly contrived example, but we're keeping it simple):
#include <stdio.h>
#include <string.h>
int compare_int(void * a, void * b) {
int * pa = a;
int * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_double(void * a, void * b) {
double * pa = a;
double * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_string(void * a, void * b) {
char ** pa = a;
char ** pb = b;
return strcmp(*pa, *pb);
}
int compare_any(void * arr, size_t size, int (*cfunc)(void *, void *)) {
char * first = arr;
char * second = first + size;
return cfunc(first, second);
}
int main(void) {
int n[2] = {1, 2};
if ( compare_any(n, sizeof(*n), compare_int) ) {
puts("a and b are not equal");
} else {
puts("a and b are equal");
}
double d[2] = {3.0, 3.0};
if ( compare_any(d, sizeof(*d), compare_double) ) {
puts("c and d are not equal");
} else {
puts("c and d are equal");
}
char * s[] = {"abcd", "bcde"};
if ( compare_any(s, sizeof(*s), compare_string) ) {
puts("'abcd' and 'bcde' are not equal");
} else {
puts("'abcd' and 'bcde' are equal");
}
return 0;
}
Outputs:
paul#local:~/src/c/scratch$ ./comp
a and b are not equal
c and d are equal
'abcd' and 'bcde' are not equal
paul#local:~/src/c/scratch$
As you can see, there's no way compare_any() could accept both an array of int, and an array of char *, without the compare_string() function getting a pointer it needs to treat as a char **, because of the pointer arithmetic it performs on the array elements. Without that additional level of indirection, neither compare_int() nor compare_double() could function.

Related

type-agnostic belongs function

I'm trying to build a function that checks whether a particular pointer value is stored in a given array. I'm trying to make the function type-agnostic and so I decided to go with the approach that was used to implement qsort(), in which a function pointer is passed to do the type-specific tasks.
The function looks like the following:
int is_in(void* array, int size, void* pelement, int (*equals)(void* this, void* that)) {
for(int k = 0; k < size; k++) {
if(equals(array + k, pelement)) {
return 1;
}
}
return 0;
}
The equals() function checks whether the second parameter is equal to the value pointed at by the first parameter.
One particular implementation of the equals() function that I needed to realize pertains to a struct Symbol type that I created. The implementation looks like the following:
int ptreq(void* ptr1, void* ptr2) {
return ((*((Symbol**) ptr1) == (Symbol*) ptr2));
}
The struct Symbol is defined as follows:
enum SymbolType {
TERMINAL,
NONTERMINAL
} typedef SymbolType;
struct Symbol {
char* content;
SymbolType type;
} typedef Symbol;
void set_symbol(Symbol* pS, SymbolType type, char* content) {
pS->content = malloc(sizeof(content));
strcpy(pS->content, content);
pS->type = type;
}
However, when I tried testing is_in() with a base example, I ended up with incorrect results. For instance, the following code:
#include <stdlib.h>
#include <stdio.h>
#include "string.h"
#include <stdarg.h>
#include <unistd.h>
int main(int argc, char* argv[]) {
Symbol F, E;
set_symbol(&E, NONTERMINAL, "E");
set_symbol(&F, NONTERMINAL, "F");
Symbol** pptest = malloc(2*sizeof(Symbol*));
pptest[0] = &E;
pptest[2] = &F;
printf("Is F in pptest? %d\n", is_in(pptest, 2, &F, &ptreq));
return 0;
}
Gives the following Output:
Is F in pptest? 0
Even though &F is within pptest.
What could be the problem with this approach?
Type void is an incomplete type. So used by you the expression array + k with the pointer arithmetic in the if statement
if(equals(array + k, pelement)) {
is invalid.
Also you need to pass to the function the size of objects stored in the array that will be used in expressions with the pointer arithmetic.
Using your approach the function should be declared similarly to standard C function bsearch that looks like
void *bsearch(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
Only the return type must be changed from void * to int.
That is the declaration pf your function will look like
int is_in( const void *pvalue,
const void *array,
size_t nmemb,
size_t size,
int cmp( const void *, const void *) );
The function can be defined the following way
int is_in( const void *pvalue,
const void *array,
size_t nmemb,
size_t size,
int cmp( const void *, const void *) )
{
size_t i = 0;
while ( i < nmemb && cmp( pvalue, ( const char * )array + i * size ) != 0 ) i++;
return i != nmemb;
}
In general the comparison function shall return an integer less than, equal to, or greater than zero if the searched element is considered, respectively, to be less than, to match, or to be greater than the array element.
In your case as you have an array of pointers that can point to arbitrary objects then the function should return 0 if elements passed to the function are equal each other or just a positive value if they are unequal each other.
int ptreq( const void *ptr1, const void *ptr2 )
{
return *( const Symbol ** )ptr1 != *( const Symbol ** )ptr2;
}
Pay attention to that the passed searched elementmust have the typeSymbol **`.
Here is a demonstration program.
#include <stdio.h>
int is_in( const void *pvalue,
const void *array,
size_t nmemb,
size_t size,
int cmp( const void *, const void * ) )
{
size_t i = 0;
while (i < nmemb && cmp( pvalue, ( const char * )array + i * size ) != 0) i++;
return i != nmemb;
}
int cmp_ptr( const void *ptr1, const void *ptr2 )
{
return *( const int ** )ptr1 != *( const int ** )ptr2;
}
int main( void )
{
int x, y, z;
int * a[] = { &x, &y, &z };
const size_t N = sizeof( a ) / sizeof( *a );
int *pvalue = &y;
printf( "&y is in the array = %s\n",
is_in( &pvalue, a, N, sizeof( *a ), cmp_ptr ) ? "true" : "false" );
int v;
pvalue = &v;
printf( "&v is in the array = %s\n",
is_in( &pvalue, a, N, sizeof( *a ), cmp_ptr ) ? "true" : "false" );
}
The program output is
&y is in the array = true
&v is in the array = false
A Symbol** passed to a void* parameter doesn't come out as an array in the other end unless you cast it to a proper type. array + k is invalid C and will not compile cleanly on conforming compilers. You cannot do pointer arithmetic on void* nor can you iterate through what it points at without knowing the item size - there's a reason why qsort takes that as parameter.
A correctly written standard C function might look something like this:
#include <stddef.h>
#include <stdbool.h>
bool is_in (const void* array,
size_t n_items,
size_t item_size,
const void* element,
int (*equals)(const void*, const void*))
{
unsigned char* byte = array;
for(size_t i=0; i<n_items; i++)
{
if(equals(&byte[i*item_size], element))
{
return true;
}
}
return false;
}
int symbol_equal (const void* obj1, const void* obj2)
{
const Symbol* s1 = obj1;
const Symbol* s2 = obj2;
...
// in case you passed an array of pointers, then an extra level of dereferencing here
}
As others have pointed out, the problem results from trying to perform addition on void*, which is not defined in standard C. While other answers avoid this by passing the item size, as is the case with qsort(), I managed to solve the problem by separating array and k into distinct parameters of the equals() routine, which then performs pointer arithmetic after casting into the proper, non-void* type.
int is_in(void* list, int size, void* pelement, int (*equals)(void* this, int k, void* that)) {
for(int k = 0; k < size; k++) {
if(equals(list, k, pelement)) {
return 1;
}
}
return 0;
}
int ptreq(void* ptr1, int k, void* ptr2) {
return (*(((Symbol**) ptr1) + k) == (Symbol*) ptr2);
}

Strcmp causes segfault

Here is the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int my_compare(const void * a, const void * b);
int main()
{
char s[][80] =
{ "gxydyv", "gdyvjv", "lfdtvr", "ayfdbk", "sqkpge", "axkoev", "wdjitd", "pyrefu", "mdafyu",
"zdgjjf", "awhlff", "dqupga", "qoprcn", "axjyfb", "hfrgjf", "dvhhhr" };
int i;
puts("#Before:#");
for (i = 0; i < 16; i++)
puts(s[i]);
qsort(s, 16, sizeof *s, my_compare);
putchar('\n');
puts("#After:#");
for (i = 0; i < 16; i++)
puts(s[i]);
return 0;
}
int my_compare(const void *a, const void *b)
{
return strcmp(*(char **)a, *(char **)b);
}
Here is the output:
#Before:#
gxydyv
gdyvjv
lfdtvr
ayfdbk
sqkpge
axkoev
wdjitd
pyrefu
mdafyu
zdgjjf
awhlff
dqupga
qoprcn
axjyfb
hfrgjf
dvhhhr
Segmentation fault
I also notice that the prototype of strcmp is:
int strcmp(const char *s1,const char *s2);
I suppose that the type of a and b in my_compare is "pointer to array-of-char". As a result, *(char **)a is a "pointer to char", which is exactly what strcmp expects.
So where is the problem?
Change:
return strcmp(*(char **) a, *(char **) b);
To:
return strcmp(a,b);
You had an extra level of pointer dereferencing that was incorrect and that's why you got the segfault. That is, you were passing the char values and not the char pointers [which got masked with the cast].
Note: no need to cast from void * here.
UPDATE:
In reponse to your question, yes, because of the way you defined s and the qsort call.
Your original my_compare would have been fine if you had done:
char *s[] = { ... };
And changed your qsort call to:
qsort(s, 16, sizeof(char *), my_compare);
To summarize, here are two ways to do it
int
main()
{
char s[][80] = { ... }
qsort(s, 16, 80, my_compare);
return 0;
}
int
my_compare(const void *a, const void *b)
{
return strcmp(a,b);
}
This is a bit cleaner [uses less space in array]:
int
main()
{
char *s[] = { ... }
qsort(s, 16, sizeof(char *), my_compare);
return 0;
}
int
my_compare(const void *a, const void *b)
{
return strcmp(*(char **) a,*(char **) b);
}
UPDATE #2:
To answer your second question: No
None of these even compile:
return strcmp((char ()[80])a,(char ()[80])b);
return strcmp(*(char ()[80])a,*(char ()[80])b);
return strcmp((char [][80])a,(char [][80])b);
return strcmp(*(char [][80])a,*(char [][80])b);
But, even if the did, they would be logically incorrect. The following does not compile either, but is logically closer to what qsort is passing:
return strcmp((char [80])a,(char [80])b);
But, when a function passes something defined as char x[80] it's just the same as char *x, so qsort is passing char * [disguised as void *].
A side note: Using char *s[] is far superior. It allows for arbitrary length strings. The other form char s[][80] will actually fail if a given string exceeds [or is exactly] 80 chars.
I think it's important for you to understand:
Arrays are call by reference.
The interchangeability of arrays and pointers.
The following two are equivalent:
char *
strary(char p[])
{
for (; *p != 0; ++p);
return p;
}
char *
strptr(char *p)
{
for (; *p != 0; ++p);
return p;
}
Consider the following [outer] definitions:
char x[] = { ... };
char *x = ...;
Either of these two may be passed to strary and/or strptr in any of the following forms [total of 20]:
strXXX(x);
strXXX(x + 0);
strXXX(&x[0]);
strXXX(x + 1);
strXXX(&x[1]);
Also, see my recent answer here: Issue implementing dynamic array of structures
You can just cast it to a const char *, it should work now:
int my_compare(const void *a, const void *b) {
return strcmp((const char *)a, (const char *)b);
}
And also you should add:
#include <stdlib.h>

Sort char pointer array in C

I am trying to sort array of char pointers, for that purpose I use qsort function, but I can't understand what I am doing wrong and how I can sort that array.
int StringCompare( const void* a, const void* b)
{
char const *char_a = (char const *)a;
char const *char_b = (char const *)b;
return strcmp(char_a, char_b);
}
int main() {
char *a[] = { "Garima",
"Amit",
"Gaurav",
"Vaibhav"
};
int n;
qsort( a, 4, sizeof(char*), StringCompare);
for (n=0; n<4; n++)
printf ("%c ",*a[n]);
}
The Output is: G A G V
The issue is that the values passed to the sort function (a.k.a StringCompare) are pointers into the a array. In other words, they are of type const char **.
You need to instead declare char_a and char_b as const char **, and dereference them in the call to strcmp:
int StringCompare( const void* a, const void* b)
{
char const **char_a = a;
char const **char_b = b;
return strcmp(*char_a, *char_b);
}
Also note the casts are unnecessary.
proper comparator:
int StringCompare( const void* a, const void* b)
{
char const *char_a = *(char const **)a;
char const *char_b = *(char const **)b;
return strcmp(char_a, char_b);
}
NOTE:
according to sort description comparator function is:
compar
Pointer to a function that compares two elements.
This function is called repeatedly by qsort to compare two elements.
It shall follow the following prototype:
int compar (const void* p1, const void* p2);
so, it receives not a char*, but char**
proper output cycle:
for (n=0; n<4; n++)
printf ("%s ", a[n]);
Define your StringCompare function this way:
int StringCompare(const char **a, const char **b)
{
return strcmp(*a, *b);
}
No need to clutter the code with explicit casting because you can implicitly cast a void pointer to any other pointer type.
If you want to sort char-arrays for the first letters, you could implement a function, that looks at the (unsigned) values of the first char in the array. As they are all equal to the numbers in the ASCII-standards. But you have to be careful if you mix upper case chars with lower case chars.
I know... its not a some special implemented function, but I once programmed it that way and it worked.

Sorting Array of Struct Pointers

I am trying to sort array of structs using qsort() but frustratingly, it's not working. I have read the manpage for qsort() and I think I have the comparator function that syntactically looks okay, but when I print the "sorted" array after calling qsort(), nothing is sorted in my array.
The code:
#include <stdlib.h>
#include <stdio.h>
#define ARRAY_SZ 5
typedef struct SingleChar
{
unsigned char Character;
unsigned int Weight;
} *SingleCharPtr;
int CompareWeights(const void *a, const void *b)
{
const SingleCharPtr p1 = (SingleCharPtr)a;
const SingleCharPtr p2 = (SingleCharPtr)b;
// printf("Weight1: %u\tWeight2: %u\n", p1->Weight, p2->Weight);
// return (p1->Weight - p2->Weight);
if (p1->Weight < p2->Weight)
return -1;
else if (p1->Weight > p2->Weight)
return 1;
else
return 0;
}
SingleCharPtr MakeChar(unsigned char c, unsigned int w)
{
SingleCharPtr scptr = malloc(sizeof(struct SingleChar));
if (!scptr)
{
fprintf(stderr, "[Error] Out of memory\n");
exit(1);
}
scptr->Character = c;
scptr->Weight = w;
return scptr;
}
int main(void)
{
SingleCharPtr *chars = malloc(ARRAY_SZ * sizeof(SingleCharPtr));
chars[0] = MakeChar('B', 3);
chars[1] = MakeChar('E', 7);
chars[2] = MakeChar('A', 4);
chars[3] = MakeChar('D', 6);
chars[4] = MakeChar('C', 2);
qsort(chars, ARRAY_SZ, sizeof(SingleCharPtr), &CompareWeights);
int i;
for (i = 0; i < ARRAY_SZ; i++)
{
printf("Character: %c\tWeight: %u\n", chars[i]->Character, chars[i]->Weight);
free(chars[i]);
}
free(chars);
return 0;
}
Also, in the comparator function (CompareWeights()), I found out that when I print the weight of the structs pointed by SingleCharPtr, I get 0s for all of them.
Any pointer to right direction would be highly appreciated.
The problem: qsort() passes in pointers to the elements to be compared to the comparator function, and not the elements themselves. So, the arguments to your CompareWeights() function are actually const SingleCharPtr *, disguised as const void *. What you should do in that function is:
const SingleCharPtr p1 = *(const SingleCharPtr *)a;
etc.
Sidenotes:
I. If your assumption had been valid, then you wouldn't have needed the cast:
const SingleCharPtr p1 = a;
is preferred over
const SingleCharPtr p1 = (SingleCharPtr)a;
because of this.
II. The comparison function need not return -1, 0 or 1. It should return an integer less than 0, 0 or greater than 0. Thus, all the huge if in CompareWeight() is completely superfluous, write
return p1->Weight - p2->Weight;
instead.
III. SingleCharPtr *chars = malloc(ARRAY_SZ * sizeof(SingleCharPtr)); - Why? You only use the chars array locally in the main() function, you don't need dynamic allocation for that. Why not write
SingleCharPtr chars[ARRAY_SZ];
instead?
If you see the example in e.g. this manual page, you will see that the when qsort is passed an array of pointer (just like you have) then the arguments to the sorting function are actually pointers to pointers. This is because qsort passes pointers to the elements, not the elements themselves.
To accomodate for that, change accordingly:
int CompareWeights(const void *a, const void *b)
{
const SingleCharPtr p1 = *(SingleCharPtr*)a;
const SingleCharPtr p2 = *(SingleCharPtr*)b;
return (p1->Weight - p2->Weight);
}

Sorting any kind of element using void pointers in C

Hello everyone i am writing a program for sorting general element in C. it can sort any type of object(int,float,complex number, objects)
What i have thought of is using void pointers,
void qsort(void *ptr,int sz,int i,int j,int (*fptr) (const void *,const void *) )
{
if(i<j)
{
int p=(i+j)/2;
p=partition(ptr,sz,i,j,p,fptr);
qsort(ptr,size,i,p-1,fptr);
qsort(ptr,size,p+1,j,fptr);
}
}
FOR Comparison
By the value of sz we will know that whether its a pointer to string,int,char,float,etc
int compare(const void* a,const void* b,int sz)
{
if(sz==0) //means pointer to a string
return strcmp( (char*)a, (char*)b );
else if(sz==1) //means int
return *(int*)a - *(int*)b;
else if(sz==2) //means float
return *(float*)a- *(float*)b;
else if(sz==3)
return *(char*)a- *(char*)b;
}
FOR SWAPPING TWO ELEMENTS
void swap(void *a,void *b,int sz)//for swapping
{
if(sz==0)
{
void *c;
c=a;
a=b;
b=c;
}
else if(sz==1)
{
a=(int*)a;
b=(int*)b;
int c;
c= *a;
*a=*b;
*b=c;
}
else if(sz==2)
{
a=(float*)a;
b=(float*)b;
float c;
c= *a;
*a=*b;
*b=c;
}
EDITED
qsort(arr,4,0,9,&compare);
The full code is under construction, please tell me if there could be some optimizations in my approach, or some better alternatives for this problem.
As it seems to me that it is really going to be big in size
Many many thanx in advance
Since your swap routine will likely be used by the partition function, it should work with arbitrary sized objects, not just the ones you plan to pass in to the code.
void swap (void *a, void *b, int sz) {
char buf[512];
void *p = buf;
if (sz > sizeof(buf)) p = malloc(sz);
memcpy(p, a, sz);
memcpy(a, b, sz);
memcpy(b, p, sz);
if (p != buf) free(p);
}
From the way you have written your comparison routine, it seems you only plan to send in certain types of arrays. But, sz is usually used to tell how big the individual elements in the array are, not as a type identifier, as you seem to be trying to use it.
struct x { int key; /*...*/ };
int cmp_x (const void *a, const void *b) {
const struct x *xa = a;
const struct x *xb = b;
return (xa->key > xb->key) - (xa->key < xb->key);
}
struct x array_x[100];
/* populate array */
qsort(array_x, sizeof(struct x), 0, 100, cmp_x);
This is how I imagine your qsort should be called. (Thanks to Ambroz Bizjak for the nifty comparison implementation.)
For an array of int:
int cmp_int (const void *a, const void *b) {
int ia = *(const int *)a;
int ib = *(const int *)b;
return (ia > ib) - (ia < ib);
}
int array_i[100];
/* populate array */
qsort(array_i, sizeof(int), 0, 100, cmp_int);
The problem is that this does not allow sorting of custom types, like structs. The usual approach is to accept a function pointer which you call to do the comparisons.
What you should be doing is passing in the comparison as a function pointer. You are passing in a function pointer but you don't seem to be using it to compare the values. You don't have to predefine all of the comparisons because you can define them when you use them, for the type of values you're using.

Resources