So I have a C function that goes like this:
int cmp (const void *a, const void* b)
return rot13cmp( (const char*)a, (const char*)b );
}
and rot13cmp is another function that takes two parameters of type const char *.
I pass this function into the compare parameter for the C qsort function but it doesn't seem to work.
However, if I instead cast the const void * variables by doing
return rot13cmp ( *(const char **)a, *(const char **)b );
the function then starts to work. I looked this up at SO but every source said that the first method of casting should work so I wanted to know why only the second one worked for me?
Edit: Here's the relevant code I have,
int cmp (const void *a, const void *b) {
return rot13cmp( (const char *)a, (const char *)b );
}
int rot13cmp (const char *a, const char *b) {
while (*a == *b && *a != '\n') {
a++;
b++;
}
if (*a == *b) { return 0; }
else if (*a == '\n') { return 1; }
else if (*b == '\n') { return 1; }
else { return rot13(*a) - rot13(*b);
}
and rot13 returns an int for the ASCII code of a letter rotated by 13 letters in the alphabet.
I called qsort by doing
qsort(words, word_count, sizeof(char*), cmp);
where words is an array of char** and word_count is an int. cmp is also just
qsort() calls the comparison function with pointers to the array elements that should be compared.
If your array contains const char*, that means the comparison function is called with pointers to those pointers, and you have to cast and dereference accordingly.
With (const char*)a you are interpreting the parameter as if it would be a pointer to const char. But it isn't. In reality it's a pointer to the const char* in the input array.
That's why (const char**)a is the correct cast, it interprets the parameter as a pointer to a const char*. To do string comparison you want that pointed-to const char*, which you access by dereferencing the casted value with *.
You can think of it as first correcting the type (by casting), and then accessing the pointed-to value (by dereferencing).
The difference between the two attempts is that the second case does an additional dereference. This is important since qsort() doesn't pass the const char* directly, but rather passes a pointer to it. So we have to look at the pointed-to value to find what we are looking for. By casting directly to const char* we just claim that the variable would contain such a pointer, which won't end well because that's not the case.
Related
I have a function, where I have 2 void pointers (part of the specification), but I know they are char *. I want to iterate through the char arrays, so I tried to create some pointers to iterate through them. When I do the following, my program doesn't work:
int foo(void const * first, void const * second)
{
char const * firstIt = (const char*) first;
char const * secondIt = (const char*) second;
...
}
However, if I do:
int foo(void const * first, void const * second)
{
char const * firstIt = *(const char**) first;
char const * secondIt = *(const char**) second;
...
}
What is the difference between the two and why does the second one work? I don't know if I included enough detail, so if more information is needed I'd be happy to provide it.
If the second one works, this is because the void pointer that you indicate for the function can really be anything, and I am guessing that you are passing to the function the pointer of a pointer. For example, the following code works:
#include <stdio.h>
#include <stdlib.h>
int foo(void const * first, void const * second);
int goo(void const * first, void const * second);
int main () {
char * a, * b;
a = malloc (sizeof (char));
b = malloc (sizeof (char));
*a = 'z';
*b = 'x';
goo (&a, &b); /* critical line */
free (a);
free (b);
return 0;
}
int foo(void const * first, void const * second) {
char const * firstIt = (const char*) first;
char const * secondIt = (const char*) second;
printf ("%c %c", *firstIt, *secondIt);
return 1;
}
int goo(void const * first, void const * second) {
char const * firstIt = *(const char**) first;
char const * secondIt = *(const char**) second;
printf ("%c %c", *firstIt, *secondIt);
return 2;
}
However, for the above program to work with the function foo you need to replace the critical line with a call of the form:
foo (a, b);
Does the difference make sense? Did it solve your problem?
The first approach assumes the caller has passed a char * (const qualified in some way).
The second assumes the caller has passed a char ** (const qualified in some way).
If the second one works, that means your caller is passing a char **.
The reason the first wouldn't work is undefined behaviour. Having a pointer of one type, converting to another type, and dereferencing it as anything other than the original type gives undefined behaviour. A round trip via a void pointer doesn't change that.
That is why compilers complain about implicit conversions of one pointer type to another (except to and from void pointers).
Here is the compare function:
int compare(const void *a, const void *b) {
char* *s = (char* *) a;
char* *t = (char* *) b;
return sort_order * strcmp(*s, *t); // sort_order is -1 or 1
}
Now my question is what is the reasoning behind casting to a double pointer of a particular type? Or rather, why is the double pointer cast needed and how is it used internally?
Other variables used:
char **wordlist; int nbr_words; (array elements are) char *word;
Ex qsort call: qsort(wordlist, nbr_words, sizeof(char *), compare);
It would help if you showed the definition of wordlist, but most likely it's defined as a char **. The compare() function receives a pointer to each element of your list. If each element of your list is of type char *, then compare() is going to receive two pointers to char *, or two char ** in other words.
The conversion to char ** (note that an actual cast would be superfluous, in this particular case, if you weren't going from a const void pointer, to a non-const char **) itself is necessary because qsort() has to work on any kind of type, so the arguments get converted to void * before they are passed. You can't deference a void * so you have to convert them back to their original types before doing anything with them.
For instance:
#include <stdio.h>
int compare_int(void * a, void * b) {
int * pa = a;
int * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_double(void * a, void * b) {
double * pa = a;
double * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_any(void * a, void * b, int (*cfunc)(void *, void *)) {
return cfunc(a, b);
}
int main(void) {
int a = 1, b = 2;
if ( compare_any(&a, &b, compare_int) ) {
puts("a and b are not equal");
} else {
puts("a and b are equal");
}
double c = 3.0, d = 3.0;
if ( compare_any(&c, &d, compare_double) ) {
puts("c and d are not equal");
} else {
puts("c and d are equal");
}
return 0;
}
Outputs:
paul#local:~/src/c/scratch$ ./comp
a and b are not equal
c and d are equal
paul#local:~/src/c/scratch$
The compare_any() function will compare any type which is supported, in this case, int and double, because we can pass a function pointer to it. However, the signature of the passed function must be the same, so we can't declare compare_int() to take two int * arguments, and compare_double() to take two double *. We have to declare them both as taking two void * arguments, and when we do this, we have to convert those void * arguments to something useful within those functions before we can work with them.
What's happening in your case is exactly the same, but the data themselves are pointers, so we're passing pointers to pointers, and so we need to convert void * to, in your case, char **.
EDIT: To explain some confusion in the comments to the original question about how qsort() works, here's the qsort() signature:
void qsort(void *base, size_t nmemb, size_t size,
int(*compar)(const void*, const void*))
base is a pointer to the first element of an array, nmemb is the number of members of that array, and size is the size of each element.
When qsort() calls compar on, say, the first and second elements of your array, it'll send the address of the first element (i.e. base itself) and the address of the element (i.e. base + size).
If base was originally declared as an array of int, then the compare function must interpret those pointers it receives as pointers to int, as int *. If base was originally declared as an array of strings, as a char **, then the compare function must interpret those pointers as pointers to char *, i.e. as char **.
In all cases, the compare function gets pointers to elements. If you have an int array, then you must interpret those pointers as int * in your compare function. If you have a char * array, then you must interpret them as char **, and so on.
In this case, you obviously could call strcmp() if you just passed plain char * arguments to the compare function. But, because qsort() is generic it can only pass pointers to the compare function, it can't actually pass the value of your elements - it's the use of void * which allows it to be generic, because any type of object pointer can be converted to void *, but there is no equivalent datatype to which any non-pointer value can be converted. For that reason, it has to work the same way with regular types like int and double, with pointers, and with structs, and the only way to get it to work correctly with all possible types is to have it deal with pointers to elements, not with the elements themselves, even when the elements are themselves also pointers. For this reason, it might seem like you're getting an unnecessary level of indirection, here, but it actually is necessary in order for qsort() to be able to function in the generic way that it does.
You can see this more clearly if I modify the code above so that compare_any() is more similar to qsort(), and takes not two pointers, but a single pointer to a two-element array of various types (slightly contrived example, but we're keeping it simple):
#include <stdio.h>
#include <string.h>
int compare_int(void * a, void * b) {
int * pa = a;
int * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_double(void * a, void * b) {
double * pa = a;
double * pb = b;
if ( *pa < *pb ) {
return -1;
} else if ( *pa > *pb ) {
return 1;
} else {
return 0;
}
}
int compare_string(void * a, void * b) {
char ** pa = a;
char ** pb = b;
return strcmp(*pa, *pb);
}
int compare_any(void * arr, size_t size, int (*cfunc)(void *, void *)) {
char * first = arr;
char * second = first + size;
return cfunc(first, second);
}
int main(void) {
int n[2] = {1, 2};
if ( compare_any(n, sizeof(*n), compare_int) ) {
puts("a and b are not equal");
} else {
puts("a and b are equal");
}
double d[2] = {3.0, 3.0};
if ( compare_any(d, sizeof(*d), compare_double) ) {
puts("c and d are not equal");
} else {
puts("c and d are equal");
}
char * s[] = {"abcd", "bcde"};
if ( compare_any(s, sizeof(*s), compare_string) ) {
puts("'abcd' and 'bcde' are not equal");
} else {
puts("'abcd' and 'bcde' are equal");
}
return 0;
}
Outputs:
paul#local:~/src/c/scratch$ ./comp
a and b are not equal
c and d are equal
'abcd' and 'bcde' are not equal
paul#local:~/src/c/scratch$
As you can see, there's no way compare_any() could accept both an array of int, and an array of char *, without the compare_string() function getting a pointer it needs to treat as a char **, because of the pointer arithmetic it performs on the array elements. Without that additional level of indirection, neither compare_int() nor compare_double() could function.
I am trying to sort array of char pointers, for that purpose I use qsort function, but I can't understand what I am doing wrong and how I can sort that array.
int StringCompare( const void* a, const void* b)
{
char const *char_a = (char const *)a;
char const *char_b = (char const *)b;
return strcmp(char_a, char_b);
}
int main() {
char *a[] = { "Garima",
"Amit",
"Gaurav",
"Vaibhav"
};
int n;
qsort( a, 4, sizeof(char*), StringCompare);
for (n=0; n<4; n++)
printf ("%c ",*a[n]);
}
The Output is: G A G V
The issue is that the values passed to the sort function (a.k.a StringCompare) are pointers into the a array. In other words, they are of type const char **.
You need to instead declare char_a and char_b as const char **, and dereference them in the call to strcmp:
int StringCompare( const void* a, const void* b)
{
char const **char_a = a;
char const **char_b = b;
return strcmp(*char_a, *char_b);
}
Also note the casts are unnecessary.
proper comparator:
int StringCompare( const void* a, const void* b)
{
char const *char_a = *(char const **)a;
char const *char_b = *(char const **)b;
return strcmp(char_a, char_b);
}
NOTE:
according to sort description comparator function is:
compar
Pointer to a function that compares two elements.
This function is called repeatedly by qsort to compare two elements.
It shall follow the following prototype:
int compar (const void* p1, const void* p2);
so, it receives not a char*, but char**
proper output cycle:
for (n=0; n<4; n++)
printf ("%s ", a[n]);
Define your StringCompare function this way:
int StringCompare(const char **a, const char **b)
{
return strcmp(*a, *b);
}
No need to clutter the code with explicit casting because you can implicitly cast a void pointer to any other pointer type.
If you want to sort char-arrays for the first letters, you could implement a function, that looks at the (unsigned) values of the first char in the array. As they are all equal to the numbers in the ASCII-standards. But you have to be careful if you mix upper case chars with lower case chars.
I know... its not a some special implemented function, but I once programmed it that way and it worked.
When I'm learning to use qsort to sort an array of string, there is a question puzzled me.
For example, to sort the following s
char *s[] = {
"Amit",
"Garima",
"Gaurav",
"Vaibhav"
};
To use the qsort, you must provide a comparison function like the
following function cstring_cmp I guess in the qsort function, the type of parameter to be passed to the function cstring_cmp is char**. How to convert a char** to a void*? Why can we convert a char** to a void*?
int cstring_cmp(const void *a, const void *b)
{
const char **ia = (const char **)a;
const char **ib = (const char **)b;
return -strcasecmp(*ia, *ib);
/* return the negative of the normal comparison */
}
Your question seems a bit vague but I'll give it a go anyway. To answer how, you can convert any pointer type to any other pointer type in C by simply casting. To answer why, well that's how C is defined.
The qsort() function requires a function with the given prototype (with const void *) parameters. This is because qsort() is unaware of the actual data type you are sorting, and must use a consistent function prototype for the comparison callback. Your comparison callback is responsible for converting the const void * parameters to pointers to the actual types in your array, in your case const char **.
The example you provide is being setup to ask qsort() to sort an array of char pointers (char *). This comparator you're providing is given each 'pair' of items the algorithm needs, by address. two char pointers. the address qsort() uses is based on the root address you give it, adding size-bytes per "item". Since each "item" is a char*, the size of each item is, in fact, the size of a pointer.
I've modified the comparator to demonstrate what is being compared, and what the addresses are that are being passed in. you will see they are all increments off the base address of the array containing all the char *s.
char *mystrings[] =
{
"This",
"is",
"a",
"test",
"of",
"pointers",
"to",
"strings"
};
int cstring_cmp(const void *a, const void *b)
{
const char **ia = (const char **)a;
const char **ib = (const char **)b;
printf("%p:%s - %p:%s\n", a, *ia, b, *ib);
return -strcasecmp(*ia, *ib);
}
int main(int argc, char *argv[])
{
printf("Base address of our pointer array: %p\n\n", mystrings);
qsort(mystrings, sizeof(mystrings)/sizeof(mystrings[0]), sizeof(char*), cstring_cmp);
for (size_t i=0; i<sizeof(mystrings)/sizeof(mystrings[0]);i++)
printf("%s\n", mystrings[i]);
return 0;
}
produces the following output:
Base address of our pointer array: 0x100006240
0x100006240:This - 0x100006260:of
0x100006260:of - 0x100006278:strings
0x100006240:This - 0x100006278:strings
0x100006248:is - 0x100006240:strings
0x100006278:This - 0x100006240:strings
0x100006250:a - 0x100006240:strings
0x100006270:to - 0x100006240:strings
0x100006258:test - 0x100006240:strings
0x100006260:of - 0x100006240:strings
0x100006268:pointers - 0x100006240:strings
0x100006260:of - 0x100006240:strings
0x100006240:test - 0x100006248:This
0x100006248:test - 0x100006250:to
0x100006240:This - 0x100006248:to
0x100006260:of - 0x100006268:pointers
0x100006268:of - 0x100006270:a
0x100006270:a - 0x100006278:is
0x100006268:of - 0x100006270:is
to
This
test
strings
pointers
of
is
a
A even less visualized one:
int cstring_cmp(const void *a, const void *b){
return -strcasecmp((char *)(*((char **)a)), (char *)(*((char **)b)));
}
But you can see , a and b are char **, and they are dereferenced and become char * and passed to strcasecmp.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int cstring_cmp(const void *a, const void *b){
return -strcasecmp((char *)(*((char **)a)), (char *)(*((char **)b)));
}
int main(){
char *s[] = { "Amit",
"Garima",
"Vaibhav",
"Gaurav"};
qsort(s, 4, sizeof(char *), cstring_cmp);
printf("%s\n%s\n%s\n%s\n", s[0], s[1], s[2], s[3]);
return 0;
}
Output:
Vaibhav
Gaurav
Garima
Amit
It is legal to cast any pointer to char * or void * because void * means a pointer to a memory (RAM or virtual) byte.
I have one function:
int compare(char * c1, char * c2){
...
...
}
What are the various styles in which I can write a function int ret_compare(void * item) that returns a pointer to compare?
There are two main styles, one using a typedef and one not (with two variants of the typedef). Your comparator should take constant pointers, as below:
int compare(const char *c1, const char *c2) { ... }
// Raw definition of a function returning a pointer to a function that returns an int
// and takes two constant char pointers as arguments
int (*ret_compare1(void *item))(const char *, const char *)
{
// Unused argument - item
return compare;
}
// More usual typedef; a Comparator2 is a pointer to a function that returns an int
// and takes two constant char pointers as arguments
typedef int (*Comparator2)(const char *, const char *);
// And ret_compare2 is a function returning a Comparator2
Comparator2 ret_compare2(void *item)
{
// Unused argument - item
return compare;
}
// Less usual typedef; a Comparator3 is a function that returns an int
// and takes two constant char pointers as arguments
typedef int Comparator3(const char *, const char *);
// And ret_compare3 is a function returning a pointer to a Comparator3
Comparator3 *ret_compare3(void *item)
{
// Unused argument - item
return compare;
}
Note that these comparators cannot be used with bsearch() and qsort() (unless you use fairly gruesome casts) because those comparators are expected to take const void * arguments.
Note, too, that for comparing strings, as opposed to single characters, the function used by qsort() or bsearch() should be similar to:
int string_comparator(const void *v1, const void *v2)
{
const char *s1 = *(char **)v1;
const char *s2 = *(char **)v2;
return(strcmp(s1, s2));
}