void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*))
Is there a way to pass, let's say strcmp to qsort without making a helper function?
I was trying to do:
qsort(..., (int (*) (const void*, const void*) (strcmp)));
Your attempt at the cast simply has a misplaced right (closing) parenthesis. The one at the end should be after the type of the cast. So, you can change:
(int (*) (const void*, const void*) (strcmp))
// ^ wrong
to
(int (*) (const void*, const void*)) (strcmp)
// ^ right
Alternatively, although hiding pointer types in typedef aliases is severely frowned-upon, function pointer types are an exception to that guideline. So, it is easier/clearer to define the required type for the qsort comparator first:
typedef int (*QfnCast) (const void*, const void*);
Then, you can cast to that type:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef int (*QfnCast) (const void*, const void*);
int main(void)
{
char list[5][8] = {
"Fred",
"Bob",
"Anna",
"Gareth",
"Joe"
};
qsort(list, 5, 8, (QfnCast)(strcmp));
for (int i = 0; i < 5; ++i) printf("%s\n", list[i]);
return 0;
}
int (*)(const void*, const void*) and int (*)(const char*, const char*) are not compatible function pointer types.
Casting between different, non-compatible function pointer types is explicitly undefined behavior, C17 6.3.2.3/8 emphasis mine:
A pointer to a function of one type may be converted to a pointer to a function of another
type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
So if you cast strcmp to something else, you are explicitly invoking undefined behavior. It will likely work in practice on any system where all pointer types are of equal size. But if you are going to rely on that, you might as well cook up something like this:
typedef union
{
int (*strcmp) (const char*, const char*);
int (*compare)(const void*, const void*);
} strcmp_t;
const strcmp_t hack = { strcmp };
...
qsort(str, x, y, hack.compare);
This is just as undefined behavior (and as likely to work in practice) but more readable.
You can never do qsort(str, x, y, strcmp) because again strcmp is not compatible with the function pointer type expected by qsort. Function parameter passing is done as per assignment, so the rules of simple assignment are the relevant part, from C17 6.5.11:
Constratints
...
the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;
Therefore qsort(str, x, y, strcmp) is always invalid C and this is not a quality of implementation issue. Rather, compilers letting this through without diagnostics are to be regarded as hopelessly broken.
And finally as noted in comments, strcmp only makes sense to use with bsearch/qsort in case you have a true 2D array of characters such as char str[x][y];. In my experience that's a rather rare use-case. When dealing with strings, you are far more likely to have char* str[x], in which case you must write a wrapper around strcmp anyway.
There are two problems with what you're trying to do.
First, strcmp has type int (*)(const char *, const char *). This type is incompatible with the type int (*)(const void*, const void*) expected by the function because the parameter types are not compatible. This will result in qsort calling strcmp via an incompatible pointer type, and doing so triggers undefined behavior.
This might work if char * and void * have the same representation, but there's no guarantee this will be the case.
The second problem is that even if the call "works", what's ultimately being passed to strcmp isn't actually a char * but a char **. This means that strcmp will be attempting to read a char * value as if it were a sequence of char values.
So you have to use a helper function to get the results you want:
int compare(const void *a, const void *b)
{
const char **s1 = a;
const char **s2 = b;
return strcmp(*a, *b);
}
as #some programmer dude has already stated, it depends on what you're sorting. If it's an array of strings, you can use strcmp without a helper function and do a cast to avoid ugly warnings:
char s_array[100][100] = { "z", "a", ... };
qsort( s_array, 100, 100, (int (*)(const void *, const void *))strcmp );
If it's an array of pointers you need a helper function because it gets passed pointers to pointers:
char *p_array[100] = { "z", "a", ... };
int cmp( const void *p1, const void *p2 )
{
return strcmp( *(const char **)p1, *(const char **)p2 );
}
qsort( p_array, 100, sizeof *p_array, cmp );
Related
I was wondering how to get rid of the following warning:
kwic1.c:118:48: warning: incompatible pointer types passing 'int (const char *,
const char *)' to parameter of type 'int (* _Nonnull)(const void *, const
void *)' [-Wincompatible-pointer-types]
I am implementing a comparator for qsort. Here is my function
int comparator(const char *p, const char *q)
{
int index_p = 0;
int index_q = 0;
while(p[index_p] != '\0')
{
if(isupper(p[index_p]))
break;
index_p++;
}
...
I have tried casting p and q, but it didn't work.
Use (implicit) pointer type casting in your function:
int comparator(const void *p1, const void *q1){
const char *p = p1, *q = q1;
// The rest of the code requires no change
It's important for function prototypes to match exactly when passing as function pointers. i.e., you can't pass a int (*)(const char*, const char*) function pointer to a parameter of int (*)(const void*, const void*). All you should do is convert the pointer into desired types in your comparison function.
Qsort is a truly generic function in C, but it trips people up because types matter.
The arguments to the comparitor should be (const void*).
If you were sorting int, then you could simply cast to (const int*).
(The void was replaced by int.)
But it looks like you are sorting (char*)s. Thus you need to remember to properly cast that extra indirection: (const void*) ā (const (char*)*) ā (const char**).
(The void was replaced by char*.)
Iām not sure what you are trying to accomplish with the comparitor snippet you have provided, but
Here is an example of qsort()ing c-strings.
I have a function that takes a void** argument and an integer that indicates its datatype
void foo (void** values, int datatype)
Inside the function, depending on the datatype, I malloc it this way:
if (datatype == 1)
*values = (int*) malloc (5 * sizeof(int));
else if (datatype == 2)
*values = (float*) malloc (5 * sizeof(float));
All is good upto now. However, when character strings come into the picture, things get complicated. The void** would need to be void***, since I will need to do something like this:
*values = (char**) malloc (5 * sizeof(char*));
for(i=0;i<5;i++)
(*values)[i] = (char*) malloc (10);
..
strncpy( (*values)[0], "hello", 5);
How should such a situation be handled?
Can I pass a char*** to the function that expects a void** but cast it correctly inside it?
void foo (void** values, int datatype) {
if(datatype == 3) {
char*** tmp_vals = (char***) values;
*tmp_vals = (char**) malloc (5 * sizeof(char*));
...
(*tmp_vals)[i] = (char*) malloc (10 * sizeof(char));
strncpy ( (*tmp_vals)[i], "hello", 5);
}
So I just cast the void** into a char***. I tried this and ignoring the warnings, it worked fine.
But is this safe? Is there a more graceful alternative?
How should such a situation be handled? Can I pass a char*** to the function that expects a void** but cast it correctly inside it?
No, that's technically Undefined Behavior. It may appear to work on your computer, but it may fail on some future computer that implements different pointer types with different representations, which is allowed by the C language standard.
If your function expects a void**, then you better pass it a void**. Any pointer type can be implicitly converted to void*, but that only works at the top level: char* can be converted to void*, and char** can be implicitly converted to void* (because char** is "pointer to char*"), but char** cannot be converted to void**, and likewise char*** also cannot be converted to void**.
The proper way to call this function is to pass it a proper void**, then cast the resulting void* pointer back to its original type:
void foo(void **values, int datatype)
{
if(datatype == 3)
{
char ***str_values = ...;
*values = str_values; // Implicit cast from char*** to void*
}
else
...
}
...
void *values;
foo(&values, 2);
char ***real_values = (char ***)values;
Assuming that *values was actually pointed to a char***, then this cast is valid and does not have any Undefined Behavior in any of the code paths.
A void * is just a pointer to an unspecified type; it could be a pointer to an int, or a char, or a char *, or a char **, or anything you wanted, as long as you ensure that when you dereference, you treat it as the appropriate type (or one which the original type could safely be interpreted as).
Thus, a void ** is just a pointer to a void *, which could be a pointer to any type you want such as a char *. So yes, if you are allocating arrays of some types of objects, and in one case those objects are char *, then you could use a void ** to refer to them, giving you something that could be referred to as a char ***.
It's generally uncommon to see this construction directly, because usually you attach some type or length information to the array, rather than having a char *** you have a struct typed_object **foo or something of the sort where struct typed_object has a type tag and the pointer, and you cast the pointer you extract from those elements to the appropriate types, or you have a struct typed_array *foo which is a struct that contains a type and an array.
A couple of notes on style. For one, doing this kind of thing can make your code hard to read. Be very careful to structure it and document it clearly so that people (including yourself) can figure out what's going on. Also, don't cast the result of malloc; the void * automatically promotes to the type its assigned to, and casting the result of malloc can lead to subtle bugs if you forget to include <stdlib.h> or your update the type declaration but forget to update the cast. See this question for more info.
And it's generally a good habit to attach the * in a declaration to the variable name, not the type name, as that's how it actually parses. The following declares one char and one char *, but if you write it the way you've been writing them, you might expect it to declare two char *:
char *foo, bar;
Or written the other way:
char* foo, bar;
You don't need to (and probably shouldn't) use a void ** at all - just use a regular void *. Per C11 6.3.2.3.1, "a pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer." A pointer variable, including a pointer to another pointer, is an object. void ** is not "a pointer to void". You can convert freely and safely to and from void *, but you're not guaranteed to be able to convert safely to and from void **.
So you can just do:
void foo (void* values, int datatype) {
if ( datatype == 1 ) {
int ** pnvalues = values;
*pnvalues = malloc(5 * sizeof int);
/* Rest of function */
}
and so on, and then call it similar to:
int * new_int_array;
foo(&new_int_array, 1);
&new_int_array is of type int **, which will get implicitly converted to void * by foo(), and foo() will convert it back to type int ** and dereference it to indirectly modify new_int_array to point to the new memory it has dynamically allocated.
For a pointer to an dynamic array of strings:
void foo (void* values, int datatype) {
/* Deal with previous datatypes */
} else if ( datatype == 3 ) {
char *** psvalues = values;
*psvalues = malloc(5 * sizeof char *);
*psvalues[0] = malloc(5);
/* Rest of function */
}
and so on, and call it:
char ** new_string_array;
foo(&new_string_array, 3);
Similarly, &new_string_array is type char ***, again gets implicitly converted to void *, and foo() converts it back and indirectly makes new_string_array point to the newly allocated blocks of memory.
There is a builtin mechanism to do this already with the added bonus that it allows a variable number of arguments. It is commonly seen in this format yourfunc(char * format_string,...)
/*_Just for reference_ the functions required for variable arguments can be defined as:
#define va_list char*
#define va_arg(ap,type) (*(type *)(((ap)+=(((sizeof(type))+(sizeof(int)-1)) \
& (~(sizeof(int)-1))))-(((sizeof(type))+ \
(sizeof(int)-1)) & (~(sizeof(int)-1)))))
#define va_end(ap) (void) 0
#define va_start(ap,arg) (void)((ap)=(((char *)&(arg))+(((sizeof(arg))+ \
(sizeof(int)-1)) & (~(sizeof(int)-1)))))
*/
So here is a basic example that you could use with a format string and variable number of args
#define INT '0'
#define DOUBLE '1'
#define STRING '2'
void yourfunc(char *fmt_string, ...){
va_list args;
va_start (args, fmt_string);
while(*fmt_string){
switch(*fmt_string++){
case INT: some_intfxn(va_arg(ap, int));
case DOUBLE: some_doublefxn(va_arg(ap, double));
case STRING: some_stringfxn(va_arg(ap, char *));
/* extend this as you like using pointers and casting to your type */
default: handlfailfunc();
}
}
va_end (args);
}
So you can run it as: yourfunc("0122",42,3.14159,"hello","world");
or since you only wanted 1 to begin with yourfunc("1",2.17); It doesn't get much more generic than that. You could even set up multiple integer types to tell it to run a different set of functions on that particular integer. If the format_string is too tedious, then you can just as easily use int datatype in its place, but you would be limited to 1 arg (technically you could use bit ops to OR datatype | num_args but I digress)
Here is the one type one value form:
#define INT '0'
#define DOUBLE '1'
#define STRING '2'
void yourfunc(datatype, ...){ /*leaving "..." for future while on datatype(s)*/
va_list args;
va_start (args, datatype);
switch(datatype){
case INT: some_intfxn(va_arg(ap, int));
case DOUBLE: some_doublefxn(va_arg(ap, double));
case STRING: some_stringfxn(va_arg(ap, char *));
/* extend this as you like using pointers and casting to your type */
default: handlfailfunc();
}
va_end (args);
}
With some tricks, you can do it. See example:
int sizes[] = { 0, sizeof(int), sizeof(float), sizeof(char *) }
void *foo(datatype) {
void *rc = (void*)malloc(5 * sizes[datatype]);
switch(datatype) {
case 1: {
int *p_int = (int*)rc;
for(int i = 0; i < 5; i++)
p_int[i] = 1;
} break;
case 3: {
char **p_ch = (char**)rc;
for(int i = 0; i < 5; i++)
p_ch[i] = strdup("hello");
} break;
} // switch
return rc;
} // foo
In the caller, just cast returned value to appropriate pointer, and work with it.
Through trial and error I managed to get the following string comparison function to work with qsort() as I intended but I don't really understand why the asterisk is needed in the (const char*) cast expression. Can someone please dissect and explain:-
int strCompare(const void *a, const void *b) {
return strcmp((const char*)a, (const char*)b);
}
Appendix:-
void findStrings(int * optionStats, char strings[][MAX_STRING_SIZE + 1], int numStrings)
{
qsort(strings, numStrings, 21*sizeof(char), strCompare);
}
Is there a way of eliminating the call to strcmp() through strCompare() and just using strcmp() as the parameter to qsort() instead?
You need an asterisk because you want to convert a pointer to const void to a pointer to const char and an asterisk designates that they are pointer types.
In fact you don't really need conversion, since pointer to void type can be implicitly converted to pointer to T type in C language, which isn't the case for C++.
As it's been mentioned by others here, you don't need to define a new function, just to cast the pointer types. Here's how you can cast the function while passing it to qsort, preventing any warning/error:
qsort(arr,
sizeof(arr)/sizeof(char*),
sizeof(char*),
(int(*)(const void *, const void *))strcmp);
The signature of strcmp is (there's another one, but this is the one you are using):
int strcmp(const char *s1, const char *s2);
so, as the parameters of your function (a and b) are const void, you have to perform those casts.
This will be correct as long as the variables you are using as parameters when calling qsort will be passed to strCompare as char *.
Because
int strcmp(
const char *string1,
const char *string2
);
is defined like that. If you do not cast it as " const char* ", the variable " a " is supposed to be of type pointer to void. Its better to understand if you type
const void *a as const void* a
the asterisk is associated with the data-type.
So to cast the whole variable "a" as a pointer to a data-type of " const char ", you have to use asterisk too.
This is from a 'magic' array library that I'm using.
void
sort(magic_list *l, int (*compare)(const void **a, const void **b))
{
qsort(l->list, l->num_used, sizeof(void*),
(int (*)(const void *,const void *))compare);
}
My question is: what on earth is the last argument to qsort doing?
(int (*)(const void *, const void*))compare)
qsort takes int (*comp_fn)(const void *,const void *) as it's comparator argument, but this sort function takes a comparator with double pointers. Somehow, the line above converts the double pointer version to a single pointer version. Can someone help explain?
That's exactly what the cast you quoted does: it converts a pointer of type
int (*)(const void **, const void **)
to a pointer of type
int (*)(const void *, const void *)
The latter is what is expected by qsort.
Thing like this are encountered rather often in bad quality code. For example, when someone wants to sort an array of ints, they often write a comparison function that accepts pointers to int *
int compare_ints(const int *a, const int *b) {
return (*a > *b) - (*a < *b);
}
and when the time comes to actually call qsort they forcefully cast it to the proper type to suppress the compiler's complaints
qsort(array, n, sizeof *array, (int (*)(const void *,const void *)) compare_ints);
This is a "hack", which leads to undefined behavior. It is, obviously, a bad practice. What you see in your example is just a less direct version of the same "hack".
The proper approach in such cases would be to declare the comparison function as
int compare_ints(const void *a, const void *b) {
int a = *(const int *) a;
int b = *(const int *) b;
return (a > b) - (a < b);
}
and then use it without any casts
qsort(array, n, sizeof *array, compare_ints);
In general, if one expects their comparison functions to be used as comparators in qsort (and similar functions), one should implemnent them with const void * parameters.
The last argument to qsort is casting a function pointer taking double pointers, to one taking single pointers that qsort will accept. It's simply a cast.
On most hardware you can assume that pointers all look the same at the hardware level. For example, in a system with flat 64bit addressing pointers will always be a 64bit integer quantity. The same is true of pointers to pointers or pointers to pointers to pointers to pointers.
Therefore, whatever method is used to invoke a function with two pointers will work with any function that takes two pointers. The specific type of the pointers doesn't matter.
qsort treats pointers generically, as though each is opaque. So it doesn't know or care how they're dereferenced. It knows what order they're currently in and uses the compare argument to work out what order they should be in.
The library you're using presumably keeps lists of pointers to pointers about. It has a compare function that can compare two pointers to pointers. So it casts that across to pass to qsort. It's just syntactically nicer than, e.g.
qsort(l->list, l->num_used, sizeof(void*), compare);
/* elsewhere */
int compare(const void *ptr1, const void *ptr2)
{
// these are really pointers to pointers, so cast them across
const void **real_ptr1 = (const void **)ptr1;
const void **real_ptr2 = (const void **)ptr2;
// do whatever with real_ptr1 and 2 here, e.g.
return (*real_ptr2)->sort_key - (*real_ptr1)->sort_key;
}
It is casting a function pointer. I imagine that the reason is so that compare can be applied to the pointers that are dereferenced rather than whatever they are pointing to.
(int (*)(const void *,const void *))compare is a C style cast to cast the function pointer compare to a function pointer with two const void * args.
The last argument is a function pointer. It specifies that it takes a pointer to a function that returns an int and takes two const void ** arguments.
From the man page of qsort, in an example of sorting strings:
static int
cmpstringp(const void *p1, const void *p2)
{
/* The actual arguments to this function are "pointers to
pointers to char", but strcmp(3) arguments are "pointers
to char", hence the following cast plus dereference */
return strcmp(* (char * const *) p1, * (char * const *) p2);
}
Why is it necessary to have char * const * in the arguments to strcmp()? Isn't char * enough?
strcmp is declared as
int strcmp(
const char *string1,
const char *string2
);
This properly expresses the function's interface contract - which is that strcmp will not modify its input data - and allows the compiler to optimize inside the function (assuming it were not part of the CRT, and likely in assembler already).
const void* p1 says that whatever p1 points at is not changed by this function. If you did
char** p1_copy = (char**) p1;
that would be a setup to potentially break that promise, because you could then do
*p1_copy = "Something else";
So a cast from const void* to char** is said to "cast away const". Legal, but some compilers will warn if you use a cast to both cast away const and otherwise change the type at once.
The cast that doesn't break the promise of the const void* p1 declaration is the one used:
char* const* p1_arg = (char* const*) p1;
Now *p1_arg, the thing p1 points to, can't be changed just like we said. You could change the characters in it though:
*p1_arg[0] = 'x';
The function declaration never said anything about them, and you say you know them to originally be non-const chars. So it's allowable, even though the function doesn't actually do any such thing.
Then you dereference that (as an rvalue) to get a char*. That can legally be passed as the const char* argument to strcmp by automatic const promotion.
Technically, if you wanted to get rid of the consts, the cast would be to char **, not char *. The const is left in the cast because the arguments to cmpstringp are also const.
A comparison function passed to qsort has no business modifying the items it's comparing.
This is why the general case of qsort looks like:
void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));