I want to use quicksort in C, which has a function signature of void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*)), but the signature of my comparison function is int (*compar)(const void *, const void*, const int) with the third argument being constant during one call to quicksort.
As an illustration, assume that I want to sort an array of vectors according to different norms (L0, L1, L2 and Linifinity norm for example). Which norm it actually is, is passed as a third argument to the comparison function, but remains constant during the call of qsort. Is it possible to do an assignment in a form like
//Function declaration for parametric comparison
int cmp3(int* a_vec, int* b_vec, int x);
// Somewhere in main
int (*cmp2)(int, int);
cmp2 = cmp3(int*, int*, 2);//2 could mean L2 norm
to be able to call something like
qsort(a, 100, sizeof(a), cmp2);
I know this does not work, but I hope it gives an idea what I want to accomplish. Also it is not possible to make different comparison functions and calls to qsort as the number of different ways of comparing is too big.
This is called partial function application, and you can only achieve something like this with wrappers in C:
int cmp3(int *a, int *b) {
return cmp2(a, b, 2);
}
If you're into partial function application or maybe mappings or pure functions, you may want to look into functional programming languages, like Haskell.
The main problem is that the function signature expects 3 elements in the stack before being called. Old C compilers were "smart" and if you don't pass enough parameters, then they "complete" the stack with empty variables (zeroed).
Nowadays if you do that (assuming the compiler accept),it will have a 3rd variable with undefined value in the stack and not the value you are expecting.
You should do a "proxy" function as a previous comment said:
int cmp3(int *a, int *b) {
return cmp2(a, b, 2);
}
Related
Lately I was programming (in C) and I realized that my code would be simpler if I could write my own loop function. So I needed to run a piece of code different times(the pieces of code vary throughout the program) but I have no idea how to take a piece of code as an argument in my function.
For instance, take the for(){"X"} loop, its output may vary depending on "X", so we could somehow refer to "X" as an argument in the function.
Although I solved the problem in my code without defining the new function, it led to a more general problem which I couldn't find its answer online: Is there a way to use a variable piece of code in a function? (in the same manner that for() does)
Edit: Here is a similar problem that I found online. However my question is more generalized than this one.
You can put code in functions and pass a function (as a pointer) to other functions (or use it in a loop), like this:
#include<stdio.h>
static int add(int a, int b)
{
return a + b;
}
static int multiply(int a, int b)
{
return a * b;
}
/* The third argument to g, f, is a pointer to a function that takes two int
parameters and returns an int.
*/
static void g(int a, int b, int (*f)(int a, int b), const char *name)
{
// This uses the pointer f to call the function.
printf("The %s of %d and %d is %d.\n", name, a, b, f(a, b));
}
int main(void)
{
// These pass the function add or multiply to g.
g(3, 4, add, "sum");
g(3, 4, multiply, "product");
}
C does not have a lot of flexibility about this. The functions involved should mostly have the same signature (take the same types of parameters and have the same return type). There is some flexibility available, using variable argument lists or by casting to different function types, but, when using pointers to functions, you should generally seek a uniform signature.
I'd like to use C's qsort() function to sort arrays each having different types, like these:
int a[] = {1, 2, 3};
const char *b[] = {"foo", "bar", "bas"};
my_defined_type_t *c[100]; for (i=0; i<100; i++) { fill(c[i]); }
Is it necessary to write comparison functions for each type, like intComparator(), stringComparitor(), myDefinedTypeComparitor() and make calls to qsort with each comparison function in turn, or can something like this be done in C:
int myGrandUnifiedComparisonFunction(const void* a, const void* b) {
if *a, *b are integers: intComparatorCode;
if *a, *b are strings: stringComparitorCode;
if *a, *b are my_defined_type_t's: myDefinedTypeComparitorCode;
/* etc. */
}
There are two problems to consider:
1) The Problem of Information
Your comparison function gets handed two void pointers. That's just some bit patterns which could mean anything. C attaches no information to, for example, floats or character pointers, so it's impossible to tell if some piece of data is the one or the other if you don't know beforehand.
That said, you can attach this information yourself by wrapping you data in a struct together with an enum value telling you what's inside. But you wouldn't technically be comparing floats or char pointers, but wrapped floats and wrapped char pointers. Something like:
enum { Float, String, MyType } typ;
typedef struct {
typ t;
union {
float f;
char *s;
myType mt;
} wrappedData;
Then you can just write one function which compares wrappedData *.
That's just about what every dynamic language does.
And then, even your grand unified function would still have to compare them appropriately, that is, differently for each type, so you wouldn't have gained much. On the contrary, you would mould logic together into one function which doesn't really belong together.
2) The Problem of Efficiency
While this may not bother you, unwrapping a pointer and checking it's type would be done with every single comparison operation, which may increase the runtime of your sort by a lot.
Conclusion:
You'd have to go some way and wrap your data, for a dubious advantage and a significant disadvantage (efficiency). Don't do it.
C have no introspection, so there's no way of knowing what type a void* is pointing to.
You need one comparison function per type, and have to call qsort with the correct callback.
Your idea of
int myGrandUnifiedComparisonFunction(const void* a, const void* b) {
if *a, *b are integers: intComparatorCode;
if *a, *b are strings: stringComparitorCode;
if *a, *b are my_defined_type_t's: myDefinedTypeComparitorCode;
/* etc. */
}
Is pretty great. Have you tried implementing it?
The problem is that there is no way of implementing it in C or C++. There is no way of determining what kind of variable the void* points to.
No you can not have a generic function because types are now passed in C at runtime (unlike in Object-oriented languages). The type has to be known at compile time.
Thus you need to have a function that knows how to compare each type and tell qsort that.
I guess you can if you somehow magically know the types, but why bother? Also, you've got the size to pass in as well so you'd need to do the check in 2 places.
Not sure what the advantage of this is.
I am implementing simple library for lists in C, and I have a problem with writing find function.
I would like my function to accept any type of argument to find, both:
find(my_list, 3) and find(my_list, my_int_var_to_find).
I already have information what is type of list's elements.
For now I've found couple of ways dealing with this:
different function with suffix for different types: int findi(void* list, int i), int findd(void* list, double d) - but I don't like this approach, it seems like redundancy for me and an API is confusing.
using union:
typedef union {
int i;
double d;
char c;
...
} any_type;
but this way I force user to both know about any_type union, and to create it before invocation of find. I would like to avoid that.
using variadic function: int find(void* list, ...). I like this approach. However, I am concerned about no restrictions on number of arguments. User is free to write int x = find(list, 1, 2.0, 'c') although I don't know what it should mean.
I have seen also answer to this question: C : send different structures for one function argument but it's irrelevant, because I want to accept non-pointer arguments.
What is the proper way of handling this function?
You could instead try implementing your function similar to a generic function like bsearch, which can perform a binary search on an array of any data type:
void *bsearch(const void *key, const void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
Rather than hard-coding the different implementations for different data types inside your function, you instead pass a pointer to a function which will do the type-dependent operation, and only it knows the underlying implementation. In your case, that could be some sort of traversal/iteration function.
The other thing bsearch needs to know (apart from the obvious - search key and array length) is the size of each element in the array, so that it can calculate the address of each element in the array and pass it to the comparison function.
If you had a finite list of types that were to be operated on, there's nothing wrong with having a family of findX() functions. The above method requires a function for each data type to be passed to the bsearch function, however one of the main differences is that common functionality doesn't need to be repeated and the generic function can be used for any data type.
I wouldn't really say there's any proper way to do this, it's up to you and really depends on the problem you're trying to solve.
I am not sure whether answering my own question is polite, but I want your opinion.
I tried to solve this problem using va_list. Why so? Because this way I can write only one function. Please, mind that I know what type the argument should be. This way I can do this:
int find(void* list, ...) {
any_type object = {0};
int i = -1;
va_list args;
va_start(args, list);
switch(type_of_elem(list)) {
case INT: object.i = va_arg(args, int); break;
...
}
/* now &object is pointer to memory ready for comparision
* f.eg. using memcmp */
return i;
}
The advantage of this solution is that I can wrap presented switch-case and reuse it with other functions.
After researching a little bit more on my concern regarding no limit on number of arguments I realized that printf lacks this limit either. You can write printf("%d", 1, 2, 3).
But I tweaked my solution with additional macro:
#define find_(list, object) find((list), (object))
Which produces error message at compile time, saying that find_ macro expects 2 arguments not 3.
What do you think about it? Do you think this is better solution than previously suggested?
I am trying to write a function that takes an array of an variable size in c.
void sort(int s, int e, int arr[*]){
...
}
It says that for variable length arrays, it needs to be bounded in the function declaration. What does that mean? I am using xcode 4.0, with the LLVM compiler 2.0.
Thanks for the help.
As I see that no one answers the real question, here I give mine.
In C99 you have variable length arrays (VLA) that are declare with a length that is evaluated at run time, and not only at compile time as for previous versions of C. But passing arrays to functions is a bit tricky.
A one dimensional array is always just passed as a pointer so
void sort(size_t n, int arr[n]) {
}
is equivalent to
void sort(size_t n, int *arr){
}
Higher dimensions are well passed through to the function
void toto(size_t n, size_t m, int arr[n][m]){
}
is equivalent to
void toto(size_t n, size_t m, int (*arr)[m]){
}
With such a definition in the inside of such a function you can access the elements with expressions as arr[i][j] and the compiler knows how to compute the correct element.
Now comes the syntax that you discovered which is only useful for prototypes that is places where you forward-declare the interface of the function
void toto(size_t, size_t, int arr[*][*]);
so here you may replace the array dimension by * as placeholders. But this is only usefull when you don't have the names of the dimensions at hand, and it is much clearer to use exactly the same version as for the definition.
void toto(size_t n, size_t m, int arr[n][m]);
In general for a consistent use of that it is just important that you have the dimensions first in the the parameter list. Otherwise they would not be known when the compiler parses the declaration of arr.
If you're not using the C99 variable length arrays (it appears you are, so see below), the usual solution is to pass in a pointer to the first element, along with any indexes you want to use for accessing the elements.
Here's a piece of code that prints out a range of an array, similar to what you're trying to do with your sort.
#include <stdio.h>
static void fn (int *arr, size_t start, size_t end) {
size_t idx;
for (idx = start; idx <= end; idx++) {
printf ("%d ", arr[idx]);
}
putchar ('\n');
}
int main (void) {
int my_array[] = {9, 8, 7, 6, 5, 4, 3, 2, 1, 0};
fn (my_array, 4, 6);
return 0;
}
This outputs elements four through six inclusive (zero-based), giving:
5 4 3
A couple of points to note.
Using my_array in that function call to fn automatically "decays" the array into a pointer to its first element. This actually happens under most (not all) circumstances when you use arrays, so you don't have to explicitly state &(my_array[0]).
C already has a very good sort function built in to the standard library, called qsort. In many cases, that's what you should be using (unless either you have a specific algorithm you want to use for sorting, or you're doing a homework/self-education exercise).
If you are using real VLAs, you should be aware that the [*] construct is only valid in the function prototype, not in an actual definition of the function.
So, while:
void xyzzy(int, int[*]);
is valid, the following is not:
void xyzzy(int sz, int plugh[*]) { doSomething(); }
That's because, while you don't need the size parameter in the prototype, you do very much need it in the definition. And, since you have it, you should just use it:
void xyzzy(int sz, int plugh[sz]) { doSomething(); }
The gcc compiler actually has a reasonably clear error message for this, far better than the "needs to be bounded in the function declaration" one you saw:
error: ‘[*]’ not allowed in other than function prototype scope
What you want to do it make your argument an int *; pass in the length of the array (which the caller presumably knows, but this routine does not) as a separate argument. You can pass an array as such an argument.
The usage of * inside of array brackets for variable-length arrays is limited to prototypes, and serves merely as a placeholder. When the function is later defined, the array's size should be stored in a variable available at either file scope or as one of the parameters. Here's a simple example:
void foo(int, int[*]);
/* asterisk is placeholder */
void foo(int size, int array[size]) {
/* note size of array is specified now */
}
There's the following declarations:
void qsort(void *lineptr[], int left, int right, int (*comp)(void *, void *));
int numcmp(char *, char *);
int strcmp(char *s, char *t);
Then, somewhere in the program there is the following call:
qsort((void**) lineptr, 0, nlines-1,
(int (*)(void*,void*))(numeric ? numcmp : strcmp));
(Ignore the first three arguments and numeric).
I ask what is this:
(int (*)(void*,void*))(numeric ? numcmp : strcmp)
I understand that qsort is expecting a "pointer to function that gets two void pointers and returns an int" as it's 4th argument but how what's written above satisfies that?
It seems to me like some sort of cast because it is made of two parentheses, but that would be a very odd cast. Because it takes a function and makes this function a "pointer to function that gets two void pointers and returns an int". Which is meaningless.
(I followed here the rule that a type type in parenthesis before a variable promotes the variable to that type).
So I think I just get it wrong, maybe someone can tell me how to read this, what's the order?
What's happening here is indeed a cast. Lets ignore the ternary for a second and pretend that numcmp is always used. For the purpose of this question, functions can act as function pointers in C. So if you look at the type of numeric it is actually
(int (*)(int*,int*))
In order for this to be properly used in qsort it needs to have void parameters. Because the types here all have the same size with respect to parameters and return types, it's possible to substitute on for the other. All that's needed is a cast to make the compiler happy.
(int (*)(void*,void*))(numcmp )
You've missed the trick here - the portion
(numeric ? numcmp : strcmp)
is using the ternary operator to choose which function is being called inside of qsort. If the data is numeric, it uses numcmp. If not, it uses strcmp. A more readable implementation would look like this:
int (*comparison_function)(void*,void*) =
(int (*)(void*,void*))(numeric ? numcmp : strcmp);
qsort((void**) lineptr, 0, nlines-1, comparison_function);
As others have pointed out, for
(int (*)(void*,void*))(numeric ? numcmp : strcmp)
then the following is a type cast
(int (*)(void*,void*))
and the expression is
(numeric ? numcmp : strcmp)
C declarations can be quite difficult to read, but it is possible to learn. The method is to start at the inner part and then go right one step, then left one step, continuing right, left, right, left, etc outwards until finished. You do not cross outside a parenthesis before everything inside has been evaluated. For instance for the type cast above, (*) indicates this is a pointer. Pointer was the only thing inside the parenthesis so then we evaluate to the right side outside it. (void*,void*) indicates that is a pointer to a function with two pointer arguments. Finally int indicates the return type of the function. The outer parenthesis makes this a type cast.
Update: two more detailed articles: The Clockwise/Spiral Rule and Reading C Declarations: A Guide for the Mystified.
However, the good news is that although the above is extremely useful to know, there is an extremely simple way to cheat: the cdecl program can convert from C to English description and vice versa:
cdecl> explain (int (*)(void*,void*))
cast unknown_name into pointer to function (pointer to void, pointer to void) returning int
cdecl> declare my_var as array 5 of pointer to int
int *my_var[5]
cdecl>
Exercise: What kind of variable is i?
int *(*(*i)[])(int *)
Answer in rot13 in case you do not have cdecl installed on your machine (but you really should!):
pqrpy> rkcynva vag *(*(*v)[])(vag *)
qrpyner v nf cbvagre gb neenl bs cbvagre gb shapgvba (cbvagre gb vag) ergheavat cbvagre gb vag
pqrpy>
You can do it without the function pointer cast. Here's how. In my experience, in most places, if you are using a cast, you are doing it wrong.
Note that the standard definition of qsort() includes const:
void qsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
Note that the string comparator is given two 'char **' values, not 'char *' values.
I write my comparators so that casts are unnecessary in the calling code:
#include <stdlib.h> /* qsort() */
#include <string.h> /* strcmp() */
int num_cmp(const void *v1, const void *v2)
{
int i1 = *(const int *)v1;
int i2 = *(const int *)v2;
if (i1 < i2)
return -1;
else if (i1 > i2)
return +1;
else
return 0;
}
int str_cmp(const void *v1, const void *v2)
{
const char *s1 = *(const char **)v1;
const char *s2 = *(const char **)v2;
return(strcmp(s1, s2));
}
Forcing people to write casts in the code using your functions is ugly. Don't.
The two functions I wrote match the function prototype required by the standard qsort(). The name of a function when not followed by parentheses is equivalent to a pointer to the function.
You will find in older code, or code written by those who were brought up on older compilers, that pointers to functions are used using the notation:
result = (*pointer_to_function)(arg1, arg2, ...);
In modern style, that is written:
result = pointer_to_function(arg1, arg2, ...);
Personally, I find the explicit dereference clearer, but not everyone agrees.
Whoever wrote that code snippet was trying to be too clever. In his mind, he probably thinks he is being a good programmer by making a clever "one-liner". In reality, he is making code that is less readable and is obnoxious to work with over the long term and should be rewritten in a more obvious form similar to Harper Shelby's code.
Remember the adage from Brian Kernighan:
Debugging is twice as hard as writing
the code in the first place.
Therefore, if you write the code as
cleverly as possible, you are, by
definition, not smart enough to debug
it.
I do lots of performance critical coding with hard real time deadlines... and I have still not seen a place where a dense one-liner is appropriate.
I have even messed around with compiling and checking the asm to see if the one-liner has a better compiled asm implementation but have never found the one-liner to be worth it.
I would probably read it like this:
typedef int (*PFNCMP)(void *, void *);
PFNCMP comparison_function;
if (numeric)
{
comparison_function = numcmp;
}
else
{
comparison_function = strcmp;
}
qsort((void**) lineptr, 0, nlines-1, comparison_function);
The example in the question has an explicit case.
Your logic is correct i think. It is indeed casting to "pointer to function that gets two void pointers and returns an int" which is the required type by the method signature.
Both numcmp and strcmp are pointers to functions which take two char* as parameters and returns an int. The qsort routine expects a pointer to a function that takes two void* as parameters and returns an int. Hence the cast. This is safe, since void* acts as a generic pointer. Now, on to reading the declaration: Let's take your strcmp's declaration:
int strcmp(char *, char *);
The compiler reads it as strcmp is actually:
int (strcmp)(char *, char *)
a function (decaying to a pointer to a function in most cases) which takes two char * arguments. The type of the pointer strcmp is therefore:
int (*)(char *, char *)
Hence, when you need to cast another function to be compatible to strcmp you'd use the above as the type to cast to.
Similarly, since qsort's comparator argument takes two void *s and thus the odd cast!