c comparator best practice for when more arguments needed - c

I intend to sort an array of multidimentional points each time by another coordinate.
I want to use c qsort() method but in order to do so I have to use comparator function which its input is only two pointers, so I can't send it the desired coordinate to sort by.
Therefore I figured out two solutions and I am struggling choosing the best one of them :
Use a static variable - an int in this example - initialize it to a -1, and before calling the qsort function set it to the wanted coordinate. In addition, make my comparator, access this variable and compare based on it.
Build a new struct to hold a pointer to the point and the desired coordinate, then make the comparator to sort two pointers to such struct and use the additional info from the struct.
The first sounds like a quick solution though it might be loop hole, while the second sounds like an overkill for such a simple task.
I would be glad to hear any better solution if there is to the problem.

If your system has qsort_r, you can use that. qsort_r has an extra parameter that you can use to pass your coordinate in.
int comparator(const void *l, const void *r, void *param)
{
Point* lPoint = (Point*) l;
Point* rPoint = (Point*) r;
Coordinate* coord = (Coordinate*) param;
// Do your comparison ....
}
void mySort(Point* list, size_t listSize, Coordinate sortCoord)
{
qsort_r(list, listSize, sizeof(Point), comparator, &sortCoord);
}
It's definitely available if you are using glibc (e.g. on Linux) and on OS X. However, it is not officially portable. See this answer on portability
How portable is the re-entrant qsort_r function compared to qsort?
My code example is written for the Linux version. With OS X, the comparator must be declared as
int comparator( void *param, const void *l, const void *r)
and the qsort_r call is
qsort_r(list, listSize, sizeof(Point), &sortCoord, comparator);

The most portable version would probably indeed be to encapsulate the comparsion object(s) ("functors") in a separate file, then use a static file scope variable to change the behavior of the function.
something.h
void set_something (something_t s);
int compare_something (const void* obj1, const void* obj2);
something.c
#include "something.h"
static something_t someting = 0;
void set_something (something_t s)
{
something = s;
}
int compare_something (const void* obj1, const void* obj2)
{
const coord_t* c1 = obj1;
const coord_t* c2 = obj2;
// do stuff with c1, c2 based on something
}
The only down-side with this is that you can't have multiple threads performing different kinds of sorting at the same time, but that's not likely to be an issue. (If it is, design some way to pass a "something" variable as parameter to each thread callback instead.)

I suggest an option #3
C11 presents qsort_s() which provides the needed context parameter. This is in Annex K (normative) and so may not be included in C11 compliant compiler.
The availability of this function may not be wide.
K.3.6.3.2 The qsort_s function
errno_t qsort_s(void *base, rsize_t nmemb, rsize_t size,
int (*compar)(const void *x, const void *y,
void *context), void *context);
... The third argument to the comparison function is the
context argument passed to qsort_s. The sole use of context by qsort_s is to pass it to the comparison function
If this function is not available, consider an alternative that mimics this interface.
This idea is similar to the #JeremyP which suggest qsort_r. The two functions are similar, except for argument order amongst qsort_r() variants the return type.

Related

Can I use memcmp along with qsort?

I am making C dynamic array library, kind of. Note that I'm doing it for fun in my free time, so please do not recommend million of existing libraries.
I started implementing sorting. The array is of arbitrary element size, defined as struct:
typedef struct {
//[PRIVATE] Pointer to array data
void *array;
//[READONLY] How many elements are in array
size_t length;
//[PRIVATE] How many elements can further fit in array (allocated memory)
size_t size;
//[PRIVATE] Bytes per element
size_t elm_size;
} Array;
I originally prepared this to start with the sort function:
/** sorts the array using provided comparator method
* if metod not provided, memcmp is used
* Comparator signature
* int my_comparator ( const void * ptr1, const void * ptr2, size_t type_size );
**/
void array_sort(Array* a, int(*comparator)(const void*, const void*, size_t)) {
if(comparator == NULL)
comparator = &memcmp;
// Sorting algorithm should follow
}
However I learned about qsort:
void qsort (void* base, size_t num, size_t size, int (*compar)(const void*,const void*));
Apparently, I could just pass my internal array to qsort. I could just call that:
qsort (a->array, a->length, a->elm_size, comparator_callback);
But there's a catch - qsort's comparator signature reads as:
int (*compar)(const void*,const void*)
While memcmp's signature is:
int memcmp ( const void * ptr1, const void * ptr2, size_t type_size );
The element size is missing in qsort's callback, meaning I can no longer have a generic comparator function when NULL is passed as callback. I could manually generate comparators up to X bytes of element size, but that sounds ugly.
Can I use qsort (or other sorting built-in) along with memcpy? Or do I have to chose between built-in comparator and built-in sorting function?
C11 provides you with an (admittedly optional) qsort_s function, which is intended to deal with this specific situation. It allows you to pass-through a user-provided void * value - a context pointer - from the calling code to the comparator function. The comparator callback in this case has the following signature
int (*compar)(const void *x, const void *y, void *context)
In the simplest case you can pass a pointer to the size value as context
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h>
...
int comparator_callback(const void *x, const void *y, void *context)
{
size_t elm_size = *(const size_t *) context;
return memcmp(x, y, elm_size);
}
...
qsort_s(a->array, a->length, a->elm_size, comparator_callback, &a->elm_size);
Or it might make sense to pass a pointer to your entire array object as context.
Some *nix-based implementations have been providing a similar qsort_r function for a while, although it is non-standard.
A non-thread-safe way is use private global variable to pass the size.
static size_t compareSize = 0;
int defaultComparator(const void *p1, const void *p2) {
return memcmp(p1, p2, compareSize);
}
void array_sort(Array* a, int(*comparator)(const void*, const void*, size_t)) {
if(comparator == NULL) {
compareSize = a->elm_size;
comparator = &defaultComparator;
}
// Sorting algorithm should follow
}
You can make it thread-safe by make compareSize thread-local variable (__thread)
The qsort() API is a legacy of simpler times. There should be an extra "environment" pointer passed unaltered from the qsort() call to each comparison. That would allow you to pass the object size and any other necessary context in a thread safe manner.
But it's not there. #BryanChen's method is the only reasonable one.
The main reason I'm writing this answer is to point out that there are very few cases where memcmp will do something useful. There are not many kinds of objects where comparison by lexicographic order of constituent unsigned chars makes any sense.
Certainly comparing structs that way is dangerous because padding byte values are unspecified. Even the equality part of the comparison can fail. In other words,
struct foo { int i; };
void bar(void) {
struct foo a, b;
a.i = b.i = 0;
if (memcmp(&a, &b, sizeof a) == 0) printf("equal!");
}
may - by the C standard - print nothing!
Another example: for something as simple as unsigned ints, you'll get different sort orders for big- vs. little-endian storage order.
unsigned a = 0x0102;
unsigned b = 0x0201;
printf("%s", memcmp(&a, &b, sizeof a) < 0 ? "Less!" : "More!");
will print Less or More depending on the machine where it's running.
Indeed the only object type I can imagine that makes sense to compare with memcmp is equal-sized blocks of unsigned bytes. This isn't a very common use case for sorting.
In all, a library that offers memcmp as a comparison function is doomed to be error prone. Someone will try to use it as a substitute for a specialized comparison that's really the only way to obtain the desired result.

Why are function pointers useful?

So, I was looking over function pointers, and in the examples I have seen, particularly in this answer here. They seem rather redundant.
For example, if I have this code:
int addInt(int n, int m) {
return n+m;
}
int (*functionPtr)(int,int);
functionPtr = &addInt;
int sum = (*functionPtr)(2, 3); // sum == 5
It seems here that the creating of the function pointer has no purpose, wouldn't it be easier just to do this?
int sum = addInt(2, 3); // sum == 5
If so, then why would you need to use them, so what purpose would they serve? (and why would you need to pass function pointers to other functions)
Simple examples of pointers seem similarly useless. It's when you start doing more complicated things that it helps. For example:
// Elsewhere in the code, there's a sum_without_safety function that blindly
// adds the two numbers, and a sum_with_safety function that validates the
// numbers before adding them.
int (*sum_function)(int, int);
if(needs_safety) {
sum_function = sum_with_safety;
}
else {
sum_function = sum_without_safety;
}
int sum = sum_function(2, 3);
Or:
// This is an array of functions. We'll choose which one to call based on
// the value of index.
int (*sum_functions)(int, int)[] = { ...a bunch of different sum functions... };
int (*sum_function)(int, int) = sum_functions[index];
int sum = sum_function(2, 3);
Or:
// This is a poor man's object system. Each number struct carries a table of
// function pointers for various operations; you can look up the appropriate
// function and call it, allowing you to sum a number without worrying about
// exactly how that number is stored in memory.
struct number {
struct {
int (*sum)(struct number *, int);
int (*product)(struct number *, int);
...
} * methods;
void * data;
};
struct number * num = get_number();
int sum = num->methods->sum(number, 3);
The last example is basically how C++ does virtual member functions. Replace the methods struct with a hash table and you have Objective-C's method dispatch. Like variable pointers, function pointers let you abstract things in valuable ways that can make code much more compact and flexible. That power, though, isn't really apparent from the simplest examples.
They are one of those most useful things in C! They allow you to make a lot more modular software.
Callbacks
eg,
typedef void (*serial_data_callback)(int length, unsigned char* data);
void serial_port_data_received(serial_data_callback callback)
{
on_data_received = callback;
}
void data_received(int length, unsigned char* data)
{
if(on_data_received != NULL) on_data_received(length, data);
}
this means in your code you can use the general serial routines.....then you might have two things that use serial, modbus and terminal
serial_port_data_received(modbus_handle_data);
serial_port_data_received(terminal_handle_data);
and they can implement the callback function and do what's appropriate.
They allow for Object Oriented C code. It's a simple way to create "Interfaces" and then each concrete type might implement things different. For this, generally you will have a struct that will have function pointers, then functions to implement each function pointer, and a creation function that will setup the function pointers with the right functions.
typedef struct
{
void (*send)(int length, unsigned char* data);
} connection_t;
void connection_send(connection_t* self, int length, unsigned char* data)
{
if(self->send != NULL) self->send(length, data);
}
void serial_send(int length, unsigned char* data)
{
// send
}
void tcp_send(int length, unsgined char* data)
{
// send
}
void create_serial_connection(connection_t* connection)
{
connection->send = serial_send;
}
then other code can use use a connection_t without caring whether its via serial, tcp, or anything else that you can come up with.
They reduce dependencies between modules. Somtimes a library must query the calling code for things (are these objects equal? Are they in a certain order?). But you can't hardcode a call to the proper function without making the library (a) depend on the calling code and (b) non-generic.
Function pointers provide the missing pieces of information all the while keeping the library module independant of any code that might use it.
They're indispensable when an API needs a callback back to the application.
Another use is for the implementation of event-emitters or signal handlers: callback functions.
What if you're writing a library in which the user inputs a function? Like qsort that can work on any type, but the user must write and supply a compare function.
Its signature is
void qsort (void* base, size_t num, size_t size,
int (*compar)(const void*,const void*));

Trying to understand function pointers in C

I am trying to understand function pointers and am stuggling. I have seen the sorting example in K&R and a few other similar examples. My main problem is with what the computer is actually doing. I created a very simple program to try to see the basics. Please see the following:
#include <stdio.h>
int func0(int*,int*);
int func1(int*,int*);
int main(){
int i = 1;
myfunc(34,23,(int(*)(void*,void*))(i==1?func0:func1));//34 and 23 are arbitrary inputs
}
void myfunc(int x, int y, int(*somefunc)(void *, void *)){
int *xx =&x;
int *yy=&y;
printf("%i",somefunc(xx,yy));
}
int func0(int *x, int *y){
return (*x)*(*y);
}
int func1(int *x, int *y){
return *x+*y;
}
The program either multiplies or adds two numbers depending on some variable (i in the main function - should probably be an argument in the main). fun0 multiplies two ints and func1 adds them.
I know that this example is simple but how is passing a function pointer preferrable to putting a conditional inside the function myfunc?
i.e. in myfunc have the following:
if(i == 1)printf("%i",func0(xx,yy));
else printf("%i",func1(xx,yy));
If I did this the result would be the same but without the use of function pointers.
Your understanding of how function pointers work is just fine. What you're not seeing is how a software system will benefit from using function pointers. They become important when working with components that are not aware of the others.
qsort() is a good example. qsort will let you sort any array and is not actually aware of what makes up the array. So if you have an array of structs, or more likely pointers to structs, you would have to provide a function that could compare the structs.
struct foo {
char * name;
int magnitude;
int something;
};
int cmp_foo(const void *_p1, const void *_p2)
{
p1 = (struct foo*)_p1;
p2 = (struct foo*)_p2;
return p1->magnitude - p2->magnitude;
}
struct foo ** foos;
// init 10 foo structures...
qsort(foos, 10, sizeof(foo *), cmp_foo);
Then the foos array will be sorted based on the magnitude field.
As you can see, this allows you to use qsort for any type -- you only have to provide the comparison function.
Another common usage of function pointers are callbacks, for example in GUI programming. If you want a function to be called when a button is clicked, you would provide a function pointer to the GUI library when setting up the button.
how is passing a function pointer preferrable to putting a conditional inside the function myfunc
Sometimes it is impossible to put a condition there: for example, if you are writing a sorting algorithm, and you do not know what you are sorting ahead of time, you simply cannot put a conditional; function pointer lets you "plug in" a piece of computation into the main algorithm without jumping through hoops.
As far as how the mechanism works, the idea is simple: all your compiled code is located in the program memory, and the CPU executes it starting at a certain address. There are instructions to make CPU jump between addresses, remember the current address and jump, recall the address of a prior jump and go back to it, and so on. When you call a function, one of the things the CPU needs to know is its address in the program memory. The name of the function represents that address. You can supply that address directly, or you can assign it to a pointer for indirect access. This is similar to accessing values through a pointer, except in this case you access the code indirectly, instead of accessing the data.
First of all, you can never typecast a function pointer into a function pointer of a different type. That is undefined behavior in C (C11 6.5.2.2).
A very important advise when dealing with function pointers is to always use typedefs.
So, your code could/should be rewritten as:
typedef int (*func_t)(int*, int*);
int func0(int*,int*);
int func1(int*,int*);
int main(){
int i = 1;
myfunc(34,23, (i==1?func0:func1)); //34 and 23 are arbitrary inputs
}
void myfunc(int x, int y, func_t func){
To answer the question, you want to use function pointers as parameters when you don't know the nature of the function. This is common when writing generic algorithms.
Take the standard C function bsearch() as an example:
void *bsearch (const void *key,
const void *base,
size_t nmemb,
size_t size,
int (*compar)(const void *, const void *));
);
This is a generic binary search algorithm, searching through any form of one-dimensional arrray, containing unknown types of data, such as user-defined types. Here, the "compar" function is comparing two objects of unknown nature for equality, returning a number to indicate this.
"The function shall return an integer less than, equal to, or greater than zero if the key object is considered, respectively, to be less than, to match, or to be greater than the array element."
The function is written by the caller, who knows the nature of the data. In computer science, this is called a "function object" or sometimes "functor". It is commonly encountered in object-oriented design.
An example (pseudo code):
typedef struct // some user-defined type
{
int* ptr;
int x;
int y;
} Something_t;
int compare_Something_t (const void* p1, const void* p2)
{
const Something_t* s1 = (const Something_t*)p1;
const Something_t* s2 = (const Something_t*)p2;
return s1->y - s2->y; // some user-defined comparison relevant to the object
}
...
Something_t search_key = { ... };
Something_t array[] = { ... };
Something_t* result;
result = bsearch(&search_key,
array,
sizeof(array) / sizeof(Something_t), // number of objects
sizeof(Something_t), // size of one object
compare_Something_t // function object
);

Globals: Best option when callback function parameters don't provide enough information in C?

Lets take qsort()'s comparison callback function as an example
int (*compar)(const void *, const void *)
What happens when the result of the comparison function depends on the current value of a variable? It appears my only two options are to use a global var (yuck) or to wrap each element of the unsorted array in a struct that contains the additional information (double yuck).
With qsort() being a standard function, I'm quite surprised that it does not allow for additional information to be passed in; something along the lines of execv()'s NULL-terminated char *const argv[] argument.
The same thing can be applied to other functions which define a callback that leave no headroom for additional parameters, ftw() & nftw() being two others I've had this problem with.
Am I just "doing it wrong" or is this a common problem and chalked up to an oversight with these types of callback function definitions?
EDIT
I've seen a few answers which say to create multiple callback functions and determine which one is appropriate to pass to qsort(). I understand this method in theory, but how would you apply it in practice if say I wanted the comparison callback function to sort an array of ints depending on how close the element is to a variable 'x'?. It would appear that I would need one callback function for each possible value of 'x' which is a non-starter.
Here is a working example using the global variable 'x'. How would you suggest I do this via multiple callback functions?
#include <stdint.h>
#include <stdio.h>
#include <math.h>
int bin_cmp(const void*, const void*);
int x;
int main(void)
{
int i;
int bins[6] = { 140, 100, 180, 80, 240, 120 };
x = 150;
qsort(bins, 6, sizeof(int), bin_cmp);
for(i=0; i < 6; i++)
printf("%d ", bins[i]);
return 0;
}
int bin_cmp(const void* a, const void* b)
{
int a_delta = abs(*(int*)a - x);
int b_delta = abs(*(int*)b - x);
if ( a_delta == b_delta )
return 0;
return a_delta < b_delta ? -1 : 1;
}
Output
140 180 120 100 80 240
Change the value of the function pointer. The whole point (no pun intended) of function pointers is that the user can pass in different functions under different circumstances. Rather that passing in a single function which acts differently based on external circumstances, pass in different functions based on the external circumstances.
This way, you only need to know about the values of variables in the context of the call to qsort() (or whatever function you're using), and write a couple different simpler comparison functions instead of one big one.
In response to your edit
To deal with the issue described in your update, just use to the global variable by name in your comparison function. This will certainly work if you are storing the variable at the global scope, and I believe qsort() will be able to find it at most other (public) scopes visible to the comparison function definition, as long as the scope is fully qualified.
However, this approach won't work if you want to pass a value straight into the sorting process without putting it in a variable.
It sounds like you need the bsearch_s() and qsort_s() functions defined by TR 24731-1:
§6.6.3.1 The bsearch_s function
Synopsis
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h>
void *bsearch_s(const void *key, const void *base,
rsize_t nmemb, rsize_t size,
int (*compar)(const void *k, const void *y, void *context),
void *context);
§6.6.3.2 The qsort_s function
Synopsis
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h>
errno_t qsort_s(void *base, rsize_t nmemb, rsize_t size,
int (*compar)(const void *x, const void *y, void *context),
void *context);
Note that the interface has the context that you need.
Something rather close to this should be available in the MS Visual Studio system.
With qsort's signature being what it is, I think your options are pretty limited. You can use a global variable or wrap your elements in structs as you suggested, but I'm not aware of any good "clean" way of doing this in C. I imagine there are other solutions out there, but they won't be any cleaner than using global variables. If your application is single-threaded, I would say this is your best bet as long as you're careful with the globals.
I'd listen to tlayton, and wrap that logic in a function that returns a pointer to the appropriate comparison function.
You can write your own specialized qsort implementation and customize it to your need.
To give you an idea about how short it is, this is a qsort implementation with your delta.
void swap(int *a, int *b)
{
int t=*a; *a=*b; *b=t;
}
void yourQsort(int arr[], int beg, int end, int delta)
{
if (end > beg + 1)
{
int piv = arr[beg], l = beg + 1, r = end;
while (l < r)
{
//Here use your var something like this
int a_delta = abs(arr[l] - delta);
int b_delta = abs(piv - delta);
if (a_delta <= delta)
l++;
else
swap(&arr[l], &arr[--r]);
}
swap(&arr[--l], &arr[beg]);
yourQsort(arr, beg, l, delta);
yourQsort(arr, r, end, delta);
}
}
More C-optimized implementations are said to be here.
You will have to prepare several function callbacks and check the value before running qsort, then send the correct one.
IMO, go with the global. The problem is that you're really asking qsort to do two things, both sort and map, where it's only designed to do the first.
The other thing you could do is break it down into a couple steps. First compute an array of these (one for each element of the source array):
struct sort_element {
int delta; // This is the delta value
int index; // This is the index of the value in the source array
}
Call qsort on that array, using a sort function that compares delta values. Then you use the index'es on the sorted array to order your original array. This consumes some memory, but it may not matter that much. (Memory is cheap, and the only thing you have to store in the temporary array is the key, not the entire value.)

Given a char* with the prototype, can we cast a void* to a function pointer? and run it?

I've declared many functions in one driver, and am passing the pointers to the functions to another driver in a list with the node format:
struct node
{
char def_prototype[256]; //example:(int (*)(wchar, int, int))
void *def_function;
};
Is there a way to typecast def_function to the prototype given in def_prototype?
Currently I'm using simple switch and strcmp, but I wanted to generalize it if possible.
PS: I know that casting between void pointer and function pointer is unsafe (as mentioned in various places in SO), but desperate times call for desperate measures and I have taken lot of care.
EDIT:
Sorry for the lack in clarity. I want to actually call the function (not just cast it), making a function pointer at runtime based on the char[] provided.
EDIT AGAIN:
Since I'm working at the kernel level (windows driver), I don't have access to much resources, so, I'm sticking to my current implementation (with some changes to kill back-doors). Thanks to all for your help.
ISO-C does not allow casting between function and data pointers, ie you should use a void (*)(void) instead of a void * to hold your function.
That aside, YeenFei is correct in his assertion that there is no general platform-independant solution, meaning the best you can do in C itself is to supply a list of supported signatures.
You should implement your own encoding scheme instead of using plain C prototypes. It's common to use a string where each char represents a function argument (and the first one the return value); a function of type int (*)(wchar, int, int) for example could have the signature "iwii".
Signature lookup tables can then be easily built using bsearch() and strcmp(); here's a complete example:
#include <assert.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static int cmp(const void *key, const void *element)
{
return strcmp(key, *(const char * const *)element);
}
static _Bool dispatch(const char *sig, void (*func)(void), void *retval, ...)
{
// supported signatures; must be ordered according to strcmp()
static const char * const SIGS[] = { "iii", "v", "vi" };
const char * const *match = bsearch(
sig, SIGS, sizeof SIGS / sizeof *SIGS, sizeof *SIGS, cmp);
if(!match) return 0;
va_list args;
va_start(args, retval);
switch(match - SIGS)
{
case 0: { // iii
int a1 = va_arg(args, int);
int a2 = va_arg(args, int);
int rv = ((int (*)(int, int))func)(a1, a2);
if(retval) memcpy(retval, &rv, sizeof rv);
break;
}
case 1: { // v
func();
break;
}
case 2: { // vi
int a1 = va_arg(args, int);
((void (*)(int))func)(a1);
break;
}
default:
assert(!"PANIC");
}
va_end(args);
return 1;
}
// example code:
static int add(int a, int b)
{
return a + b;
}
int main(void)
{
int sum;
dispatch("iii", (void (*)(void))add, &sum, 3, 4);
printf("%i", sum);
return 0;
}
unless you want to mess with assembly thunking (pushing data onto stack before jumping, etc), there is better way other than doing some switch case.
if the destination function is finite and known, why not create a lookup table (map<string, functor>) for it ?
A good implementation of similar ideas is libffi. This implements the gory details of declaring and calling functions with arbitrary calling conventions and signatures. It is (surprisingly) platform portable, and known to work on Linux and Windows out of the box.
An example of its use is the Lua extension library alien. That demonstrates calling arbitrary functions declared at runtime and adapting from native Lua types to the types required for the calling conventions. The specific Lua binding won't be useful to you, but it serves as a complete working example of how and why one might actually use libffi.
Since C has no runtime type information, there is absolutely no need to do a dynamic cast as you are considering. Just pass the pointer and if everything fits, it will work. If the pointer doesn't point to a function with the right signature, there is no way to fix it.
There are basically two solutions:
Go to the assembly level and parse the prototype string there and put the arguments you find in the prototype there where the other function will expect them.
Make a long list of all supported prototypes and compare the current one with the list. When you find a match, you can make the typecast as needed. The most common structure for this test would ba an if-else ladder.

Resources