How to make generic function using void * in c? - c

I have an incr function to increment the value by 1
I want to make it generic,because I don't want to make different functions for the same functionality.
Suppose I want to increment int,float,char by 1
void incr(void *vp)
{
(*vp)++;
}
But the problem I know is Dereferencing a void pointer is undefined behaviour. Sometimes It may give error :Invalid use of void expression.
My main funciton is :
int main()
{
int i=5;
float f=5.6f;
char c='a';
incr(&i);
incr(&f);
incr(&c);
return 0;
}
The problem is how to solve this ? Is there a way to solve it in Conly
or
will I have to define incr() for each datatypes ? if yes, then what's the use of void *
Same problem with the swap() and sort() .I want to swap and sort all kinds of data types with same function.

You can implement the first as a macro:
#define incr(x) (++(x))
Of course, this can have unpleasant side effects if you're not careful. It's about the only method C provides for applying the same operation to any of a variety of types though. In particular, since the macro is implemented using text substitution, by the time the compiler sees it, you just have the literal code ++whatever;, and it can apply ++ properly for the type of item you've provided. With a pointer to void, you don't know much (if anything) about the actual type, so you can't do much direct manipulation on that data).
void * is normally used when the function in question doesn't really need to know the exact type of the data involved. In some cases (e.g., qsort) it uses a callback function to avoid having to know any details of the data.
Since it does both sort and swap, let's look at qsort in a little more detail. Its signature is:
void qsort(void *base, size_t nmemb, size_t size,
int(*cmp)(void const *, void const *));
So, the first is the void * you asked about -- a pointer to the data to be sorted. The second tells qsort the number of elements in the array. The third, the size of each element in the array. The last is a pointer to a function that can compare individual items, so qsort doesn't need to know how to do that. For example, somewhere inside qsort will be some code something like:
// if (base[j] < base[i]) ...
if (cmp((char *)base+i, (char *)base+j) == -1)
Likewise, to swap two items, it'll normally have a local array for temporary storage. It'll then copy bytes from array[i] to its temp, then from array[j] to array[i] and finally from temp to array[j]:
char temp[size];
memcpy(temp, (char *)base+i, size); // temp = base[i]
memcpy((char *)base+i, (char *)base+j, size); // base[i] = base[j]
memcpy((char *)base+j, temp, size); // base[j] = temp

Using void * will not give you polymorphic behavior, which is what I think you're looking for. void * simply allows you to bypass the type-checking of heap variables. To achieve actual polymorphic behavior, you will have to pass in the type information as another variable and check for it in your incr function, then casting the pointer to the desired type OR by passing in any operations on your data as function pointers (others have mentioned qsort as an example). C does not have automatic polymorphism built in to the language, so it would be on you to simulate it. Behind the scenes, languages that build in polymorphism are doing something just like this behind the scenes.
To elaborate, void * is a pointer to a generic block of memory, which could be anything: an int, float, string, etc. The length of the block of memory isn't even stored in the pointer, let alone the type of the data. Remember that internally, all data are bits and bytes, and types are really just markers for how the logical data are physically encoded, because intrinsically, bits and bytes are typeless. In C, this information is not stored with variables, so you have to provide it to the compiler yourself, so that it knows whether to apply operations to treat the bit sequences as 2's complement integers, IEEE 754 double-precision floating point, ASCII character data, functions, etc.; these are all specific standards of formats and operations for different types of data. When you cast a void * to a pointer to a specific type, you as the programmer are asserting that the data pointed to actually is of the type you're casting it to. Otherwise, you're probably in for weird behavior.
So what is void * good for? It's good for dealing with blocks of data without regards to type. This is necessary for things like memory allocation, copying, file operations, and passing pointers-to-functions. In almost all cases though, a C programmer abstracts from this low-level representation as much as possible by structuring their data with types, which have built-in operations; or using structs, with operations on these structs defined by the programmer as functions.
You may want to check out the Wikipedia explanation for more info.

You can't do exactly what you're asking - operators like increment need to work with a specific type. So, you could do something like this:
enum type {
TYPE_CHAR,
TYPE_INT,
TYPE_FLOAT
};
void incr(enum type t, void *vp)
{
switch (t) {
case TYPE_CHAR:
(*(char *)vp)++;
break;
case TYPE_INT:
(*(int *)vp)++;
break;
case TYPE_FLOAT:
(*(float *)vp)++;
break;
}
}
Then you'd call it like:
int i=5;
float f=5.6f;
char c='a';
incr(TYPE_INT, &i);
incr(TYPE_FLOAT, &f);
incr(TYPE_CHAR, &c);
Of course, this doesn't really give you anything over just defining separate incr_int(), incr_float() and incr_char() functions - this isn't the purpose of void *.
The purpose of void * is realised when the algorithm you're writing doesn't care about the real type of the objects. A good example is the standard sorting function qsort(), which is declared as:
void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));
This can be used to sort arrays of any type of object - the caller just needs to supply a comparison function that can compare two objects.
Both your swap() and sort() functions fall into this category. swap() is even easier - the algorithm doesn't need to know anything other than the size of the objects to swap them:
void swap(void *a, void *b, size_t size)
{
unsigned char *ap = a;
unsigned char *bp = b;
size_t i;
for (i = 0; i < size; i++) {
unsigned char tmp = ap[i];
ap[i] = bp[i];
bp[i] = tmp;
}
}
Now given any array you can swap two items in that array:
int ai[];
double ad[];
swap(&ai[x], &ai[y], sizeof(int));
swap(&di[x], &di[y], sizeof(double));

Example for using "Generic" swap.
This code swaps two blocks of memory.
void memswap_arr(void* p1, void* p2, size_t size)
{
size_t i;
char* pc1= (char*)p1;
char* pc2= (char*)p2;
char ch;
for (i= 0; i<size; ++i) {
ch= pc1[i];
pc1[i]= pc2[i];
pc2[i]= ch;
}
}
And you call it like this:
int main() {
int i1,i2;
double d1,d2;
i1= 10; i2= 20;
d1= 1.12; d2= 2.23;
memswap_arr(&i1,&i2,sizeof(int)); //I use memswap_arr to swap two integers
printf("i1==%d i2==%d \n",i1,i2); //I use the SAME function to swap two doubles
memswap_arr(&d1,&d2,sizeof(double));
printf("d1==%f d2==%f \n",d1,d2);
return 0;
}
I think that this should give you an idea of how to use one function for different data types.

Sorry if this may come off as a non-answer to the broad question "How to make generic function using void * in c?".. but the problems you seem to have (incrementing a variable of an arbitrary type, and swapping 2 variables of unknown types) can be much easier done with macros than functions and pointers to void.
Incrementing's simple enough:
#define increment(x) ((x)++)
For swapping, I'd do something like this:
#define swap(x, y) \
({ \
typeof(x) tmp = (x); \
(x) = (y); \
(y) = tmp; \
})
...which works for ints, doubles and char pointers (strings), based on my testing.
Whilst the incrementing macro should be pretty safe, the swap macro relies on the typeof() operator, which is a GCC/clang extension, NOT part of standard C (tho if you only really ever compile with gcc or clang, this shouldn't be too much of a problem).
I know that kind of dodged the original question; but hopefully it still solves your original problems.

You can use the type-generic facilities (C11 standard). If you intend to use more advanced math functions (more advanced than the ++ operator), you can go to <tgmath.h>, which is type-generic definitions of the functions in <math.h> and <complex.h>.
You can also use the _Generic keyword to define a type-generic function as a macro. Below an example:
#include <stdio.h>
#define add1(x) _Generic((x), int: ++(x), float: ++(x), char: ++(x), default: ++(x))
int main(){
int i = 0;
float f = 0;
char c = 0;
add1(i);
add1(f);
add1(c);
printf("i = %d\tf = %g\tc = %d", i, f, c);
}
You can find more information on the language standard and more soffisticated examples in this post from Rob's programming blog.
As for the * void, swap and sort questions, better refer to Jerry Coffin's answer.

You should cast your pointer to concrete type before dereferencing it. So you should also add code to pass what is the type of pointer variable.

Related

Is it UB to cast a pointer to a void pointer and write to it?

I was working on a way to create dynamic arrays in C, and I came up with this solution as a general structure for how I want my functions/macros to work:
//dynarray.h
#define dynarray(TYPE)\
struct{\
TYPE *data;\
size_t size;\
size_t capacity;\
}
int dynarray_init_internal(void **ptr, size_t *size, size_t *cap, size_t type_size, size_t count);
#define dynarray_init(ARR, SIZE) dynarray_init_internal(&ARR->data, &ARR->size, &ARR->capacity, sizeof(*ARR->data), SIZE)
//dynarray.c
int dynarray_init_internal(void **ptr, size_t *size, size_t *cap, size_t type_size, size_t count){
*ptr = malloc(type_size*count);
if(*ptr == NULL){
return 1;
}
*size = 0;
*cap = count;
return 1;
}
Is this an acceptable approach to have a generic function/macro combo that deals with dynamically allocating memory in a type agnostic way?
The only doubts I have about this is that I'm not sure if this is undefined behavior or not. I imagine this could be easily expanded for other functions that are typically expected for a dynamic array structure. The only issue I can see with it is that since it's an anonymous struct you can't pass it as an argument anywhere (easily at least), but that can be easily fixed by creating a dynarray_def(TYPE, NAME) macro which would define a dynamic array struct with NAME and have it hold data of TYPE while still having it work with all the other function/macro style listed above.
This is undefined behavior because you're converting (for example) an int ** to a void ** and dereferencing it to yield a void *. The automatic conversion to/from a void * does not extend to void **. Reading/writing one type as another (in this case, writing a int * as a void *) is in violation.
The best way to handle this is to make the entire init routine a macro:
#define dynarray_init(ARR, SIZE) \
do {\
(ARR)->data = malloc(sizeof(*(ARR)->data*(SIZE));\
if ((ARR)->data == NULL){\
_exit(1);\
}\
(ARR)->size = 0;\
(ARR)->capacity = (SIZE);\
} while (0)
EDIT:
If you're looking to shy away from function-like macros, you can instead use a macro to create a function and the struct type it works with:
#include <stdio.h>
#include <stdlib.h>
#define dynarray(TYPE)\
struct dynarray_##TYPE {\
TYPE *data;\
size_t size;\
size_t capacity;\
};\
\
int dynarray_##TYPE##_init(struct dynarray_##TYPE **ptr, size_t count){\
*ptr = malloc(sizeof(*ptr)*count);\
if(*ptr == NULL){\
return 1;\
}\
\
(*ptr)->size = 0;\
(*ptr)->capacity = count;\
return 1;\
}
// generate types and functions
dynarray(int)
dynarray(double)
int main()
{
struct dynarray_int *da1;
dynarray_int_init(&da1, 5);
// use da1
struct dynarray_double *da2;
dynarray_double_init(&da2, 5);
// use da2
return 0;
}
Because some rare implementations use different representations for different types of pointers, the Standard does not require that implementations allow them to be manipulated interchangeably. Instead, it regards support for such manipulation as a "popular extension" for which support is a "Quality of Implementation" issue outside its jurisdiction. Just about any compiler for a remotely-commonplace platform will be configurable to support the construct, and the while the authors of the Standard wanted to give programmers a "fighting chance" [their words] to write portable code, they have explicitly said they did not wish to "demean" programs that weren't 100% portable.
Note, however, that some optimizers are unable to handle such constructs except by completely disabling type-based aliasing analysis is disabled, and any program that is using such constructs will need to document such requirement. On the other hand, unless one needs to target obscure architectures, it's often better to use constructs and document their usage is often better than to jump through hoops to accommodate poor quality optimizers.
Note, btw, that even good quality compilers could get tripped up by some sufficiently-tricky usage patterns involving pointer casts. The authors of the Standard didn't want to forbid implementations from performing useful optimizations merely because some tricky and contrived usage patterns could yield incorrect behavior, but they expected that implementations would be able to recognize patterns their users would actually use. For example, given:
float f;
int *ip; float *fp;
int *ipp = (int**)(&fp);
...
void test(void)
{
fp = &f;
f = 1.0;
**ip+=1;
return f;
}
a compiler would have no realistic way of recognizing that a write to **ip could realistically affect an object of type float. If, however, the address of fp had been stored into ip between the write to f and the later read therefrom, optimizing compilers in the era when the Standard was written would recognize that converting a T* to a U* should be regarded as a potential memory clobber on any object of type T* that might be accessed via the U*. I suspect your usage patterns fit the latter pattern far more strongly than the former.
*ipp = someFloat;

Casting Structs With Void Pointers into Structs With Typed Pointers

Short version:
Suppose I have two structs:
struct charPtrWithLen
{
size_t len;
char * charPtr;
}
struct voidPtrWithLen
{
size_t len;
void * voidPtr;
}
Is there a way to cast voidPtrWithLen into charPtrWithLen and vice-versa, or even better, implicitly convert one into the other, much the same way that a char * and a void * can be readily cast and implicitly converted between each other?
Put another way:
I am trying to write all my C so that all pointers to arrays bring their size information with them. I am also trying to write generic functions using void pointers where applicable to keep operations which are essentially identical, well, identical. I am looking for a way to pass the typed-pointer-containing 'sized-array' structs into the generic functions taking void-pointer-containing 'sized-array' arguments.
Long version, with involved example:
So, void pointers are wonderfully flexible, so I can do this:
int foo(void * ptr, size_t dataLen);
/* ... */
char * c;
size_t c_n;
/* ... */
foo(c, c_n);
/* ... */
int * i;
size_t i_n;
/* ... */
foo(i, i_n);
But since the pattern of "pointer to arbitrary length array, plus size there-of" is so common, suppose at some point I get tired of specifying my various functions in terms of pairs of arguments, pointer and length, and instead I start to code with such pairs encapsulated in a struct instead:
typedef struct
{
size_t v_n;
void * v;
}
pointerWithSize;
/* ... */
int foo(pointerWithSize);
So far so good. I can always assign my "char * c" or "int * i" into the pointerWithSize's "void * v" with minimal difficulty. But when you do this long enough, using the same pattern, you run into the following problem: Soon enough you have a bunch of general functions which work with the data agnostically, and are thus happy to take void pointers, for example things like:
pointerWithSize combinePointersWithSize(pointerWithSize p1, pointerWithSize p2);
int readFromStream(FILE * readFromHere, pointerWithSize * readIntoHere);
But you also end up with functions which are inherently intended for specific data types:
size_t countOccurancesOfChar(pointerWithSize str, char c);
int summate(pointerWithSize integers);
And then you end up with the annoyance of having to do casts inside the latter category of functions. E.g. you end up with stuff like this:
/* This inside countOccurancesOfChar */
if(((char * )str.m)[i] == c) {
/* ..or this inside summate: */
sum += ((int * )integers.m)[i];
So you get to a point where you have a lot of functions which operate specifically on "strings with size", and in all of those cases, you don't want to have to much around with void pointers. So instead, in those cases you start doing stuff like this:
typedef struct
{
size_t v_n;
char * v;
}
stringWithSize;
/* ... */
size_t countOccurancesOfChar(stringWithSize str, char c);
int parseFormatting(stringWithSize str, struct someFormat_t foo);
Which is great, because now all the string related code doesn't need to be cluttered with casts. BUT, now I can't use my wonderful generic function combinePointersWithSize to concatenate my strings contained within the stringWithSize, in a way that's as syntactically clean, as I could if I was still writing my functions in terms of two separate arguments for each pointer-and-size pair.
To finish up the illustration:
pointerWithSize combinePointersWithSize(pointerWithSize p1, pointerWithSize p2);
void * combineAlternative(void * p1, size_t p_n1, void * p2);
/* ... */
stringWithSize a, b, c;
/* ... */
/* This doesn't work, incompatible types: */
c = combinePointersWithSize(a, b);
/* But this works, because char * can be passed into void * parameter. */
c.v_n = a.v_n + b.v_n;
c.v = combineAlternative(a.v, a.v_n, b.v, b.v_n); /* Works fine. */
Possible Solutions I've Considered:
1: Don't write my functions with those structs as arguments, instead write them with individual pair arguments. But this is a big part of what I want to avoid in the first place - I like the 'cleanness' and clarity of intent that having a size_t and a pointer bundled in one struct represents.
2: Do something like this:
stringWithSize a, b, c;
/* ... */
pointerWithSize d;
d = combinePointersWithSize((pointerWithSize){.v=a.v, .v_n=a.v_n}, (pointerWithSize){.v=b.v, .v_n=b.v_n})
/* and then do either this: */
c.v = d.v;
c.v_n = d.v_n;
foo(c);
/* ..or this: */
foo((stringWithSize){.v=d.v, .v_n=d.v_n});
..but I think most would agree, this is also as bad or worse as the original problem of casting within the library functions. On the surface it looks worse, because it offloads the casting burden to the client code instead of library code which can hopefully be fairly stable after being implemented/completed (incl. testing/etc). On the other hand, if you did keep every function defined in terms of the void * containing pointerWithSize, you could end up forcing similar casts to the kind you're doing inside your own functions, elsewhere in their code, and worse, you're losing the advantage of the compiler yelling at you, because now the code is carrying everything within the same pointerWithSize struct.
I'm also concerned about how many compilers out there have the ability to optimize the first of the two variants of this solution away (where 'd' servers as merely a temporary result holder.
3: Union-of-pointers. Instead of my prior pointerWithSize example, I would do:
typedef union
{
void * void;
char * char;
int * int;
/* ...and so on... */
}
rainbowPointer;
typedef struct
{
size_t v_n;
rainbowPointer v;
}
pointerWithSize;
At first glance this is almost good enough. However, I very frequently end up wanting to store arrays of some struct which is specific to the program I'm working on inside this "pointer with size" construct, and in those cases, a predefined union of pointer types would be useless to me, I'd still be right back at this problem.
4: I could write wrapper functions for each permuted pointer type. I could EVEN write function-like macros to define each of these pointer-with-size struct types, which would in the same swoop generate the wrapper functions. For example:
#define pointerWithSizeDef(T, name) \
typedef struct \
{ \
size_t v_n; \
T * v;
} \
name; \
foo_ ## name (name p1) \
{ \
/* generic function code defined in macro */ \
/* Or something like this: */ \
foo((pointerWithSize){.v=p1.v, .v_n=p1.v_n});
};
/* Then, stuff like this: */
pointerWithSizeDef(char, stringWithSize)
My intuition is that sooner or later this method would become unwieldy.
5: If there is a mechanism with no performance impact, but which is unappealing otherwise, I could write my generic functions as function-like macros, which in turn invoke the underlying actual function:
int foo_actual(void * v, size_t v_n);
#define foo(p) \
foo_actual(p.v, p.v_n);
..or even something like this, to replace casting syntax:
#define castToPointerWithSize(p) \
((pointerWithSize){.v=p.v, .v_n=p.v_n})
/* ... */
stringWithSize a;
foo(castToPointerWithSize(a));
But as these examples for possible-solution-#5 show, I can't actually think of a way to do this that wouldn't quickly become a possible problem (e.g. if someone wanted to place a function call which returned a pointerWithSize in place of 'p' in the above examples - you'd be running the function twice, and it wouldn't be at all obvious from the code.
So I don't think any of the solutions I've thought of are really sufficient for my usecase, so I'm hoping some of you know of some C syntax or mechanism I could take advantage of here to make it easy to cast/"cast" between two structs which are identical save for the pointer type of one of their members.
Firstly, any kind of "actual" casting isn't going to be allowed per the letter of the standard, because C makes no guarantee at all that all pointers have the same format. A cast from some arbitrary pointer type to a void pointer is allowed to involve a conversion of representation (that gets reversed when you cast it back in order to access the data), including possibly to a different size of pointer or a pointer existing in a separate address space. So a simple reinterpretation of a bit pattern to change pointer type is not safe; void*'s bit pattern isn't guaranteed to mean anything in particular, and the bit patterns of other types aren't guaranteed to be related in any particular way. (How many systems actually take advantage of this, I have no idea.)
Since the explicit conversion between void* and other types has to exist somewhere, using whole-value conversion is probably the safest idea. What you could do is define a macro to quickly and easily generate "cast functions" for you, e.g.:
#define GEN_CAST(NAME, FROM_TYPE, TO_TYPE) \
static inline TO_TYPE NAME(FROM_TYPE from) { \
return (TO_TYPE){ .v=p.v, .v_n=p.v_n }; \
}
GEN_CAST(s_to_v, stringWithSize, pointerWithSize)
GEN_CAST(v_to_s, pointerWithSize, stringWithSize)
...that you can then use in place of the cast operator in expressions:
stringWithSize a, b, c;
pointerWithSize d;
d = combinePointersWithSize(s_to_v(a), s_to_v(b));
foo(v_to_s(d));
A good compiler should recognise that on common platforms the conversion function is an identity operation, and remove it entirely.
You should be able to cast one to another by converting one to a pointer, casting it to a pointer of the other type, and dereferencing it. This will work in reverse too.
struct charPtrWithLen
{
size_t len;
char * charPtr;
};
struct voidPtrWithLen
{
size_t len;
void * voidPtr;
};
int main() {
struct charPtrWithLen cpwl = {.len = 6, .charPtr = "Hello"};
struct voidPtrWithLen vpwl = *(struct voidPtrWithLen *)&cpwl;
return 0;
}
Note this will only work as long as the struct layout is the same for both structs.

Trying to understand function pointers in C

I am trying to understand function pointers and am stuggling. I have seen the sorting example in K&R and a few other similar examples. My main problem is with what the computer is actually doing. I created a very simple program to try to see the basics. Please see the following:
#include <stdio.h>
int func0(int*,int*);
int func1(int*,int*);
int main(){
int i = 1;
myfunc(34,23,(int(*)(void*,void*))(i==1?func0:func1));//34 and 23 are arbitrary inputs
}
void myfunc(int x, int y, int(*somefunc)(void *, void *)){
int *xx =&x;
int *yy=&y;
printf("%i",somefunc(xx,yy));
}
int func0(int *x, int *y){
return (*x)*(*y);
}
int func1(int *x, int *y){
return *x+*y;
}
The program either multiplies or adds two numbers depending on some variable (i in the main function - should probably be an argument in the main). fun0 multiplies two ints and func1 adds them.
I know that this example is simple but how is passing a function pointer preferrable to putting a conditional inside the function myfunc?
i.e. in myfunc have the following:
if(i == 1)printf("%i",func0(xx,yy));
else printf("%i",func1(xx,yy));
If I did this the result would be the same but without the use of function pointers.
Your understanding of how function pointers work is just fine. What you're not seeing is how a software system will benefit from using function pointers. They become important when working with components that are not aware of the others.
qsort() is a good example. qsort will let you sort any array and is not actually aware of what makes up the array. So if you have an array of structs, or more likely pointers to structs, you would have to provide a function that could compare the structs.
struct foo {
char * name;
int magnitude;
int something;
};
int cmp_foo(const void *_p1, const void *_p2)
{
p1 = (struct foo*)_p1;
p2 = (struct foo*)_p2;
return p1->magnitude - p2->magnitude;
}
struct foo ** foos;
// init 10 foo structures...
qsort(foos, 10, sizeof(foo *), cmp_foo);
Then the foos array will be sorted based on the magnitude field.
As you can see, this allows you to use qsort for any type -- you only have to provide the comparison function.
Another common usage of function pointers are callbacks, for example in GUI programming. If you want a function to be called when a button is clicked, you would provide a function pointer to the GUI library when setting up the button.
how is passing a function pointer preferrable to putting a conditional inside the function myfunc
Sometimes it is impossible to put a condition there: for example, if you are writing a sorting algorithm, and you do not know what you are sorting ahead of time, you simply cannot put a conditional; function pointer lets you "plug in" a piece of computation into the main algorithm without jumping through hoops.
As far as how the mechanism works, the idea is simple: all your compiled code is located in the program memory, and the CPU executes it starting at a certain address. There are instructions to make CPU jump between addresses, remember the current address and jump, recall the address of a prior jump and go back to it, and so on. When you call a function, one of the things the CPU needs to know is its address in the program memory. The name of the function represents that address. You can supply that address directly, or you can assign it to a pointer for indirect access. This is similar to accessing values through a pointer, except in this case you access the code indirectly, instead of accessing the data.
First of all, you can never typecast a function pointer into a function pointer of a different type. That is undefined behavior in C (C11 6.5.2.2).
A very important advise when dealing with function pointers is to always use typedefs.
So, your code could/should be rewritten as:
typedef int (*func_t)(int*, int*);
int func0(int*,int*);
int func1(int*,int*);
int main(){
int i = 1;
myfunc(34,23, (i==1?func0:func1)); //34 and 23 are arbitrary inputs
}
void myfunc(int x, int y, func_t func){
To answer the question, you want to use function pointers as parameters when you don't know the nature of the function. This is common when writing generic algorithms.
Take the standard C function bsearch() as an example:
void *bsearch (const void *key,
const void *base,
size_t nmemb,
size_t size,
int (*compar)(const void *, const void *));
);
This is a generic binary search algorithm, searching through any form of one-dimensional arrray, containing unknown types of data, such as user-defined types. Here, the "compar" function is comparing two objects of unknown nature for equality, returning a number to indicate this.
"The function shall return an integer less than, equal to, or greater than zero if the key object is considered, respectively, to be less than, to match, or to be greater than the array element."
The function is written by the caller, who knows the nature of the data. In computer science, this is called a "function object" or sometimes "functor". It is commonly encountered in object-oriented design.
An example (pseudo code):
typedef struct // some user-defined type
{
int* ptr;
int x;
int y;
} Something_t;
int compare_Something_t (const void* p1, const void* p2)
{
const Something_t* s1 = (const Something_t*)p1;
const Something_t* s2 = (const Something_t*)p2;
return s1->y - s2->y; // some user-defined comparison relevant to the object
}
...
Something_t search_key = { ... };
Something_t array[] = { ... };
Something_t* result;
result = bsearch(&search_key,
array,
sizeof(array) / sizeof(Something_t), // number of objects
sizeof(Something_t), // size of one object
compare_Something_t // function object
);

generic programming in C with void pointer

Even though it is possible to write generic code in C using void pointer(generic pointer), I find that it is quite difficult to debug the code since void pointer can take any pointer type without warning from compiler.
(e.g function foo() take void pointer which is supposed to be pointer to struct, but compiler won't complain if char array is passed.)
What kind of approach/strategy do you all use when using void pointer in C?
The solution is not to use void* unless you really, really have to. The places where a void pointer is actually required are very small: parameters to thread functions, and a handful of others places where you need to pass implementation-specific data through a generic function. In every case, the code that accepts the void* parameter should only accept one data type passed via the void pointer, and the type should be documented in comments and slavishly obeyed by all callers.
This might help:
comp.lang.c FAQ list ยท Question 4.9
Q: Suppose I want to write a function that takes a generic pointer as an argument and I want to simulate passing it by reference. Can I give the formal parameter type void **, and do something like this?
void f(void **);
double *dp;
f((void **)&dp);
A: Not portably. Code like this may work and is sometimes recommended, but it relies on all pointer types having the same internal representation (which is common, but not universal; see question 5.17).
There is no generic pointer-to-pointer type in C. void * acts as a generic pointer only because conversions (if necessary) are applied automatically when other pointer types are assigned to and from void * 's; these conversions cannot be performed if an attempt is made to indirect upon a void ** value which points at a pointer type other than void *. When you make use of a void ** pointer value (for instance, when you use the * operator to access the void * value to which the void ** points), the compiler has no way of knowing whether that void * value was once converted from some other pointer type. It must assume that it is nothing more than a void *; it cannot perform any implicit conversions.
In other words, any void ** value you play with must be the address of an actual void * value somewhere; casts like (void **)&dp, though they may shut the compiler up, are nonportable (and may not even do what you want; see also question 13.9). If the pointer that the void ** points to is not a void *, and if it has a different size or representation than a void *, then the compiler isn't going to be able to access it correctly.
To make the code fragment above work, you'd have to use an intermediate void * variable:
double *dp;
void *vp = dp;
f(&vp);
dp = vp;
The assignments to and from vp give the compiler the opportunity to perform any conversions, if necessary.
Again, the discussion so far assumes that different pointer types might have different sizes or representations, which is rare today, but not unheard of. To appreciate the problem with void ** more clearly, compare the situation to an analogous one involving, say, types int and double, which probably have different sizes and certainly have different representations. If we have a function
void incme(double *p)
{
*p += 1;
}
then we can do something like
int i = 1;
double d = i;
incme(&d);
i = d;
and i will be incremented by 1. (This is analogous to the correct void ** code involving the auxiliary vp.) If, on the other hand, we were to attempt something like
int i = 1;
incme((double *)&i); /* WRONG */
(this code is analogous to the fragment in the question), it would be highly unlikely to work.
Arya's solution can be changed a little to support a variable size:
#include <stdio.h>
#include <string.h>
void swap(void *vp1,void *vp2,int size)
{
char buf[size];
memcpy(buf,vp1,size);
memcpy(vp1,vp2,size);
memcpy(vp2,buf,size); //memcpy ->inbuilt function in std-c
}
int main()
{
int array1[] = {1, 2, 3};
int array2[] = {10, 20, 30};
swap(array1, array2, 3 * sizeof(int));
int i;
printf("array1: ");
for (i = 0; i < 3; i++)
printf(" %d", array1[i]);
printf("\n");
printf("array2: ");
for (i = 0; i < 3; i++)
printf(" %d", array2[i]);
printf("\n");
return 0;
}
The approach/strategy is to minimize use of void* pointers. They are needed in specific cases. If you really need to pass void* you should pass size of pointer's target also.
This generic swap function will help you a lot in understanding generic void *
#include<stdio.h>
void swap(void *vp1,void *vp2,int size)
{
char buf[100];
memcpy(buf,vp1,size);
memcpy(vp1,vp2,size);
memcpy(vp2,buf,size); //memcpy ->inbuilt function in std-c
}
int main()
{
int a=2,b=3;
float d=5,e=7;
swap(&a,&b,sizeof(int));
swap(&d,&e,sizeof(float));
printf("%d %d %.0f %.0f\n",a,b,d,e);
return 0;
}
We all know that the C typesystem is basically crap, but try to not do that... You still have some options to deal with generic types: unions and opaque pointers.
Anyway, if a generic function is taking a void pointer as a parameter, it shouldn't try to dereference it!.

Solution for "dereferencing `void *' pointer" warning in struct in C?

I was trying to create a pseudo super struct to print array of structs. My basic
structures are as follows.
/* Type 10 Count */
typedef struct _T10CNT
{
int _cnt[20];
} T10CNT;
...
/* Type 20 Count */
typedef struct _T20CNT
{
long _cnt[20];
} T20CNT;
...
I created the below struct to print the array of above mentioned structures. I got dereferencing void pointer error while compiling the below code snippet.
typedef struct _CMNCNT
{
long _cnt[3];
} CMNCNT;
static int printCommonStatistics(void *cmncntin, int cmncnt_nelem, int cmncnt_elmsize)
{
int ii;
for(ii=0; ii<cmncnt_nelem; ii++)
{
CMNCNT *cmncnt = (CMNCNT *)&cmncntin[ii*cmncnt_elmsize];
fprintf(stout,"STATISTICS_INP: %d\n",cmncnt->_cnt[0]);
fprintf(stout,"STATISTICS_OUT: %d\n",cmncnt->_cnt[1]);
fprintf(stout,"STATISTICS_ERR: %d\n",cmncnt->_cnt[2]);
}
return SUCCESS;
}
T10CNT struct_array[10];
...
printCommonStatistics(struct_array, NELEM(struct_array), sizeof(struct_array[0]);
...
My intention is to have a common function to print all the arrays. Please let me know the correct way of using it.
Appreciate the help in advance.
Edit: The parameter name is changed to cmncntin from cmncnt. Sorry it was typo error.
Thanks,
Mathew Liju
I think your design is going to fail, but I am also unconvinced that the other answers I see fully deal with the deeper reasons why.
It appears that you are trying to use C to deal with generic types, something that always gets to be hairy. You can do it, if you are careful, but it isn't easy, and in this case, I doubt if it would be worthwhile.
Deeper Reason: Let's assume we get past the mere syntactic (or barely more than syntactic) issues. Your code shows that T10CNT contains 20 int and T20CNT contains 20 long. On modern 64-bit machines - other than under Win64 - sizeof(long) != sizeof(int). Therefore, the code inside your printing function should be distinguishing between dereferencing int arrays and long arrays. In C++, there's a rule that you should not try to treat arrays polymorphically, and this sort of thing is why. The CMNCNT type contains 3 long values; different from both the T10CNT and T20CNT structures in number, though the base type of the array matches T20CNT.
Style Recommendation: I strongly recommend avoiding leading underscores on names. In general, names beginning with underscore are reserved for the implementation to use, and to use as macros. Macros have no respect for scope; if the implementation defines a macro _cnt it would wreck your code. There are nuances to what names are reserved; I'm not about to go into those nuances. It is much simpler to think 'names starting with underscore are reserved', and it will steer you clear of trouble.
Style Suggestion: Your print function returns success unconditionally. That is not sensible; your function should return nothing, so that the caller does not have to test for success or failure (since it can never fail). A careful coder who observes that the function returns a status will always test the return status, and have error handling code. That code will never be executed, so it is dead, but it is hard for anyone (or the compiler) to determine that.
Surface Fix: Temporarily, we can assume that you can treat int and long as synonyms; but you must get out of the habit of thinking that they are synonyms, though. The void * argument is the correct way to say "this function takes a pointer of indeterminate type". However, inside the function, you need to convert from a void * to a specific type before you do indexing.
typedef struct _CMNCNT
{
long count[3];
} CMNCNT;
static void printCommonStatistics(const void *data, size_t nelem, size_t elemsize)
{
int i;
for (i = 0; i < nelem; i++)
{
const CMNCNT *cmncnt = (const CMNCNT *)((const char *)data + (i * elemsize));
fprintf(stdout,"STATISTICS_INP: %ld\n", cmncnt->count[0]);
fprintf(stdout,"STATISTICS_OUT: %ld\n", cmncnt->count[1]);
fprintf(stdout,"STATISTICS_ERR: %ld\n", cmncnt->count[2]);
}
}
(I like the idea of a file stream called stout too. Suggestion: use cut'n'paste on real source code--it is safer! I'm generally use "sed 's/^/ /' file.c" to prepare code for cut'n'paste into an SO answer.)
What does that cast line do? I'm glad you asked...
The first operation is to convert the const void * into a const char *; this allows you to do byte-size operations on the address. In the days before Standard C, char * was used in place of void * as the universal addressing mechanism.
The next operation adds the correct number of bytes to get to the start of the ith element of the array of objects of size elemsize.
The second cast then tells the compiler "trust me - I know what I'm doing" and "treat this address as the address of a CMNCNT structure".
From there, the code is easy enough. Note that since the CMNCNT structure contains long value, I used %ld to tell the truth to fprintf().
Since you aren't about to modify the data in this function, it is not a bad idea to use the const qualifier as I did.
Note that if you are going to be faithful to sizeof(long) != sizeof(int), then you need two separate blocks of code (I'd suggest separate functions) to deal with the 'array of int' and 'array of long' structure types.
The type of void is deliberately left incomplete. From this, it follows you cannot dereference void pointers, and neither you can take the sizeof of it. This means you cannot use the subscript operator using it like an array.
The moment you assign something to a void pointer, any type information of the original pointed to type is lost, so you can only dereference if you first cast it back to the original pointer type.
First and the most important, you pass T10CNT* to the function, but you try to typecast (and dereference) that to CMNCNT* in your function. This is not valid and undefined behavior.
You need a function printCommonStatistics for each type of array elements. So, have a
printCommonStatisticsInt, printCommonStatisticsLong, printCommonStatisticsChar which all differ by their first argument (one taking int*, the other taking long*, and so on). You might create them using macros, to avoid redundant code.
Passing the struct itself is not a good idea, since then you have to define a new function for each different size of the contained array within the struct (since they are all different types). So better pass the contained array directly (struct_array[0]._cnt, call the function for each index)
Change the function declaration to char * like so:
static int printCommonStatistics(char *cmncnt, int cmncnt_nelem, int cmncnt_elmsize)
the void type does not assume any particular size whereas a char will assume a byte size.
You can't do this:
cmncnt->_cnt[0]
if cmnct is a void pointer.
You have to specify the type. You may need to re-think your implementation.
The function
static int printCommonStatistics(void *cmncntin, int cmncnt_nelem, int cmncnt_elmsize)
{
char *cmncntinBytes;
int ii;
cmncntinBytes = (char *) cmncntin;
for(ii=0; ii<cmncnt_nelem; ii++)
{
CMNCNT *cmncnt = (CMNCNT *)(cmncntinBytes + ii*cmncnt_elmsize); /* Ptr Line */
fprintf(stdout,"STATISTICS_INP: %d\n",cmncnt->_cnt[0]);
fprintf(stdout,"STATISTICS_OUT: %d\n",cmncnt->_cnt[1]);
fprintf(stdout,"STATISTICS_ERR: %d\n",cmncnt->_cnt[2]);
}
return SUCCESS;
}
Works for me.
The issue is that on the line commented "Ptr Line" the code adds a pointer to an integer. Since our pointer is a char * we move forward in memory sizeof(char) * ii * cmncnt_elemsize, which is what we want since a char is one byte. Your code tried to do an equivalent thing moving forward sizeof(void) * ii * cmncnt_elemsize, but void doesn't have a size, so the compiler gave you the error.
I'd change T10CNT and T20CNT to both use int or long instead of one with each. You're depending on sizeof(int) == sizeof(long)
On this line:
CMNCNT *cmncnt = (CMNCNT *)&cmncnt[ii*cmncnt_elmsize];
You are trying to declare a new variable called cmncnt, but a variable with this name already exists as a parameter to the function. You might want to use a different variable name to solve this.
Also you may want to pass a pointer to a CMNCNT to the function instead of a void pointer, because then the compiler will do the pointer arithmetic for you and you don't have to cast it. I don't see the point of passing a void pointer when all you do with it is cast it to a CMNCNT. (Which is not a very descriptive name for a data type, by the way.)
Your expression
(CMNCNT *)&cmncntin[ii*cmncnt_elmsize]
tries to take the address of cmncntin[ii*cmncnt_elmsize] and then cast that pointer to type (CMNCNT *). It can't get the address of cmncntin[ii*cmncnt_elmsize] because cmncntin has type void*.
Study C's operator precedences and insert parentheses where necessary.
Point of Information: Internal Padding can really screw this up.
Consider struct { char c[6]; }; -- It has sizeof()=6. But if you had an array of these, each element might be padded out to an 8 byte alignment!
Certain assembly operations don't handle mis-aligned data gracefully. (For example, if an int spans two memory words.) (YES, I have been bitten by this before.)
.
Second: In the past, I've used variably sized arrays. (I was dumb back then...) It works if you are not changing type. (Or if you have a union of the types.)
E.g.:
struct T { int sizeOfArray; int data[1]; };
Allocated as
T * t = (T *) malloc( sizeof(T) + sizeof(int)*(NUMBER-1) );
t->sizeOfArray = NUMBER;
(Though padding/alignment can still screw you up.)
.
Third: Consider:
struct T {
int sizeOfArray;
enum FOO arrayType;
union U { short s; int i; long l; float f; double d; } data [1];
};
It solves problems with knowing how to print out the data.
.
Fourth: You could just pass in the int/long array to your function rather than the structure. E.g:
void printCommonStatistics( int * data, int count )
{
for( int i=0; i<count; i++ )
cout << "FOO: " << data[i] << endl;
}
Invoked via:
_T10CNT foo;
printCommonStatistics( foo._cnt, 20 );
Or:
int a[10], b[20], c[30];
printCommonStatistics( a, 10 );
printCommonStatistics( b, 20 );
printCommonStatistics( c, 30 );
This works much better than hiding data in structs. As you add members to one of your struct's, the layout may change between your struct's and no longer be consistent. (Meaning the address of _cnt relative to the start of the struct may change for _T10CNT and not for _T20CNT. Fun debugging times there. A single struct with a union'ed _cnt payload would avoid this.)
E.g.:
struct FOO {
union {
int bar [10];
long biff [20];
} u;
}
.
Fifth:
If you must use structs... C++, iostreams, and templating would be a lot cleaner to implement.
E.g.:
template<class TYPE> void printCommonStatistics( TYPE & mystruct, int count )
{
for( int i=0; i<count; i++ )
cout << "FOO: " << mystruct._cnt[i] << endl;
} /* Assumes all mystruct's have a "_cnt" member. */
But that's probably not what you are looking for...
C isn't my cup o'java, but I think your problem is that "void *cmncnt" should be CMNCNT *cmncnt.
Feel free to correct me now, C programmers, and tell me this is why java programmers can't have nice things.
This line is kind of tortured, don'tcha think?
CMNCNT *cmncnt = (CMNCNT *)&cmncntin[ii*cmncnt_elmsize];
How about something more like
CMNCNT *cmncnt = ((CMNCNT *)(cmncntin + (ii * cmncnt_elmsize));
Or better yet, if cmncnt_elmsize = sizeof(CMNCNT)
CMNCNT *cmncnt = ((CMNCNT *)cmncntin) + ii;
That should also get rid of the warning, since you are no longer dereferencing a void *.
BTW: I'm not real sure why you are doing it this way, but if cmncnt_elmsize is sometimes not sizeof(CMNCNT), and can in fact vary from call to call, I'd suggest rethinking this design. I suppose there could be a good reason for it, but it looks really shaky to me. I can almost guarantee there is a better way to design things.

Resources