I often see void pointers being cast back and forth with other types of pointers and wonder why, I see that malloc() returns a void pointer that needs to be cast before used also.
I am very new to C and come from Python so pointers are a very new concept to wrap my head around.
Is the purpose of the void * pointer to offer some "dynamic" typing?
Take this code as example, is it valid code?
struct GenericStruct{
void *ptr;
};
// use void pointer to store an arbitrary type?
void store(struct GenericStruct *strct, int *myarr){
strct->ptr = (void *) myarr;
}
// use void pointer to load an arbitrary type as int*?
int *load(struct GenericStruct *strct){
return (int *) strct->ptr;
}
The purpose of a void * is to provide a welcome exception to some of C's typing rules. With the exception of void *, you cannot assign a pointer value of one type to an object of a different pointer type without a cast - for example, you cannot write
int p = 10;
double *q = &p; // BZZT - cannot assign an int * value to a double *
When assigning to pointers of different types, you have to explicitly cast to the target type:
int p = 10;
double *q = (double *) &p; // convert the pointer to p to the right type before assigning to q
except for a void *:
int p = 10;
void *q = &p; // no cast required here.
In the old days of K&R C, char * was used as a "generic" pointer type1 - the memory allocation functions malloc/calloc/realloc all returned char *, the callback functions for qsort and bsearch took char * arguments, etc., but because you couldn't directly assign different pointer types, you had to add an explicit cast (if the target wasn't a char *, anyway):
int *mem = (int *) malloc( N * sizeof *mem );
Using explicit casts everywhere was a bit painful.
The 1989/1990 standard (C89/C90) introduced the void data type - it's a data type that cannot store any values. An expression of type void is evaluated only for its side effects (if any)2. A special rule was created for the void * type such that a value of that type can be assigned to/from any other pointer type without need of an explicit cast, which made it the new "generic" pointer type. malloc/calloc/realloc were all changed to return void *, qsort and bsearch callbacks now take void * arguments instead of char *, and now things are a bit cleaner:
int *mem = malloc( sizeof *mem * N );
You cannot dereference a void * - in our example above, where q has type void *, we cannot get at the value of p without a cast:
printf( "p = %d\n", *(int *)q );
Note that C++ is different in this regard - C++ does not treat void * specially, and requires an explicit cast to assign to different pointer types. That's because C++ provides overloading mechanisms that C doesn't.
Every object type should be mappable to an array of char.
In K&R C, all functions had to return a value - if you didn't explicitly type the function, the compiler assumed it returned int. This made it difficult to determine which functions were actually meant to return a value vs. functions that only had side effects. The void type was handy for typing functions that weren't meant to return a value.
C suffers from the absence of function overloading. So most C "generic" functions as for example qsort or bsearch use pointers to void * that to be able to deal with objects of different types.
In C you need not to cast a pointer of any type to a pointer of the type void *. And a pointer of any type can be assigned with a pointer of the type void * without casting.
So in C the functions from your code snippet can be rewritten like
void store(struct GenericStruct *strct, int *myarr){
strct->ptr = myarr;
}
int *load(struct GenericStruct *strct){
return strct->ptr;
}
Is the purpose of the void * pointer to offer some "dynamic" typing?
No, because C has no dynamic typing. But your title is about right, on the other hand -- void * serves roughly as a generic pointer type, in the sense that a pointer of that type can point to an object of any type. But there is no dynamism as that term is usually applied in this sort of context, because values of type void * do not contain any information about the type of object to which they point.
The use, then, is for pointers where the type of the object pointed to does not matter (for a given purpose). In some cases the size of the pointed-to object does matter, however, so that is sometimes provided alongside such pointers. The standard qsort() function is a canonical example. In other cases, such as generic linked lists, it is up to the user to know, somehow, what type of object is pointed to.
As #Vlad already observed, however, casting between void * and other object pointer types is not required in C. Such conversions are performed automatically upon assignment, and in other contexts that use the same rules as assignment.
Related
Why are void pointers necessary, as long as one could cast any pointer type to any pointer type, i.e.:
char b = 5;
int*a = (int*)&b;//both upcasting
or
int b = 10;
char*a = (char*)b;//and downcasting are allowed
?
Also, why there is no need for cast when using malloc/calloc/realloc ?
one could cast any pointer type to any pointer type
Not quite. void * is defined to convert any object pointer1 to void * and back again with an equivalent value. It does not need any cast.
In less common architectures, the size and range of some other pointers may be smaller than void *. Casting between other pointers type may lose necessary information.
void * provides a universal object pointer type.
void *p = any_object_pointer; // No casts required
any_object_pointer = p; // No casts required
char * could substitute for void *, except conversion to and from other object pointers requires casts.
OP's char b = 5; int*a = (int*)&b; risks undefined behavior as the alignment needs of int * may exceed char *.
1 Function pointers may be wider than void*. void * and other pointers are object pointers. C lacks a truly universal pointer type.
void pointers are really useful to create a generic API. You might think of qsort function which can be used to sort arrays of any types. void pointers can be used if the API does not know the concrete type of the pointer.
void qsort(
void *base,
size_t number,
size_t width,
int (__cdecl *compare )(const void *, const void *)
);
Regarding allocation functions, it's the same thing. The C runtime does not know the type of the effective object. But this is not a problem as user can use generic pointer void.
So void pointers are considered as generic pointers, very useful for polymorphism, that's why the C language makes casting to void optional.
"Why are void pointers necessary, as long as one could cast any pointer type".
The alloc function would then have to use common denominator such as char. But then it would be confusing to have to cast to whatever we really need. "void" just means "a bunch of bytes".
I have seen void * explained as a pointer to an unused chunk of memory. I have also seen void * described as a pointer to any type, or a pointer to any type can be cast to void *.
From what I know, int * means a pointer to type int. So keeping this in mind, what does void * mean literally? Is it a pointer to type void? This doesn't seem right because I did not know that void could be a type.
void * is a pointer to void.
C11: 6.3.2.3 Pointers:
A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
The compiler will not let you dereference a void* pointer because it does not know the size of the object pointed to but you need to cast void * type pointer to the right type before you dereference it.
Let's begin with this example:
int a = 65;
int *q = &a;
void *p = q;
char *c = p;
(source: qiniudn.com)
we define an int variable a, and an int pointer q pointing to it. p in the void * pointer.
The beginning address of a is 0x8400(just for simplicity).
A pointer is an address, no more.
No matter what type of pointer, they have the same memory size, and their value is an address.
So,
printf("%p, %p", *p, *q);
will display:
0x8400, 0x8400
Type: how you interpret the date
As you see in the graph, the data in memory is 65000000(this is little endian). If we want to use it, we have to specify what it is! And type does that.
printf("%d %c", *p, *q);
If we print it as integer, we get 65. If we print them as char, we get A(asciicode).
And p + 1 pointer to 0x8401, q + 1 points to 0x8404.
void *: a universal type
According to wikipedia:
A program can probably convert a pointer to any type of data (except a function pointer) to a pointer to void and back to the original type without losing information, which makes these pointers useful for polymorphic functions.
Yes, void * define a trivial unit of pointer, it can be converted to any pointer and vise versa. But you can't dereference it, because it doesn't specify a type.
If you want to manipulator on bytes, you should always use void *.
Isn't char * the same as void *
Not exactly.
The C language standard does not explicitly guarantee that the different pointer types have the same size.
You can't always hope char * have the same size on different platforms.
And converting char * to int * can be confusing, and mistakes can be made.
It means: a pointer to some memory, but this pointer does not contain any information about the type of data that may be stored in that memory.
This is why it's not possible to dereference a void *: the operation of dereferencing (and obtaining an rvalue, or writing through an lvalue) requires that the bits in the memory location be interpreted as a particular type, but we don't know which type to interpret the memory as.
The onus is on the programmer to make sure that data read in matches the type of data read out. The programmer might help himself in this by converting the void * to a pointer to an object type.
It's useful if you want to have a common interface for dealing with memory of multiple possible types, without requiring the user to do a lot of casting. for example free() takes a void * because the type information isn't necessary when doing the free operation.
void * is a pointer to data of unspecified type. As such, it can't be used directly; it must be cast to a usable datatype before it can be dereferenced.
I have a situation where in the address inside the void pointer to be copied to a another pointer. Doing this without a type cast gives no warnings or errors. The pseudo code looks like
structA A;
void * p = &A;
structA * B = p;// Would like to conform this step
I don't foresee any problems with this.
But since this operation is used over a lot of places, I would like to conform whether it can have any replications. Also is there any compiler dependency?
No, this is fine and 100% standard.
A void * can be converted to and from any other data/object pointer without problems.
Note that data/object restriction: function pointers do not convert to/from void *.
This is the reason why you shouldn't cast the return value of malloc(), which returns a void *.
In C a void * is implicitly compatible with any pointer. That why you don't need to cast when e.g. passing pointers of any type to functions taking void * arguments, or when assigning to pointer (still of any type) from function returning void * (here malloc is a good example).
Is it legal to access a pointer type through a void **?
I've looked over the standards quotes on pointer aliasing but I'm still unsure on whether this is legal C or not:
int *array;
void **vp = (void**)&array;
*vp = malloc(sizeof(int)*10);
Trivial example, but it applies to a more complex situation I'm seeing.
It seems that it wouldn't be legal since I'm accessing an int * through a variable whose type is not int * or char *. I can't come to a simple conclusion on this.
Related:
Does C have a generic "pointer to a pointer" type?
C-FAQ question 4.9
No. void ** has a specific type (pointer to a pointer-to-void). I.e. the underlying type of the pointer is "pointer-to-void"
You're not storing a like-pointer value when storing a pointer-to-int. That a cast is required is a strong indicator what you're doing is not defined behavior by the standard (and it isn't). Interestingly enough, however, you can use a regular void* coming and going and it will exhibit defined behavior. In other words, this:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *array;
void *vp = &array;
int **parray = vp;
*parray = malloc(sizeof(int)*10);
}
is legitimate. Your original example won't even compile if I remove the cast and use apple llvm 4.2 (clang), due precisely to incompatible pointer types, i.e. the very subject of your question. The specific error is:
"Incompatible pointer types initializing 'void **' with an expression of type 'int **'"
and rightfully so.
Pointer to different types can have different sizes.
You can store a pointer to any type into a void * and then you can recover it back but this means simply that a void * must be large enough to hold all other pointers.
Treating a variable that is holding an int * like it's indeed a void * is instead, in general, not permitted.
Note also that doing a cast (e.g. casting to int * the result of malloc) is something completely different from treating an area of memory containing an int * like it's containing a void *. In the first case the compiler is informed of the conversion if needed, in the second instead you're providing false information to the compiler.
On X86 however they're normally the same size and you're safe if you just play with pointers to data (pointers to functions could be different, though).
About aliasing any write operation done through a void * or a char * can mutate any object so the compiler must consider aliasing as possible.
Here however in your example you're writing through a void ** (a different thing) and the compiler is free to ignore potentially aliasing effects to int *.
Your code may work on some platforms, but it is not portable. The reason is that C doesn't have a generic pointer to pointer type. In the case of void * the standard explicitly permits conversions between it and other pointer to complete/incomplete types, but this is not the case with void **. What this means is that in your code, the compiler has no way of knowing if the value of *vp was converted from any type other than void *, and therefore can not perform any conversions except the one you explicitly cast yourself.
Consider this code:
void dont_do_this(struct a_t **a, struct b_t **b)
{
void **x = (void **) a;
*x = *b;
}
The compiler will not complain about the implicit cast from b_t * to void * in the *x = *b line, even though that line is trying to put a pointer to a b_t in a place where only pointers to a_t should be put. The mistake is in fact in the previous line, which is converting "a pointer to a place where pointers to a_t can be put" to "a pointer to a place where pointers to anything can be put". This is the reason there is no implicit cast possible. For an analogous example with pointers to arithmetic types, see the C FAQ.
Your cast, then, even though it shuts the compiler warning up, is dangerous because not all pointer types may have the same internal representation/size (e.g. void ** and int *). To make your code work in all cases, you have to use an intermediate void *:
int *array;
void *varray = array;
void **vp = &varray;
*vp = malloc(sizeof(int) * 10);
Being new to C, the only practical usage I have gotten out of void pointers is for versatile functions that may store different data types in a given pointer. Therefore I did not type-cast my pointer when doing memory allocation.
I have seen some code examples that sometimes use void pointers, but they get type-cast. Why is this useful? Why not directly create desired type of pointer instead of a void?
There are two reasons for casting a void pointer to another type in C.
If you want to access something being pointed to by the pointer ( *(int*)p = 42 )
If you are actually writing code in the common subset of C and C++, rather than "real" C. See also Do I cast the result of malloc?
The reason for 1 should be obvious. Number two is because C++ disallows the implicit conversion from void* to other types, while C allows it.
You need to cast void pointers to something else if you want to dereference them, for instance you get a void pointer as a function parameter and you know for sure this is an integer:
void some_function(void * some_param) {
int some_value = *some_param; /* won't work, you can't dereference a void pointer */
}
void some_function(void * some_param) {
int some_value = *((int *) some_param); /* ok */
}