Void pointers pretending to be void double pointers - c

I've been doing some thinking. I haven't found anything directly answering this question, but I think I know the answer; I just want some input from some more experienced persons.
Knowns:
A void pointer points to just a memory address. It includes no type information.
An int pointer points to a memory address containing an int. It will read whatever is in the memory address pointed to as an integer, regardless of what was stuffed into the address originally.
Question:
If a void double pointer void ** foo were to point to a dynamically allocated array of void pointers
void ** foo = malloc(sizeof(void *) * NUM_ELEMENTS);
is it true, as I am supposing, that because of the unique nature of void pointers actually lacking any sort of type information that instead of void ** foo an equivalent statement would be
void * bar = malloc(sizeof(void *) * NUM_ELEMENTS);
and that when I use indirection to access by assigning a specific type, such as with
(It was pointed out that I can't dereference void pointers. For clarity to the purpose of the question the next line is changed to be appropriate to that information)
int ** fubar = bar;
that I would get an appropriate pointer from the single void pointer which is just acting like a double pointer?
Or is this all just in my head?

It is permissible to assign the result of malloc to a void * object and then later assign it to an int ** object. This is because the return value of malloc has type void * anyway, and it is guaranteed to be suitable for assignment a pointer to any type of object with a fundamental alignment requirement.
However, this code:
#define NUM_ELEMENTS 1000
void *bar = malloc(sizeof(void *) * NUM_ELEMENTS);
int **fubar = bar;
*fubar = 0;
is not guaranteed by the C standard to work; it may have undefined behavior. The reason for this is not obvious. The C standard does not require different types of pointers to have the same size. A C implementation may set the size of an int * to one million bytes and the size of a void * to four bytes. In this case, the space allocated for 1000 void * would not be enough to hold one int *, so the assignment to *fubar has undefined behavior. Generally, one would implement C in such a way only to prove a point. However, similar errors are possible on a smaller scale: There are C implementations in which pointers of different types have different sizes.
A pointer to an object type may be converted to a pointer to another object type provided the pointer has alignment suitable for the destination type. If it does, then converting it back yields a pointer with the original value. Thus, you may convert pointers to void * to pointers to void and back, and you may convert pointers to void * to pointers to int * and back, provided the alignments are suitable (which they will be if the pointers were returned by malloc and you are not using custom objects with extended alignments).
In general, you cannot write using a pointer to an object type and then read the same bytes using a pointer to a different object type. This violates aliasing rules. An exception is that if one of the pointers is to a character type. Also, many C implementations do support such aliasing, but it may require setting command-line options to enable such support.
This prohibition on aliasing includes reinterpreting pointers. Consider this code:
int a;
int *b = &a;
void **c = (void **) &b;
void *d = *c;
int *e = (int *) d;
In the fourth line, c points to the bytes that b occupies but *c tries to interpret those bytes as a void *. This is not guaranteed to work, so the value that d gets is not necessarily a pointer to a, even when it is converted to int * as in the last line.

Under the C Standard, the behavior of the code you gave is undefined because you allocated an array of void pointers and then tried to use it as an array of int pointers. There is nothing in the Standard that requires these two kinds of pointer to have the same size or alignment. Now if you had said
void * bar = malloc(sizeof(int*) * NUM_ELEMENTS);
int ** fubar = bar;
Then all would be fine.
Now on the vast majority of machines, an int* and a void* will actually have the same size and alignment. So your code ought to work fine in practice.
Additionally, these two are not equivalent:
void ** foo = malloc(sizeof(void *) * NUM_ELEMENTS);
void * bar = malloc(sizeof(void *) * NUM_ELEMENTS);
This is because foo can be dereferenced at any element to get a void pointer, while bar cannot. For example, this program is correct and prints 00000000 on my 32-bit machine:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
void **a = calloc(10, sizeof(void*));
printf("%p\n", a[0]);
return 0;
}
One other point is that you seem to be thinking that type information is explicit in the pointer at the machine level. This is not true (at least for the vast majority of implementations). The type of C pointers is normally represented only while the program is being compiled. By the time compilation is done, explicit type information is normally lost except in debugging symbol tables, which are not runnable code. (There are some minor exceptions to this. And for C++ the situation is very different.)

Related

How can I allocate a pointer to a struct in C?

#include <stdlib.h>
struct foo{
int a;
};
int main(int argc, char * * argv){
struct foo * bar = malloc(sizeof(struct foo));
free(bar);
return EXIT_SUCCESS;
}
Wouldn't this cause undefined behavior according to the standard? If so, what should I do instead to adhere to the standard?
https://stackoverflow.com/a/1241314/13959281
If the question is what will happen if the malloc fails and bar will be assigned NULL, then the answer is: nothing will happen when free is called. free function checks if the pointer passed is NULL. If the pointer is NULL no action is taken. So there is no UB here.
As a general remark: it is safer (or at least less error-prone) if instead of types the actual objects are used:
struct foo * bar = malloc(sizeof(*bar));
#EDIT#
OPs comment clarifies the question. The size of pointer in the implementation does not matter as C standard guarantees that any pointer to object type (not function pointer) can be converted to void * and void * can be converted to any type of pointer. How it is actually done is left to the implementation. So it is 100% safe as it is guaranteed by the C standard.

Is it safe to use void ** when freeing double pointers of other types?

I instantiated a two dimensional array in a double int pointer by the following methodology.
int **make2Dint(int rows, int columns) {
//a is our 2D array
int **a = (int **)calloc(rows, sizeof(int *));
//If we can't allocate the necessary memory, return null.
if (!a) {
return (int **)NULL;
}
//Allocate each row with the correct amount of column memory.
for (int i = 0; i < rows; i++) {
a[i] = (int *) calloc(columns, sizeof(int));
//If we can't allocate the necessary memory, return null.
if (!a[i]) {
return (int **)NULL;
}
}
return a;
}
For this, I also wrote a free function, which is as follows
void free2DArray(void **a, int rows) {
for (int i = 0; i < rows; i++) {
free(a[i]);
}
free (a);
}
When going to use my free function, (by way of free2DArray(array, rows); however, gcc gave me a warning.
In file included from life.c:6:0:
twoD.h:14:6: note: expected ‘void **’ but argument is of type ‘int **’
void free2DArray(void **a, int rows);
^~~~~~~~~~~
Now, I can make this go away with a cast to void **, but this seems to indicate that my usage of void ** is problematic.
There is no generic pointer-to-pointer type in C. void * acts as a generic pointer only because conversions (if necessary) are applied automatically when other pointer types are assigned to and from void *'s; these conversions cannot be performed if an attempt is made to indirect upon a void ** value which points at a pointer type other than void *. When you make use of a void ** pointer value (for instance, when you use the * operator to access the void * value to which the void ** points), the compiler has no way of knowing whether that void * value was once converted from some other pointer type. It must assume that it is nothing more than a void *; it cannot perform any implicit conversions.
In other words, any void ** value you play with must be the address of an actual void * value somewhere; casts like (void **)&dp, though they may shut the compiler up, are nonportable (and may not even do what you want; see also question 13.9). If the pointer that the void ** points to is not a void *, and if it has a different size or representation than a void *, then the compiler isn't going to be able to access it correctly.
However, since I'm not dereferencing this pointer, I'm not sure that it's problematic. Can someone explain whether or not it's problematic, and why?
No. Even if the pointer types have the same size and representation, C does not permit distinct types to alias except in a few special cases, and compilers can and will make transformations (optimizations) assuming this. If you really want such a generic free loop, you can put the loop in a macro that expands in the context where it has the right pointer type (in the caller) but you should really just avoid "deep" pseudo 2D arrays
Creating 2D arrays as nested 1D arrays is bad practice. You should allocate the entire space in a single step, and free() it just once.
Not only will this entirely sidestep your current issue, it will give you superior performance for rectangular (not jagged) 2D arrays, because all the memory will be contiguous, rather than scattered by row/column.

Can representation and alignment be assumed for qualified types within a family

I have read this post.
From that, I read this:
From C99: 6.2.7.27
A pointer to void shall have the same representation and alignment
requirements as a pointer to a character type.39) Similarly, pointers
to qualified or unqualified versions of compatible types shall have
the same representation and alignment requirements. All pointers to
structure types shall have the same representation and alignment
requirements as each other. All pointers to union types shall have the
same representation and alignment requirements as each other.
Pointers to other types need not have the same representation or alignment requirements. (emphasis mine)
My interpretation of the parts important to me for the purpose of this question seem to say
if I have:
int *a, **b;
registration and alignment are guaranteed, and that all of these statements are true;
sizeof(a)==sizeof(*a)&&
sizeof(int *)==sizeof(b)&&
sizeof(*b)==sizeof(**b);// all essentially int pointers,
// and would be equal
but if I have:
int *a;
float*b;
registration and alignment are not guaranteed. i.e.:
sizeof(a)!=sizeof(b)&&
sizeof(float *)!=sizeof(int *)&&
sizeof(*b)!=sizeof(*a);//all pointers, but not of compatible types
//therefore not guaranteed to be equal.
The reason I ask is because of this discussion,
where I posted an answer showing a function that creates a 3D array:
int *** Create3D(int p, int c, int r)
{
int ***arr;
int x,y;
arr = calloc(p, sizeof(arr));
for(x = 0; x < p; x++)
{
arr[x] = calloc(c ,sizeof(arr));
for(y = 0; y < c; y++)
{
arr[x][y] = calloc(r, sizeof(int));
}
}
return arr;
}
Is the following statement safe in terms of using sizeof()?
arr = calloc(p, sizeof(arr));
Or, even though only int types are used, should it be:
arr = calloc(p, sizeof(int **));
or:
arr = calloc(p, sizeof *arr);
The question:
Given arr is declared as int ***:
For allocating memory, as long as type stays int is there any danger of using any of the variations of int pointer (int *, int **, arr, *arr, int ***) as the argument to sizeof ?
Is one form preferred over the other? (please give reasons other than style)
My interpretation of the parts important to me for the purpose of this question seem to say
if I have:
int *a, **b;
registration and alignment are guaranteed,
a is a pointer to int. b is a pointer to int *. These are not compatible types, and the standard does not require pointers to these types to have the same representation or alignment.
and that all of these statements are true[:]
sizeof(a)==sizeof(*a)&&
sizeof(int *)==sizeof(b)&&
sizeof(*b)==sizeof(**b);// all essentially int pointers,
// and would be equal
No, the standard does not require any of those to be true. The first is frequently false on 64-bit systems. The others are typically true, but if you're looking for guarantees then the standard does not offer them.
but if I have:
int *a;
float*b;
registration and alignment are not guaranteed. i.e.:
Correct, the representations and alignment of float * and int * are not guaranteed to be the same.
The reason I ask is because of this discussion,
where I posted an answer showing a function that creates a 3D array:
[..]
int ***arr;
int x,y;
arr = calloc(p, sizeof(arr));
That will often work as intended because on most systems, all object pointers in fact do have the same size, but C does not require it to be correct. It should be:
arr = calloc(p, sizeof(*arr));
or
arr = calloc(p, sizeof(int **));
Likewise, this:
arr[x] = calloc(c ,sizeof(arr));
should be
arr[x] = calloc(c ,sizeof(*arr[x]));
or
arr[x] = calloc(c ,sizeof(int *));
. This one is ok, though:
arr[x][y] = calloc(r, sizeof(int));
For allocating memory, and as long as type stays int, in that statement, is there any danger of using any variation of int pointer (int *, int **, arr, *arr) as the argument to sizeof ?
Yes. Given any type T, T * is a different, incompatible type. Regardless of qualification or pointerness, the standard provides no guarantee that the two have the same representation or alignment. Part of not having a guarantee of the same representation is not having a guarantee of the same size. In particular, if T is int then there are common cases in which T and T * have different size.
Is one form preferred over the other? (please give reasons other than style)
Although it's arguable whether this is a point of style, this form:
arr = calloc(p, sizeof(*arr));
has the advantage that if you change the type of arr, you don't have to modify the calloc call. The correct size follows from the declaration of arr. It's also easy to tell that the size is right, without looking up the declaration of arr. And it's easy to write a macro around that form if you should wish to do so.
Consider a fantasy compiler that only makes pointers as wide as needed.
Code uses a large set (trillions) of int and only a small set (2) of int *.
int *ticket = malloc(sizeof *ticket * 1000ULL*1000*1000*1000);
int **purchase = malloc(sizeof *purchase * 2);
Pointer arithmetic need only work within the range + 1 of the allocated memory. Pointers to int need to have have precision of 39+ bits. Pointers to int * need only have 2+ bits. Such a compiler could maintain the base of all int in code, somewhere, and add the a scaled 39 bit pointer when needed to form the physical address. Pointers to int * could use their fews bits along with a hidden in code base and scale to produce the physical address of the int *.
This is a reason for casting to void * on printf("%p", (void *) ptr) as such a compiler would need to put together the base and scale to form a generic void * pointer which could point to any object in memory. It is also a reason why casting (int **)((void *)ticket) is UB.
Similar systems existed in DOS segment:offset days with the functions in one address space (being 32 or 16 bit) and data in another address space (being independent from code 32 or 16 bit).
Present data embedded processors sometimes use one pointer size for constant data and another size for variable data.
C supports many architectures, not only the flat model of a 64-bit pointer that can point anywhere.

what does it mean to have a void* member of a struct in c?

I don't understand what kind of property the mystery member is below:
typedef struct _myobject
{
long number;
void *mystery;
} t_myobject;
What kind of member is this void member? How much memory does that take up? Where can I get more information about what that accomplishes (for instance, why would one use a void member?)
EDIT-- updated title to say void*
A void* variable is a "generic" pointer to an address in memory.
The field mystery itself consumes sizeof(void*) bytes in memory, which is typically either 4 or 8, depending on your system (on the size of your virtual memory address space, to be more accurate). However, it may point to some other object which consumes a different amount of memory.
A few usage examples:
int var;
char arr[10];
t_myobject obj;
obj.mystery = &var;
obj.mystery = arr;
obj.mystery = malloc(100);
Your struct declaration says void *, and your question says void. A void pointer member is a pointer to any kind of data, which is cast to the correct pointer type according to conditions known at run-time.
A void member is an "incomplete type" error.
Variable of type void * can hold address of any symbol. Assignment to this variable can be done directly but while dereferencing it needs to be type cast to the actual type. This is required to inform the compiler about how much memory bytes needs to be accessed while dereferencing. Data type is the one which tells the size of a variable.
int a = 10;
char b = 'c';
void *c = NULL;
c = &a;
printf("int is %d\n", *((int*)c));
c = &b;
printf("char is %c\n", *(char*)c));
In above example void pointer variable c stores address of int variable a at first. So while dereferencing void pointer c, its typecasted to int *. This informs the compiler to access 4 byte (size of int) to get the value. And then in second printf its typecasted to char *, this is to inform the compiler to access one byte (size of char) to get the value.
Wrong question header. The member is 'void *', not 'void'.
A pointer to anything, rather than nothing.

Casting a void pointer to an arbitrary type pointer

I am attempting to do my own implementation of common data structures in C for my own learning benefit. My current effort is a vector, and I want it to be able to hold a single arbitrary type (or at least type size, but isn't that all that really matters in C?). My struct is as follows:
struct vector
{
void *item;
size_t element_size;
size_t num_elements;
}
However, what I don't understand is how I can refer to specific elements in the *item array if the type is supposed to be arbitrary. I know the element_size, but that doesn't help me with index referencing (e.g. item[5]) because void is not a type. I figured it would be easiest to refer to the elements as byte offsets. So if I was holding a vector of structs with size 12, item[5] would be at 12*5=60 bytes from item*. However, I don't understand how to retrieve that data. I know I want 12 bytes from item+60, but how would I make the compiler understand that? Am I getting into preprocessor territory?
sizeof is measure in chars, so you can do this:
void *start_of_sixth_element = ((char*)item) + (5 * element_size);
Someone who knows what the type of each element is could then cast start_of_sixth_element to the correct type to use it.
void* is a bad choice in a way, since you can't do pointer arithmetic with void* pointers in standard C (there's a GNU extension to allow it, but for portable code you use char* or unsigned char* at least for the arithmetic).
Code that knows the correct type before applying the offset 5, could just do this:
correct_type *sixth_element = ((correct_type *)item) + 5;
void is not a type
It is a type, it's just not a "complete type". "Incomplete type" pretty much means the compiler doesn't know what a void* really points to.
The void * is a generic pointer type for any object pointer type. You have to know what type it points into in order to correctly cast the void * and dereference the pointer.
Here is an example with a void * object used with different pointer types.
void *p;
int a = 42, b;
double f = 3.14159, g;
p = &a;
b = *(int *) a;
p = &f;
g = *(double *) f;
So if I was holding a vector of structs with size 12, item[5] would be at 12*5=60 bytes from item*. However, I don't understand how to retrieve that data. I know I want 12 bytes from item+60, but how would I make the compiler understand that?
Say that your given type is foo. As you already understood, you need to reach the address at item + (5 * 12) then dereference the void pointer after casting it to foo*.
foo my_stuff = *(foo *)(item + 5 * 12);
You can also use sizeof() if you can determine at compile time the type of your data :
foo my_stuff = *(foo *)(item + 5 * sizeof(foo));

Resources