Why is a double-void pointer required here? Dynamic "generic" array - c

I tried to implement a form of collections-library. I do it all the time, when learning a new language, because it teaches most of the language details.
So, I started with a form of "generic" dynamic array. Well it is not really generic, because it just holds pointers to the actual data.
But to be honest, I don't fully understand, why I need a double void pointer here.
The Vector struct defined in my header file (I declared every method and #include in the header file, but I omitted this here to keep the code readable. I also ommitted some bounds checks)
typedef struct {
size_t capacity; //the allocated capacity
size_t length; //the actual length
void **data; //here I don't fully understand, why I need a double pointer.
} Vector;
Here is my implementation of a few methods, where the compiler complains when I use a single void pointer in my struct, so void *data instead of void **data.
#include "utils.h"
const size_t INITIAL_SIZE = 16;
//Creates a new empty vector.
Vector *vec_new(void) {
printf("sizeof Vector is: %ld", sizeof(Vector));
Vector *vec = malloc(sizeof(Vector));
vec->length = 0;
vec->capacity = INITIAL_SIZE;
void *data = calloc(INITIAL_SIZE, sizeof(void*));
if(data == NULL) {
free(vec->data);
fprintf(stderr, "Error allocating memory.");
exit(EXIT_FAILURE);
}
vec->data = data;
return vec;
}
//This method appends the specified value at the end of the vector.
void vec_push(Vector *vec, void *data) {
if(vec->length == vec->capacity-1) {
vec_resize(vec);
}
vec->data[vec->length] = data;
vec->length += 1;
}
//gets the value at the specified index or NULL if index is out of bounds.
void *vec_get(Vector *vec, size_t index) {
return vec->data[index];
}
//Resizes the vector to 1.5x its current capacity.
void vec_resize(Vector *vec) {
vec->capacity *= 1.5;
void *data = realloc(vec->data, sizeof(void*) * vec->capacity);
if(data == NULL) {
free(vec->data);
fprintf(stderr, "Error allocating memory.");
exit(EXIT_FAILURE);
}
vec->data = data;
}
It seems like here is where the magic happens, which i do not yet understand:
void *data = malloc(...);
vec->data = data;
Malloc/calloc return a void pointer, so i either have to declare an actual type or just using the returned void pointer. So the first line is clear.
vec->data is, under the assumption I do not use a double pointer in the struct definition equivalent to (*vec).data as far as I understand it. So basically this line should assing a void pointer to a void pointer.
Can maybe someone explain it to me in simple terms, why exactly a single void pointer is not enough here or where I might misunderstand something.

But to be honest, I don't fully understand, why I need a double void pointer here.
Some background first - maybe you already know that:
A pointer of the type someType * is a pointer to some variable of the type someType or to an array of variables of the type someType.
A pointer of the type someType ** is a pointer to a variable of the type someType * - this means: A pointer to a pointer to a variable of the type someType.
A pointer of the type void * is a pointer to anything; because the compiler does not know to what kind of element this pointer points to, it is not possible to access such an element directly.
In contrast to this, it is known what variable a pointer of the type void ** points to: It points to a variable of the type void *.
Why you need void** in this position:
The key are the lines:
vec->data[vec->length] = data;
...
return vec->data[index];
In these lines, the code accesses the data vec->data points to. For this reason, vec->data cannot be void * but it must be xxx * while xxx is the type of data the pointer vec->data points to. And because vec->data points to a pointer of the type void *, xxx is void * so xxx * is void **.
vec->data = data;
Your observation is correct: vec->data is of the type void ** and data is of the type void *.
The reason is that malloc() returns some memory and the compiler does not know which kind of data is stored in this memory. So the value returned by malloc() is void * and not void **.
In the automotive industry, you would use an explicit pointer cast like this:
vec->data = (void **)data;
The expression (xxx *)y tells the compiler that the pointer y points to some data of the type xxx. So (void **) tells the compiler that the pointer points to an element of the type void *.
However, in desktop applications you often don't write the (void **).

If you have a pointer of the type
T *p1;
where T is some type specifier as for example void then pointer to this pointer will be declared like
T **p2 = &p1.
In this call of calloc
calloc(INITIAL_SIZE, sizeof(void*))
you are going to allocate an array of pointers of the type void *. The function returns a pointer to the first element of the allocated array. So you need to write
void **data = calloc(INITIAL_SIZE, sizeof(void*));
To make it more clear let's assume that you need to allocate dynamically an integer array. In this case you will write
int *data = calloc( INITIAL_SIZE, sizeof( int ) );
So dereferencing the pointer data like *data you will get an object of the type int more precisely the first element of the allocated array.
When elements of the array have the type void * then dereferencing the pointer data like *data you must to get a pointer of the type void * (the first element of the allocated array). So to make the operation correct the pointer data shall have the type void **.

Related

Understanding use of memcpy on memory allocation

Looking at the source code for e2fsprogs and wanting to understand the use of internal memory routines. Allocating and freeing.
More to the point why use memcpy instead of direct handling?
Allocate
For example ext2fs_get_mem is:
/*
* Allocate memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_get_mem(unsigned long size, void *ptr)
{
void *pp;
pp = malloc(size);
if (!pp)
return EXT2_ET_NO_MEMORY;
memcpy(ptr, &pp, sizeof (pp));
return 0;
}
I guess the use of a local variable is as not to invalidate the passed ptr in case of malloc error.
Why memcpy instead of setting ptr to pp on success?
Free
The memory is copied to a local variable, then freed, then memcpy on the passed pointer to pointer. As the allocation uses memcpy I guess it has to do some juggling on free as well.
It can not free directly?
And what does the last memcpy do? Isn't sizeof(p) size of int here?
/*
* Free memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_free_mem(void *ptr)
{
void *p;
memcpy(&p, ptr, sizeof(p));
free(p);
p = 0;
memcpy(ptr, &p, sizeof(p));
return 0;
}
Example of use:
ext2_file_t is defined as:
typedef struct ext2_file *ext2_file_t;
where ext2_file has, amongst other members, char *buf.
In dump.c : dump_file()
Here we have:
ext2_file_t e2_file;
retval = ext2fs_file_open(current_fs, ino, 0, &e2_file);
It calls ext2fs_file_open() which do:
ext2_file_t file;
retval = ext2fs_get_mem(sizeof(struct ext2_file), &file);
retval = ext2fs_get_array(3, fs->blocksize, &file->buf);
And the free routine is for example:
if (file->buf)
ext2fs_free_mem(&file->buf);
ext2fs_free_mem(&file);
You cannot assign directly to the ptr parameter, as this is a local variable. memcpying to ptr actually writes to where the pointer points to. Compare the following usage code:
struct SomeData* data;
//ext2fs_get_mem(256, data); // wrong!!!
ext2fs_get_mem(256, &data);
// ^ (!)
You would achieve exactly the same with a double pointer indirection:
_INLINE_ errcode_t ext2fs_get_mem_demo(unsigned long size, void** ptr)
{
*ptr = malloc(size);
return *ptr ? 0 : EXT2_ET_NO_MEMORY;
}
but this variant requires the pointer being passed to to be of type void*, which is avoided by the original variant:
void* p;
ext2fs_get_mem_demo(256, &p);
struct SomeData* data = p;
Note: One additional variable and one additional line of code (or at very least one would need a cast)...
Note, too, that in the usage example ext_file_t should be a typedef to a pointer type to make this work correctly (or uintptr_t) or at least have a pointer as its first member (address of struct and address of its first member are guaranteed to be the same in C).
/* The 'ptr' arg must point to a pointer. */
can be read as "The ptr can point to pointer to ANYTHING".
It is a very simple malloc-wrapper in a library; to be useful it has to work for any type. So void * is the argument.
With a real type the function looks like this, with direct pointer assignment:
int g(unsigned long size, int **ptr)
{
void *pp;
pp = malloc(size);
if (!pp)
return 1;
*ptr = pp;
return 0;
}
The same *ptr = pp gives a invalid-void error with void *ptr as argument decalration. Somehow disappointing, but then again it is called void *, not any *.
With void **ptr there is a type warning like:
expected 'void **' but argument is of type 'int **'
So memcpy to the rescue. It looks like even without optimization, the call is replaced by a quadword MOV.

How to get struct address inside array of structs?

I'm trying to get struct's address.
I want to get address in an int *, and I want to change address by adding numbers to the int *. I tried several ways, but I can't solve it.
struct num_d {
unsigned char data;
unsigned char pad1;
unsigned char pad2;
unsigned char pad3;
};
struct num_d **m = malloc(sizeof(struct num_d *) * row);
for (int i = 0; i < row; i++)
{
m[i] = malloc(sizeof(struct num_d) * col);
}
How can I get m[0][0]'s address in an int *?
first things first lets typedef your struct, so we can type less and be more clear:
typedef struct num_d num_d;
void pointer
A pointer to void is a "generic" pointer type. A void * can be converted to any other pointer type without an explicit cast. we cannot de-reference a void * or do pointer arithmetic with it; you must convert it to a complete data type pointer first (like int* e.g.) then do the de-refrence or the pointer arithmetic.
Now, malloc() return a void* which points to the allocated heap buffer (if malloc successed in allocation other wise null is the return value).
you code become:
num_d** m = malloc(sizeof(num_d*) * row); /*m is an array of void* pointers (not initialized)*/
for (int i = 0; i < row; i++)
{
m[i] = malloc(sizeof(num_d) * col); /*in each element in m you have a void* that points to struct num_d on the heap*/
}
the sizeof(void*) is the same as sizeof any pointer (except function pointers in some machines/os).
putting it all together
How can I get m[0][0]'s address in an int *?
This is a wrong question! because m is an array of void* to "num_d structs" (holding the num_d heap address).
if you want the start address of the i-th num_d struct in the array m, then, just return the void* in the index i in this array m[i]. and if you want to cast it just cast it (no need actually) just assign it:
int* ptr = m[i];
Take in mind that compilers will warn you, regarding the assignment above (but this assignment is supported and legal) :
warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
or (no need again):
int* ptr = (int*)m[i];
I don't know why you need such behavior, it makes more sense to cast to num_d*
if you want the address of the first data member in the struct num_d, then you must cast to the appropriate data type to get the expected data:
unsigned char data = ((num_d*)m[i])->data;
unsigned char* p_data = &((num_d*)m[i])->data;
You don't need to have the address in an int* in order to be adding to it. The way that [] works, is that it adds to the pointer and dereferences.
You can just add to *(m[0] + 1) to get the second element.
How about:
int *ptr = (int *) m[0];

Assignment from void pointer to another void pointer

I want to copy the bits from one void * to another void *.
How can I do it?
I tried this:
static void* copyBlock(void* ptr) {
if (!ptr) {
return NULL;
}
int sizeOfBlock=*(int*)ptr+13;
void* copy = malloc(sizeOfBlock);
if (!copy) {
return NULL;
}
for(int i=0;i<sizeOfBlock;i++){
*(copy+i)=*(ptr+i);
}
return copy;
}
but I get: invalid use of void expression
You cannot dereference, perform pointer arithmetic, indexing a void pointer because it has no base type or object size. You must therefore cast the void pointer to the a pointer to the type of the data units you are copying so that the compiler will know the size of the data to copy.
All that said, you'd be better off using:
memcpy( copy, prt, sizeOfBlock ) ;
This design (storing block size inside of a block without any struct) seems dangerous to me, but I still know the answer.
*(copy+i)=*(ptr+i);
Here you get the error, because you can't dereference a void pointer. You need to cast it to pointer to something before. Like this:
((char *)copy)[i] = ((char *)ptr)[i];
You should use the memcpy function:
memcpy(copy, ptr, sizeOfBlock);
Depending on the compiler settings (you may be compiling as C++ and not as C), you may need to cast the pointers to a char pointer:
memcpy((char *) copy, (const char *) ptr, sizeOfBlock);
Note: The parameter of the function should be const char *ptr, to make sure you don't change the contents of ptr by mistake.

What does the declaration void** mean in the C language?

I'm beginning to learn C and read following code:
public void** list_to_array(List* thiz){
int size = list_size(thiz);
void **array = malloc2(sizeof(void *) * size);
int i=0;
list_rewind(thiz);
for(i=0; i<size; i++){
array[i] = list_next(thiz);
}
list_rewind(thiz);
return array;
}
I don't understand the meaning of void**. Could someone explain it with some examples?
void** is a pointer to a pointer to void (unspecified type). It means that the variable (memory location) contains an address to a memory location, that contains an address to another memory location, and what is stored there is not specified. In this question's case it is a pointer to an array of void* pointers.
Sidenote: A void pointer can't be dereferenced, but a void** can.
void *a[100];
void **aa = a;
By doing this one should be able to do e.g. aa[17] to get at the 18th element of the array a.
To understand such declarations you can use this tool and might as well check a related question or two.
void** is a pointer to void*, or a pointer to a void pointer if you prefer!
This notation is traditionally used in C to implement a matrix, for example. So, in the matrix case, that would be a pointer to an array of pointers.
Normally void * pointers are used to denote a pointer to an unknown data type. In this case your function returns an array of such pointers thus the double star.
In C, a pointer is often used to reference an array. Eg the following assignment is perfectly legal:
char str1[10];
char *str2 = str1;
Now when void is used, it means that instead of char you have a variable of unknown type.
Pointers to an unknown data type are useful for writing generic algorithms. Eg. the qsort function in standard C library is defined as:
void qsort ( void * base,
size_t num,
size_t size,
int ( * comparator )
( const void *, const void * ) );
The sorting algorithm itself is generic, but has no knowledge of the contents of the data. Thus the user has to provide an implementation of a comparator that can deal with it. The algorithm will call the comparator with two pointers to the elements to be compared. These pointers are of void * type, because there is now information about the type of data being sorted.
Take a look at this thread for more examples
http://forums.fedoraforum.org/showthread.php?t=138213
void pointers are used to hold address of any data type. void** means pointer to void pointer. Void pointers are used in a place where we want a function should receive different types of data as function argument. Please check the below example
void func_for_int(void *int_arg)
{
int *ptr = (int *)int_arg;
//some code
}
void func_for_char(void *char_arg)
{
char *ptr = (char *)char_arg;
//some code
}
int common_func(void * arg, void (*func)(void *arg))
{
func(arg);
}
int main()
{
int a = 10;
char b = 5;
common_func((void *)&a, func_for_int);
common_func((void *)&b, func_for_char);
return 0;
}

What is the use of void** as an argument in a function?

I have to implement a wrapper for malloc called mymalloc with the following signature:
void mymalloc(int size, void ** ptr)
Is the void** needed so that no type casting will be needed in the main program and the ownership of the correct pointer (without type cast) remains in main().
void mymalloc(int size, void ** ptr)
{
*ptr = malloc(size) ;
}
main()
{
int *x;
mymalloc(4,&x); // do we need to type-cast it again?
// How does the pointer mechanism work here?
}
Now, will the pointer being passed need to be type-cast again, or will it get type-cast implicitly?
I do not understand how this works.
malloc returns a void*. For your function, the user is expected to create their own, local void* variable first, and give you a pointer to it; your function is then expected to populate that variable. Hence you have an extra pointer in the signature, a dereference in your function, and an address-of operator in the client code.
The archetypal pattern is this:
void do_work_and_populate(T * result)
{
*result = the_fruits_of_my_labour;
}
int main()
{
T data; // uninitialized!
do_work_and_populate(&data); // pass address of destination
// now "data" is ready
}
For your usage example, substitute T = void *, and the fruits of your labour are the results of malloc (plus checking).
However, note that an int* isn't the same as a void*, so you cannot just pass the address of x off as the address of a void pointer. Instead, you need:
void * p;
my_malloc(&p);
int * x = p; // conversion is OK
Contrary to void *, the type void ** is not a generic pointer type so you need to cast before the assignment if the type is different.
void ** ptr
Here, "ptr" is a pointer to a pointer, and can be treated as a pointer to an array of pointers. Since your result is stored there (nothing returned from mymalloc), you need to clarify what you wish to allocate into "ptr". The argument "size" is not a sufficient description.

Resources