Understanding use of memcpy on memory allocation - c

Looking at the source code for e2fsprogs and wanting to understand the use of internal memory routines. Allocating and freeing.
More to the point why use memcpy instead of direct handling?
Allocate
For example ext2fs_get_mem is:
/*
* Allocate memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_get_mem(unsigned long size, void *ptr)
{
void *pp;
pp = malloc(size);
if (!pp)
return EXT2_ET_NO_MEMORY;
memcpy(ptr, &pp, sizeof (pp));
return 0;
}
I guess the use of a local variable is as not to invalidate the passed ptr in case of malloc error.
Why memcpy instead of setting ptr to pp on success?
Free
The memory is copied to a local variable, then freed, then memcpy on the passed pointer to pointer. As the allocation uses memcpy I guess it has to do some juggling on free as well.
It can not free directly?
And what does the last memcpy do? Isn't sizeof(p) size of int here?
/*
* Free memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_free_mem(void *ptr)
{
void *p;
memcpy(&p, ptr, sizeof(p));
free(p);
p = 0;
memcpy(ptr, &p, sizeof(p));
return 0;
}
Example of use:
ext2_file_t is defined as:
typedef struct ext2_file *ext2_file_t;
where ext2_file has, amongst other members, char *buf.
In dump.c : dump_file()
Here we have:
ext2_file_t e2_file;
retval = ext2fs_file_open(current_fs, ino, 0, &e2_file);
It calls ext2fs_file_open() which do:
ext2_file_t file;
retval = ext2fs_get_mem(sizeof(struct ext2_file), &file);
retval = ext2fs_get_array(3, fs->blocksize, &file->buf);
And the free routine is for example:
if (file->buf)
ext2fs_free_mem(&file->buf);
ext2fs_free_mem(&file);

You cannot assign directly to the ptr parameter, as this is a local variable. memcpying to ptr actually writes to where the pointer points to. Compare the following usage code:
struct SomeData* data;
//ext2fs_get_mem(256, data); // wrong!!!
ext2fs_get_mem(256, &data);
// ^ (!)
You would achieve exactly the same with a double pointer indirection:
_INLINE_ errcode_t ext2fs_get_mem_demo(unsigned long size, void** ptr)
{
*ptr = malloc(size);
return *ptr ? 0 : EXT2_ET_NO_MEMORY;
}
but this variant requires the pointer being passed to to be of type void*, which is avoided by the original variant:
void* p;
ext2fs_get_mem_demo(256, &p);
struct SomeData* data = p;
Note: One additional variable and one additional line of code (or at very least one would need a cast)...
Note, too, that in the usage example ext_file_t should be a typedef to a pointer type to make this work correctly (or uintptr_t) or at least have a pointer as its first member (address of struct and address of its first member are guaranteed to be the same in C).

/* The 'ptr' arg must point to a pointer. */
can be read as "The ptr can point to pointer to ANYTHING".
It is a very simple malloc-wrapper in a library; to be useful it has to work for any type. So void * is the argument.
With a real type the function looks like this, with direct pointer assignment:
int g(unsigned long size, int **ptr)
{
void *pp;
pp = malloc(size);
if (!pp)
return 1;
*ptr = pp;
return 0;
}
The same *ptr = pp gives a invalid-void error with void *ptr as argument decalration. Somehow disappointing, but then again it is called void *, not any *.
With void **ptr there is a type warning like:
expected 'void **' but argument is of type 'int **'
So memcpy to the rescue. It looks like even without optimization, the call is replaced by a quadword MOV.

Related

Why is a double-void pointer required here? Dynamic "generic" array

I tried to implement a form of collections-library. I do it all the time, when learning a new language, because it teaches most of the language details.
So, I started with a form of "generic" dynamic array. Well it is not really generic, because it just holds pointers to the actual data.
But to be honest, I don't fully understand, why I need a double void pointer here.
The Vector struct defined in my header file (I declared every method and #include in the header file, but I omitted this here to keep the code readable. I also ommitted some bounds checks)
typedef struct {
size_t capacity; //the allocated capacity
size_t length; //the actual length
void **data; //here I don't fully understand, why I need a double pointer.
} Vector;
Here is my implementation of a few methods, where the compiler complains when I use a single void pointer in my struct, so void *data instead of void **data.
#include "utils.h"
const size_t INITIAL_SIZE = 16;
//Creates a new empty vector.
Vector *vec_new(void) {
printf("sizeof Vector is: %ld", sizeof(Vector));
Vector *vec = malloc(sizeof(Vector));
vec->length = 0;
vec->capacity = INITIAL_SIZE;
void *data = calloc(INITIAL_SIZE, sizeof(void*));
if(data == NULL) {
free(vec->data);
fprintf(stderr, "Error allocating memory.");
exit(EXIT_FAILURE);
}
vec->data = data;
return vec;
}
//This method appends the specified value at the end of the vector.
void vec_push(Vector *vec, void *data) {
if(vec->length == vec->capacity-1) {
vec_resize(vec);
}
vec->data[vec->length] = data;
vec->length += 1;
}
//gets the value at the specified index or NULL if index is out of bounds.
void *vec_get(Vector *vec, size_t index) {
return vec->data[index];
}
//Resizes the vector to 1.5x its current capacity.
void vec_resize(Vector *vec) {
vec->capacity *= 1.5;
void *data = realloc(vec->data, sizeof(void*) * vec->capacity);
if(data == NULL) {
free(vec->data);
fprintf(stderr, "Error allocating memory.");
exit(EXIT_FAILURE);
}
vec->data = data;
}
It seems like here is where the magic happens, which i do not yet understand:
void *data = malloc(...);
vec->data = data;
Malloc/calloc return a void pointer, so i either have to declare an actual type or just using the returned void pointer. So the first line is clear.
vec->data is, under the assumption I do not use a double pointer in the struct definition equivalent to (*vec).data as far as I understand it. So basically this line should assing a void pointer to a void pointer.
Can maybe someone explain it to me in simple terms, why exactly a single void pointer is not enough here or where I might misunderstand something.
But to be honest, I don't fully understand, why I need a double void pointer here.
Some background first - maybe you already know that:
A pointer of the type someType * is a pointer to some variable of the type someType or to an array of variables of the type someType.
A pointer of the type someType ** is a pointer to a variable of the type someType * - this means: A pointer to a pointer to a variable of the type someType.
A pointer of the type void * is a pointer to anything; because the compiler does not know to what kind of element this pointer points to, it is not possible to access such an element directly.
In contrast to this, it is known what variable a pointer of the type void ** points to: It points to a variable of the type void *.
Why you need void** in this position:
The key are the lines:
vec->data[vec->length] = data;
...
return vec->data[index];
In these lines, the code accesses the data vec->data points to. For this reason, vec->data cannot be void * but it must be xxx * while xxx is the type of data the pointer vec->data points to. And because vec->data points to a pointer of the type void *, xxx is void * so xxx * is void **.
vec->data = data;
Your observation is correct: vec->data is of the type void ** and data is of the type void *.
The reason is that malloc() returns some memory and the compiler does not know which kind of data is stored in this memory. So the value returned by malloc() is void * and not void **.
In the automotive industry, you would use an explicit pointer cast like this:
vec->data = (void **)data;
The expression (xxx *)y tells the compiler that the pointer y points to some data of the type xxx. So (void **) tells the compiler that the pointer points to an element of the type void *.
However, in desktop applications you often don't write the (void **).
If you have a pointer of the type
T *p1;
where T is some type specifier as for example void then pointer to this pointer will be declared like
T **p2 = &p1.
In this call of calloc
calloc(INITIAL_SIZE, sizeof(void*))
you are going to allocate an array of pointers of the type void *. The function returns a pointer to the first element of the allocated array. So you need to write
void **data = calloc(INITIAL_SIZE, sizeof(void*));
To make it more clear let's assume that you need to allocate dynamically an integer array. In this case you will write
int *data = calloc( INITIAL_SIZE, sizeof( int ) );
So dereferencing the pointer data like *data you will get an object of the type int more precisely the first element of the allocated array.
When elements of the array have the type void * then dereferencing the pointer data like *data you must to get a pointer of the type void * (the first element of the allocated array). So to make the operation correct the pointer data shall have the type void **.

Assignment from void pointer to another void pointer

I want to copy the bits from one void * to another void *.
How can I do it?
I tried this:
static void* copyBlock(void* ptr) {
if (!ptr) {
return NULL;
}
int sizeOfBlock=*(int*)ptr+13;
void* copy = malloc(sizeOfBlock);
if (!copy) {
return NULL;
}
for(int i=0;i<sizeOfBlock;i++){
*(copy+i)=*(ptr+i);
}
return copy;
}
but I get: invalid use of void expression
You cannot dereference, perform pointer arithmetic, indexing a void pointer because it has no base type or object size. You must therefore cast the void pointer to the a pointer to the type of the data units you are copying so that the compiler will know the size of the data to copy.
All that said, you'd be better off using:
memcpy( copy, prt, sizeOfBlock ) ;
This design (storing block size inside of a block without any struct) seems dangerous to me, but I still know the answer.
*(copy+i)=*(ptr+i);
Here you get the error, because you can't dereference a void pointer. You need to cast it to pointer to something before. Like this:
((char *)copy)[i] = ((char *)ptr)[i];
You should use the memcpy function:
memcpy(copy, ptr, sizeOfBlock);
Depending on the compiler settings (you may be compiling as C++ and not as C), you may need to cast the pointers to a char pointer:
memcpy((char *) copy, (const char *) ptr, sizeOfBlock);
Note: The parameter of the function should be const char *ptr, to make sure you don't change the contents of ptr by mistake.

What is the use of void** as an argument in a function?

I have to implement a wrapper for malloc called mymalloc with the following signature:
void mymalloc(int size, void ** ptr)
Is the void** needed so that no type casting will be needed in the main program and the ownership of the correct pointer (without type cast) remains in main().
void mymalloc(int size, void ** ptr)
{
*ptr = malloc(size) ;
}
main()
{
int *x;
mymalloc(4,&x); // do we need to type-cast it again?
// How does the pointer mechanism work here?
}
Now, will the pointer being passed need to be type-cast again, or will it get type-cast implicitly?
I do not understand how this works.
malloc returns a void*. For your function, the user is expected to create their own, local void* variable first, and give you a pointer to it; your function is then expected to populate that variable. Hence you have an extra pointer in the signature, a dereference in your function, and an address-of operator in the client code.
The archetypal pattern is this:
void do_work_and_populate(T * result)
{
*result = the_fruits_of_my_labour;
}
int main()
{
T data; // uninitialized!
do_work_and_populate(&data); // pass address of destination
// now "data" is ready
}
For your usage example, substitute T = void *, and the fruits of your labour are the results of malloc (plus checking).
However, note that an int* isn't the same as a void*, so you cannot just pass the address of x off as the address of a void pointer. Instead, you need:
void * p;
my_malloc(&p);
int * x = p; // conversion is OK
Contrary to void *, the type void ** is not a generic pointer type so you need to cast before the assignment if the type is different.
void ** ptr
Here, "ptr" is a pointer to a pointer, and can be treated as a pointer to an array of pointers. Since your result is stored there (nothing returned from mymalloc), you need to clarify what you wish to allocate into "ptr". The argument "size" is not a sufficient description.

How do I correctly use a void pointer in C?

Can someone explain why I do not get the value of the variable, but its memory instead?
I need to use void* to point to "unsigned short" values.
As I understand void pointers, their size is unknown and their type is unknown.
Once initialize them however, they are known, right?
Why does my printf statement print the wrong value?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void func(int a, void *res){
res = &a;
printf("res = %d\n", *(int*)res);
int b;
b = * (int *) res;
printf("b =%d\n", b);
}
int main (int argc, char* argv[])
{
//trial 1
int a = 30;
void *res = (int *)a;
func(a, res);
printf("result = %d\n", (int)res);
//trial 2
unsigned short i = 90;
res = &i;
func(i, res);
printf("result = %d\n", (unsigned short)res);
return 0;
}
The output I get:
res = 30
b =30
result = 30
res = 90
b =90
result = 44974
One thing to keep in mind: C does not guarantee that int will be big enough to hold a pointer (including void*). That cast is not a portable thing/good idea. Use %p to printf a pointer.
Likewise, you're doing a "bad cast" here: void* res = (int*) a is telling the compiler: "I am sure that the value of a is a valid int*, so you should treat it as such." Unless you actually know for a fact that there is an int stored at memory address 30, this is wrong.
Fortunately, you immediately overwrite res with the address of the other a. (You have two vars named a and two named res, the ones in main and the ones in func. The ones in func are copies of the value of the one in main, when you call it there.) Generally speaking, overwriting the value of a parameter to a function is "bad form," but it is technically legal. Personally, I recommend declaring all of your functions' parameters as const 99% of the time (e.g. void func (const int a, const void* res))
Then, you cast res to an unsigned short. I don't think anybody's still running on a 16-bit address-space CPU (well, your Apple II, maybe), so that will definitely corrupt the value of res by truncating it.
In general, in C, typecasts are dangerous. You're overruling the compiler's type system, and saying: "look here, Mr Compiler, I'm the programmer, and I know better than you what I have here. So, you just be quiet and make this happen." Casting from a pointer to a non-pointer type is almost universally wrong. Casting between pointer types is more often wrong than not.
I'd suggest checking out some of the "Related" links down this page to find a good overview of how C types an pointers work, in general. Sometimes it takes reading over a few to really get a grasp on how this stuff goes together.
(unsigned short)res
is a cast on a pointer, res is a memory address, by casting it to an unsigned short, you get the address value as an unsigned short instead of hexadecimal value, to be sure that you are going to get a correct value you can print
*(unsigned short*)res
The first cast (unsigned short*)res makes a cast on void* pointer to a pointer on unsigned short. You can then extract the value inside the memory address res is pointing to by dereferencing it using the *
If you have a void pointer ptr that you know points to an int, in order to access to that int write:
int i = *(int*)ptr;
That is, first cast it to a pointer-to-int with cast operator (int*) and then dereference it to get the pointed-to value.
You are casting the pointer directly to a value type, and although the compiler will happily do it, that's not probably what you want.
A void pointer is used in C as a kind of generic pointer. A void pointer variable can be used to contain the address of any variable type. The problem with a void pointer is once you have assigned an address to the pointer, the information about the type of variable is no longer available for the compiler to check against.
In general, void pointers should be avoided since the type of the variable whose address is in the void pointer is no longer available to the compiler. On the other hand, there are cases where a void pointer is very handy. However it is up to the programmer to know the type of variable whose address is in the void pointer variable and to use it properly.
Much of older C source has C style casts between type pointers and void pointers. This is not necessary with modern compilers and should be avoided.
The size of a void pointer variable is known. What is not known is the size of the variable whose pointer is in the void pointer variable. For instance here are some source examples.
// create several different kinds of variables
int iValue;
char aszString[6];
float fValue;
int *pIvalue = &iValue;
void *pVoid = 0;
int iSize = sizeof(*pIvalue); // get size of what int pointer points to, an int
int vSize = sizeof(*pVoid); // compile error, size of what void pointer points to is unknown
int vSizeVar = sizeof(pVoid); // compiles fine size of void pointer is known
pVoid = &iValue; // put the address of iValue into the void pointer variable
pVoid = &aszString[0]; // put the address of char string into the void pointer variable
pVoid = &fValue; // put the address of float into the void pointer variable
pIvalue = &fValue; // compiler error, address of float into int pointer not allowed
One way that void pointers have been used is by having several different types of structs which are provided as an argument for a function, typically some kind of a dispatching function. Since the interface for the function allows for different pointer types, a void pointer must be used in the argument list. Then the type of variable pointed to is determined by either an additional argument or inspecting the variable pointed to. An example of that type of use of a function would be something like the following. In this case we include an indicator as to the type of the struct in the first member of the various permutations of the struct. As long as all structs that are used with this function have as their first member an int indicating the type of struct, this will work.
struct struct_1 {
int iClass; // struct type indicator. must always be first member of struct
int iValue;
};
struct struct_2 {
int iClass; // struct type indicator. must always be first member of struct
float fValue;
};
void func2 (void *pStruct)
{
struct struct_1 *pStruct_1 = pStruct;
struct struct_2 *pStruct_2 = pStruct;
switch (pStruct_1->iClass) // this works because a struct is a kind of template or pattern for a memory location
{
case 1:
// do things with pStruct_1
break;
case 2:
// do things with pStruct_2
break;
default:
break;
}
}
void xfunc (void)
{
struct struct_1 myStruct_1 = {1, 37};
struct struct_2 myStruct_2 = {2, 755.37f};
func2 (&myStruct_1);
func2 (&myStruct_2);
}
Something like the above has a number of software design problems with the coupling and cohesion so unless you have good reasons for using this approach, it is better to rethink your design. However the C programming language allows you to do this.
There are some cases where the void pointer is necessary. For instance the malloc() function which allocates memory returns a void pointer containing the address of the area that has been allocated (or NULL if the allocation failed). The void pointer in this case allows for a single malloc() function that can return the address of memory for any type of variable. The following shows use of malloc() with various variable types.
void yfunc (void)
{
int *pIvalue = malloc(sizeof(int));
char *paszStr = malloc(sizeof(char)*32);
struct struct_1 *pStruct_1 = malloc (sizeof(*pStruct_1));
struct struct_2 *pStruct_2Array = malloc (sizeof(*pStruct_2Array)*21);
pStruct_1->iClass = 1; pStruct_1->iValue = 23;
func2(pStruct_1); // pStruct_1 is already a pointer so address of is not used
{
int i;
for (i = 0; i < 21; i++) {
pStruct_2Array[i].iClass = 2;
pStruct_2Array[i].fValue = 123.33f;
func2 (&pStruct_2Array[i]); // address of particular array element. could also use func2 (pStruct_2Array + i)
}
}
free(pStruct_1);
free(pStruct_2Array); // free the entire array which was allocated with single malloc()
free(pIvalue);
free(paszStr);
}
If what you want to do is pass the variable a by name and use it, try something like:
void func(int* src)
{
printf( "%d\n", *src );
}
If you get a void* from a library function, and you know its actual type, you should immediately store it in a variable of the right type:
int *ap = calloc( 1, sizeof(int) );
There are a few situations in which you must receive a parameter by reference as a void* and then cast it. The one I’ve run into most often in the real world is a thread procedure. So, you might write something like:
#include <stddef.h>
#include <stdio.h>
#include <pthread.h>
void* thread_proc( void* arg )
{
const int a = *(int*)arg;
/** Alternatively, with no explicit casts:
* const int* const p = arg;
* const int a = *p;
*/
printf( "Daughter thread: %d\n", a );
fflush(stdout); /* If more than one thread outputs, should be atomic. */
return NULL;
}
int main(void)
{
int a = 1;
const pthread_t tid = pthread_create( thread_proc, &a );
pthread_join(tid, NULL);
return EXIT_SUCCESS;
}
If you want to live dangerously, you could pass a uintptr_t value cast to void* and cast it back, but beware of trap representations.
printf("result = %d\n", (int)res); is printing the value of res (a pointer) as a number.
Remember that a pointer is an address in memory, so this will print some random looking 32bit number.
If you wanted to print the value stored at that address then you need (int)*res - although the (int) is unnecessary.
edit: if you want to print the value (ie address) of a pointer then you should use %p it's essentially the same but formats it better and understands if the size of an int and a poitner are different on your platform
void *res = (int *)a;
a is a int but not a ptr, maybe it should be:
void *res = &a;
The size of a void pointer is known; it's the size of an address, so the same size as any other pointer. You are freely converting between an integer and a pointer, and that's dangerous. If you mean to take the address of the variable a, you need to convert its address to a void * with (void *)&a.

Malloc inside another function

I have to allocate a struct from within another function, obviously using pointers.
I've been staring at this problem for hours and tried in a million different ways to solve it.
This is some sample code (very simplified):
...
some_struct s;
printf("Before: %d\n", &s);
allocate(&s);
printf("After: %d\n", &s);
...
/* The allocation function */
int allocate(some_struct *arg) {
arg = malloc(sizeof(some_struct));
printf("In function: %d\n", &arg);
return 0;
}
This does give me the same address before and after the allocate-call:
Before: -1079752900
In function: -1079752928
After: -1079752900
I know it's probably because it makes a copy in the function, but I don't know how to actually work on the pointer I gave as argument. I tried defining some_struct *s instead of some_struct s, but no luck. I tried with:
int allocate(some_struct **arg)
which works just fine (the allocate-function needs to be changed as well), BUT according to the assignment I may NOT change the declaration, and it HAS to be *arg.. And it would be most correct if I just have to declare some_struct s.. Not some_struct *s.
The purpose of the allocation function is to initialize a struct (a some_struct), which also includes allocating it.
One more thing I forgot to mention. The return 0 in the allocate function is reserved for some status messages and therefore I can't return the address using this.
Typically, I'd return the pointer from allocate:
void * allocate()
{
void * retval = malloc(sizeof(some_struct));
/* initialize *retval */
return retval;
}
If you want to return it in a parameter, you have to pass a pointer to the parameter. Since this is a pointer to a some_struct, you have to pass a pointer to a pointer:
void allocate (some_struct ** ret)
{
*ret = malloc(sizeof(some_struct));
/* initialization of **ret */
return;
}
to be called as
some_struct *s;
allocate(&s);
I highly doubt this is what your teacher had in mind, but you can cheat using a series of legal type conversions.
int allocate(some_struct *arg)
/* we're actually going to pass in a some_struct ** instead.
Our caller knows this, and allocate knows this. */
{
void *intermediate = arg; /* strip away type information */
some_struct **real_deal = intermediate; /* the real type */
*real_deal = malloc(sizeof *real_deal); /* store malloc's return in the
object pointed to by real_deal */
return *real_deal != 0; /* return something more useful than always 0 */
}
Then your caller does the same:
{
some_struct *s;
void *address_of_s = &s;
int success = allocate(address_of_s);
/* what malloc returned should now be what s points to */
/* check whether success is non-zero before trying to use it */
}
This relies on a rule in C that says any pointer to an object can be implicitly converted to a void pointer, and vice-versa, without loss.
Note that formally this is undefined, but it is all but sure to work. While any object pointer value is required to be able to convert to a void* and back without loss, there is nothing in the language that guarantees that a some_struct* can store a some_struct** without loss. But it has a very high likelihood of working just fine.
Your teacher gave you no option but to write formally illegal code. I don't see that you have any other option besides "cheating" like this.
int func(some_struct *arg) {
arg = malloc(sizeof(some_struct));
...
}
Here you just assign the result of malloc to the local arg variable. pointers are passed by value in C, a copy of the pointer gets passed to the function. You cannot change the pointer of the caller this way. Keep in mind the difference in a pointer and what it points to.
You have various options:
Return the pointer from the function:
some_struct *func(void) {
arg = malloc(sizeof(some_struct));
...
return arg;
}
...
some_struct *a = func();
Allocate the structure in the caller:
int func(some_struct *arg) {
...
arg->something = foo;
}
...
some_struct a;
func(&a);
Or dynamically allocate it
some_struct *a = malloc(sizeof *a);
func(a);
Using a pointer to the callers pointer:
int func(some_struct **arg) {
*arg = malloc(sizeof **arg);
}
...
some_struct *a;
func(&a);
Use a global variable (ugly..)
some_struct *global;
int func(void) {
global = malloc(sizeof *global);
}
...
some_struct *a;
func();
a = global;
You can't do it this way. You can't declare a struct by value, and then change it by address.
some_struct *s;
printf("Before: %d\n", s");
allocate(&s);
printf("After: %d\n", s");
...
/* The allocation function */
int allocate(some_struct **arg) {
*arg = malloc(sizeof(some_struct));
printf("In function: %d\n", *arg");
return 0;
}
You need to modify the pointed value for the struct. So you need another level of indirection, thus you have to send a pointer to the struct pointer.
Well, C uses pass-by-value, which means that functions get copies of their arguments, and any changes made to those copies don`t affect the original in the caller.
/* The allocation function */
int allocate(some_struct *arg) {
arg = malloc(sizeof(some_struct));
printf("In function: %d\n", &arg");
return 0;
}
Here you pass in the address of your some_struct s. Then you discard that address, and replace it with whatever was returned by malloc. Then you return, and the return value of malloc is lost forever, and you've leaked memory. And your some_struct s has not been changed. It still has whatever random number it was initialized to, which you printed out.
If you may not change the signature of the allocate function, it can never be useful. It must either take the address of a pointer, so that it can modify the value of that pointer, or it must return a pointer that your caller can tuck away.

Resources