filling dynamically allocated struct inside function - c

I'm trying to make a function that accepts a pre-allocated memory pointer as input and fills an array of structs at that location with data. In this example, I expected the output to be:
W 100
L 200
However, the first line is correct, but the second line prints no character and a zero. What am I doing wrong?
typedef struct{
char word;
long number;
}record;
void makerec(record** data){
data[0]->word='W';
data[0]->number=100;
data[1]->word='L';
data[1]->number=200;
}
int main(){
record* data=(record*)malloc(sizeof(record)*1000);
makerec(&data);
printf("%c %ld\n",data[0].word,data[0].number);
printf("%c %ld\n",data[1].word,data[1].number);
free(data);
return 0;
}

You're not dealing with the right types. Simply change:
void makerec(record** data) {
to:
void makerec(record * data) {
and:
makerec(&data);
to:
makerec(data);
as well as changing data[0]->word='W'; and friends to data[0].word = 'W';
data is already a pointer, and you want to change the thing to which it points, so you can just pass it directly to makerec. You'd pass a pointer to data if you wanted makerec() to make it point to something different, but that's not what you're doing here, so just passing data itself is correct.
Incidental to your main issue, but:
record* data=(record*)malloc(sizeof(record)*1000);
should be:
record* data = malloc(1000 * sizeof *data);
if ( !data ) {
perror("memory allocation failed");
exit(EXIT_FAILURE);
}
Notes:
You don't need to (and, to my mind, shouldn't) cast the return value of malloc() and friends
sizeof *data is better than sizeof(record), since it continues to work if the type of data changes, and more importantly, it removes the possibility of you applying the sizeof operator to an incorrect type, which is a common mistake.
Reversing the positions of 1000 and sizeof *data is merely cosmetic, to make the multiple '*'s easier to comprehend.
You should always check the return value of malloc() in case the allocation failed, and take appropriate action (such as exiting your program) if it did.

Related

What is the correct way to temporarily cast void* for arithmetic?

I am C novice but been a programmer for some years, so I am trying to learn C by following along Stanford's course from 2008 and doing Assignment 3 on Vectors in C.
It's just a generic array basically, so the data is held inside a struct as a void *. The compiler flag -Wpointer-arith is turned on so I can't do arithmetic (and I understand the reasons why).
The struct around the data must not know what type the data is, so that it is generic for the caller.
To simplify things I am trying out the following code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
void *data;
int aindex;
int elemSize;
} trial;
void init(trial *vector, int elemSize)
{
vector->aindex = 0;
vector->elemSize = elemSize;
vector->data = malloc(10 * elemSize);
}
void add(trial *vector, const void *elemAddr)
{
if (vector->aindex != 0)
vector->data = (char *)vector->data + vector->elemSize;
vector->aindex++;
memcpy(vector->data, elemAddr, sizeof(int));
}
int main()
{
trial vector;
init(&vector, sizeof(int));
for (int i = 0; i < 8; i++)
{add(&vector, &i);}
vector.data = (char *)vector.data - ( 5 * vector.elemSize);
printf("%d\n", *(int *)vector.data);
printf("%s\n", "done..");
free(vector.data);
return 0;
}
However I get an error at free with free(): invalid pointer. So I ran valgrind on it and received the following:
==21006== Address 0x51f0048 is 8 bytes inside a block of size 40 alloc'd
==21006== at 0x4C2CEDF: malloc (vg_replace_malloc.c:299)
==21006== by 0x1087AA: init (pointer_arithm.c:13)
==21006== by 0x108826: main (pointer_arithm.c:29)
At this point my guess is I am either not doing the char* correctly, or maybe using memcpy incorrectly
This happens because you add eight elements to the vector, and then "roll back" the pointer by only five steps before attempting a free. You can easily fix that by using vector->aindex to decide by how much the index is to be unrolled.
The root cause of the problem, however, is that you modify vector->data. You should avoid modifying it in the first place, relying on a temporary pointer inside of your add function instead:
void add(trial *vector, const void *elemAddr, size_t sz) {
char *base = vector->data;
memcpy(base + vector->aindex*sz, elemAddr, sz);
vector->aindex++;
}
Note the use of sz, you need to pass sizeof(int) to it.
Another problem in your code is when you print by casting vector.data to int*. This would probably work, but a better approach would be to write a similar read function to extract the data.
If you don't know the array's data type beforehand, you must assume a certain amount of memory when you first initialize it, for example, 32 bytes or 100 bytes. Then if you run out of memory, you can expand using realloc and copying over your previous data to the new slot. The C++ vector IIRC follows either a x2 or x2.2 ratio to reallocate, not sure.
Next up is your free. There's a big thing you must know here. What if the user were to send you a memory allocated object of their own? For example a char* that they allocated previously? If you simply delete the data member of your vector, that won't be enough. You need to ask for a function pointer in case the data type is something that requires special attention as your input to add.
Lastly you are doing a big mistake at this line here:
if (vector->aindex != 0)
vector->data = (char *)vector->data + vector->elemSize;
You are modifiyng your pointer address!!! Your initial address is lost here! You must never do this. Use a temporary char* to hold your initial data address and manipulate it instead.
Your code is somewhat confusing, there's probably a mis-understanding or two hiding in there.
A few observations:
You can't change a pointer returned by malloc() and then pass the new value to free(). Every value passed to free() must be the exact same value returned by one of the allocation functions.
As you've guessed, the copying is best done by memcpy() and you have to cast to char * for the arithmetic.
The function to append a value could be:
void add(trial *vector, const void *element)
{
memcpy((char *) vector->data + vector->aindex * vector->elemSize, element);
++vector->aindex;
}
Of course this doesn't handle overflowing the vector, since the length is not stored (I didn't want to assume it was hard-coded at 10).
Changing the data value in vector for each object is very odd, and makes things more confusing. Just add the required offset when you need to access the element, that's super-cheap and very straight forward.

Pointers and assignment in a sub-function

I have a small program that creates a semver struct with some variables in it:
typedef struct {
unsigned major;
unsigned minor;
unsigned patch;
char * note;
char * tag;
} semver;
Then, I would like to create a function which creates a semver struct and returns it to the caller. Basically, a Factory.
That factory would call an initialize function to set the default values of the semver struct:
void init_semver(semver * s) {
s->major = 0;
s->minor = 0;
s->patch = 0;
s->note = "alpha";
generate_semver(s->tag, s);
}
And on top of that, I would like a function to generate a string of the complete semver tag.
void generate_semver(char * tag, semver * s) {
sprintf( tag, "v%d.%d.%d-%s",
s->major, s->minor, s->patch, s->note);
}
My problem appears to lie in this function. I have tried returning a string, but have heard that mallocing some space is bad unless you explicitly free it later ;) In order to avoid this problem, I decided to try to pass a string to the function to have it be changed within the function with no return value. I'm trying to loosely follow something like DI practices, even though I'd really like to separate the concerns of these functions and have the generate_semver function return a string that I can use like so:
char * generate_semver(semver * s) {
char * full_semver;
sprintf( full_semver, "v%d.%d.%d-%s",
s->major, s->minor, s->patch, s->note);
return full_semver; // I know this won't work because it is defined in the local stack and not outside.
}
semver->tag = generate_semver(semver);
How can I do this?
My problem appears to lie in this function. I have tried returning a string, but have heard that mallocing some space is bad unless you explicitly free it later.
Explicitly freeing dynamically allocated memory is required to avoid memory leaks. However, it is not necessarily a task that the end users need to perform directly: an API often provides a function to deal with this.
In your case, you should provide a deinit_semver function that does the clean up of memory that init_semver has allocated dynamically. These two functions behave in a way that is similar to constructor and destructor; init_semver is not a factory function, because it expects the semver struct to be allocated, rather than allocating it internally.
Here is one way of doing it:
void init_semver(semver * s, int major, int minor, int pathc, const char * note) {
s->major = major;
s->minor = minor;
s->patch = pathc;
size_t len = strlen(note);
s->note = malloc(len+1);
strcpy(s->note, note);
s->tag = malloc(40 + len);
sprintf(s->tag, "v%d.%d.%d-%s", major, minor, patch, note);
}
void deinit_semver(semver *s) {
free(s->note);
free(s->tag);
}
Note the changes above: rather than using fixed values for the components of struct semver, this code takes the values as parameters. In addition, the code copies the note into a dynamically allocated buffer, rather than pointing to it directly.
The deinit function does the clean-up by free-ing both fields that were allocated dynamically.
A char * on its own is just a pointer to memory. To accomplish what you want you will either need to instead use a fixed size field, i.e. char[33], or you can dynamically allocate the memory as needed.
As it is, your generate_semver function is attempting to print to an unknown address. Let's look at one solution.
typedef struct {
unsigned major;
unsigned minor;
unsigned patch;
char note[32];
char tag[32];
} semver;
Now, in your init_semver function, the line previously s->note = "alpha"; will become a string copy, as arrays are not a valid lvalue.
strncpy(s->note, "alpha", 31);
s->note[31] = '\0';
strncpy will copy a string from the second parameter to the first up to the number of bytes in the third parameter. The second line ensures that a trailing null terminator is in place.
Similarly, in the generate_semver function, it would directly work in the buffer:
void generate_semver(semver * s) {
snprintf( s->tag, 32, "v%d.%d.%d-%s",
s->major, s->minor, s->patch, s->note);
}
This will directly print to the array in the structure, with a maximum character limit. snprintf does append a trailing null terminator (unlike strncpy), so we don't need to worry about adding it ourselves.
You mention having to free allocated memory, and then say: "In order to avoid this problem". Well, it's not so much a problem, but rather a necessity of the C language. It's common to have functions that allocate memory, and require the caller to free it again.
The idiomatic way is to have a pair of "create" and "destroy" functions. So I'd suggest doing it like this:
// Your factory function
semver* create_semver() {
semver* instance = malloc(sizeof(*instance));
init_semver(instance); // will also allocate instance->tag and ->note
return instance;
}
// Your destruction function
void free_semver(semver* s) {
free(semver->tag);
free(semver->note);
free(semver);
}

Recursive struct and malloc()

I have a recursive struct which is:
typedef struct dict dict;
struct dict {
dict *children[M];
list *words[M];
};
Initialized this way:
dict *d = malloc(sizeof(dict));
bzero(d, sizeof(dict));
I would like to know what bzero() exactly does here, and how can I malloc() recursively for children.
Edit: This is how I would like to be able to malloc() the children and words:
void dict_insert(dict *d, char *signature, unsigned int current_letter, char *w) {
int occur;
occur = (int) signature[current_letter];
if (current_letter == LAST_LETTER) {
printf("word found : %s!\n",w);
list_print(d->words[occur]);
char *new;
new = malloc(strlen(w) + 1);
strcpy(new, w);
list_append(d->words[occur],new);
list_print(d->words[occur]);
}
else {
d = d->children[occur];
dict_insert(d,signature,current_letter+1,w);
}
}
bzero(3) initializes the memory to zero. It's equivalent to calling memset(3) with a second parameter of 0. In this case, it initializes all of the member variables to null pointers. bzero is considered deprecated, so you should replace uses of it with memset; alternatively, you can just call calloc(3) instead of malloc, which automatically zeroes out the returned memory for you upon success.
You should not use either of the two casts you have written—in C, a void* pointer can be implicitly cast to any other pointer type, and any pointer type can be implicitly cast to void*. malloc returns a void*, so you can just assign it to your dict *d variable without a cast. Similarly, the first parameter of bzero is a void*, so you can just pass it your d variable directly without a cast.
To understand recursion, you must first understand recursion. Make sure you have an appropriate base case if you want to avoid allocating memory infinitely.
In general, when you are unsure what the compiler is generating for you, it is a good idea to use a printf to report the size of the struct. In this case, the size of dict should be 2 * M * the size of a pointer. In this case, bzero will fill a dict with zeros. In other words, all M elements of the children and words arrays will be zero.
To initialize the structure, I recommend creating a function that takes a pointer to a dict and mallocs each child and then calls itself to initialize it:
void init_dict(dict* d)
{
int i;
for (i = 0; i < M; i++)
{
d->children[i] = malloc(sizeof(dict));
init_dict(d->children[i]);
/* initialize the words elements, too */
}
}
+1 to you if you can see why this code won't work as is. (Hint: it has an infinite recursion bug and needs a rule that tells it how deep the children tree needs to be so it can stop recursing.)
bzero just zeros the memory. bzero(addr, size) is essentially equivalent to memset(addr, 0, size). As to why you'd use it, from what I've seen around half the time it's used, it's just because somebody though zeroing the memory seemed like a good idea, even though it didn't really accomplish anything. In this case, it looks like the effect would be to set some pointers to NULL (though it's not entirely portable for that purpose).
To allocate recursively, you'd basically just keep track of a current depth, and allocate child nodes until you reached the desired depth. Code something on this order would do the job:
void alloc_tree(dict **root, size_t depth) {
int i;
if (depth == 0) {
(*root) = NULL;
return;
}
(*root) = malloc(sizeof(**root));
for (i=0; i<M; i++)
alloc_tree((*root)->children+i, depth-1);
}
I should add that I can't quite imagine doing recursive allocation like this though. In a typical case, you insert data, and allocate new nodes as needed to hold the data. The exact details of that will vary depending on whether (and if so how) you're keeping the tree balanced. For a multi-way tree like this, it's fairly common to use some B-tree variant, in which case the code I've given above won't normally apply at all -- with a B-tree, you fill a node, and when it's reached its limit, you split it in half and promote the middle item to the parent node. You allocate a new node when this reaches the top of the tree, and the root node is already full.

How to get size of different kinds of types in the same function using C?

I'm writing a function which increases the size of a dynamic memory object created with malloc. The function should as arguments take a pointer to the memory block to be increased, the current size of the block and the amount the block is going to be increased.
Something like this:
int getMoreSpace(void **pnt, int size, int add) {
xxxxxx *tmp; /* a pointer to the same as pnt */
if (tmp = realloc(pnt, (size+add)*sizeof(xxxxxx))) { /* get size of what pnt points to */
*pnt=tmp;
return 1;
else return 0;
}
The problem is that I want the function to work no matter what pnt points to. How do I achieve that?
This type of function cannot possibly work, because pnt is local and the new pointer is lost as soon as the function returns. You could take an argument of type xxxxxx ** so that you could update the pointer, but then you're stuck with only supporting a single type.
The real problem is that you're writing an unnecessary and harmful wrapper for realloc. Simply use realloc directly as it was meant to be used. There is no way to make it simpler or more efficient by wrapping it; it's already as simple as possible.
You pass in the size as an argument. You can use a convenience macro to make it look the same as your function:
#define getMoreSpace(P, SZ, ADD) getMoreSpaceFunc(&(P), sizeof(*(P)), (SZ), (ADD))
int getMoreSpace(void **pnt, size_t elem_size, int size, int add) {
*pnt = ...
}
Edit to show that your convenience macro would also need to add call-by-reference semantics.
Pass in the element size as a separate parameter:
int getMoreSpace(void **pnt, int size, int add, size_t eltSize)
{
void *tmp;
if (tmp = realloc(pnt, (size+add)*eltSize))
{
*pnt=tmp;
return 1;
}
else
return 0;
}
...
int *p = malloc(100 * sizeof *p);
...
if (!getMoreSpace(&p, 100, 20, sizeof *p))
{
// panic
}
It's the most straightforward solution, if not the most elegant. C just doesn't lend itself to dynamic type information.
Edit
Changed the type of pnt in response to Steve's comment.
As caf points out, this won't work, even with the "fix" per Steve. R.'s right; don't do this.

How can I allocate memory and return it (via a pointer-parameter) to the calling function?

I have some code in a couple of different functions that looks something like this:
void someFunction (int *data) {
data = (int *) malloc (sizeof (data));
}
void useData (int *data) {
printf ("%p", data);
}
int main () {
int *data = NULL;
someFunction (data);
useData (data);
return 0;
}
someFunction () and useData () are defined in separate modules (*.c files).
The problem is that, while malloc works fine, and the allocated memory is usable in someFunction, the same memory is not available once the function has returned.
An example run of the program can be seen here, with output showing the various memory addresses.
Can someone please explain to me what I am doing wrong here, and how I can get this code to work?
EDIT: So it seems like I need to use double pointers to do this - how would I go about doing the same thing when I actually need to use double pointers? So e.g. data is
int **data = NULL; //used for 2D array
Do I then need to use triple pointers in function calls?
You want to use a pointer-to-pointer:
void someFunction (int **data) {
*data = malloc (sizeof (int));
}
void useData (int *data) {
printf ("%p", data);
}
int main () {
int *data = NULL;
someFunction (&data);
useData (data);
return 0;
}
Why? Well, you want to change your pointer data in the main function. In C, if you want to change something that's passed in as a parameter (and have that change show up in the caller's version), you have to pass in a pointer to whatever you want to change. In this case, that "something you want to change" is a pointer -- so to be able to change that pointer, you have to use a pointer-to-pointer...
Note that on top of your main problem, there was another bug in the code: sizeof(data) gives you the number of bytes required to store the pointer (4 bytes on a 32-bit OS or 8 bytes on a 64-bit OS), whereas you really want the number of bytes required to store what the pointer points to (an int, i.e. 4 bytes on most OSes). Because typically sizeof(int *)>=sizeof(int), this probably wouldn't have caused a problem, but it's something to be aware of. I've corrected this in the code above.
Here are some useful questions on pointers-to-pointers:
How do pointer to pointers work in C?
Uses for multiple levels of pointer dereferences?
A common pitfall especially if you moved form Java to C/C++
Remember when you passing a pointer, it's pass by value i.e the value of the pointer is copied. It's good for making changes to data pointed by the pointer but any changes to the pointer itself is just local since it a copy!!
The trick is to use pass the pointer by reference since you wanna change it i.e malloc it etc.
**pointer --> will scare a noobie C programmer ;)
You have to pass a pointer to the pointer if you want to modify the pointer.
ie. :
void someFunction (int **data) {
*data = malloc (sizeof (int)*ARRAY_SIZE);
}
edit :
Added ARRAY_SIZE, at some point you have to know how many integers you want to allocate.
That is because pointer data is passed by value to someFunction.
int *data = NULL;
//data is passed by value here.
someFunction (data);
//the memory allocated inside someFunction is not available.
Pointer to pointer or return the allocated pointer would solve the problem.
void someFunction (int **data) {
*data = (int *) malloc (sizeof (data));
}
int* someFunction (int *data) {
data = (int *) malloc (sizeof (data));
return data;
}
someFunction() takes its parameter as int*. So when you call it from main(), a copy of the value you passed created. Whatever you are modifying inside the function is this copy and hence the changes will not be reflected outside. As others suggested, you can use int** to get the changes reflected in data. Otherway of doing it is to return int* from someFunction().
Apart from using the doublepointer technique, if there's only 1 return param needed rewrite is as following:
int *someFunction () {
return (int *) malloc (sizeof (int *));
}
and use it:
int *data = someFunction ();
Here's the general pattern for allocating memory in a function and returning the pointer via parameter:
void myAllocator (T **p, size_t count)
{
*p = malloc(sizeof **p * count);
}
...
void foo(void)
{
T *p = NULL;
myAllocator(&p, 100);
...
}
Another method is to make the pointer the function's return value (my preferred method):
T *myAllocator (size_t count)
{
T *p = malloc(sizeof *p * count);
return p;
}
...
void foo(void)
{
T *p = myAllocator(100);
...
}
Some notes on memory management:
The best way to avoid problems with memory management is to avoid memory management; don't muck with dynamic memory unless you really need it.
Do not cast the result of malloc() unless you're using an implementation that predates the 1989 ANSI standard or you intend to compile the code as C++. If you forget to include stdlib.h or otherwise don't have a prototype for malloc() in scope, casting the return value will supress a valuable compiler diagnostic.
Use the size of the object being allocated instead of the size of the data type (i.e., sizeof *p instead of sizeof (T)); this will save you some heartburn if the data type has to change (say from int to long or float to double). It also makes the code read a little better IMO.
Isolate memory management functions behind higher-level allocate and deallocate functions; these can handle not only allocation but also initialization and errors.
Here you are trying to modifying the pointer i.e. from "data == Null" to "data == 0xabcd"some other memory you allocated. So to modify data that you need pass the address of data i.e. &data.
void someFunction (int **data) {
*data = (int *) malloc (sizeof (int));
}
Replying to your additional question you edited in:
'*' denotes a pointer to something. So '**' would be a pointer to a pointer to something, '***' a pointer to a pointer to a pointer to something, etc.
The usual interpretation of 'int **data' (if data is not a function parameter) would be a pointer to list of int arrays (e.g. 'int a [100][100]').
So you'd need to first allocate your int arrays (I am using a direct call to malloc() for the sake of simplicity):
data = (int**) malloc(arrayCount); //allocate a list of int pointers
for (int i = 0; i < arrayCount; i++) //assign a list of ints to each int pointer
data [i] = (int*) malloc(arrayElemCount);
Rather than using double pointer we can just allocate a new pointer and just return it, no need to pass double pointer because it is not used anywhere in the function.
Return void * so can be used for any type of allocation.
void *someFunction (size_t size) {
return malloc (size);
}
and use it as:
int *data = someFunction (sizeof(int));
For simplicity, let me call the above single pointer parameter p
and the double pointer pp (pointing to p).
In a function, the object that p points to can be changed and the change goes out of
the function. However, if p itself is changed, the change does not
leave the function.
Unfortunately, malloc by its own nature, typically
changes p. That is why the original code does not work.
The correction (58) uses the pointer pp pointing to p. in the corrected
function, p is changed but pp is not. Thus it worked.

Resources