How should I malloc/realloc with a struct that includes an array? - c

I'm pretty new to c, so if my steps are wrong, please let me know. Let's say that I have something like the following:
struct graphNode{
int val;
graphNode* parent;
int succSize;
int succMaxSize;
graphNode* succ[1];
};
I will create a new node with:
graphNode *n;
n = malloc(sizeof(struct graphNode));
assert(n);
n->val = 1;
n->parent = NULL;
n->succSize = 0;
n->succMaxSize = 1;
Then, if I want to add a successor to the node
if (n->succSize == n->succMaxSize){
n->succ = realloc(n->succ, sizeof(graphNode*) * n->succMaxSize * 2);
n->succMaxSize *= 2;
}
n->succ[succSize] = n2; //n2 is of type graphNode*
succSize++;
Is this correct? Do I need to realloc for the struct as well or is realloc of the array enough? Do I need to malloc for the initial array? Should the initial array size be included in my malloc call for n?

The usual way to define a "stretchy" array member in C is to either specify a size of 0 or no size at all, e.g.:
struct foo {
int stuff;
bar theBars[]; // or theBars[0]
};
With this definition, sizeof(struct foo) will include all the elements other than the array at the end, and you can allocate the right size by saying malloc(sizeof(struct foo) + numberOfBars * sizeof(bar)).
If you need to reallocate it to change the number of bar elements, then you'll use the same formula (but with a new numberOfBars).
To be clear, you can't just realloc part of a struct. You have to realloc the whole thing.

realloc(ptr,size) needs 2 parameters, not 1 as used in realloc(sizeof(graphNode*) * n->succMaxSize * 2)
// Something like ...
graphNode *n;
n->succSize = 0;
n->succMaxSize = 0; // set to 0
n->succ = NULL; // Initialize to NULL
// Then, if OP wants to add a successor to the node
if (n->succSize <= n->succMaxSize){
n->succ = realloc(n->succ, sizeof(graphNode*) * n->succMaxSize * 2);
n->succMaxSize *= 2;
}
n->succ[succSize++] = n2;
As with all memory allocations, check for NULL return. In realloc(), one should save the original value, so if the realloc() fails, the original pointer is not lost.

Usually when you see struct definition where the last field is an array of size 0 or 1 it means the author is going to do some subtle stuff with malloc when the struct is malloced.
For example
struct foo {
int x;
:
:
type a[0];
};
With a malloc like
struct foo *p = malloc(sizeof(*p) + (n * sizeof(type));
What this does is it allocates a contiguous chunk of memory for the struct and the trailing array. In this case the array size is n. So references to the array in this case are:
p->a[i] // where i >= 0 and i < n
One reason for doing this is to save memory.
I'm sure there are better explanations for this on StackOver; it's a very common C idiom.
It's generally not used when the array is dynamic. Rather, it is used when the array size is known at malloc() time. You can use dynamically, of course, but you have to realloc the entire memory chunk, not just the struct or array by itself. To increase the size to 2n you would say
p = realloc(p, sizeof(*p) + (2 * n * sizeof(type)));
Now your array is twice is big as it was, and it's still one chunk of memory.

If you only want a single array, just make succ a single pointer and only use malloc/realloc etc. to allocate memory for the array.
graphNode* succ;
What you are doing is almost certain to break.

I too am new to C, but there's some things that I can see right off the bat. First of all, you can't re-allocate arrays. In c89, they're compile-time fixed-size. In C99 and C11, they can be dynamically allocated, but not reallocated (as far as I'm aware). So for this, you need to allocate a
graphnode *succ;
pointer, and malloc(nodes * sizeof(node)).
graphNode* succ[1];
This creates an array of size one, not an array with a maximum index of one. So, it is the same (almost) functionally as
graphNode* succ;
except that you can't change its size once you've made it.
I think what you want is to make a tree, with a dynamically re-allocable amount of branches. In this case, you only need to reallocate the size of the graphNode* pointer, and then access each element via index as you would an array.

Related

Simulating a List with array

Good morning!
I must handle a struct array (global variable) that simulates a list. In practice, every time I call a method, I have to increase the size of the array 1 and insert it into the new struct.
Since the array size is static, my idea is to use pointers like this:
The struct array is declared as a pointer to a second struct array.
Each time I call the increaseSize () method, the content of the old array is copied to a new n + 1 array.
The global array pointer is updated to point to a new array
In theory, the solution seems easy ... but I'm a noob of c. Where is that wrong?
struct task {
char title[50];
int execution;
int priority;
};
struct task tasks = *p;
int main() {
//he will call the increaseSize() somewhere...
}
void increaseSize(){
int dimension = (sizeof(*p) / sizeof(struct task));
struct task newTasks[dimension+1];
for(int i=0; i<dimension; i++){
newTasks[i] = *(p+i);
}
free(&p);
p = newTasks;
}
You mix up quite a lot here!
int dimension = (sizeof(*p) / sizeof(struct task));
p is a pointer, *p points to a struct task, so sizeof(*p) will be equal to sizeof(struct task), and dimension always will be 1...
You cannot use sizeof in this situation. You will have to store the size (number of elements) in a separate variable.
struct task newTasks[dimension+1];
This will create a new array, yes – but with scope local to the current function (so normally, it is allocated on the stack). This means that the array will be cleaned up again as soon as you leave your function.
What you need is creating the array on the heap. You need to use malloc function for (or calloc or realloc).
Additionally, I recomment not increasing the array by 1, but rather duplicating its size. You need to store the number of elements contained in then, too, though.
Putting all together:
struct task* p;
size_t count;
size_t capacity;
void initialize()
{
count = 0;
capacity = 16;
p = (struct task*) malloc(capacity * sizeof(struct task));
if(!p)
// malloc failed, appropriate error handling!
}
void increase()
{
size_t c = capacity * 2;
// realloc is very convenient here:
// if allocation is successful, it copies the old values
// to the new location and frees the old memory, so nothing
// so nothing to worry about except for allocation failure
struct task* pp = realloc(p, c * sizeof(struct task));
if(pp)
{
p = pp;
capacity = c;
}
// else: apprpriate error handling
}
Finally, as completion:
void push_back(struct task t)
{
if(count == capacity)
increase();
p[count++] = t;
}
Removing elements is left to you – you'd have to copy the subsequent elements all to one position less and then decrease count.

free struct of unions in c

I have a dynamically allocated vector of a special struct, and i trying to free but the software always crashes
the structure is :
typedef struct {
Type_e type;
union {
char m_char;
int m_int;
// more types (non of them is a pointer)
} my_data;
} Data_t;
where Type is an enum that contain all possible data types.
I allocate and initialize the vector as follows
void vector(Data_t **vec, UInt32_t start_element, UInt32_t end_element, Type_e type)
{
UInt32_t i;
Data_t *vec_ptr;
*vec=(Data_t *)malloc((size_t) ((end_element-start_element+1) * sizeof(Data_t)));
vec_ptr = *vec;
if (!vec_ptr)
{
// Write error
}
for (i =start_element; i <= end_element + 1; i++)
{
vec_ptr->type = type;
switch (type)
{
case UINT32: vec_ptr->my_data.m_int = 0; break;
// more possible cases
default:
break;
}
(vec_ptr)++;
}
}
I call this function as follows
Data_t *lVector = NULL;
vector(&lVector,0,10,INT32)
but when I try to free the allocated memory as follows,
free (lVector+start_element-1);
I tried
free (lVector+start_element);
and
free (lVector);
were start_element = 0 (in this case)
But in all cases, it crash. Am I doing anything wrong ?
This is incorrect:
*vec = *vec + sizeof(Data_t);
It advances *vec by sizeof(Data_t)*sizeof(Data_t) bytes, because pointer arithmetics multiplies integral constants by sizeof(*p) automatically.
Replace with (*vec)++, and let the compiler do the math for you. Similarly, remove multiplication in all places where you manipulate pointers. The only place in your code where you need to multiply by sizeof is when you call malloc.
Note: your code is hard to read because you move *vec back and forth as you go through the loop. You would be better off declaring and using a plain temporary pointer for iterating the vector, and keeping *vec fixed to whatever has been allocated by malloc.
You must free exactly the pointer returned by malloc, and do so exactly once. You store the return value of malloc in *vec, so free(*vec) would be correct in the same function or free(lVector) in the calling function. However, you subsequently assign other values to *vec, so to be able to free it correctly you would need to somehow restore the original return value of malloc (a better choice would almost certainly be to use another variable instead).
You also seem to misunderstand pointer arithmetic. p += n already advances the address pointed to by sizeof(*p) * n. So you mustn't multiply the changes to *vec by sizeof(Data_t) (which is sizeof(**vec)).
this parameter says array of pointers to type 'Data_t'
Data_t **vec,
however, this line:
*vec=(Data_t *)malloc((size_t) ((end_element-start_element+1) * sizeof(Data_t)));
allocates memory for an array of 'Data_t' not an array of pointers to 'Data_t'
in C, do not cast the returned value from malloc
the parameter to malloc() is automatically a 'size_t' so casting to 'size_t' just clutters the code
This line:
for (i =start_element; i <= end_element + 1; i++)
iterates over the array from index 0 to index 11 however, the valid index is from 0 to 10 as C array indexs start with 0 and end at sizeof(array) -1
this line:
(*vec)->type = type;
is expecting 'vec' to actually be an array of pointers to struct. But, as mentioned earlier, it is not
this line:
*vec = *vec + sizeof(Data_t);
is properly stepping through the array of struct However, this looses the pointer to the malloc'd memory, resulting in a memory leak because the pointer to malloc'd memory is lost so cannot be passed to free()
This line:
*vec = *vec - ((end_element-start_element+1) * sizeof(Data_t));
doesn't quite work, because the prior 'for' statement iterates one too many times.
Strongly suggest indexing off 'vec' rather than changing vec contents. I.E. vec[i]
Where do you try to call free()?
If inside vector(), you will free '&lVector', which's on the Stack and can't be freed.
You can only free space you allocated with malloc(), so you can free *vec, but not vec.

How can I dynamically update the array within a struct?

So I have this struct
#define MAX 128
typedef struct this_struct {
Type items[MAX];
} *SVar;
Lets say we create something like this
SVar first = malloc(sizeof(struct this_struct));
Now when I push values into the array and it fills to the MAX which is 128, I need to dynamically create a new array but I don't know how since the array is inside.
Here are my current thoughts on how I want to do it:
Create a new SVar names "second" with second->items[MAX *2]
free(first)
How can I go about doing this?
The typical way to do that is make your struct contain three values: first, a pointer to an array of variables, and second a count of the currently allocated array size, and in practice, you will need a third item to track the number of array slots you're actually using.
So, with your struct, it would be something like this:
Type *items;
int item_length; /* Number allocated */
int item_count; /* Number in use */
you initially allocate a "batch" of entries, say 100:
first = malloc(sizeof(this_struct));
first->items = malloc(sizeof(Type) * 100);
first->item_length = 100;
first->item_count = 0;
Then you add items one at a time. Simplistically, it's this:
first->items[first->item_count] = new_item;
first->item_count += 1;
But really you need to make sure each time you're not going to overflow the currently-allocated space, so it's really like this:
if (first->item_count == first->item_length) {
first->item_length += 100;
first->items = realloc(first->items, sizeof(Type) * first->item_length);
}
first->items[first->item_count] = new_item;
first->item_count += 1;
You're basically just using slots one at a time as long as your currently allocated space is large enough. Once you've used all the space you've allocated, realloc will either extend the array in place if there is room in the address space, or it will find and allocate a new larger space and move all the existing data to the new spot (and freeing up the old space).
In practice, you should check the return valueon the malloc and realloc calls.
A usual trick is to do something like this:
typedef struct {
int max_items; /* enough memory allocated for so many items */
...
Whatever_type items[1]; /* must be the last member */
} Dyn_array;
...
int n = 127;
Dyn_array *p = malloc(sizeof(Dyn_array) + n*sizeof(p.items[0]);
p->max_items = n + 1;
...
n = 1023;
p = realloc(p, sizeof(Dyn_array) + n*sizeof(p.items[0]);
p->max_items = n + 1;
and so on. The code using the structure performs out-of-bound reads and writes to the items array, which is declared to store one item only. This is OK, however, since C does not do any bounds checking, and the memory allocation policy must guarantee that there is always enough space available for num_items items.

malloc for struct and pointer in C

Suppose I want to define a structure representing length of the vector and its values as:
struct Vector{
double* x;
int n;
};
Now, suppose I want to define a vector y and allocate memory for it.
struct Vector *y = (struct Vector*)malloc(sizeof(struct Vector));
My search over the internet show that I should allocate the memory for x separately.
y->x = (double*)malloc(10*sizeof(double));
But, it seems that I am allocating the memory for y->x twice, one while allocating memory for y and the other while allocating memory for y->x, and it seems a waste of memory.
It is very much appreciated if let me know what compiler really do and what would be the right way to
initialize both y, and y->x.
No, you're not allocating memory for y->x twice.
Instead, you're allocating memory for the structure (which includes a pointer) plus something for that pointer to point to.
Think of it this way:
1 2
+-----+ +------+
y------>| x------>| *x |
| n | +------+
+-----+
You actually need the two allocations (1 and 2) to store everything you need.
Additionally, your type should be struct Vector *y since it's a pointer, and you should never cast the return value from malloc in C.
It can hide certain problems you don't want hidden, and C is perfectly capable of implicitly converting the void* return value to any other pointer.
And, of course, you probably want to encapsulate the creation of these vectors to make management of them easier, such as with having the following in a header file vector.h:
struct Vector {
double *data; // Use readable names rather than x/n.
size_t size;
};
struct Vector *newVector(size_t sz);
void delVector(struct Vector *vector);
//void setVectorItem(struct Vector *vector, size_t idx, double val);
//double getVectorItem(struct Vector *vector, size_t idx);
Then, in vector.c, you have the actual functions for managing the vectors:
#include "vector.h"
// Atomically allocate a two-layer object. Either both layers
// are allocated or neither is, simplifying memory checking.
struct Vector *newVector(size_t sz) {
// First, the vector layer.
struct Vector *vector = malloc(sizeof (struct Vector));
if (vector == NULL)
return NULL;
// Then data layer, freeing vector layer if fail.
vector->data = malloc(sz * sizeof (double));
if (vector->data == NULL) {
free(vector);
return NULL;
}
// Here, both layers worked. Set size and return.
vector->size = sz;
return vector;
}
void delVector(struct Vector *vector) {
// Can safely assume vector is NULL or fully built.
if (vector != NULL) {
free(vector->data);
free(vector);
}
}
By encapsulating the vector management like that, you ensure that vectors are either fully built or not built at all - there's no chance of them being half-built.
It also allows you to totally change the underlying data structures in future without affecting clients. For example:
if you wanted to make them sparse arrays to trade off space for speed.
if you wanted the data saved to persistent storage whenever changed.
if you wished to ensure all vector elements were initialised to zero.
if you wanted to separate the vector size from the vector capacity for efficiency(1).
You could also add more functionality such as safely setting or getting vector values (see commented code in the header), as the need arises.
For example, you could (as one option) silently ignore setting values outside the valid range and return zero if getting those values. Or you could raise an error of some description, or attempt to automatically expand the vector under the covers(1).
In terms of using the vectors, a simple example is something like the following (very basic) main.c
#include "vector.h"
#include <stdio.h>
int main(void) {
Vector myvec = newVector(42);
myvec.data[0] = 2.718281828459;
delVector(myvec);
}
(1) That potential for an expandable vector bears further explanation.
Many vector implementations separate capacity from size. The former is how many elements you can use before a re-allocation is needed, the latter is the actual vector size (always <= the capacity).
When expanding, you want to generally expand in such a way that you're not doing it a lot, since it can be an expensive operation. For example, you could add 5% more than was strictly necessary so that, in a loop continuously adding one element, it doesn't have to re-allocate for every single item.
The first time around, you allocate memory for Vector, which means the variables x,n.
However x doesn't yet point to anything useful.
So that is why second allocation is needed as well.
In principle you're doing it correct already. For what you want you do need two malloc()s.
Just some comments:
struct Vector y = (struct Vector*)malloc(sizeof(struct Vector));
y->x = (double*)malloc(10*sizeof(double));
should be
struct Vector *y = malloc(sizeof *y); /* Note the pointer */
y->x = calloc(10, sizeof *y->x);
In the first line, you allocate memory for a Vector object. malloc() returns a pointer to the allocated memory, so y must be a Vector pointer. In the second line you allocate memory for an array of 10 doubles.
In C you don't need the explicit casts, and writing sizeof *y instead of sizeof(struct Vector) is better for type safety, and besides, it saves on typing.
You can rearrange your struct and do a single malloc() like so:
struct Vector{
int n;
double x[];
};
struct Vector *y = malloc(sizeof *y + 10 * sizeof(double));
Few points
struct Vector y = (struct Vector*)malloc(sizeof(struct Vector)); is wrong
it should be struct Vector *y = (struct Vector*)malloc(sizeof(struct Vector)); since y holds pointer to struct Vector.
1st malloc() only allocates memory enough to hold Vector structure (which is pointer to double + int)
2nd malloc() actually allocate memory to hold 10 double.
When you allocate memory for struct Vector you just allocate memory for pointer x, i.e. for space, where its value, which contains address, will be placed. So such way you do not allocate memory for the block, on which y.x will reference.
First malloc allocates memory for struct, including memory for x (pointer to double). Second malloc allocates memory for double value wtich x points to.
You could actually do this in a single malloc by allocating for the Vector and the array at the same time. Eg:
struct Vector y = (struct Vector*)malloc(sizeof(struct Vector) + 10*sizeof(double));
y->x = (double*)((char*)y + sizeof(struct Vector));
y->n = 10;
This allocates Vector 'y', then makes y->x point to the extra allocated data immediate after the Vector struct (but in the same memory block).
If resizing the vector is required, you should do it with the two allocations as recommended. The internal y->x array would then be able to be resized while keeping the vector struct 'y' intact.
When you malloc(sizeof(struct_name)) it automatically allocates memory for the full size of the struct, you don't need to malloc each element inside.
Use -fsanitize=address flag to check how you used your program memory.

Are "malloc(sizeof(struct a *))" and "malloc(sizeof(struct a))" the same?

This question is a continuation of Malloc call crashing, but works elsewhere
I tried the following program and I found it working (i.e. not crashing - and this was mentioned in the above mentioned link too). I May be lucky to have it working but I'm looking for a reasonable explanation from the SO experts on why this is working?!
Here are some basic understanding on allocation of memory using malloc() w.r.t structures and pointers
malloc(sizeof(struct a) * n) allocates n number of type struct a elements. And, this memory location can be stored and accessed using a pointer-to-type-"struct a". Basically a struct a *.
malloc(sizeof(struct a *) * n) allocates n number of type struct a * elements. Each element can then point to elements of type struct a. Basically malloc(sizeof(struct a *) * n) allocates an array(n-elements)-of-pointers-to-type-"struct a". And, the allocated memory location can be stored and accessed using a pointer-to-(pointer-to-"struct a"). Basically a struct a **.
So when we create an array(n-elements)-of-pointers-to-type-"struct a", is it
valid to assign that to struct a * instead of struct a ** ?
valid to access/de-reference the allocated array(n-elements)-of-pointers-to-type-"struct a" using pointer-to-"struct a" ?
data * array = NULL;
if ((array = (data *)malloc(sizeof(data *) * n)) == NULL) {
printf("unable to allocate memory \n");
return -1;
}
The code snippet is as follows:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
typedef struct {
int value1;
int value2;
}data;
int n = 1000;
int i;
int val=0;
data * array = NULL;
if ((array = (data *)malloc(sizeof(data *) * n)) == NULL) {
printf("unable to allocate memory \n");
return -1;
}
printf("allocation successful\n");
for (i=0 ; i<n ; i++) {
array[i].value1 = val++;
array[i].value2 = val++;
}
for (i=0 ; i<n ; i++) {
printf("%3d %3d %3d\n", i, array[i].value1, array[i].value2);
}
free(array);
printf("freeing successful\n");
return 0;
}
EDIT:
OK say if I do the following by mistake
data * array = NULL;
if ((array = (data *)malloc(sizeof(data *) * n)) == NULL) {
Is there a way to capture (during compile-time using any GCC flags) these kind of unintended programming typo's which could work at times and might blow out anytime! I compiled this using -Wall and found no warnings!
There seems to be a fundamental misunderstanding.
malloc(sizeof(struct a) * n) allocates n number of type struct a elements.
No, that's just what one usually does use it as after such a call. malloc(size) allocates a memory region of size bytes. What you do with that region is entirely up to you. The only thing that matters is that you don't overstep the limits of the allocated memory. Assuming 4 byte float and int and 8 byte double, after a successful malloc(100*sizeof(float));, you can use the first 120 of the 400 bytes as an array of 15 doubles, the next 120 as an array of 30 floats, then place an array of 20 chars right behind that and fill up the remaining 140 bytes with 35 ints if you wish. That's perfectly harmless defined behaviour.
malloc returns a void*, which can be implicitly cast to a pointer of any type, so
some_type **array = malloc(100 * sizeof(data *)); // intentionally unrelated types
is perfectly fine, it might just not be the amount of memory you wanted. In this case it very likely is, because pointers tend to have the same size regardless of what they're pointing to.
More likely to give you the wrong amount of memory is
data *array = malloc(n * sizeof(data*));
as you had it. If you use the allocated piece of memory as an array of n elements of type data, there are three possibilities
sizeof(data) < sizeof(data*). Then your only problem is that you're wasting some space.
sizeof(data) == sizeof(data*). Everything's fine, no space wasted, as if you had no typo at all.
sizeof(data) > sizeof(data*). Then you'll access memory you shouldn't have accessed when touching later array elements, which is undefined behaviour. Depending on various things, that could consistently work as if your code was correct, immediately crash with a segfault or anything in between (technically it could behave in a manner that cannot meaningfully be placed between those two, but that would be unusual).
If you intentionally do that, knowing point 1. or 2. applies, it's bad practice, but not an error. If you do it unintentionally, it is an error regardless of which point applies, harmless but hard to find while 1. or 2. applies, harmful but normally easier to detect in case of 3.
In your examples. data was 4 resp. 8 bytes (probably), which on a 64-bit system puts them into 1. resp. 2. with high probability, on a 32-bit system into 2 resp. 3.
The recommended way to avoid such errors is to
type *pointer = malloc(num_elems * sizeof(*pointer));
No.
sizeof(struct a*) is the size of a pointer.
sizeof(struct a) is the size of the entire struct.
This array = (data *)malloc(sizeof(data *) * n) allocates a sizeof(data*) (pointer) to struct data, if you want to do that, you need a your array to be a data** array.
In your case you want your pointer to point to sizeof(data), a structure in memory, not to another pointer. That would require a data** (pointer to pointer).
is it valid to assign that to struct a * instead of struct a ** ?
Well, technically speaking, it is valid to assign like that, but it is wrong (UB) to dereference such pointer. You don't want to do this.
valid to access/de-reference the allocated array(n-elements)-of-pointers-to-type-"struct a" using pointer-to-"struct a" ?
No, undefined behavior.

Resources