Passing a dynamic array in to functions in C - c

I'm trying to create a function which takes an array as an argument, adds values to it (increasing its size if necessary) and returns the count of items.
So far I have:
int main(int argc, char** argv) {
int mSize = 10;
ent a[mSize];
int n;
n = addValues(a,mSize);
for(i=0;i<n;i++) {
//Print values from a
}
}
int addValues(ent *a, int mSize) {
int size = mSize;
i = 0;
while(....) { //Loop to add items to array
if(i>=size-1) {
size = size*2;
a = realloc(a, (size)*sizeof(ent));
}
//Add to array
i++;
}
return i;
}
This works if mSize is large enough to hold all the potential elements of the array, but if it needs resizing, I get a Segmentation Fault.
I have also tried:
int main(int argc, char** argv) {
...
ent *a;
...
}
int addValues(ent *a, int mSize) {
...
a = calloc(1, sizeof(ent);
//usual loop
...
}
To no avail.
I assume this is because when I call realloc, the copy of 'a' is pointed elsewhere - how is it possible to modify this so that 'a' always points to the same location?
Am I going about this correctly? Are there better ways to deal with dynamic structures in C? Should I be implementing a linked list to deal with these?

The main problem here is that you're trying to use realloc with a stack-allocated array. You have:
ent a[mSize];
That's automatic allocation on the stack. If you wanted to use realloc() on this later, you would create the array on the heap using malloc(), like this:
ent *a = (ent*)malloc(mSize * sizeof(ent));
So that the malloc library (and thus realloc(), etc.) knows about your array. From the looks of this, you may be confusing C99 variable-length arrays with true dynamic arrays, so be sure you understand the difference there before trying to fix this.
Really, though, if you are writing dynamic arrays in C, you should try to use OOP-ish design to encapsulate information about your arrays and hide it from the user. You want to consolidate information (e.g. pointer and size) about your array into a struct and operations (e.g. allocation, adding elements, removing elements, freeing, etc.) into special functions that work with your struct. So you might have:
typedef struct dynarray {
elt *data;
int size;
} dynarray;
And you might define some functions to work with dynarrays:
// malloc a dynarray and its data and returns a pointer to the dynarray
dynarray *dynarray_create();
// add an element to dynarray and adjust its size if necessary
void dynarray_add_elt(dynarray *arr, elt value);
// return a particular element in the dynarray
elt dynarray_get_elt(dynarray *arr, int index);
// free the dynarray and its data.
void dynarray_free(dynarray *arr);
This way the user doesn't have to remember exactly how to allocate things or what size the array is currently. Hope that gets you started.

Try reworking it so a pointer to a pointer to the array is passed in, i.e. ent **a. Then you will be able to update the caller on the new location of the array.

this is a nice reason to use OOP. yes, you can do OOP on C, and it even looks nice if done correctly.
in this simple case you don't need inheritance nor polymorphism, just the encapsulation and methods concepts:
define a structure with a length and a data pointer. maybe an element size.
write getter/setter functions that operate on pointers to that struct.
the 'grow' function modifies the data pointer within the struct, but any struct pointer stays valid.

If you changed the variable declaration in main to be
ent *a = NULL;
the code would work more like you envisioned by not freeing a stack-allocated array. Setting a to NULL works because realloc treats this as if the user called malloc(size). Keep in mind that with this change, the prototype to addValue needs to change to
int addValues(ent **a, int mSize)
and that the code needs to handle the case of realloc failing. For example
while(....) { //Loop to add items to array
tmp = realloc(*a, size*sizeof(ent));
if (tmp) {
*a = tmp;
} else {
// allocation failed. either free *a or keep *a and
// return an error
}
//Add to array
i++;
}
I would expect that most implementations of realloc will internally allocate twice as much memory if the current buffer needs resizing making the original code's
size = size * 2;
unnecessary.

You are passing the array pointer by value. What this means is:
int main(int argc, char** argv) {
...
ent *a; // This...
...
}
int addValues(ent *a, int mSize) {
...
a = calloc(1, sizeof(ent); // ...is not the same as this
//usual loop
...
}
so changing the value of a in the addValues function does not change the value of a in main. To change the value of a in main you need to pass a reference to it to addValues. At the moment, the value of a is being copied and passed to addValues. To pass a reference to a use:
int addValues (int **a, int mSize)
and call it like:
int main(int argc, char** argv) {
...
ent *a; // This...
...
addValues (&a, mSize);
}
In the addValues, access the elements of a like this:
(*a)[element]
and reallocate the array like this:
(*a) = calloc (...);

Xahtep explains how your caller can deal with the fact that realloc() might move the array to a new location. As long as you do this, you should be fine.
realloc() might get expensive if you start working with large arrays. That's when it's time to start thinking of using other data structures -- a linked list, a binary tree, etc.

As stated you should pass pointer to pointer to update the pointer value.
But I would suggest redesign and avoid this technique, in most cases it can and should be avoided. Without knowing what exactly you trying to achieve it's hard to suggest alternative design, but I'm 99% sure that it's doable other way. And as Javier sad - think object oriented and you will always get better code.

Are you really required to use C? This would be a great application of C++'s "std::vector", which is precisely a dynamically-sized array (easily resizeble with a single call you don't have to write and debug yourself).

Related

What is the correct way to temporarily cast void* for arithmetic?

I am C novice but been a programmer for some years, so I am trying to learn C by following along Stanford's course from 2008 and doing Assignment 3 on Vectors in C.
It's just a generic array basically, so the data is held inside a struct as a void *. The compiler flag -Wpointer-arith is turned on so I can't do arithmetic (and I understand the reasons why).
The struct around the data must not know what type the data is, so that it is generic for the caller.
To simplify things I am trying out the following code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
void *data;
int aindex;
int elemSize;
} trial;
void init(trial *vector, int elemSize)
{
vector->aindex = 0;
vector->elemSize = elemSize;
vector->data = malloc(10 * elemSize);
}
void add(trial *vector, const void *elemAddr)
{
if (vector->aindex != 0)
vector->data = (char *)vector->data + vector->elemSize;
vector->aindex++;
memcpy(vector->data, elemAddr, sizeof(int));
}
int main()
{
trial vector;
init(&vector, sizeof(int));
for (int i = 0; i < 8; i++)
{add(&vector, &i);}
vector.data = (char *)vector.data - ( 5 * vector.elemSize);
printf("%d\n", *(int *)vector.data);
printf("%s\n", "done..");
free(vector.data);
return 0;
}
However I get an error at free with free(): invalid pointer. So I ran valgrind on it and received the following:
==21006== Address 0x51f0048 is 8 bytes inside a block of size 40 alloc'd
==21006== at 0x4C2CEDF: malloc (vg_replace_malloc.c:299)
==21006== by 0x1087AA: init (pointer_arithm.c:13)
==21006== by 0x108826: main (pointer_arithm.c:29)
At this point my guess is I am either not doing the char* correctly, or maybe using memcpy incorrectly
This happens because you add eight elements to the vector, and then "roll back" the pointer by only five steps before attempting a free. You can easily fix that by using vector->aindex to decide by how much the index is to be unrolled.
The root cause of the problem, however, is that you modify vector->data. You should avoid modifying it in the first place, relying on a temporary pointer inside of your add function instead:
void add(trial *vector, const void *elemAddr, size_t sz) {
char *base = vector->data;
memcpy(base + vector->aindex*sz, elemAddr, sz);
vector->aindex++;
}
Note the use of sz, you need to pass sizeof(int) to it.
Another problem in your code is when you print by casting vector.data to int*. This would probably work, but a better approach would be to write a similar read function to extract the data.
If you don't know the array's data type beforehand, you must assume a certain amount of memory when you first initialize it, for example, 32 bytes or 100 bytes. Then if you run out of memory, you can expand using realloc and copying over your previous data to the new slot. The C++ vector IIRC follows either a x2 or x2.2 ratio to reallocate, not sure.
Next up is your free. There's a big thing you must know here. What if the user were to send you a memory allocated object of their own? For example a char* that they allocated previously? If you simply delete the data member of your vector, that won't be enough. You need to ask for a function pointer in case the data type is something that requires special attention as your input to add.
Lastly you are doing a big mistake at this line here:
if (vector->aindex != 0)
vector->data = (char *)vector->data + vector->elemSize;
You are modifiyng your pointer address!!! Your initial address is lost here! You must never do this. Use a temporary char* to hold your initial data address and manipulate it instead.
Your code is somewhat confusing, there's probably a mis-understanding or two hiding in there.
A few observations:
You can't change a pointer returned by malloc() and then pass the new value to free(). Every value passed to free() must be the exact same value returned by one of the allocation functions.
As you've guessed, the copying is best done by memcpy() and you have to cast to char * for the arithmetic.
The function to append a value could be:
void add(trial *vector, const void *element)
{
memcpy((char *) vector->data + vector->aindex * vector->elemSize, element);
++vector->aindex;
}
Of course this doesn't handle overflowing the vector, since the length is not stored (I didn't want to assume it was hard-coded at 10).
Changing the data value in vector for each object is very odd, and makes things more confusing. Just add the required offset when you need to access the element, that's super-cheap and very straight forward.

How to return a char** in C

I've been trying for a while now and I can not seem to get this working:
char** fetch (char *lat, char*lon){
char emps[10][50];
//char** array = emps;
int cnt = -1;
while (row = mysql_fetch_row(result))
{
char emp_det[3][20];
char temp_emp[50] = "";
for (int i = 0; i < 4; i++){
strcpy(emp_det[i], row[i]);
}
if ( (strncmp(emp_det[1], lat, 7) == 0) && (strncmp(emp_det[2], lon, 8) == 0) ) {
cnt++;
for (int i = 0; i < 4; i++){
strcat(temp_emp, emp_det[i]);
if(i < 3) {
strcat(temp_emp, " ");
}
}
strcpy(emps[cnt], temp_emp);
}
}
}
mysql_free_result(result);
mysql_close(connection);
return array;
Yes, I know array = emps is commented out, but without it commented, it tells me that the pointer types are incompatible. This, in case I forgot to mention, is in a char** type function and I want it to return emps[10][50] or the next best thing. How can I go about doing that? Thank you!
An array expression of type T [N][M] does not decay to T ** - it decays to type T (*)[M] (pointer to M-element array).
Secondly, you're trying to return the address of an array that's local to the function; once the function exits, the emps array no longer exists, and any pointer to it becomes invalid.
You'd probably be better off passing the target array as a parameter to the function and have the function write to it, rather than creating a new array within the function and returning it. You could dynamically allocate the array, but then you're doing a memory management dance, and the best way to avoid problems with memory management is to avoid doing memory management.
So your function definition would look like
void fetch( char *lat, char *lon, char emps[][50], size_t rows ) { ... }
and your function call would look like
char my_emps[10][50];
...
fetch( &lat, &lon, my_emps, 10 );
What you're attempting won't work, even if you attempt to cast, because you'll be returning the address of a local variable. When the function returns, that variable goes out of scope and the memory it was using is no longer valid. Attempting to dereference that address will result in undefined behavior.
What you need is to use dynamic memory allocation to create the data structure you want to return:
char **emps;
emps = malloc(10 * sizeof(char *));
for (int i=0; i<10; i++) {
emps[i] = malloc(50);
}
....
return emps;
The calling function will need to free the memory created by this function. It also needs to know how many allocations were done so it knows how many times to call free.
If you found a way to cast char emps[10][50]; into a char * or char **
you wouldn't be able to properly map the data (dimensions, etc). multi-dimensional char arrays are not char **. They're just contiguous memory with index calculation. Better fit to a char * BTW
but the biggest problem would be that emps would go out of scope, and the auto memory would be reallocated to some other variable, destroying the data.
There's a way to do it, though, if your dimensions are really fixed:
You can create a function that takes a char[10][50] as an in/out parameter (you cannot return an array, not allowed by the compiler, you could return a struct containing an array, but that wouldn't be efficient)
Example:
void myfunc(char emp[10][50])
{
emp[4][5] = 'a'; // update emp in the function
}
int main()
{
char x[10][50];
myfunc(x);
// ...
}
The main program is responsible of the memory of x which is passed as modifiable to myfunc routine: it is safe and fast (no memory copy)
Good practice: define a type like this typedef char matrix10_50[10][50]; it makes declarations more logical.
The main drawback here is that dimensions are fixed. If you want to use myfunc for another dimension set, you have to copy/paste it or use macros to define both (like a poor man's template).
EDITa fine comment suggests that some compilers support variable array size.
So you could pass dimensions alongside your unconstrained array:
void myfunc(int rows, int cols, char emp[rows][cols])
Tested, works with gcc 4.9 (probably on earlier versions too) only on C code, not C++ and not in .cpp files containing plain C (but still beats cumbersome malloc/free calls)
In order to understand why you can't do that, you need to understand how matrices work in C.
A matrix, let's say your char emps[10][50] is a continuous block of storage capable of storing 10*50=500 chars (imagine an array of 500 elements). When you access emps[i][j], it accesses the element at index 50*i + j in that "array" (pick a piece of paper and a pen to understand why). The problem is that the 50 in that formula is the number of columns in the matrix, which is known at the compile time from the data type itself. When you have a char** the compiler has no way of knowing how to access a random element in the matrix.
A way of building the matrix such that it is a char** is to create an array of pointers to char and then allocate each of those pointers:
char **emps = malloc(10 * sizeof(char*)); // create an array of 10 pointers to char
for (int i = 0; i < 10; i++)
emps[i] = malloc(50 * sizeof(char)); // create 10 arrays of 50 chars each
The point is, you can't convert a matrix to a double pointer in a similar way you convert an array to a pointer.
Another problem: Returning a 2D matrix as 'char**' is only meaningful if the matrix is implemented using an array of pointers, each pointer pointing to an array of characters. As explained previously, a 2D matrix in C is just a flat array of characters. The most you can return is a pointer to the [0][0] entry, a 'char*'. There's a mismatch in the number of indirections.

How can I make a pool with pointers in C?

I'm making my library, and just when I thought understanding the pointers syntax, I just get confused, search on the web and get even more confused.
Basically I want to make a pool, here is what I actually want to do:
the following points must be respected :
when I add an object to the pool, the pointers of the current array to the objects are
added to a new array of pointers + 1 (to contain the new object).
the new array is pointed by "objects" of my foo structure.
the old array is free'ing.
when I call the cleanup function, all the object in the pool are
free'd
How should I define my structure ?
typedef struct {
int n;
(???)objects
} foo;
foo *the_pool;
here's the code to manage my pool :
void myc_pool_init ()
{
the_pool = (???)malloc(sizeof(???));
the_pool->n = 0;
the_pool->objects = NULL;
}
void myc_push_in_pool (void* object)
{
if (object != NULL) {
int i;
(???)new_pointers;
the_pool->n++;
new_pointers = (???)malloc(sizeof(???)*the_pool->n);
for (i = 0; i < the_pool->n - 1; ++i) {
new_pointers[i] = (the_pool->objects)[i]; // that doesn't work (as I'm not sure how to handle it)
}
new_array[i] = object;
free(the_pool->objects);
the_pool->objects = new_array; // that must be wrong
}
}
void myc_pool_cleanup ()
{
int i;
for (i = 0; i < the_pool->n; ++i)
free((the_pool->objects)[i]); // as in myc_push_in_pool, it doesn't work
free(the_pool->objects);
free(the_pool);
}
Note: the type of objects added to the pool are not known in advance, so i should handles all pointers as void
any feedback would be very welcomed.
A straight answer to your question would be: use void *. This type is very powerful as it allows you to put any kind of pointer in your pool. However, it's up to you to do the correct casts when retrieving a void * pointer from your pool.
Your struct would look like this
typedef struct {
int n;
(void **)objects
} foo;
foo *the_pool;
As in, an array of pointers.
Your malloc:
new_pointers = (void **)malloc(sizeof(void *)*the_pool->n);
There is an performance issue here. You could simply allocate an array of a fixed size, and only reallocate if the number of elements exceeds a predefined load factor (= number used/ max size)
Also, instead of allocating a new pointer each time you add something to your pool, you could just use realloc (http://www.cplusplus.com/reference/cstdlib/realloc/)
the_pool->objects = (void **)realloc(the_pool->objects, the_pool->n* sizeof(void*));
Realloc tries to increase the current allocated area, without the need to copy everything. Only if the function cannot increase the allocated area contiguously will it allocate a new area and copy everything.
Firstly, you already answered your "What should the type of foo.objects be?" question: void *objects;, malloc already returns void *. Your struct needs to store the size_t item_size;, too. n should probably also be a size_t.
typedef struct {
size_t item_count;
size_t item_size;
void *objects;
} foo;
foo *the_pool;
You could use a home-grown loop, but I'd consider memcpy to be a more convenient way to copy your old items to your new space, and the new item to it's new space.
Dereferencing a void * is a constraint violation, as is pointer arithmetic on a void *, so new_pointers will need to be a different type. You need a type that points to objects of the right size. You could use an array of the right number of unsigned char, like so:
// new_pointers is a pointer to array of the_pool->item_size unsigned chars.
unsigned char (*new_pointers)[the_pool->item_size] = malloc(the_pool->item_count * sizeof *new_pointers);
// copy the old items
memcpy(new_pointers, the_pool->objects, the_pool->item_count * sizeof *new_pointers);
// copy the new items
memcpy(new_pointers + the_pool->item_count, object, sizeof *new_pointers);
Remember, free() is only for pointers returned by malloc(), and there should be a one-to-one correspondence: Each malloc() should be free()d. Look how you malloc: new_pointers = malloc(sizeof(???)*the_pool->n); ... What makes you think you need a loop (in myc_pool_cleanup) to free each item, when you can free them all in one foul swoop?
You could use realloc, but you otherwise seem to be handling malloc/memcpy/free *in myc_push_in_pool* flawlessly. Lots of people tend to mess up when writing realloc code.

Dynamically allocate array of file pointers

is it possible to 'dynamically' allocate file pointers in C?
What I mean is this :
FILE **fptr;
fptr = (FILE **)calloc(n, sizeof(FILE*));
where n is an integer value.
I need an array of pointer values, but I don't know how many before I get a user-input, so I can't hard-code it in.
Any help would be wonderful!
You're trying to implement what's sometimes called a flexible array (or flex array), that is, an array that changes size dynamically over the life of the program.) Such an entity doesn't exist among in C's native type system, so you have to implement it yourself. In the following, I'll assume that T is the type of element in the array, since the idea doesn't have anything to do with any specific type of content. (In your case, T is FILE *.)
More or less, you want a struct that looks like this:
struct flexarray {
T *array;
int size;
}
and a family of functions to initialize and manipulate this structure. First, let's look at the basic accessors:
T fa_get(struct flexarray *fa, int i) { return fa->array[i]; }
void fa_set(struct flexarray *fa, int i, T p) { fa->array[i] = p; }
int fa_size(struct flexarray *fa) { return fa->size; }
Note that in the interests of brevity these functions don't do any error checking. In real life, you should add bounds-checking to fa_get and fa_set. These functions assume that the flexarray is already initialized, but don't show how to do that:
void fa_init(struct flexarray *fa) {
fa->array = NULL;
fa->size = 0;
}
Note that this starts out the flexarray as empty. It's common to make such an initializer create an array of a fixed minimum size, but starting at size zero makes sure you exercise your array growth code (shown below) and costs almost nothing in most practical circumstances.
And finally, how do you make a flexarray bigger? It's actually very simple:
void fa_grow(struct flexarray *fa) {
int newsize = (fa->size + 1) * 2;
T *newarray = malloc(newsize * sizeof(T));
if (!newarray) {
// handle error
return;
}
memcpy(newaray, fa->array, fa->size * sizeof(T));
free(fa->array);
fa->array = newarray;
fa->size = newsize;
}
Note that the new elements in the flexarray are uninitialized, so you should arrange to store something to each new index i before fetching from it.
Growing flexarrays by some constant multiplier each time is generally speaking a good idea. If instead you increase it's size by a constant increment, you spend quadratic time copying elements of the array around.
I haven't showed the code to shrink an array, but it's very similar to the growth code,
Any way it's just pointers so you can allocate memory for them
but don't forget to fclose() each file pointer and then free() the memory

How to realloc an array inside a function with no lost data? (in C )

I have a dynamic array of structures, so I thought I could store the information about the array in the first structure.
So one attribute will represent the amount of memory allocated for the array and another one representing number of the structures actually stored in the array.
The trouble is, that when I put it inside a function that fills it with these structures and tries to allocate more memory if needed, the original array gets somehow distorted.
Can someone explain why is this and how to get past it?
Here is my code
#define INIT 3
typedef struct point{
int x;
int y;
int c;
int d;
}Point;
Point empty(){
Point p;
p.x=1;
p.y=10;
p.c=100;
p.d=1000; //if you put different values it will act differently - weird
return p;
}
void printArray(Point * r){
int i;
int total = r[0].y+1;
for(i=0;i<total;i++){
printf("%2d | P [%2d,%2d][%4d,%4d]\n",i,r[i].x,r[i].y,r[i].c,r[i].d);
}
}
void reallocFunction(Point * r){
r=(Point *) realloc(r,r[0].x*2*sizeof(Point));
r[0].x*=2;
}
void enter(Point* r,int c){
int i;
for(i=1;i<c;i++){
r[r[0].y+1]=empty();
r[0].y++;
if( (r[0].y+2) >= r[0].x ){ /*when the amount of Points is near
*the end of allocated memory.
reallocate the array*/
reallocFunction(r);
}
}
}
int main(int argc, char** argv) {
Point * r=(Point *) malloc ( sizeof ( Point ) * INIT );
r[0]=empty();
r[0].x=INIT; /*so here I store for how many "Points" is there memory
//in r[0].y theres how many Points there are.*/
enter(r,5);
printArray(r);
return (0);
}
Your code does not look clean to me for other reasons, but...
void reallocFunction(Point * r){
r=(Point *) realloc(r,r[0].x*2*sizeof(Point));
r[0].x*=2;
r[0].y++;
}
The problem here is that r in this function is the parameter, hence any modifications to it are lost when the function returns. You need some way to change the caller's version of r. I suggest:
Point * // Note new return type...
reallocFunction(Point * r){
r=(Point *) realloc(r,r[0].x*2*sizeof(Point));
r[0].x*=2;
r[0].y++;
return r; // Note: now we return r back to the caller..
}
Then later:
r = reallocFunction(r);
Now... Another thing to consider is that realloc can fail. A common pattern for realloc that accounts for this is:
Point *reallocFunction(Point * r){
void *new_buffer = realloc(r, r[0].x*2*sizeof(Point));
if (!new_buffer)
{
// realloc failed, pass the error up to the caller..
return NULL;
}
r = new_buffer;
r[0].x*=2;
r[0].y++;
return r;
}
This ensures that you don't leak r when the memory allocation fails, and the caller then has to decide what happens when your function returns NULL...
But, some other things I'd point out about this code (I don't mean to sound like I'm nitpicking about things and trying to tear them apart; this is meant as constructive design feedback):
The names of variables and members don't make it very clear what you're doing.
You've got a lot of magic constants. There's no explanation for what they mean or why they exist.
reallocFunction doesn't seem to really make sense. Perhaps the name and interface can be clearer. When do you need to realloc? Why do you double the X member? Why do you increment Y? Can the caller make these decisions instead? I would make that clearer.
Similarly it's not clear what enter() is supposed to be doing. Maybe the names could be clearer.
It's a good thing to do your allocations and manipulation of member variables in a consistent place, so it's easy to spot (and later, potentially change) how you're supposed to create, destroy and manipulate one of these objects. Here it seems in particular like main() has a lot of knowledge of your structure's internals. That seems bad.
Use of the multiplication operator in parameters to realloc in the way that you do is sometimes a red flag... It's a corner case, but the multiplication can overflow and you can end up shrinking the buffer instead of growing it. This would make you crash and in writing production code it would be important to avoid this for security reasons.
You also do not seem to initialize r[0].y. As far as I understood, you should have a r[0].y=0 somewhere.
Anyway, you using the first element of the array to do something different is definitely a bad idea. It makes your code horribly complex to understand. Just create a new structure, holding the array size, the capacity, and the pointer.

Resources