Dynamic Array Allocation confusion - c

I am to read in several values from the user and store those in an array. Then I need to create an array which is big enough to store all those values. Using some functions I wrote I sort/lsearch/bsearch through the array for given values.
I already have my program written and everything, but for a static array implementation. I am sort of getting confused on where to actually use the dynamic array.
It makes sense to use it when the user starts entering values, since I can't assume how many values he enters, so the array needs to be big enough to hold it. It also makes sense (Sort of) to use it when I am creating a big enough array that can hold all the value (Acts as a copy of the first array).
I'm not asking for any code, everything is done but on a static approach. I am just trying to visualize where I would need to use darrays here. My thoughts are:
When the user first enters the values
When i copy arr1 into a new arr2 that needs to be big enough to hold all of arr1's values.
Am I right or wrong on this?

Start by using malloc or calloc to allocate an array of some known starting size, and keep track of the current capacity in a variable.
As you're reading values in, if your array isn't big enough, then user realloc to double the size of the array.

The best solution is not to copy the entire array each time a user inputs a value. The demands on malloc and free will be heavy, and get worse with larger arrays.
You need to calculate the size of your array with "number of elements as the input
int* array = newArray(10);
int* newArray(int size) {
return malloc(size * sizeof(int));
}
Keep in mind that an int* is an array, so you can still do array[3]. But, if you centralize the storage of number of used elements and the current size, you can allocate a few elements and only grow when the available elements are exhausted.
struct DynamicIntArray {
int used;
int size;
int* storage
};
void add(struct DynamicArray* array, int value) {
if (used < size) {
(*array).storage[used] = value;
used++;
} else {
int newSize = size+10;
int* newStorage = (int*)malloc(newSize*sizeof(int));
int* oldStorage = (*array).storage;
for (int i = 0; i < size; i++) {
newStorage[i] = oldStorage[i];
}
(*array).storage = newStorage;
(*array).size = newSize;
free(oldStorage);
}
}
with such an example. You should be able to write the newDynamicIntArray(...) function and the freeDynamicIntArray(struct DynamicIntArray* array) function and any other methods you care about.

I think you ask the wrong question.
The question is:
Is a dynamic array (a contiguous block of memory) the proper data structure to hold and process the data in your application?
There is only one especially useful application for arrays and that is as associative array, which means that the array index itself has a meaning and can be used to retrieve the correct contents you are searching with an effort of O(1).
In example, a list of track runners could be stored in an array, where the array index equals the track number. This is the perfect data structure if you want to visualize the name of the runners per track. It's a terrible data structure if you want to alphabetically sort the names of all runners.
But according to your application description, the array index has no meaning for you. This is an indication that an array is not the best choice.

If you are not sure how many entries inserted at runtime i suggest you to use linked list data structure. It will save your memory usage.

Related

Is there a way to dynamically increase an array size in C? [duplicate]

I am experimenting a little bit with gamestudio.
I am now making a shooter game.
I have an array with the pointers to the enemies. When an enemy is killed, I want to remove him from the list. And I also want to be able to create new enemies.
Gamestudio uses a scripting language named lite-C. It has the same syntax as C and on the website they say, that it can be compiled with any C compiler. It is pure C, no C++ or anything else.
I am new to C. I normally program in .NET languages and some scripting languages.
You can't. This is normally done with dynamic memory allocation.
// Like "ENEMY enemies[100]", but from the heap
ENEMY* enemies = malloc(100 * sizeof(ENEMY));
if (!enemies) { error handling }
// You can index pointers just like arrays.
enemies[0] = CreateEnemy();
// Make the array bigger
ENEMY* more_enemies = realloc(enemies, 200 * sizeof(ENEMY));
if (!more_enemies) { error handling }
enemies = more_enemies;
// Clean up when you're done.
free(enemies);
Once an array in C has been created, it is set. You need a dynamic data structure like a Linked List or an ArrayList
Arrays are static so you won't be able to change it's size.You'll need to create the linked list data structure. The list can grow and shrink on demand.
Take a look at realloc which will allow you to resize the memory pointed to by a given pointer (which, in C, arrays are pointers).
As NickTFried suggested, Linked List is one way to go.
Another one is to have a table big enough to hold the maximum number of items you'll ever have and manage that (which ones are valid or not, how many enemies currently in the list).
As far as resizing, you'd have to use a pointer instead of a table and you could reallocate, copy over and so on... definitely not something you want to do in a game.
If performance is an issue (and I am guessing it is), the table properly allocated is probably what I would use.
I wanted to lower the array size, but didn't worked like:
myArray = new int[10];//Or so.
So I've tried creating a new one, with the size based on a saved count.
int count = 0;
for (int i = 0; i < sizeof(array1)/sizeof(array1[0]); i++){
if (array1[i] > 0) count++;
}
int positives[count];
And then re-pass the elements in the first array to add them in the new one.
//Create the new array element reference.
int x = 0;
for (int i = 0; i < sizeof(array1)/sizeof(array1[0]); i++){
if (array1[i] > 0){ positives[x] = array1[i]; x++; }
}
And here we have a brand new array with the exact number of elements that we want (in my case).

How to import a large quantity of numerical data

I'm thinking what is the best technique for importing a large amount of data, whether integer or floating point type, from a file into an array to be processed later.
Considering that the number of data can vary (not all import files are of equal size), therefore in one file there can be 100 numbers, in another file 1 million numbers and they are in ASCII format, I thought that before sizing the array to hold the data i should know how much data will fill it.
I can't size the array upfront if I don't know how much data will go into that array. So I could read the data from the file and as they are read, use the realloc instruction to resize the array every time (in doing so, however, it seems to me to waste system resources since if the file consists of a million numbers, it is forced to resize the array 1 million times).
Or (but I think this would be fine if it were in binary format), understand the file size, know which separator there is between the numbers and then calculate, based on this, the size of the array.
Or again, if the file as I said is in ASCII format, first read the number of separators (for example, they can be spaces or commas), and based on this understand the quantity of elements and size the array accordingly.
I don't know which technique would be the best.
Here's an example of the realloc dynamic resizing approach [as Bodo mentioned] from some code I've had lying around. Note the ary_grow can be set to whatever you want.
// qwklib/ary.c -- quick dynamic array control
#include <string.h>
#include <stdlib.h>
typedef void (*aryinit_p)(void *);
typedef struct {
void *ary_base; // base address
int ary_siz; // size of elements
int ary_cnt; // current count
int ary_max; // maximum count
int ary_grow; // amount to grow
aryinit_p ary_init; // initialization
} ary_t;
typedef ary_t *ary_p;
// aryinit -- initialize the array
ary_p
aryinit(ary_p ary,int siz,int grow)
{
memset(ary,0,sizeof(ary_t));
ary->ary_siz = siz;
ary->ary_grow = grow;
return ary;
}
static inline void *
aryloc(ary_p ary,int idx)
{
void *ptr;
ptr = ary->ary_base;
ptr += (ary->ary_siz * idx);
return ptr;
}
// arypush -- add to dynamic array
void *
arypush(ary_p ary)
{
aryinit_p init;
int cnt;
void *ptr;
do {
// got enough space already
if (ary->ary_cnt < ary->ary_max)
break;
if (ary->ary_siz == 0)
ary->ary_siz = 1;
// get number of elements to grow by
if (ary->ary_grow == 0)
ary->ary_grow = 10;
// add to allocated space
ary->ary_max += ary->ary_grow;
ptr = realloc(ary->ary_base,ary->ary_max * ary->ary_siz);
ary->ary_base = ptr;
ptr += ary->ary_cnt;
cnt = ary->ary_max - ary->ary_cnt;
memset(ptr,0,ary->ary_siz * cnt);
init = ary->ary_init;
if (init == NULL)
break;
for (; cnt > 0; --cnt, ptr += ary->ary_siz)
init(ptr);
} while (0);
// get pointer to first available slot
ptr = aryloc(ary,ary->ary_cnt);
// advance count for next time
ary->ary_cnt += 1;
return ptr;
}
// arytrim -- trim allocated array size to in-use size
void
arytrim(ary_p ary)
{
void *ptr;
ary->ary_max = ary->ary_cnt;
ptr = realloc(ary->ary_base,ary->ary_max * ary->ary_siz);
ary->ary_base = ptr;
}
// aryclean -- free up storage
void
aryclean(ary_p ary)
{
free(ary->ary_base);
}
Note that, for completeness, you may wish to use size_t instead of int for some variables if your array indexes could overflow a 32 bit number, as well as adding proper error checking for realloc
One thing you could do is not store the data in an array, but rather in a linked list storing one piece of data per list node. That way, you could add elements to the linked list at will, without ever having to resize anything. However, this has the following disadvantges:
Dynamic memory allocation is rather slow.
Linked lists aren't cached as well as arrays, which is bad for performance.
It is not very space efficient. For example, on a 64-bit system, pointers are normally 8 bytes long. So, if every node contains a 32-bit int as data, you will have 4 bytes of data per node and 8 bytes of overhead from the pointer (16 bytes if the linked list is doubly-linked). This means that more than half of the space is being wasted. In addition, the memory allocator itself likely has a few bytes of internal overhead for every memory allocation, so even more space is wasted.
For this reason, it would be more efficient to allocate an array of several kilobytes of memory at once using malloc and, if if it later turns out that you need more memory, you can allocate another array of the same size (or maybe higher size) using malloc. These individual arrays could be linked with each other using a linked list, so the number of new arrays you can allocate would only be limited by your available memory.
However, this efficient solution is also more complicated. Therefore, if the disadvantages mentioned above are acceptable to you, then a simple linked list storing one piece of data per list node would probably be the easiest and most flexible solution.
An alternative would be to allocate one single array and expand it as necessary using realloc in large steps of several kilobytes (instead of once for every new element). This would be significantly faster than calling realloc once for every new element. However, when compared to the linked list solution, it has the following two disadvantages:
If there is not enough room to expand the array, the entire array must be copied to a new location with more room. Even if this is handled internally by realloc (so you don't have to program it yourself), it can be bad for performance.
If the memory is too fragmented, the allocator may not be able to find any room anywhere for a large enough array to store all elements.
When deciding whether to use arrays or linked lists, it is also worth taking into consideration that certain operations are better suited for linked lists (such as insert operations), whereas other operations (such as random access) are better suited for arrays.

Split C Array on element

Say, I have an array T*array and a predicate p, I want to split the array into different sub-arrays T**subs on every element matching p.
So something like:
typedef bool (*P) (T element);
T**subs(T*array,P p){....}
How can the code for subs() look like?
Note, that the code is just pseudo code, you can use variables like array_length and so on in your example, because I just want to get the idea on how to implement subs().
Of more importance than "what would the code look like" is the question "what data structure do you want/need to use?"
For example, if you need to change the sub arrays without changing the original values, you need to copy the array elements into new arrays. If you do not change the values of the sub arrays, you can just return an array of pointers or indices into the original array. Or the array of pointers is a list.
Once you have decided on a data structure that matches your requirements, you can develop the algorithm. But if your algorithm turns out to be cumbersome or slow, you might need to adapt your data structure to allow faster processing.
So you see, your question needs a lot of "design" and decissions from you, based on your requirements.
Assuming there will be no gaps between the sub-arrays, you can return a pointer to a dynamically created array T * result with size N for M=N-2 detected elements.
This array needs to be NULL terminated to indicate it's size, that is result[N-1] needs to be NULL.
Each element of result points into the source array indicating the start (the 1st element) of a sub-array.
The result[N-2] points just beyond the last element.
The size of sub-array i (for i = {0 ... M}) can then be derived by doing result[i+1]-result[i].
No copying, no additional array to indicated the sub-arrays' sizes is needed. Just the source array's size needs to be passed to subs().
We call that a callback function, not predicate.
typedef bool (*P) (T element);
T * * subs(T * array, P callback) {
T * * retval = malloc(sizeof(T*) * max_groups); // either count before, or realloc as needed
size_t group = 0;
retval[group] = array;
for (size_t i = 0; i < array_length; ++i) {
if (callback(array[i])) {
retval[++group]=array + i;
}
}
return retval;
}
This reuses the memory of the argument array and doesn't return any information about the lengths of the groups, but since you only wanted a general idea on how to solve this, I think this should be enough starting point for you to get exactly what you want.

How we can insert array elements when array size is already fixed in C?

When ever I read differences between linked lists & arrays, I always saw on lot of sites that insertion of an element in to an array is very costly because we need to do lot of data moving. But one thing I always didn't understand is how we can create space for one more element while inserting, as the size of the array (or number of the elements in array) is fixed at compile time. Can any one please let me know how we can insert element into a fixed size array. And is there any concept called Dynamic array in C?
There is, indeed, the concept of a dynamic array. You just need a pointer and to reserve memory of the size you want with malloc. You need also to keep track of the number of elements you have.
int* my_array = malloc(10 * sizeof(int));
int n_used_elements = 0; // Need to keep track of the used elements and the size
int my_array_size = 10; // reserved size
However, when you exceed the number of elements in your array, you need to reserve the whole thing again and copy it again to the new reserved memory, which is also costly.
Usually, when using arrays for dynamically increasing and shrinking amounts of data, one of the most typical approaches goes with the following idea: when you exceed the size of your array, you double the size (i.e. you do not just add one more, but reserve for an extra number of elements in prevision you might need to increase the size of your array again), copy the elements of the old small one and keep going. Whenever you exceed, you double the size. On the other hand, to avoid wasting memory, if you have less than a certain amount of elements occupied, sometimes you half the size of the array.
Inserting a new element in an array is very costly because you have to shift all the elements after the inserted index one position to the right. The bigger the array, the bigger the cost of it (i.e. it is proportional to the size of an array). And you always need to consider the possibility of exceeding the size of the vector.
In C, there is no "native" concept of a dynamic array. You can create fixed length arrays via declaration:
int myArray[10];
Or dynamically via malloc/calloc:
int* myArray = malloc(10, sizeof(int));
The reason that "inserting" into a fixed array is so costly, is because you need to:
Create a new, bigger array.
Copy the old data into the new array.
Insert the new element into the appropriate spot in the new array.
Your options are to create your own storage mechanism (ie: stack, queue, linked list), or implement an existing implementation of such.
If you have an array like int a[10]; (and you use all 10 elements) it is not possible to resize it to fit another element.
For dynamic size you have to use a pointer int* a;, allocate memory youself with a = malloc(10*sizeof(int)); and take care of moving around elements when you insert in the middle.
There's no built-in dynamic array in C. If you need a dynamic array, you can't escape pointers.
typedef struct {
int *array;
size_t used;
size_t size;
} Array;
void insertArray(Array *a, int element) {
if (a->used == a->size) {
a->size *= 2; // double the size when exceeding the size of the array
a->array = (int *)realloc(a->array, a->size * sizeof(int));
}
a->array[a->used++] = element;
}
Check out this post for more details and examples.

Dynamic Array printing

I am trying to print a dynamic array, but I am having trouble with the bounds for the array.
For a simple example, lets say I'm trying to loop through an array of ints. How can I get the size of the array? I was trying to divide the size of the array by the size of the type like this sizeof(list)/sizeof(int) but that was not working correctly. I understand that I was trying to divide the size of the pointer by the type.
int *list
// Populate list
int i;
for(i = 0; i < ????; i++)
printf("%d", list[i]);
With dynamic arrays you need to maintain a pointer to the beginning address of the array and a value that holds the number of elements in that array. There may be other ways, but this is the easiest way I can think of.
sizeof(list) will also return 4 because the compiler is calculating the size of an integer pointer, not the size of your array, and this will always be four bytes (depending on your compiler).
YOU should know the size of the array, as you are who allocated it.
sizeof is an operator, which means it does its job at compile time. It will give you the size of an object, but not the length of an array.
So, sizeof(int*) is 32/62-bit depending on architecture.
Take a look at std::vector.
There is no standardized method to get the size of allocated memory block. You should keep size of list in unsigned listSize like this:
int *list;
unsigned listSize;
list = malloc(x * sizeof(int));
listSize = x;
If you are coding in C++, then it is better to use STL container like std::vector<>
As you wrote, you really did tried to divide the size of a pointer since list is declared as a pointer, and not an array. In those cases, you should keep the size of the list during the build of it, or finish the list with a special cell, say NULL, or anything else that will not be used in the array.
Seeing some of the inapropriate links to C++ tools for a C question, here is an answer for modern C.
Your ideas of what went wrong are quite correct, as you did it you only have a pointer, no size information about the allocation.
Modern C has variable length arrays (VLA) that you can either use directly or via malloc. Direcly:
int list[n];
and then your idea with the sizeof works out of the box, even if you changed your n in the mean time. This use is to be taken with a bit of care, since this is allocated on the stack. You shouldn't reserve too much, here. For a use with malloc:
int (list*)[n] = malloc(*list);
Then you'd have to adapt your code a bit basically putting a (*list) everywhere you had just list.
If by size you mean the number of elements then you could keep it in a counter that gets a ++ each time you push an element or if you dont mind the lost cycles you could make a function that gets a copy of the pointer to the first location, runs thru the list keeping a counter until it finds a k.next==null. or you could keep a list that as a next and a prev that way you wouldnt care if you lost the beginning.

Resources