How we can insert array elements when array size is already fixed in C? - c

When ever I read differences between linked lists & arrays, I always saw on lot of sites that insertion of an element in to an array is very costly because we need to do lot of data moving. But one thing I always didn't understand is how we can create space for one more element while inserting, as the size of the array (or number of the elements in array) is fixed at compile time. Can any one please let me know how we can insert element into a fixed size array. And is there any concept called Dynamic array in C?

There is, indeed, the concept of a dynamic array. You just need a pointer and to reserve memory of the size you want with malloc. You need also to keep track of the number of elements you have.
int* my_array = malloc(10 * sizeof(int));
int n_used_elements = 0; // Need to keep track of the used elements and the size
int my_array_size = 10; // reserved size
However, when you exceed the number of elements in your array, you need to reserve the whole thing again and copy it again to the new reserved memory, which is also costly.
Usually, when using arrays for dynamically increasing and shrinking amounts of data, one of the most typical approaches goes with the following idea: when you exceed the size of your array, you double the size (i.e. you do not just add one more, but reserve for an extra number of elements in prevision you might need to increase the size of your array again), copy the elements of the old small one and keep going. Whenever you exceed, you double the size. On the other hand, to avoid wasting memory, if you have less than a certain amount of elements occupied, sometimes you half the size of the array.
Inserting a new element in an array is very costly because you have to shift all the elements after the inserted index one position to the right. The bigger the array, the bigger the cost of it (i.e. it is proportional to the size of an array). And you always need to consider the possibility of exceeding the size of the vector.

In C, there is no "native" concept of a dynamic array. You can create fixed length arrays via declaration:
int myArray[10];
Or dynamically via malloc/calloc:
int* myArray = malloc(10, sizeof(int));
The reason that "inserting" into a fixed array is so costly, is because you need to:
Create a new, bigger array.
Copy the old data into the new array.
Insert the new element into the appropriate spot in the new array.
Your options are to create your own storage mechanism (ie: stack, queue, linked list), or implement an existing implementation of such.

If you have an array like int a[10]; (and you use all 10 elements) it is not possible to resize it to fit another element.
For dynamic size you have to use a pointer int* a;, allocate memory youself with a = malloc(10*sizeof(int)); and take care of moving around elements when you insert in the middle.

There's no built-in dynamic array in C. If you need a dynamic array, you can't escape pointers.
typedef struct {
int *array;
size_t used;
size_t size;
} Array;
void insertArray(Array *a, int element) {
if (a->used == a->size) {
a->size *= 2; // double the size when exceeding the size of the array
a->array = (int *)realloc(a->array, a->size * sizeof(int));
}
a->array[a->used++] = element;
}
Check out this post for more details and examples.

Related

How to import a large quantity of numerical data

I'm thinking what is the best technique for importing a large amount of data, whether integer or floating point type, from a file into an array to be processed later.
Considering that the number of data can vary (not all import files are of equal size), therefore in one file there can be 100 numbers, in another file 1 million numbers and they are in ASCII format, I thought that before sizing the array to hold the data i should know how much data will fill it.
I can't size the array upfront if I don't know how much data will go into that array. So I could read the data from the file and as they are read, use the realloc instruction to resize the array every time (in doing so, however, it seems to me to waste system resources since if the file consists of a million numbers, it is forced to resize the array 1 million times).
Or (but I think this would be fine if it were in binary format), understand the file size, know which separator there is between the numbers and then calculate, based on this, the size of the array.
Or again, if the file as I said is in ASCII format, first read the number of separators (for example, they can be spaces or commas), and based on this understand the quantity of elements and size the array accordingly.
I don't know which technique would be the best.
Here's an example of the realloc dynamic resizing approach [as Bodo mentioned] from some code I've had lying around. Note the ary_grow can be set to whatever you want.
// qwklib/ary.c -- quick dynamic array control
#include <string.h>
#include <stdlib.h>
typedef void (*aryinit_p)(void *);
typedef struct {
void *ary_base; // base address
int ary_siz; // size of elements
int ary_cnt; // current count
int ary_max; // maximum count
int ary_grow; // amount to grow
aryinit_p ary_init; // initialization
} ary_t;
typedef ary_t *ary_p;
// aryinit -- initialize the array
ary_p
aryinit(ary_p ary,int siz,int grow)
{
memset(ary,0,sizeof(ary_t));
ary->ary_siz = siz;
ary->ary_grow = grow;
return ary;
}
static inline void *
aryloc(ary_p ary,int idx)
{
void *ptr;
ptr = ary->ary_base;
ptr += (ary->ary_siz * idx);
return ptr;
}
// arypush -- add to dynamic array
void *
arypush(ary_p ary)
{
aryinit_p init;
int cnt;
void *ptr;
do {
// got enough space already
if (ary->ary_cnt < ary->ary_max)
break;
if (ary->ary_siz == 0)
ary->ary_siz = 1;
// get number of elements to grow by
if (ary->ary_grow == 0)
ary->ary_grow = 10;
// add to allocated space
ary->ary_max += ary->ary_grow;
ptr = realloc(ary->ary_base,ary->ary_max * ary->ary_siz);
ary->ary_base = ptr;
ptr += ary->ary_cnt;
cnt = ary->ary_max - ary->ary_cnt;
memset(ptr,0,ary->ary_siz * cnt);
init = ary->ary_init;
if (init == NULL)
break;
for (; cnt > 0; --cnt, ptr += ary->ary_siz)
init(ptr);
} while (0);
// get pointer to first available slot
ptr = aryloc(ary,ary->ary_cnt);
// advance count for next time
ary->ary_cnt += 1;
return ptr;
}
// arytrim -- trim allocated array size to in-use size
void
arytrim(ary_p ary)
{
void *ptr;
ary->ary_max = ary->ary_cnt;
ptr = realloc(ary->ary_base,ary->ary_max * ary->ary_siz);
ary->ary_base = ptr;
}
// aryclean -- free up storage
void
aryclean(ary_p ary)
{
free(ary->ary_base);
}
Note that, for completeness, you may wish to use size_t instead of int for some variables if your array indexes could overflow a 32 bit number, as well as adding proper error checking for realloc
One thing you could do is not store the data in an array, but rather in a linked list storing one piece of data per list node. That way, you could add elements to the linked list at will, without ever having to resize anything. However, this has the following disadvantges:
Dynamic memory allocation is rather slow.
Linked lists aren't cached as well as arrays, which is bad for performance.
It is not very space efficient. For example, on a 64-bit system, pointers are normally 8 bytes long. So, if every node contains a 32-bit int as data, you will have 4 bytes of data per node and 8 bytes of overhead from the pointer (16 bytes if the linked list is doubly-linked). This means that more than half of the space is being wasted. In addition, the memory allocator itself likely has a few bytes of internal overhead for every memory allocation, so even more space is wasted.
For this reason, it would be more efficient to allocate an array of several kilobytes of memory at once using malloc and, if if it later turns out that you need more memory, you can allocate another array of the same size (or maybe higher size) using malloc. These individual arrays could be linked with each other using a linked list, so the number of new arrays you can allocate would only be limited by your available memory.
However, this efficient solution is also more complicated. Therefore, if the disadvantages mentioned above are acceptable to you, then a simple linked list storing one piece of data per list node would probably be the easiest and most flexible solution.
An alternative would be to allocate one single array and expand it as necessary using realloc in large steps of several kilobytes (instead of once for every new element). This would be significantly faster than calling realloc once for every new element. However, when compared to the linked list solution, it has the following two disadvantages:
If there is not enough room to expand the array, the entire array must be copied to a new location with more room. Even if this is handled internally by realloc (so you don't have to program it yourself), it can be bad for performance.
If the memory is too fragmented, the allocator may not be able to find any room anywhere for a large enough array to store all elements.
When deciding whether to use arrays or linked lists, it is also worth taking into consideration that certain operations are better suited for linked lists (such as insert operations), whereas other operations (such as random access) are better suited for arrays.

Dynamic Array Allocation confusion

I am to read in several values from the user and store those in an array. Then I need to create an array which is big enough to store all those values. Using some functions I wrote I sort/lsearch/bsearch through the array for given values.
I already have my program written and everything, but for a static array implementation. I am sort of getting confused on where to actually use the dynamic array.
It makes sense to use it when the user starts entering values, since I can't assume how many values he enters, so the array needs to be big enough to hold it. It also makes sense (Sort of) to use it when I am creating a big enough array that can hold all the value (Acts as a copy of the first array).
I'm not asking for any code, everything is done but on a static approach. I am just trying to visualize where I would need to use darrays here. My thoughts are:
When the user first enters the values
When i copy arr1 into a new arr2 that needs to be big enough to hold all of arr1's values.
Am I right or wrong on this?
Start by using malloc or calloc to allocate an array of some known starting size, and keep track of the current capacity in a variable.
As you're reading values in, if your array isn't big enough, then user realloc to double the size of the array.
The best solution is not to copy the entire array each time a user inputs a value. The demands on malloc and free will be heavy, and get worse with larger arrays.
You need to calculate the size of your array with "number of elements as the input
int* array = newArray(10);
int* newArray(int size) {
return malloc(size * sizeof(int));
}
Keep in mind that an int* is an array, so you can still do array[3]. But, if you centralize the storage of number of used elements and the current size, you can allocate a few elements and only grow when the available elements are exhausted.
struct DynamicIntArray {
int used;
int size;
int* storage
};
void add(struct DynamicArray* array, int value) {
if (used < size) {
(*array).storage[used] = value;
used++;
} else {
int newSize = size+10;
int* newStorage = (int*)malloc(newSize*sizeof(int));
int* oldStorage = (*array).storage;
for (int i = 0; i < size; i++) {
newStorage[i] = oldStorage[i];
}
(*array).storage = newStorage;
(*array).size = newSize;
free(oldStorage);
}
}
with such an example. You should be able to write the newDynamicIntArray(...) function and the freeDynamicIntArray(struct DynamicIntArray* array) function and any other methods you care about.
I think you ask the wrong question.
The question is:
Is a dynamic array (a contiguous block of memory) the proper data structure to hold and process the data in your application?
There is only one especially useful application for arrays and that is as associative array, which means that the array index itself has a meaning and can be used to retrieve the correct contents you are searching with an effort of O(1).
In example, a list of track runners could be stored in an array, where the array index equals the track number. This is the perfect data structure if you want to visualize the name of the runners per track. It's a terrible data structure if you want to alphabetically sort the names of all runners.
But according to your application description, the array index has no meaning for you. This is an indication that an array is not the best choice.
If you are not sure how many entries inserted at runtime i suggest you to use linked list data structure. It will save your memory usage.

How to implement a dynamic 2D array in C with known number colums

I am trying to create a 2d Array at compile time that has an unknown number of rows that i can dynamically allocate throughout the program but a specific number of columns as 8.
Something like ---->Elements[?][8];
If you have to use 2d array instead of list of array you gonna have to make a array
constant i = 1
foo[i][8]
and every time you want to expand that array
make temp_foo[i][8]
copy foo to temp_foo
delete foo
make foo[i++][8]
copy temp_foo to foo
But that's make confusing. and i think its better if use link list
struct node
{
foo[8]
node *next;
}
adding first element
node *element_head
element->foo = {add elements}
element->next = null
adding new element
node *temp
temp->foo = {add element}
temp->next = element_head
element_head= temp
Knowing the number of columns, and making only the number of rows dynamic you can either use a VLA or dynamic allocation. A VLA is straight forward:
int rows;
// get rows somehow
int table[rows][8];
Keeping in mind a VLA has automatic storage lifetime and will be removed from addressable memory once the enclosing scope expires. And they cannot be globals.
If your implementation doesn't support VLA's, automatic storage space is a concern, or you need a global variable for some nefarious purpose, you'll have to manage this dynamically (which it sounds like you want to do anyway). To do that, declare a pointer to an array of 8 elements, as such:
int rows;
// get rows somehow
int (*table)[8] = malloc(rows * sizeof(*table));
The rest is straight forward. You can reference your elements as table[i][j] for i in 0..rows-1 and j in 0..7. Just remember to free your allocation when finished:
free(table);
and don't reference it again.
As far as I know, you can't have foo[][8] in C. You might be able to hack around it by making a struct and casting a pointer to that struct to an array, as discussed here, but that is a somewhat fragile hack.
What you can do is change the definition of rows and columns in your problem space, so that, in order to access row i, column j, you would do foo[j][i] instead of foo[i][j].
In this case you could declare your array like this: <typename> * foo[8].
I'd go with this approach when the dimensions are unknown.
Assuming data type to be int.
int* a; //this will point to your 2D array
allocate it when you know the dimensions (ROW, COL):
a = malloc(sizeof(int)*ROW*COL);
and access it like
a[ROW*i + j] = value // equivalent of a[i][j]
I think it will not be created when you are not passing any value at compile time, my suggestion is to use dynamic memory allocation as you don't know how many rows

Is there a way to initialize an array without defining the size

Is there a way to initialize an array without defining the size.
The size of an array increases on its own as as when the loop runs it reallocates the array.
There is no such thing out of the box. You will have to create your own array-like data structure that does this. It shouldn't be very hard to implement, if you're careful.
What you're looking for is, roughly, a data structure that, when created, allocates (using malloc, for instance) a predefined size and starts using the consecutive space inside it as slots of an array. Then, as more items are added, it reallocates (say, using realloc) that space.
Of course, you won't be able to use the indexer syntax you're used to with simple arrays. Instead, your data structure will have to provide its own pair of set/get functions that take care of the above, under the hood. Therefore, the set function will check the index specified in its arguments and, if that index is greater than the current size of the array, perform a reallocation. Then, in any case, set the value provided to the specified index.
You can initialize an array without specifying the size but it would not be useful unless you allocated space for it before you used it. Normally when you declare a variable in C, the compiler reserves a specific amount of memory for that variable on the "stack". If you want an array to be able to grow throughout the program, however, this is not what you are looking for because the amount of space allocated for a variable on the "stack" is static.
The solution, therefore, is to have the program decide how much memory to allocate to your variable at run-time, instead of compile-time. This way, while the program is running, you will be able to decide how much space your variable needs to have reserved.
In practice, this is called dynamic memory allocation and it is accomplished in C using the functions malloc() and realloc(). I would suggest reading up on these functions, I think they will be very useful to you.
If you have follow up questions feel free to ask.
One last thing!
Whenever you use malloc() to allocate memory for a variable, you should remember to call the function free() on that variable at the end of the program or whenever you are done using the variable.
Here's a simple implementation of such a datastructure (for ints, but you can replace the int with whatever type you need). I've omitted error-handling for clarity.
typedef struct array_s {
int len, cap;
int *a;
} array_s, *array_t;
/* Create a new array with 0 length, and the given capacity. */
array_t array_new(int cap) {
array_t result = malloc(sizeof(array_s));
array_s a = {0, cap, malloc(sizeof(int) * cap)};
*result = a;
return result;
}
/* Destroy an array. */
void array_free(array_t a) {
free(a->a);
free(a);
}
/* Change the size of an array, truncating if necessary. */
void array_resize(array_t a, int new_cap) {
result->cap = new_cap;
result->a = realloc(result->a, new_cap * sizeof(int));
if (result->len > result->cap) {
result->len = result->cap;
}
}
/* Add a new element to the end of the array, resizing if necessary. */
void array_append(array_t a, int x) {
if (result->len == result->cap) {
// max the new size with 4 in case cap is 0.
array_resize(a, max(4, result->cap * 2));
}
a->a[a->len++] = x;
}
By storing len (the current length of the array), and cap (the amount of space you've reserved for the array), you can extend the array in O(1) up to the point when len is cap, then resize the array (eg: using realloc), perhaps by multiplying the existing cap by 2 or 1.5 or something. This is what most vector or list types do in languages that support resizable arrays. I've coded this in array_append as an example.

Dynamic Array printing

I am trying to print a dynamic array, but I am having trouble with the bounds for the array.
For a simple example, lets say I'm trying to loop through an array of ints. How can I get the size of the array? I was trying to divide the size of the array by the size of the type like this sizeof(list)/sizeof(int) but that was not working correctly. I understand that I was trying to divide the size of the pointer by the type.
int *list
// Populate list
int i;
for(i = 0; i < ????; i++)
printf("%d", list[i]);
With dynamic arrays you need to maintain a pointer to the beginning address of the array and a value that holds the number of elements in that array. There may be other ways, but this is the easiest way I can think of.
sizeof(list) will also return 4 because the compiler is calculating the size of an integer pointer, not the size of your array, and this will always be four bytes (depending on your compiler).
YOU should know the size of the array, as you are who allocated it.
sizeof is an operator, which means it does its job at compile time. It will give you the size of an object, but not the length of an array.
So, sizeof(int*) is 32/62-bit depending on architecture.
Take a look at std::vector.
There is no standardized method to get the size of allocated memory block. You should keep size of list in unsigned listSize like this:
int *list;
unsigned listSize;
list = malloc(x * sizeof(int));
listSize = x;
If you are coding in C++, then it is better to use STL container like std::vector<>
As you wrote, you really did tried to divide the size of a pointer since list is declared as a pointer, and not an array. In those cases, you should keep the size of the list during the build of it, or finish the list with a special cell, say NULL, or anything else that will not be used in the array.
Seeing some of the inapropriate links to C++ tools for a C question, here is an answer for modern C.
Your ideas of what went wrong are quite correct, as you did it you only have a pointer, no size information about the allocation.
Modern C has variable length arrays (VLA) that you can either use directly or via malloc. Direcly:
int list[n];
and then your idea with the sizeof works out of the box, even if you changed your n in the mean time. This use is to be taken with a bit of care, since this is allocated on the stack. You shouldn't reserve too much, here. For a use with malloc:
int (list*)[n] = malloc(*list);
Then you'd have to adapt your code a bit basically putting a (*list) everywhere you had just list.
If by size you mean the number of elements then you could keep it in a counter that gets a ++ each time you push an element or if you dont mind the lost cycles you could make a function that gets a copy of the pointer to the first location, runs thru the list keeping a counter until it finds a k.next==null. or you could keep a list that as a next and a prev that way you wouldnt care if you lost the beginning.

Resources