how to deal with large 2D arrays

how to deal with large 2D arrays - c

i have a 2D array of size 5428x5428 size.and it is a symmetric array. but while compiling it gives me an error saying that array size too large. can anyone provide me a way?

This array is to large for program stack memory - thats your error.
int main()
{
double arr[5428][5428]; // 8bytes*5428*5428 = 224MB
// ...
// use arr[y][x]
// ...
// no memory freeing needed
}
Use dynamic array allocation:
int main()
{
int i;
double ** arr;
arr = (double**)malloc(sizeof(double*)*5428);
for (i = 0; i < 5428; i++)
arr[i] = (double*)malloc(sizeof(double)*5428);
// ...
// use arr[y][x]
// ...
for (i = 0; i < 5428; i++)
free(arr[i]);
free(arr);
}
Or allocate plain array of size MxN and use ptr[y*width+x]
int main()
{
double * arr;
arr = (double*)malloc(sizeof(double)*5428*5428);
// ...
// use arr[y*5428 + x]
// ...
free(arr);
}
Or use combined method:
int main()
{
int i;
double * arr[5428]; // sizeof(double*)*5428 = 20Kb of stack for x86
for(i = 0; i < 5428; i++)
arr[i] = (double)malloc(sizeof(double)*5428);
// ...
// use arr[y][x]
// ...
for(i = 0; i < 5428; i++)
free(arr[i]);
}

When arrays get large, there are a number of solutions. The one that is good for you depends heavily on what you are actually doing.
I'll list a few to get you thinking:
Buy more memory.
Move your array from the stack to the heap.
The stack has tighter size limitations than the heap.
Simulate portions of the array (you say yours is symmetric, so just under 1/2 of the data is redundant).
In your case, the array is symmetric, so instead of using an array, use a "simulated array"
int getArray(array, col, row);
void setArray(array, col, row, value);
where array is a data structure tha only holds the lower left half and the diagonal. The getArray(..) then determines if the column is greater than the row, and if it is, it returns (note the reversed entries getArray(array, row, col); This leverages the symmetric property of the array without the need to actually hold both symmetric sides.
Simulate the array using a list (or tree or hash table) of "only the value holding items"
This works very well for sparse arrays, as you no longer need to allocate memory to hold large numbers of zero (or empty) values. In the event that someone "looks up" a non-set value, your code "discovers" no value set for that entry, and then returns the "zero" or empty value without it actually being stored in your array.
Again without more details, it is hard to know what kind of solution is the best approach.

When you create local variables, they go on the stack, which is of limited size. You're blowing through that limit.
You want your array to go on the heap, which is all the virtual memory your system has, i.e. gigs and gigs on a modern system. There are two ways to manage that. One is to dynamically allocate the array as in k06a's answer; use malloc() or your platform-specific allocator function (e.g. GlobalAlloc() on Windows) . The second is to declare the array as a global or module static variable, outside of any function.
Using a global or static has the disadvantage that this memory will be allocated for the entire lifetime of your program. Also, pretty much everybody hates globals on principle. On the other hand, you can use the two-dimensional array syntax, "array[x][y]" and the like, to access array elements... easier than doing array[x + y * width], plus you don't have to remember whether you're supposed to be doing "x + y * width" or "x * height + y" .

Related

How can I shorten an array?

I wanted to create a function that deletes from an array of segments the ones that are longer than a given number, by freeing the memory I don't need anymore. The problem is that the function I've created frees also all the memory allocated after the given point. How can I limit it, so that it frees just one pointer without compromising the others?
Here is the code I've written so far:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
typedef struct
{
double x1;
double y1;
double x2;
double y2;
} Segment;
double length(Segment* s)
{
return sqrt(pow(s->x1 - s->x2, 2) + pow(s->y1 - s->y2, 2));
}
// HERE IS THE PROBLEM!!
void delete_longer(Segment* as[], int n, double max_len)
{
for(int i = 0; i < n; i++)
{
if(length(as[i]) > max_len)
{
as[i] = NULL; // Those two lines should be swapped, but the problem remains
free(as[i]);
}
}
}
int main()
{
const int SIZE = 5;
Segment** arr = (Segment**)calloc(SIZE, sizeof(Segment*));
for(int i = 0; i < SIZE; i++)
{
arr[i] = (Segment*)malloc(sizeof(Segment));
}
srand(time(0));
for(int i = 0; i < SIZE; i++)
{
arr[i]->x1 = rand() % 100;
arr[i]->x2 = rand() % 100;
arr[i]->y1 = rand() % 100;
arr[i]->y2 = rand() % 100;
printf("Lungezza: %d\n", (int)length(arr[i]));
}
delete_longer(arr, SIZE, 80);
for(int i = 0; i < SIZE && arr[i]; i++)
{
printf("Lunghezza 2: %d\n", (int)length(arr[i]));
}
return 0;
}

First of all the free function should come after the instruction that sets the pointer to NULL, but that's not the main cause of the problem.
What causes the behaviour I described was the fact that the second for loop in the main stops after finding the first NULL pointer. Instead I should have written:
for(int i = 0; i < SIZE ; i++)
{
if(arr[i])
printf("Lunghezza 2: %d\n", (int)length(arr[i]));
}

You have two main problems:
In the delete function you write:
as[i] = NULL;
free(as[i]);
This is the wrong order. You must first free the memory and then set the element to null. But note that this is not the cause of your perceived problem, it only causes a memory leak (i.e. the memory of as[i] becomes inaccessible). You should write:
free(as[i]);
as[i] = NULL;
Your second problem is in your for loop, which now stops at the first null element. So not all the memory after it is deleted, you just don't print it. The loop should be for example:
for(int i = 0; i < SIZE; i++)
{
printf("Lunghezza 2: %d\n", arr[i]?(int)length(arr[i]):0);
}
Note: I agree with the discussion that free(NULL) may be implementation dependent in older implementations of the library function. In my personal opinion, never pass free a null pointer. I consider it bad practice.

There's no way to change the size of an array at runtime. The compiler assigns the memory statically, and even automatic arrays are fixed size (except if you use the last C standard, in which you can specify a different size at declaration time, but even in that case, the array size stands until the array gets out of scope). The reason is that, once allocated, the memory of an array gets surrounded of other declarations that, being fixed, make it difficult ot use the memory otherwise.
The other alternative is to allocate the array dynamically. You allocate a fixed number of cells, and store with the array, not only it's size, but also its capacity (the maximum amount of cell it is allow to grow) Think that erasing an element of an array requires moving all the elements behind to the front one place, and this is in general an expensive thing to do. If your array is filled with references to other objects, a common technique is to use NULL pointers on array cells that are unused, or to shift all the elements one place to the beginning.
Despite the technique you use, arrays are a very efficient way to access multiple objects of the same type, but they are difficult to shorten or enlengthen.
Finally, a common technique to handle arrays in a way you can consider them as variable length is to allocate a fixed amount of cells (initially) and if you need more memory to allocate double the space of the original (there are other approaches, like using a fibonacci sequence to grow the array) and use the size of the array and the actual capacity of it. Only in case your array is full, you call a function that will allocate a new array of larger size, adjust the capacity, copy the elements to the new copy, and deallocate the old array. This will work until you fill it again.
You don't post any code, so I shall do the same. If you have some issue with some precise code, don't hesitate to post it in your question, I'll try to provide you with a working solution.

Shift elements by one index with memmove

I am trying to shift the elements in a dynamically created 3d array by one index, so that each element [i][j][k] should be on [i+1][j][k].
This is how my array creation looks like
typedef struct stencil{
int ***arr;
int l;
int m;
int n;}matrix;
void createMatrix(matrix *vector){
vector->arr = (int***) malloc(sizeof(int**) * (vector->l+2));
for (int i = 0; i< vector->l+2; ++i) {
vector->arr[i] = (int**) malloc(sizeof(int*) * (vector->m+2));
for (int j = 0; j < vector->m+2; ++j) {
vector->arr[i][j] = (int*) calloc((vector->n+2),sizeof(int));
}
}
}
This is basically what I want to achieve with memmove
for(int i = vector->l-1; i >= 0; --i){
for(int j = vector->m; j >= 0; --j){
for(int k = vector->n; k >= 0; --k){
vector->arr[i+1][j][k] = vector->arr[i][j][k];
}
}
}
for some reason memmove shifts 2 indices.
memmove(&(vector->arr[1][1][1]), &(vector->arr[0][1][1]), (vector->l+2)*(vector->m+2)*(vector->n)*sizeof(int*));
Could anyone give me a hint?

When you create a dynamic multi-dimensional array like this, the array contents are not contiguous -- each row is a separate allocation. So you can't move it all with a single memmov().
But you don't need to copy all the data, just shift the pointers in the top-level array.
int **temp = arr[l-1]; // save last pointer, which will be overwritten
memmov(&arr[1], &arr[0], sizeof(*arr[1]));
arr[0] = temp;
I've shifted the last element around to the first, to avoid having two elements that point to the same data. You could also free the old last element (including freeing the arrays it points to) and create a new first element, but this was simpler.

Compile with a higher optimization level (-O3). Obtain a direct reference on vector->arr instead of forcing dereferencing on every single array access.
Your call to memmove looks half correct under the assumption that you allocated arr as continuous memory. However, since you said "dynamic", I very much doubt that. Plus the size calculation appears very much wrong, with the sizeof(int*).
I suppose arr is not int arr[constexpr][constexpr][constexpr] (single, continuous allocation), but rather int ***arr.
In which case the memmove goes horribly wrong. After moving the int** contents of the arr field by one (which actually already did the move), it caused a nasty overflow on the heap, most likely by chance hitting also a majority of the int* allocations following.
Looks like a double move, and leaves behind a completely destroyed heap.

Simply doing this would work (Illustrating in a 3d array)
memmove(arr[1], arr[0], Y*Z*sizeof(int));
where Y and Z denotes the other 2 dimensions of the 2d array.
Here arr[X][Y][Z] is the int array where X>=2.
In case of dynamically allocated memory you need to do each continuous chunk one by one. Then it would work.

Add item to empty array in C and getting array length

I've taking many attempts at solving this problem but failed every time.
I have an array
char *array[1024] = {};
Now I would like to add an item to the array and would also access the items by numbers
For example:
array[0] would be the first item
array[1] would be the second
array[2] would be the third item
But also I would like to know how many items are in the array so I could use something like
for(int i = 0; i <= totalitemsinarray; i++) {
print(array[i]);
}

You cannot change the size of an array in C. You can however allocate a sufficiently large array and then fill it up with entries. First, declare an array with a sufficient size, say, 1024.
char *array[1024];
Then declare a variable fill that counts the number of used slots in array. Initialize it to 0 as 0 slots are used in the beginning. Then, each time you insert an item, increment fill:
array[fill++] = ...;
...
array[fill++] = ...;
Make sure that you never attempt to insert more than 1024 items into the array, C doesn't check that for you.
For a more flexible approach, use malloc() to allocate memory for the array and then periodically enlarge it with realloc() when it's full. If you increase the array size in exponential steps (say, multiply with Φ = 0.5 + 0.5 √2 &approx; 1.61), this runs in O(1) amortised time per entry inserted.

There is no way to do what you're asking directly with C. One option could be if you knew that only certain values were valid. For example, you have an array of char *s so often people use NULL as a flag/invalid value. In that case you could initialize your array to have all NULLs and use that to know the size of the array:
char *array[1024];
memset(array, 0, sizeof(array));
/* .... */
for (int i = 0; i < sizeof(array)/sizeof(char*); i++) {
if (array[i]) {
printf("%s\n", array[i]);
}
}

char *array[1024] = {};
First, that is an array with 1024 char pointers/strings. Those elements can be 0s or plain garbage. If you don't plan to set them all you may want to nullify the array.
For the matter of storing the values and the count you might want to have a look at structs. For example:
typedef struct elem {
int count;
char *value;
} elem;
Then elem.count would be the number and elem.value would be the value accordingly.
And then initialize them in a for loop.

The only really valid way to approach this, is to dynamically grow the array. Allocate the array on the heap, and manage two counts: 1. the count of currently used elements, and 2. the count of elements for which you currently have memory allocated. Something like this:
//the setup
size_t arrayLength = 0, allocatedSize = 8;
int* array = malloc(sizeof(*array) * allocatedSize);
//grow the array -> first check that we have space to add an element
if(arrayLength == allocatedSize) {
array = realloc(array, allocatedSize *= 2);
assert(array);
}
assert(arrayLength < allocatedSize);
//grow the array -> add an element
array[arrayLength++] = ...;
You see, the realloc() call is not too much hassle, but it will protect you from bugs when the requirements change. My experience is that any fixed limit in the code, as insanely large as it may seem to be, will eventually be exceeded, and miserable failure will result. The only safeguard is to use as much memory as needed everywhere.

Int-stream in C

I'm implementing a function in C where I convert a byte[] to an int[]. The problem is that the length of the int[] depends on the contents of the byte[] (not just the length of the byte[]) so I won't know the total length of the int[] until I've iterated the entire byte[]. I'm therefore looking for some form av int-stream or dynamically increasing int-list which I can write to and then convert to a int[] once I'm done writing all the ints. My C-experience is a bit limited at the moment so I'm not really sure what's considered best practice to solve this kind of problem. Any suggestions?

The easiest method would be to allocate the int[] to be the same length (number of elements) as the byte[], and when you're done and know the size, call realloc to shrink it.
This assumes, of course, that interpreting the data would never create more integers than there are bytes in the stream.

There are a few ways of doing this I can think of.
I'm assuming, based on your question, that the transformation of your char[] to the corresponding int[]s is expensive (which is why you want to avoid performing that calculation twice - once to determine the size, and again to populate the contents.
So, here's how I would go about it:
First, is there a maximum size you can associate to the transformation? EX: Is there a maximum 2-to-1 size difference? (For each char in the char[] can it create "up to X" ints?)
If this is the case, and memory usage isn't an issue (you're not super constrained) - Go ahead and alloc the maximum size, populate it as you perform your translation, and realloc when you're done to shrink your memory footprint.
If this is not the case, you're in tougher waters, and should look to non-contiguous schemes - such as a linked list. Once you've performed your translation and built your linked list, you can then allocate space for your array, and visit each element in the linked list to populate the array.

First, inspect byte[] to determine the resulting int[] size. Then use malloc() to allocate the appropriately sized int[] structure.
#include <stdlib.h>
...
// imagine that the resulting int[] size depends on the sum of the bytes
int j, size = 0;
for (j = 0; byte[j]; ++j)
size += byte[j];
int *int_array = (int *) malloc (size);
for (j = 0; j < size; ++j)
int_array [j] = whatever;

First, If you can use C++, then you can just use a vector, which is a dynamically-sized array. Otherwise, you'll have to first iterate through your byte array to determine what the int array size should be, then dynamically allocate the int array. Second, C doesn't have a byte type, so the type normally used is char.
#include <stdlib.h>
char byte_array[ size ];
int i, int_size = 0;
int *int_array;
for ( i = 0; i < size; i++ ) {
int_size += f( byte_array[i] );
}
int_array = (int*) malloc( int_size );
where f() is some function you write that looks at one element of the byte array to help determine how large the int array should be.

Coding problem using a 2-d array of structs inside another struct in C

I am working with a 2-dimensional array of structs which is a part of another struct. It's not something I've done a lot with so I'm having a problem. This function ends up failing after getting to the "test" for-loop near the end. It prints out one line correctly before it seg faults.
The parts of my code which read data into a dummy 2-d array of structs works just fine, so it must be my assigning array to be part of another struct (the imageStruct).
Any help would be greatly appreciated!
/*the structure of each pixel*/
typedef struct
{
int R,G,B;
}pixelStruct;
/*data for each image*/
typedef struct
{
int height;
int width;
pixelStruct *arr; /*pointer to 2-d array of pixels*/
} imageStruct;
imageStruct ReadImage(char * filename)
{
FILE *image=fopen(filename,"r");
imageStruct thisImage;
/*get header data from image*/
/*make a 2-d array of of pixels*/
pixelStruct imageArr[thisImage.height][thisImage.width];
/*Read in the image. */
/*I know this works because I after storing the image data in the
imageArr array, I printed each element from the array to the
screen.*/
/*so now I want to take the array called imageArr and put it in the
imageStruct called thisImage*/
thisImage.arr = malloc(sizeof(imageArr));
//allocate enough space in struct for the image array.
*thisImage.arr = *imageArr; /*put imageArr into the thisImage imagestruct*/
//test to see if assignment worked: (this is where it fails)
for (i = 0; i < thisImage.height; i++)
{
for (j = 0; j < thisImage.width; j++)
{
printf("\n%d: R: %d G: %d B: %d\n", i ,thisImage.arr[i][j].R,
thisImage.arr[i][j].G, thisImage.arr[i][j].B);
}
}
return thisImage;
}
(In case you are wondering why I am using a dummy array in the first place, well it's because when I started writing this code, I couldn't figure out how to do what I am trying to do now.)
EDIT: One person suggested that I didn't initialize my 2-d array correctly in the typedef for the imageStruct. Can anyone help me correct this if it is indeed the problem?

You seem to be able to create variable-length-arrays, so you're on a C99 system, or on a system that supports it. But not all compilers support those. If you want to use those, you don't need the arr pointer declaration in your struct. Assuming no variable-length-arrays, let's look at the relevant parts of your code:
/*data for each image*/
typedef struct
{
int height;
int width;
pixelStruct *arr; /*pointer to 2-d array of pixels*/
} imageStruct;
arr is a pointer to pixelStruct, and not to a 2-d array of pixels. Sure, you can use arr to access such an array, but the comment is misleading, and it hints at a misunderstanding. If you really wish to declare such a variable, you would do something like:
pixelStruct (*arr)[2][3];
and arr would be a pointer to an "array 2 of array 3 of pixelStruct", which means that arr points to a 2-d array. This isn't really what you want. To be fair, this isn't what you declare, so all is good. But your comment suggests a misunderstanding of pointers in C, and that is manifested later in your code.
At this point, you will do well to read a good introduction to arrays and pointers in C, and a really nice one is C For Smarties: Arrays and Pointers by Chris Torek. In particular, please make sure you understand the first diagram on the page and everything in the definition of the function f there.
Since you want to be able to index arr in a natural way using "column" and "row" indices, I suggest you declare arr as a pointer to pointer. So your structure becomes:
/* data for each image */
typedef struct
{
int height;
int width;
pixelStruct **arr; /* Image data of height*width dimensions */
} imageStruct;
Then in your ReadImage function, you allocate memory you need:
int i;
thisImage.arr = malloc(thisImage.height * sizeof *thisImage.arr);
for (i=0; i < thisImage.height; ++i)
thisImage.arr[i] = malloc(thisImage.width * sizeof *thisImage.arr[i]);
Note that for clarity, I haven't done any error-checking on malloc. In practice, you should check if malloc returned NULL and take appropriate measures.
Assuming all the memory allocation succeeded, you can now read your image in thisImage.arr (just like you were doing for imageArr in your original function).
Once you're done with thisImage.arr, make sure to free it:
for (i=0; i < thisImage.height; ++i)
free(thisImage.arr[i]);
free(thisImage.arr);
In practice, you will want to wrap the allocation and deallocation parts above in their respective functions that allocate and free the arr object, and take care of error-checking.

I don't think sizeof imageArr works as you expect it to when you're using runtime-sized arrays. Which, btw, are a sort of "niche" C99 feature. You should add some printouts of crucial values, such as that sizeof to see if it does what you think.
Clearer would be to use explicit allocation of the array:
thisImage.arr = malloc(thisImage.width * thisImage.height * sizeof *thisImage.arr);
I also think that it's hard (if even possible) to implement a "true" 2D array like this. I would recommend just doing the address computation yourself, i.e. accessing a pixel like this:
unsigned int x = 3, y = 1; // Assume image is larger.
print("pixel at (%d,%d) is r=%d g=%d b=%d\n", x, y, thisImage.arr[y * thisImage.width + x]);
I don't see how the required dimension information can be associated with an array at run-time; I don't think that's possible.

height and width are undefined; you might want to initialise them first, as in
thisImage.height = 10; thisImage.width = 20;
also,
what is colorRGB?

*thisImage.arr = *imageArr; /*put imageArr into the thisImage imagestruct*
This won't work. You have to declare arr as colorRGB **, allocate it accordingly, etc.

it looks like you are trying to copy array by assignment.
You cannot use simple assignment operator to do that, you have to use some function to copy things, for example memcpy.
*thisImage.arr = *imageArr;
thisimage.arr[0] = imagearr[0];
The above statements are doing the same thing.
However this is not most likely what causes the memory corruption
since you are working with two dimensional arrays, do make sure you initialize them correctly.
Looking at the code, should not even compile: the array is declared as one-dimensional in your image structure but you refer to as two-dimensional?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight