A suitable replacement for *****double - c

I have this big data structure which is a list of lists of lists of lists of lists of doubles. Clearly it's extremely inefficient to handle. Around 70% of time spent to run my application is used to write zeros in the doubles at the end of the lists. I need a faster replacement which satisfies two constraints:
1)All the memory must be allocated continuously (that is, a huge chunk of memory)
2)I must access this chunk using the usual A[][][][][] syntax
As for now, I thought of using a *double to hold the entire chunk and reuse my list of lists of... to store pointers to the appropriate areas in the chunk.
Any better ideas?

One example of how to achieve this with a 2D array, I'm to lazy to do the 5D case, is
double **a;
a = malloc (n * sizeof(*double));
a[0] = malloc (n * m * sizeof(double));
for (int i = 1; i < n; ++i)
a[i] = a[0][i*n];
This way you can decide if you wish to index it with a[0][i*n], or a[i][j]. The memory is contiguous, and you get away with only two allocations. Of course, this also requires a free n*m*sizeof(double) block in memory, but since you demand the memory to be allocated continuously I expect this to be satisfied. This also means that you will have to delete it correctly with:
free(a[0]);
free(a);
so I would make a create5Darray (n,m,k,l,t) and a delete5Darray function to make this easier.

Related

Allocating matrix performances

I have two scenarios, in both i allocate 78*2 sizeof(int) of memory and initialize it to 0.
Are there any differences regards performances?
Scenario A:
int ** v = calloc(2 , sizeof(int*));
for (i=0; i<2; ++i)
{
v[i] = calloc(78, sizeof(int));
}
Scenario B:
int ** v = calloc(78 , sizeof(int*));
for (i=0; i<78; ++i)
{
v[i] = calloc(2, sizeof(int));
}
I supposed that in performance terms, it's better to use a calloc if an initialize array is needed, let me know if I'm wrong
First, discussing optimization abstractly has some difficulties because compilers are becoming increasingly better at optimization. (For some reason, compiler developers will not stop improving them.) We do not always know what machine code given source code will produce, especially when we write source code today and expect it to be used for many years to come. Optimization may consolidate multiple steps into one or may omit unnecessary steps (such as clearing memory with calloc instead of malloc immediately before the memory is completely overwritten in a for loop). There is a growing difference between what source code nominally says (“Do these specific steps in this specific order”) and what it technically says in the language abstraction (“Compute the same results as this source code in some optimized fashion”).
However, we can generally figure that writing source code without unnecessary steps is at least as good as writing source code with unnecessary steps. With that in mind, let’s consider the nominal steps in your scenarios.
In Scenario A, we tell the computer:
Allocate 2 int *, clear them, and put their address in v.
Twice, allocate 78 int, clear them, and put their addresses in the preceding int *.
In Scenario B, we tell the computer:
Allocate 78 int *, clear them, and put their address in v.
78 times, allocate two int, clear them, and put their addresses in the preceding int *.
We can easily see two things:
Both of these scenarios both clear the memory for the int * and immediately fill it with other data. That is wasteful; there is no need to set memory to zero before setting it to something else. Just set it to something else. Use malloc for this, not calloc. malloc takes just one parameter for the size instead of two that are multiplied, so replace calloc(2, sizeof (int *)) with malloc(2 * sizeof (int *)). (Also, to tie the allocation to the pointer being assigned, use int **v = malloc(2 * sizeof *v); instead of repeating the type separately.)
At the step where Scenario B does 78 things, Scenario A does two things, but the code is otherwise very similar, so Scenario A has fewer steps. If both would serve some purpose, then A is likely preferable.
However, both scenarios allude to another issue. Presumably, the so-called array will be used later in the program, likely in a form like v[i][j]. Using this as a value means:
Fetch the pointer v.
Calculate i elements beyond that.
Fetch the pointer at that location.
Calculate j elements beyond that.
Fetch the int at that location.
Let’s consider a different way to define v: int (*v)[78] = malloc(2 * sizeof *v);.
This says:
Allocate space for 2 arrays of 78 int and put their address in v.
Immediately we see that involves fewer steps than Scenario A or Scenario B. But also look at what it does to the steps for using v[i][j] as a value. Because v is a pointer to an array instead of a pointer to a pointer, the computer can calculate where the appropriate element is instead of having to load an address from memory:
Fetch the pointer v.
Calculate i•78 elements beyond that.
Calculate j elements beyond that.
Fetch the int at that location.
So this pointer-to-array version is one step fewer than the pointer-to-pointer version.
Further, the pointer-to-pointer version requires an additional fetch from memory for each use of v[i][j]. Fetches from memory can be expensive relative to in-processor operations like multiplying and adding, so it is a good step to eliminate. Having to fetch a pointer can prevent a processor from predicting where the next load from memory might be based on recent patterns of use. Additionally, the pointer-to-array version puts all the elements of the 2×78 array together in memory, which can benefit the cache performance. Processors are also designed for efficient use of consecutive memory. With the pointer-to-pointer version, the separate allocations typically wind up with at least some separation between the rows and may have a lot of separation, which can break the benefits of consecutive memory use.

Malloc or normal array definition?

When shall i use malloc instead of normal array definition in C?
I can't understand the difference between:
int a[3]={1,2,3}
int array[sizeof(a)/sizeof(int)]
and:
array=(int *)malloc(sizeof(int)*sizeof(a));
In general, use malloc() when:
the array is too large to be placed on the stack
the lifetime of the array must outlive the scope where it is created
Otherwise, use a stack allocated array.
int a[3]={1,2,3}
int array[sizeof(a)/sizeof(int)]
If used as local variables, both a and array would be allocated on the stack. Stack allocation has its pros and cons:
pro: it is very fast - it only takes one register subtraction operation to create stack space and one register addition operation to reclaim it back
con: stack size is usually limited (and also fixed at link time on Windows)
In both cases the number of elements in each arrays is a compile-time constant: 3 is obviously a constant while sizeof(a)/sizeof(int) can be computed at compile time since both the size of a and the size of int are known at the time when array is declared.
When the number of elements is known only at run-time or when the size of the array is too large to safely fit into the stack space, then heap allocation is used:
array=(int *)malloc(sizeof(int)*sizeof(a));
As already pointed out, this should be malloc(sizeof(a)) since the size of a is already the number of bytes it takes and not the number of elements and thus additional multiplication by sizeof(int) is not necessary.
Heap allocaiton and deallocation is relatively expensive operation (compared to stack allocation) and this should be carefully weighted against the benefits it provides, e.g. in code that gets called multitude of times in tight loops.
Modern C compilers support the C99 version of the C standard that introduces the so-called variable-length arrays (or VLAs) which resemble similar features available in other languages. VLA's size is specified at run-time, like in this case:
void func(int n)
{
int array[n];
...
}
array is still allocated on the stack as if memory for the array has been allocated by a call to alloca(3).
You definately have to use malloc() if you don't want your array to have a fixed size. Depending on what you are trying to do, you might not know in advance how much memory you are going to need for a given task or you might need to dynamically resize your array at runtime, for example you might enlarge it if there is more data coming in. The latter can be done using realloc() without data loss.
Instead of initializing an array as in your original post you should just initialize a pointer to integer like.
int* array; // this variable will just contain the addresse of an integer sized block in memory
int length = 5; // how long do you want your array to be;
array = malloc(sizeof(int) * length); // this allocates the memory needed for your array and sets the pointer created above to first block of that region;
int newLength = 10;
array = realloc(array, sizeof(int) * newLength); // increase the size of the array while leaving its contents intact;
Your code is very strange.
The answer to the question in the title is probably something like "use automatically allocated arrays when you need quite small amounts of data that is short-lived, heap allocations using malloc() for anything else". But it's hard to pin down an exact answer, it depends a lot on the situation.
Not sure why you are showing first an array, then another array that tries to compute its length from the first one, and finally a malloc() call which tries do to the same.
Normally you have an idea of the number of desired elements, rather than an existing array whose size you want to mimic.
The second line is better as:
int array[sizeof a / sizeof *a];
No need to repeat a dependency on the type of a, the above will define array as an array of int with the same number of elements as the array a. Note that this only works if a is indeed an array.
Also, the third line should probably be:
array = malloc(sizeof a);
No need to get too clever (especially since you got it wrong) about the sizeof argument, and no need to cast malloc()'s return value.

How to insert more than 10^6 elements in a array

I want to operate on 10^9 elements. For this they should be stored somewhere but in c, it seems that an array can only store 10^6 elements. So is there any way to operate on such a large number of elements in c?
The error thrown is error: size of array ‘arr’ is too large".
For this they should be stored somewhere but in c it seems that an
array only takes 10^6 elements.
Not at all. I think you're allocating the array in a wrong way. Just writing
int myarray[big_number];
won't work, as it will try to allocate memory on the stack, which is very limited (several MB in size, often, so 10^6 is a good rule of thumb). A better way is to dynamically allocate:
int* myarray;
int main() {
// Allocate the memory
myarray = malloc(big_number * sizeof(int));
if (!myarray) {
printf("Not enough space\n");
return -1;
}
// ...
// Free the allocated memory
free(myarray);
return 0;
}
This will allocate the memory (or, more precise, big_number * 4 bytes on a 32-bit machine) on the heap. Note: This might fail, too, but is mainly limited by the amount of free RAM which is much closer to or even above 10^9 (1 GB).
An array uses a contiguous memory space. Therefore, if your memory is fragmented, you won't be able to use such array. Use a different data structure, like a linked list.
About linked lists:
Wikipedia definition - http://en.wikipedia.org/wiki/Linked_list
Implementation in C - http://www.macs.hw.ac.uk/~rjp/Coursewww/Cwww/linklist.html
On a side note, I tried on my computer, and while I can't create an int[1000000], a malloc(1000000*sizeof(int)) works.

how is dynamic memory allocation better than array?

int numbers*;
numbers = malloc ( sizeof(int) * 10 );
I want to know how is this dynamic memory allocation, if I can store just 10 int items to the memory block ? I could just use the array and store elemets dynamically using index. Why is the above approach better ?
I am new to C, and this is my 2nd day and I may sound stupid, so please bear with me.
In this case you could replace 10 with a variable that is assigned at run time. That way you can decide how much memory space you need. But with arrays, you have to specify an integer constant during declaration. So you cannot decide whether the user would actually need as many locations as was declared, or even worse , it might not be enough.
With a dynamic allocation like this, you could assign a larger memory location and copy the contents of the first location to the new one to give the impression that the array has grown as needed.
This helps to ensure optimum memory utilization.
The main reason why malloc() is useful is not because the size of the array can be determined at runtime - modern versions of C allow that with normal arrays too. There are two reasons:
Objects allocated with malloc() have flexible lifetimes;
That is, you get runtime control over when to create the object, and when to destroy it. The array allocated with malloc() exists from the time of the malloc() call until the corresponding free() call; in contrast, declared arrays either exist until the function they're declared in exits, or until the program finishes.
malloc() reports failure, allowing the program to handle it in a graceful way.
On a failure to allocate the requested memory, malloc() can return NULL, which allows your program to detect and handle the condition. There is no such mechanism for declared arrays - on a failure to allocate sufficient space, either the program crashes at runtime, or fails to load altogether.
There is a difference with where the memory is allocated. Using the array syntax, the memory is allocated on the stack (assuming you are in a function), while malloc'ed arrays/bytes are allocated on the heap.
/* Allocates 4*1000 bytes on the stack (which might be a bit much depending on your system) */
int a[1000];
/* Allocates 4*1000 bytes on the heap */
int *b = malloc(1000 * sizeof(int))
Stack allocations are fast - and often preferred when:
"Small" amount of memory is required
Pointer to the array is not to be returned from the function
Heap allocations are slower, but has the advantages:
Available heap memory is (normally) >> than available stack memory
You can freely pass the pointer to the allocated bytes around, e.g. returning it from a function -- just remember to free it at some point.
A third option is to use statically initialized arrays if you have some common task, that always requires an array of some max size. Given you can spare the memory statically consumed by the array, you avoid the hit for heap memory allocation, gain the flexibility to pass the pointer around, and avoid having to keep track of ownership of the pointer to ensure the memory is freed.
Edit: If you are using C99 (default with the gnu c compiler i think?), you can do variable-length stack arrays like
int a = 4;
int b[a*a];
In the example you gave
int *numbers;
numbers = malloc ( sizeof(int) * 10 );
there are no explicit benefits. Though, imagine 10 is a value that changes at runtime (e.g. user input), and that you need to return this array from a function. E.g.
int *aFunction(size_t howMany, ...)
{
int *r = malloc(sizeof(int)*howMany);
// do something, fill the array...
return r;
}
The malloc takes room from the heap, while something like
int *aFunction(size_t howMany, ...)
{
int r[howMany];
// do something, fill the array...
// you can't return r unless you make it static, but this is in general
// not good
return somethingElse;
}
would consume the stack that is not so big as the whole heap available.
More complex example exists. E.g. if you have to build a binary tree that grows according to some computation done at runtime, you basically have no other choices but to use dynamic memory allocation.
Array size is defined at compilation time whereas dynamic allocation is done at run time.
Thus, in your case, you can use your pointer as an array : numbers[5] is valid.
If you don't know the size of your array when writing the program, using runtime allocation is not a choice. Otherwise, you're free to use an array, it might be simpler (less risk to forget to free memory for example)
Example:
to store a 3-D position, you might want to use an array as it's alwaays 3 coordinates
to create a sieve to calculate prime numbers, you might want to use a parameter to give the max value and thus use dynamic allocation to create the memory area
Array is used to allocate memory statically and in one go.
To allocate memory dynamically malloc is required.
e.g. int numbers[10];
This will allocate memory statically and it will be contiguous memory.
If you are not aware of the count of the numbers then use variable like count.
int count;
int *numbers;
scanf("%d", count);
numbers = malloc ( sizeof(int) * count );
This is not possible in case of arrays.
Dynamic does not refer to the access. Dynamic is the size of malloc. If you just use a constant number, e.g. like 10 in your example, it is nothing better than an array. The advantage is when you dont know in advance how big it must be, e.g. because the user can enter at runtime the size. Then you can allocate with a variable, e.g. like malloc(sizeof(int) * userEnteredNumber). This is not possible with array, as you have to know there at compile time the (maximum) size.

How can I free all allocated memory at once?

Here is what I am working with:
char* qdat[][NUMTBLCOLS];
char** tdat[];
char* ptr_web_data;
// Loop thru each table row of the query result set
for(row_index = 0; row_index < number_rows; row_index++)
{
// Loop thru each column of the query result set and extract the data
for(col_index = 0; col_index < number_cols; col_index++)
{
ptr_web_data = (char*) malloc((strlen(Data) + 1) * sizeof(char));
memcpy (ptr_web_data, column_text, strlen(column_text) + 1);
qdat[row_index][web_data_index] = ptr_web_data;
}
}
tdat[row_index] = qdat[col_index];
After the data is used, the memory allocated is released one at a time using free().
for(row_index = 0; row_index < number_rows; row_index++)
{
// Loop thru all columns used
for(col_index = 0; col_index < SARWEBTBLCOLS; col_index++)
{
// Free memory block pointed to by results set array
free(tdat[row_index][col_index]);
}
}
Is there a way to release all the allocated memory at once, for this array?
Thank You.
Not with the standard malloc() allocator - you need to investigate the use of memory pools. These work by allocating a big block of memory up-front, allocating from it and freeing back to it as you request via their own allocation functions, and then freeing the whole lot with a special "deallocate all" function.
I must say I've always found these things a bit ugly - it really isn't that hard to write code that doesn't leak. The only reason I can see for using them is to mitigate heap fragmentation, if that is a real problem for you.
No there is not. Memory which is separately allocated must be separately freed.
The only way you could free it as once is if you allocated it at once is a giant block. You would then have to do a bit of pointer math to assign every row the correct index into the array but it's not terribly difficult. This approach does have a few downsides though
Extra pointer math
Requires one giant contiguous block of memory vs. N smaller blocks of memory. Can be an issue in low memory or high fragmentation environments.
Extra work for no real stated gain.
If you want to release it all at once, you have to allocate it all at once.
A simple manual solution, if you know the total size you'll need in advance, is to allocate it all in once chunk and index into it as appropriate. If you don't know the size in advance you can use realloc to grow the memory, so long as you only access it indexed from the initial pointer, and don't store additional pointers anywhere.
That being said, direct allocation and deallocation is a simple solution, and harder to get wrong than the alternatives. Unless the loop to deallocate is causing you real difficulties, I would stick with what you have.

Resources