My question is pretty straightforward.
I'm building a small program to analyse and simulate random text using Markov chains. My first MC had memory of size 2, working on the alphabet {a, b, ..., z}. Therefore, my transition matrix was of size 26 * 26 * 26.
But now, I'd like to enhance my simulation using a MC with memory of size 4. Therefore, I need to store my probabilities of transitions in a 5D array of size 26*26*26*26*26.
The problem is (I believe), that C doesn't allow me to declare and manipulate such a array, as it might be too big. In fact, I got a segmentation faults 11 prompt when writing :
int count[26][26][26][26][26]
Is there a way to get around this restriction?
Thanks!
On a typical PC architecture with 32-bit integers, int count[26][26][26][26][26] creates an object of size 47525504 bytes, 47MB, which is manageable on most current computers, but is likely too large for automatic allocation (aka on the stack).
You can declare count as a global or a static variable, or you can allocate it from the heap and make count a pointer with this declaration:
int (*count)[26][26][26][26] = calloc(sizeof(*count), 26);
if (count == NULL) {
/* handle allocation failure gracefully */
fprintf(stderr, "cannot allocate memory for 5D array\n");
exit(1);
}
Make it global1 or make it static or dynamically allocate the same amount of memory. Dynamic memory allocation allocates memory from a portion of memory which doesn't have the constraint to an extent larger than the one you faced. Variables having automatic storage duration are likely to stored in stack in most implementations. Dynamic memory belongs to heap in most implementations.
You can do this (Illustration):-
int (*a)[26][26][26][26] = malloc(sizeof *a *26);
if(!a){ perror("malloc");exit(1);}
...
free(a);
1static storage duration - all variables defined in file scope have static storage duration.
With this kind of array declaration, your data will be stored in stack. And stack have usually only 8 MB on Unix like systems and 1 MB on Windows. But you need at least 4*26^5 B (roughly 46 MB).
Prefered solution would be allocate this array on heap using malloc.
But you can also instruct compiler to increase the stack size...
Try this
#define max=11881376 //answer of 26*26*26*26*26
int count[max]; //array
Related
What is the difference between defining an array of a length that is definied before runtime (depends on command line arguments) with array[size] and using malloc()? array[i] leads to the data put on the stack, malloc() uses the heap[see this stackoverflow]
So with large Data I can run into stackoverflows, but on a new machine a total of 30 chars and ints should not be problematic (Stack is around 1MB on windows according to this).
So I am probably missing something obvious.
As far as I understand, when defined in main(),the two should be the same:
Example 1
int size; // depends on some logic
char array[size];
Example 2
int size; // depends on some logic
array= (char *)malloc(size * sizeof(char));
free(array); // later after use
But when I use the array inside of functions and hand it over as a pointer (func(char* array)) or as an array (funct(char array[])), sometimes the gdb-debugger let's me know that the function gets handed corrupted data in #1 , using malloc() fixed the issue.
Is array[i], not okay to use when it is not determined at compile time? Is it some scoping issue? This answer has a comment suggesting such a thing, but I don't quite understand if that applies here.
I am using C99.
The main difference is that arrays declared with a fixed size are stack allocated (and references to it or its elements are valid only within its scope) while mallocd arrays are heap allocated.
Stack allocation means that variables are stored directly to the memory. Access to this memory is usually very fast as and it's allocation is done during compilation.
On the other hand, variables allocated on the heap have their memory allocated at runtime and - while accessing this memory is slower - your only limit is the size of virtual memory.
Have a look here:
https://gribblelab.org/CBootCamp/7_Memory_Stack_vs_Heap.html
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Malloc or normal array definition?
We learn that there is dynamic memory in C and dynamic variables:
#include <stdio.h>
int a = 17;
int main(void)
{
int b = 18; //automatic stack memory
int * c;
c = malloc( sizeof( int ) ); //dynamic heap memory
*c = 19;
printf("a = %d at address %x\n", a, &a);
printf("b = %d at address %x\n", b, &b);
printf("c = %d at address %x\n", *c, c);
free(c);
system("PAUSE");
return 0;
}
How do I know which type of memory to use? When do I ned one or the other?
Use dynamic in the following situations:
When you need a lot of memory. Typical stack size is 1 MB, so anything bigger than 50-100KB should better be dynamically allocated, or you're risking crash. Some platforms can have this limit even lower.
When the memory must live after the function returns. Stack memory gets destroyed when function ends, dynamic memory is freed when you want.
When you're building a structure (like array, or graph) of size that is unknown (i.e. may get big), dynamically changes or is too hard to precalculate. Dynamic allocation allows your code to naturally request memory piece by piece at any moment and only when you need it. It is not possible to repeatedly request more and more stack space in a for loop.
Prefer stack allocation otherwise. It is faster and can not leak.
You use dynamic memory when the size of your allocation is not known in advance only on runtime.
For example you ask a user to input names (lets say up to 10 names) and store them in a string array. Since you do not know how much names the user will provide (only on runtime that is) you will have to allocate the array only after you know how much to allocate so you will use dynamic allocation.
You can of course use an array of fixed sized 10 but for larger amounts this will be wasteful
Use dynamic memory allocation, if you don't know exactly how much memory your program will need to allocate at compile-time.
int a[n] for example will limit your array size to n. Also, it allocated n x 4 bytes of memory whether you use it or not. This is allocated on the stack, and the variable n must be known at compile time.
int *a = (int *)malloc(n * sizeof (int)) on the other hand allocated at runtime, on the heap, and the n needs to be known only at runtime, not necessarily at compile-time.
This also ensures you allocate exactly as much memory as you really need. However, as you allocated it at runtime, the cleaning up has to be done by you using free.
You should use dynamic memory when:
If you want your object to persist beyond the scope in which it was created.
Usually, stack sizes are limited and hence if your object occupies a lot of memory then you might run out of stack space in such cases one would usually go for dynamic memory allocation.
Note that c99 standard introduces Variable Length Arrays(VLA) in C so you need not use dynamic memory allocation just because you do not know the array dimensions before hand(unless ofcourse #2 mentioned above is the case)
It is best to avoid dynamic memory allocations as much as you can because it means explicitly managing the memory instead of the automatic mechanism provided by the language.Explicit memory management means that you are prone to make more errors, which might lead to catastrophic effects.
Having said that dynamic memory allocations cannot be avoided always and must be used when the use is imperative(two cases mentioned above).
If you can program without dynamic allocation don't use it!
But a day you will be blocked, and the only way to unblock you will be to use dynamic allocation then now you can use it
Als made an interesting point that you should allocate memory from the heap if your object needs to persist beyond the scope in which it was created. In the code above, you don't need to allocate memory from heap at all. You can rewrite it like this:
#include <stdio.h>
int a = 17;
int main(void)
{
int b = 18; //automatic stack memory
int c[1]; // allocating stack memory. sizeof(int) * 1
c[0] = 19;
printf("a = %d at address %x\n", a, &a);
printf("b = %d at address %x\n", b, &b);
printf("c = %d at address %x\n", c[0], c);
system("PAUSE");
return 0;
}
In fact, as part of the C99 standard (Variable-length array), you can use the [] operator to allocate dynamic space for an array on the stack just as you would normally do to create an array. You don't even need to know the size of the array at compilation time. The compiler will just adjust the esp register (for x86 machines) based on the requested allocation space and you're good to go.
I want to operate on 10^9 elements. For this they should be stored somewhere but in c, it seems that an array can only store 10^6 elements. So is there any way to operate on such a large number of elements in c?
The error thrown is error: size of array ‘arr’ is too large".
For this they should be stored somewhere but in c it seems that an
array only takes 10^6 elements.
Not at all. I think you're allocating the array in a wrong way. Just writing
int myarray[big_number];
won't work, as it will try to allocate memory on the stack, which is very limited (several MB in size, often, so 10^6 is a good rule of thumb). A better way is to dynamically allocate:
int* myarray;
int main() {
// Allocate the memory
myarray = malloc(big_number * sizeof(int));
if (!myarray) {
printf("Not enough space\n");
return -1;
}
// ...
// Free the allocated memory
free(myarray);
return 0;
}
This will allocate the memory (or, more precise, big_number * 4 bytes on a 32-bit machine) on the heap. Note: This might fail, too, but is mainly limited by the amount of free RAM which is much closer to or even above 10^9 (1 GB).
An array uses a contiguous memory space. Therefore, if your memory is fragmented, you won't be able to use such array. Use a different data structure, like a linked list.
About linked lists:
Wikipedia definition - http://en.wikipedia.org/wiki/Linked_list
Implementation in C - http://www.macs.hw.ac.uk/~rjp/Coursewww/Cwww/linklist.html
On a side note, I tried on my computer, and while I can't create an int[1000000], a malloc(1000000*sizeof(int)) works.
I am trying to do following:
#include <windows.h>
#include <stdio.h>
#define N 400000
void main() {
int a[N];
}
I get a stackoverflow exception. My computer has 6GB of main memory so I cant be using it all up. How do I solve this problem? I using VS 2008 on Windows 7 and coding in C.
The amount of stack size you're allowed to use is never going to be the full amount of main memory.
You can use this flag to set the stack size--which defaults to 1MB. To store 400,000 ints you'll need at least 1.526 MB.
Why not allocate this on the heap instead of the stack?
When you define a variable like that, you're requesting space on the stack. This is the managed section of memory that's used for variables in function calls, but isn't meant to store large amounts of data.
Instead, you'd need to allocate the memory manually, on the heap.
int *a = (int *) malloc(sizeof(int) * N);
This defines a as a pointer to the memory on the heap. This will behave the same as the array, except you will need to manually
free(a);
when you finish using it or you'll create a memory leak.
Automatic variables are allocated on the stack, which is usually 1MB. To solve this, allocate the memory on the heap:
int *a = (int*)malloc(sizeof(int) * N);
When you're done with that memory, you can deallocate it:
free(a);
That will return the memory to the system.
You need Stack Size larger than 400000*4=1600000 Bytes ~ 1.6 MB but the default stack size in visual studio is 1MB. There is 2 solutions:
1- you can change the stack size of you program by:
right click project, and choose properties from the menu .
go to Configuration properties->Linker->Commandline, add this parameter
/STACK:2000000
2- dynamic array to allocate over the heap, instead of static array , as all have said.
int numbers*;
numbers = malloc ( sizeof(int) * 10 );
I want to know how is this dynamic memory allocation, if I can store just 10 int items to the memory block ? I could just use the array and store elemets dynamically using index. Why is the above approach better ?
I am new to C, and this is my 2nd day and I may sound stupid, so please bear with me.
In this case you could replace 10 with a variable that is assigned at run time. That way you can decide how much memory space you need. But with arrays, you have to specify an integer constant during declaration. So you cannot decide whether the user would actually need as many locations as was declared, or even worse , it might not be enough.
With a dynamic allocation like this, you could assign a larger memory location and copy the contents of the first location to the new one to give the impression that the array has grown as needed.
This helps to ensure optimum memory utilization.
The main reason why malloc() is useful is not because the size of the array can be determined at runtime - modern versions of C allow that with normal arrays too. There are two reasons:
Objects allocated with malloc() have flexible lifetimes;
That is, you get runtime control over when to create the object, and when to destroy it. The array allocated with malloc() exists from the time of the malloc() call until the corresponding free() call; in contrast, declared arrays either exist until the function they're declared in exits, or until the program finishes.
malloc() reports failure, allowing the program to handle it in a graceful way.
On a failure to allocate the requested memory, malloc() can return NULL, which allows your program to detect and handle the condition. There is no such mechanism for declared arrays - on a failure to allocate sufficient space, either the program crashes at runtime, or fails to load altogether.
There is a difference with where the memory is allocated. Using the array syntax, the memory is allocated on the stack (assuming you are in a function), while malloc'ed arrays/bytes are allocated on the heap.
/* Allocates 4*1000 bytes on the stack (which might be a bit much depending on your system) */
int a[1000];
/* Allocates 4*1000 bytes on the heap */
int *b = malloc(1000 * sizeof(int))
Stack allocations are fast - and often preferred when:
"Small" amount of memory is required
Pointer to the array is not to be returned from the function
Heap allocations are slower, but has the advantages:
Available heap memory is (normally) >> than available stack memory
You can freely pass the pointer to the allocated bytes around, e.g. returning it from a function -- just remember to free it at some point.
A third option is to use statically initialized arrays if you have some common task, that always requires an array of some max size. Given you can spare the memory statically consumed by the array, you avoid the hit for heap memory allocation, gain the flexibility to pass the pointer around, and avoid having to keep track of ownership of the pointer to ensure the memory is freed.
Edit: If you are using C99 (default with the gnu c compiler i think?), you can do variable-length stack arrays like
int a = 4;
int b[a*a];
In the example you gave
int *numbers;
numbers = malloc ( sizeof(int) * 10 );
there are no explicit benefits. Though, imagine 10 is a value that changes at runtime (e.g. user input), and that you need to return this array from a function. E.g.
int *aFunction(size_t howMany, ...)
{
int *r = malloc(sizeof(int)*howMany);
// do something, fill the array...
return r;
}
The malloc takes room from the heap, while something like
int *aFunction(size_t howMany, ...)
{
int r[howMany];
// do something, fill the array...
// you can't return r unless you make it static, but this is in general
// not good
return somethingElse;
}
would consume the stack that is not so big as the whole heap available.
More complex example exists. E.g. if you have to build a binary tree that grows according to some computation done at runtime, you basically have no other choices but to use dynamic memory allocation.
Array size is defined at compilation time whereas dynamic allocation is done at run time.
Thus, in your case, you can use your pointer as an array : numbers[5] is valid.
If you don't know the size of your array when writing the program, using runtime allocation is not a choice. Otherwise, you're free to use an array, it might be simpler (less risk to forget to free memory for example)
Example:
to store a 3-D position, you might want to use an array as it's alwaays 3 coordinates
to create a sieve to calculate prime numbers, you might want to use a parameter to give the max value and thus use dynamic allocation to create the memory area
Array is used to allocate memory statically and in one go.
To allocate memory dynamically malloc is required.
e.g. int numbers[10];
This will allocate memory statically and it will be contiguous memory.
If you are not aware of the count of the numbers then use variable like count.
int count;
int *numbers;
scanf("%d", count);
numbers = malloc ( sizeof(int) * count );
This is not possible in case of arrays.
Dynamic does not refer to the access. Dynamic is the size of malloc. If you just use a constant number, e.g. like 10 in your example, it is nothing better than an array. The advantage is when you dont know in advance how big it must be, e.g. because the user can enter at runtime the size. Then you can allocate with a variable, e.g. like malloc(sizeof(int) * userEnteredNumber). This is not possible with array, as you have to know there at compile time the (maximum) size.