Array of size 2^25 - c

I am trying to create an array of size 2^25 in c and then perform some elementary operations on it (memsweep function). The c code is
#include <stdio.h>
#include <time.h>
#define S (8191*4096)
main()
{
clock_t start = clock();
unsigned i;
volatile char large[S];
for (i = 0; i < 10*S; i++)
large[(4096*i+i)%S]=1+large[i%S];
printf("%f\n",((double)clock()-start)/CLOCKS_PER_SEC);
}
I am able to compile it but on execution it gives segmentation fault.

That might be bigger than your stack. You can
Make large global
Use malloc

The array is too big to fit on your stack. Use the heap with char *large = malloc(S) instead.

You don't have that much stack space to allocate an array that big ... on Linux for instance, the stack-size is typically 8192 bytes. You've definitely exceeded that.
The best option would be to allocate the memory on the heap using malloc(). So you would write char* large = malloc(S);. You can still access the array using the [] notation.
Optionally, if you're on Linux, you could on the commandline call sudo ulimit -s X, where X is some number large enough for your array to fit on the stack ... but I'd generally discourage that solution.

Large is being allocated on the stack and you are overflowing it.
Try using char *large = malloc(S)

Related

Problem with array declaration of large size in C [duplicate]

I'm implementing a sequential program for sorting like quicksort. I would like to test the performance of my program in a huge array of 1 or 10 billions of integers.
But the problem is that I obtain a segmentation error due to the size of the array.
A sample code of declaration of this array:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 1000000000
int main(int argc, char **argv)
{
int list[N], i;
srand(time(NULL));
for(i=0; i<N; i++)
list[i] = rand()%1000;
return 0;
}
I got a proposition to use mmap function. But I don't know how to use it ? can anybody help me to use it ?
I'm working on Ubuntu 10.04 64-bit, gcc version 4.4.3.
Thanks for your replies.
Michael is right, you can't fit that much on the stack. However, you can make it global (or static) if you don't want to malloc it.
#include <stdlib.h>
#include <time.h>
#define N 1000000000
static int list[N];
int main(int argc, char **argv)
{
size_t i;
srand(time(NULL));
for(i=0; i<N; i++)
list[i] = rand()%1000;
return 0;
}
You must use malloc for this sort of allocation. That much on the stack will fail nearly every time.
int *list;
list = malloc(N * sizeof(int));
This puts the allocation on the heap where there is a lot more memory available.
You probably don't create so large an array and if you do you certainly don't create it on the stack; the stack just isn't that big.
If you have a 32-bit address space and a 4-byte int, then you can't create an array with a billion ints; there just won't be enough contiguous space in memory for that large an object (there probably won't be enough contiguous space for an object a fraction of that size). If you have a 64-bit address space, you might get away with allocating that much space.
If you really want to try, you'll need either to create it statically (i.e., declare the array at file scope or with the static qualifier in the function) or dynamically (using malloc).
On linux systems malloc of very large chunks just does a mmap under the hood, so it is perhaps too tedious to look into that.
Be careful that you don't have neither overflow (signed integers) nor silent wrap (unsigned integers) for your array bounds and indices. Use size_t as a type for that, since you are on a 64bit machine, this then should work.
But as a habit you should definitively check your bounds against SIZE_MAX, something like assert(N*sizeof(data[0]) <= SIZE_MAX), to be sure.
The stack allocations makes it break. N=1Gig ints => 4Gig of memory (both with a 32-bit and a 64-bit compiler). But
if you want to measure the performance of quicksort, or a similar algorithm of yours, this is not the way to go about it.
Try instead to use multiple quicksorts in sequence on prepared samples with a large size.
-create a large random sample not more than half your available memory.
make sure it doesn''t fill your ram!
If it does all measuring efforts are in vain.
500 M elements is more than enough on a 4 gig system.
-decide on a test size ( e.g. N = 100 000 elements)
-start timer
--- do the algoritm for ( *start # i*N, *end # (i+1)*N)
(rinse repeat for next i until the large random sample is depleted)
-end timer
Now you have a very precise answer to how much time your algorithm has consumed. Run it a few times to get a feel of "how precise" (use a new srand(seed) seed each time). And change the N for more inspection.
Another option is to dynamically allocate a linked list of smaller arrays. You'll have to wrap them with accessor functions, but it's far more likely that you can grab 16 256 MB chunks of memory than a single 4 GB chunk.
typedef struct node_s node, *node_ptr;
struct node_s
{
int data[N/NUM_NODES];
node_ptr next;
};

Segmentation Fault on Small(ish) 2d array

I keep getting a segmentation with the following code. Changing the 4000 to 1000 makes the code run fine. I would think that I have enough memory here... How can I fix this?
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#define MAXLEN 4000
void initialize_mx(float mx[][MAXLEN])
{
int i, j;
float c=0;
for(i=0;i<MAXLEN;i++){
for(j=0;j<MAXLEN;j++) mx[i][j]=c;
}
}
int main(int ac, char *av[])
{
int i, j;
float confmx[MAXLEN][MAXLEN];
initialize_mx(confmx);
return 0;
}
The problem is you're overflowing the stack.
When you call initialize_mx() it allocates stack space for it's local variables (confmx in your case). This space, which is limited by your OS (check ulimit if you're on linux), can get overflowed if local variables are too big.
Basically you can:
Declare confmx as a global variable as cnicutar suggests.
Allocate memory space for your array dynamically. and pass a pointer to initialize_mx()
EDIT: Just realized you must still allocate memory space if you pass a pointer so you have those two options :)
You are using 4000*4000*4 bytes on your stack, if I didn't make any calculation errors, that's 61MB, which is a lot. It works with 1000 because in that case you are only using nearly 4MB on your stack.
4000*4000*sizeof(float)==64000000. I suspect your operating system may have a limit on the stack size between 4 and 64 MB.
As others have noted smallish isn't small for auto class variables which are allocated on the stack.
Depending on your needs, you could
static float confmx[MAXLEN][MAXLEN];
which would allocate the storage in the BSS. You might want to consider a different storage system as one often only needs a sparse matrix and there are more efficient ways to store and access matrices where many of the cells are zero.

Basic array usage in C?

Is this how you guys get size of an array in ANSI-C99? Seems kind of, um clunky coming from higher language.
int tests[7];
for (int i=0; i<sizeof(tests)/sizeof(int); i++) {
tests[i] = rand();
}
Also this Segmentation faults.
int r = 10000000;
printf ("r: %i\n", r);
int tests[r];
run it:
r: 10000000
Segmentation fault
10000000 seg faults, but 1000000 works.
How do I get more info out of this? What should I be checking and how would I debug something like this? Is there a limit on C arrays? What's a segmentation fault?
Getting size of an array in C is easy. This will give you the size of array in bytes.
sizeof(x)
But I guess what you require is number of elements, in that case it would be:
sizeof(x) / sizeof(x[0])
You can write a simple macro for this:
#define NumElements(x) (sizeof(x) / sizeof(x[0]))
For example:
int a[10];
int size_a = sizeof(a); /* size in bytes */
int numElm = NumElements(a); /* number of elements, here 10 */
Why calculate the size?
Define a constant containing the size and use that when declaring the array. Reference the constant whenever you want the size of the array.
As a primarily C++ programmer, I'll say that historically the constant was often defined as an enum value or a #define. In C, that may be current rather than historic, though - I don't know how current C handles "const".
If you really want to calculate the size, define a macro to do it. There may even be a standard one.
The reason for the segfault is most likely because the array you're trying to declare is about 40 megabytes worth, and is declared as a local variable. Most operating systems limit the size of the stack. Keep your array on the heap or in global memory, and 40 megabytes for one variable will probably be OK for most systems, though some embedded systems may still cry foul. In a language like Java, all objects are on the heap, and only references are kept on the stack. This is a simple and flexible system, but often much less efficient than storing data on the stack (heap allocation overheads, avoidable heap fragmentation, indirect access overheads...).
Arrays in C don't know how big they are, so yes, you have to do the sizeof array / sizeof array[0] trick to get the number of elements in an array.
As for the segfault issue, I'm guessing that you exceeded your stack size by attempting to allocate 10000000 * sizeof int bytes. A rule of thumb is that if you need more than a few hundred bytes, allocate it dynamically using malloc or calloc instead of trying to create a large auto variable:
int r = 10000000;
int *tests = malloc(sizeof *test * r);
Note that you can treat tests as though it were an array type in most circumstances (i.e., you can subscript it, you can pass it to any function that expects an array, etc.), but it is not an array type; it is a pointer type, so the sizeof tests / sizeof tests[0] trick won't work.
Traditionally, an array has a static size. So we can do
#define LEN 10
int arr[LEN];
but not
int len;
scanf("%d", &len);
int arr[len]; // bad!
Since we know the size of an array at compile time, getting the size of an array tends to trivial. We don't need sizeof because we can figure out the size by looking at our declaration.
C++ provides heap arrays, as in
int len;
scanf("%d", &len);
int *arr = new int[len];
but since this involves pointers instead of stack arrays, we have to store the size in a variable which we pass around manually.
I suspect that it is because of integer overflow. Try printing the value using a printf:
printf("%d", 10000000);
If it prints a negative number - that is the issue.
Stack Overflow! Try allocating on the heap instead of the stack.

Segmentation fault due to lack of memory in C

This code gives me segmentation fault about 1/2 of the time:
int main(int argc, char **argv) {
float test[2619560];
int i;
for(i = 0; i < 2619560; i++)
test[i] = 1.0f;
}
I actually need to allocate a much larger array, is there some way of allowing the operating system to allow me get more memory?
I am using Linux Ubuntu 9.10
You are overflowing the default maximum stack size, which is 8 MB.
You can either increase the stack size - eg. for 32 MB:
ulimit -s 32767
... or you can switch to allocation with malloc:
float *test = malloc(2619560 * sizeof test[0]);
Right now you're allocating (or at least trying to) 2619560*sizeof(float) bytes on the stack. At least in most typical cases, the stack can use only a limited amount of memory. You might try defining it static instead:
static float test[2619560];
This gets it out of the stack, so it can typically use any available memory instead. In other functions, defining something as static changes the semantics, but in the case of main it makes little difference (other than the mostly theoretical possibility of a recursive main).
Don't put such a large object on the stack. Instead, consider storing it in the heap, by allocation with malloc() or its friends.
2.6M floats isn't that many, and even on a 32-bit system you should be ok for address space.
If you need to allocate a very large array, be sure to use a 64-bit system (assuming you have enough memory!). 32-bit systems can only address about 3G per process, and even then you can't allocate it all as a single contigous block.
It is the stack overflower.
You'd better to use malloc function to get memory larger than stack size which you can get it from "ulimit -s".

How to declare and use huge arrays of 1 billion integers in C?

I'm implementing a sequential program for sorting like quicksort. I would like to test the performance of my program in a huge array of 1 or 10 billions of integers.
But the problem is that I obtain a segmentation error due to the size of the array.
A sample code of declaration of this array:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 1000000000
int main(int argc, char **argv)
{
int list[N], i;
srand(time(NULL));
for(i=0; i<N; i++)
list[i] = rand()%1000;
return 0;
}
I got a proposition to use mmap function. But I don't know how to use it ? can anybody help me to use it ?
I'm working on Ubuntu 10.04 64-bit, gcc version 4.4.3.
Thanks for your replies.
Michael is right, you can't fit that much on the stack. However, you can make it global (or static) if you don't want to malloc it.
#include <stdlib.h>
#include <time.h>
#define N 1000000000
static int list[N];
int main(int argc, char **argv)
{
size_t i;
srand(time(NULL));
for(i=0; i<N; i++)
list[i] = rand()%1000;
return 0;
}
You must use malloc for this sort of allocation. That much on the stack will fail nearly every time.
int *list;
list = malloc(N * sizeof(int));
This puts the allocation on the heap where there is a lot more memory available.
You probably don't create so large an array and if you do you certainly don't create it on the stack; the stack just isn't that big.
If you have a 32-bit address space and a 4-byte int, then you can't create an array with a billion ints; there just won't be enough contiguous space in memory for that large an object (there probably won't be enough contiguous space for an object a fraction of that size). If you have a 64-bit address space, you might get away with allocating that much space.
If you really want to try, you'll need either to create it statically (i.e., declare the array at file scope or with the static qualifier in the function) or dynamically (using malloc).
On linux systems malloc of very large chunks just does a mmap under the hood, so it is perhaps too tedious to look into that.
Be careful that you don't have neither overflow (signed integers) nor silent wrap (unsigned integers) for your array bounds and indices. Use size_t as a type for that, since you are on a 64bit machine, this then should work.
But as a habit you should definitively check your bounds against SIZE_MAX, something like assert(N*sizeof(data[0]) <= SIZE_MAX), to be sure.
The stack allocations makes it break. N=1Gig ints => 4Gig of memory (both with a 32-bit and a 64-bit compiler). But
if you want to measure the performance of quicksort, or a similar algorithm of yours, this is not the way to go about it.
Try instead to use multiple quicksorts in sequence on prepared samples with a large size.
-create a large random sample not more than half your available memory.
make sure it doesn''t fill your ram!
If it does all measuring efforts are in vain.
500 M elements is more than enough on a 4 gig system.
-decide on a test size ( e.g. N = 100 000 elements)
-start timer
--- do the algoritm for ( *start # i*N, *end # (i+1)*N)
(rinse repeat for next i until the large random sample is depleted)
-end timer
Now you have a very precise answer to how much time your algorithm has consumed. Run it a few times to get a feel of "how precise" (use a new srand(seed) seed each time). And change the N for more inspection.
Another option is to dynamically allocate a linked list of smaller arrays. You'll have to wrap them with accessor functions, but it's far more likely that you can grab 16 256 MB chunks of memory than a single 4 GB chunk.
typedef struct node_s node, *node_ptr;
struct node_s
{
int data[N/NUM_NODES];
node_ptr next;
};

Resources