I need to fill 2-d array with 0s. But compiled program falls with this error. What's wrong?
int main()
{
int vert[1001][1001];
int hor[1001][1001];
int dudiag[1416][1416];
int uddiag[1416][1416];
int n, k, m;
int row, col;
int i, j;
int answer = 0;
for(i = 0; i <= 1000; i++){
for(j = 0; j <= 1000; j++){
vert[i][j] = 0;
hor[i][j] = 0;
}
}
...
}
When cycle is commented out, it works properly.
The problem is that you are trying to allocate too much memory in the automatic store (AKA "on the stack"). When you comment out the cycle, the compiler optimizes out the allocation along with the now-unused variables, so you do not get a crash.
You need to change the allocation to either static or dynamic memory (AKA "the heap") to fix this problem. Since the issue is inside main, making the arrays static would be an appropriate choice.
int main()
{
static int vert[1001][1001];
static int hor[1001][1001];
static int dudiag[1416][1416];
static int uddiag[1416][1416];
...
}
In functions other than main you could make these arrays dynamic, allocate them using malloc/calloc, and then free them once your program has finished using them.
What's wrong?
You are trying to reserve on stack several 4MB arrays. On many Linux distributions, the default stack size is 8MB (you can change this with ulimit -s unlimited).
Even with unlimited stack, the Linux kernel will not extend stack by more than some constant, so ulimit -s unlimited may not help avoiding the crash.
As dasblinkenlight's answer says, if you need arrays that large, allocate them dynamically.
Finally, an explicit for loop is an inefficient way to zero out an array. Using memset is likely to be much more efficient, and requires much less code.
Related
I'm new to CUDA/C and new to stack overflow. This is my first question.
I'm trying to allocate memory dynamically in a kernel function, but the results are unexpected.
I read using malloc() in a kernel can lower performance a lot, but I need it anyway so I first tried with a simple int ** array just to test the possibility, then I'll actually need to allocate more complex structs.
In my main I used cudaMalloc() to allocate the space for the array of int *, and then I used malloc() for every thread in the kernel function to allocate the array for every index of the outer array. I then used another thread to check the result, but it doesn't always work.
Here's main code:
#define N_CELLE 1024*2
#define L_CELLE 512
extern "C" {
int main(int argc, char **argv) {
int *result = (int *)malloc(sizeof(int));
int *d_result;
int size_numbers = N_CELLE * sizeof(int *);
int **d_numbers;
cudaMalloc((void **)&d_numbers, size_numbers);
cudaMalloc((void **)&d_result, sizeof(int *));
kernel_one<<<2, 1024>>>(d_numbers);
cudaDeviceSynchronize();
kernel_two<<<1, 1>>>(d_numbers, d_result);
cudaMemcpy(result, d_result, sizeof(int), cudaMemcpyDeviceToHost);
printf("%d\n", *result);
cudaFree(d_numbers);
cudaFree(d_result);
free(result);
}
}
I used extern "C"because I could't compile while importing my header, which is not used in this example code. I pasted it since I don't know if this may be relevant or not.
This is kernel_one code:
__global__ void kernel_one(int **d_numbers) {
int i = threadIdx.x + blockIdx.x * blockDim.x;
d_numbers[i] = (int *)malloc(L_CELLE*sizeof(int));
for(int j=0; j<L_CELLE;j++)
d_numbers[i][j] = 1;
}
And this is kernel_two code:
__global__ void kernel_two(int **d_numbers, int *d_result) {
int temp = 0;
for(int i=0; i<N_CELLE; i++) {
for(int j=0; j<L_CELLE;j++)
temp += d_numbers[i][j];
}
*d_result = temp;
}
Everything works fine (aka the count is correct) until I use less than 1024*2*512 total blocks in device memory. For example, if I #define N_CELLE 1024*4 the program starts giving "random" results, such as negative numbers.
Any idea of what the problem could be?
Thanks anyone!
In-kernel memory allocation draws memory from a statically allocated runtime heap. At larger sizes, you are exceeding the size of that heap and then your two kernels are attempting to read and write from uninitialised memory. This produces a runtime error on the device and renders the results invalid. You would already know this if you either added correct API error checking on the host side, or ran your code with the cuda-memcheck utility.
The solution is to ensure that the heap size is set to something appropriate before trying to run a kernel. Adding something like this:
size_t heapsize = sizeof(int) * size_t(N_CELLE) * size_t(2*L_CELLE);
cudaDeviceSetLimit(cudaLimitMallocHeapSize, heapsize);
to your host code before any other API calls, should solve the problem.
I don't know anything about CUDA but these are severe bugs:
You cannot convert from int** to void**. They are not compatible types. Casting doesn't solve the problem, but hides it.
&d_numbers gives the address of a pointer to pointer which is wrong. It is of type int***.
Both of the above bugs result in undefined behavior. If your program somehow seems to works in some condition, that's just by pure (bad) luck only.
#include<stdio.h>
int main()
{
int t;
scanf("%d",&t);
while(t-->0)
{
long int size;
scanf("%ld",&size);
long int size2=size*size;
long int a[size],b[size2];
long int i=0;
for(i=0;i<size;i++)
{
scanf("%ld",&a[i]);
}
long int j=0;
long int y;
y=2*a[0];
for(i=0;i<size;i++)
{
for(j=0;j<size;j++)
{
if(i!=0 && j!=0)
{
b[i*size+j]=a[i]+a[j];
y=y^b[i*size+j];
}
}
}
printf("%ld\n",y);
}
}
I was solving for one of the problems on a popular Competitive Coding Websites, I wrote this code it works for most test cases I tried but, they didn't consider it as it gave them an RE(SIGSEV)->(Runtime error due to segmentation fault.), And didn't even provide with the test case where the error sneaked in,
I made sure about the semantics of taking Input for data variables of different types and (Even made sure my code stays within the allowed limit of 50000 bytes.) Can somebody help me understand what is causing the segmentation fault here?
This segmentation fault is caused by allocating too much automatic memory through VLA (variable-length arrays).
I made sure my code stays within the allowed limit of 50000 bytes
Large VLAs may cause undefined behavior even if your program has plenty of memory available for dynamic allocation, because they take memory from the automatic storage (commonly referred to as "stack"). The amount of automatic storage available to your program is usually a fraction of the total memory available to your process.
You can fix this problem by switching to dynamic allocation:
long int *a = malloc(sizeof(long int)*size);
long int *b = malloc(sizeof(long int)*size2);
...
free(a);
free(b);
Is there a way to actually create dynamic arrays in C without having to use the stdlib?
Malloc requires the stdlib.h library, I am not allowed to use this in my project.
If anyone has any ideas, please share? Thanks.
malloc is not just a library, it is the way you interface with the Operating System to ask for more memory for the running process. Well, you could ask more memory and manage free/occupied memory yourself, but it would be wrong on many levels.
But, I am inclined to believe that your project is going to run in some kind of platform which does not have an operating system, is it?1 In that case, the faster solution is to first allocate statically some memory in a big global array, and every time you need memory you would ask for a manager responsible for this big array.
Let me give you an example, for the sake of simplicity it will be tiny and not very functional, but it is a very good quick start.
typedef char bool;
#define BLOCK_SIZE 1024 //Each allocation must have in max 1kb
#define MAX_DATA 1024*1024*10 //Our program statically allocates 10MB
#define BLOCKS (MAX_DATA/BLOCK_SIZE)
typedef char Scott_Block[BLOCK_SIZE];
Scott_Block Scott_memory[BLOCKS];
bool Scott_used_memory[BLOCKS];
void* Scott_malloc(unsigned size) {
if( size > BLOCK_SIZE )
return NULL;
unsigned int i;
for(i=0;i<BLOCKS;++i) {
if( Scott_used_memory[i] == 0 ) {
Scott_used_memory[i] = 1;
return Scott_memory[i];
}
}
return NULL;
}
void Scott_free(void* ptr) {
unsigned int pos = ((char*)(ptr)-Scott_memory[0])/BLOCK_SIZE;
printf("Pos %d\n",pos);
Scott_used_memory[pos] = 0;
}
I wrote this code to show how to emulate a memory manager. Let me point out a few improvements that may be done to it.
First, the Scott_used_memory could be a bitmap instead of a bool array.
Second, it does not allocate memory bigger than BLOCK_SIZE, it should search for consecutives blocks to create a bigger block. But for that you would need more control data to tell how much blocks an allocated void* occupies.
Third, the way free memory is searched (linearly) is very slow, usually the blocks creates a link list of free blocks.
But, like I said, this is a great quick start. And depending on your needs this may fulfill it very well.
1 If not, then you have absolutely no reason to not use malloc.
Well why not this program (C99)
#include <stdio.h>
int main(int argc, char *argv[])
{
int sizet, i;
printf("Enter size:");
scanf("%d",&sizet);
int array[sizet];
for(i = 0; i < sizet; i++){
array[i] = i;
}
for(i = 0; i < sizet; i++){
printf("%d", array[i]);
}
return 0;
}
Like a boss! :-)
I started to learn C recently. I use Code::Blocks with MinGW and Cygwin GCC.
I made a very simple prime sieve for Project Euler problem 10, which prints primes below a certain limit to stdout. It works fine until roughly 500000 as limit, but above that my minGW-compiled .exe crashes and the GCC-compiled one throws a "STATUS_STACK_OVERFLOW" exception.
I'm puzzled as to why, since the code is totally non-recursive, consisting of simple for loops.
#include <stdio.h>
#include <math.h>
#define LIMIT 550000
int main()
{
int sieve[LIMIT+1] = {0};
int i, n;
for (i = 2; i <= (int)floor(sqrt(LIMIT)); i++){
if (!sieve[i]){
printf("%d\n", i);
for (n = 2; n <= LIMIT/i; n++){
sieve[n*i] = 1;
}
}
}
for (i; i <= LIMIT; i++){
if (!sieve[i]){
printf("%d\n", i);
}
}
return 0;
}
Seems like you cannot allocate 550000 ints on the stack, allocate them dynamically instead.
int * sieve;
sieve = malloc(sizeof(int) * (LIMIT+1));
Your basic options are to store variables in data segment when your memory chunk is bigger than stack:
allocating memory for array in heap with malloc (as #Binyamin explained)
storing array in Data/BSS segments by declaring array as static int sieve[SIZE_MACRO]
All the memory in that program is allocated on the stack. When you increase the size of the array you increase the amount of space required on the stack. Eventually the method cannot be called as there isn't enough space on the stack to accomodate it.
Either experiement with mallocing the array (so it's allocated on the heap). Or learn how to tell the compiler to allocate a larger stack.
New to C, thanks a lot for help.
Is it possible to define an array in C without either specifying its size or initializing it.
For example, can I prompt a user to enter numbers and store them in an int array ? I won't know how many numbers they will enter beforehand.
The only way I can think of now is to define a max size, which is not an ideal solution...
Well, you can dynamically allocate the size:
#include <stdio.h>
int main(int argc, char *argv[])
{
int *array;
int cnt;
int i;
/* In the real world, you should do a lot more error checking than this */
printf("enter the amount\n");
scanf("%d", &cnt);
array = malloc(cnt * sizeof(int));
/* do stuff with it */
for(i=0; i < cnt; i++)
array[i] = 10*i;
for(i=0; i < cnt; i++)
printf("array[%d] = %d\n", i, array[i]);
free(array);
return 0;
}
Perhaps something like this:
#include <stdio.h>
#include <stdlib.h>
/* An arbitrary starting size.
Should be close to what you expect to use, but not really that important */
#define INIT_ARRAY_SIZE 8
int array_size = INIT_ARRAY_SIZE;
int array_index = 0;
array = malloc(array_size * sizeof(int));
void array_push(int value) {
array[array_index] = value;
array_index++;
if(array_index >= array_size) {
array_size *= 2;
array = realloc(array, array_size * sizeof(int));
}
}
int main(int argc, char *argv[]) {
int shouldBreak = 0;
int val;
while (!shouldBreak) {
scanf("%d", &val);
shouldBreak = (val == 0);
array_push(val);
}
}
This will prompt for numbers and store them in a array, as you asked. It will terminated when passed given a 0.
You create an accessor function array_push for adding to your array, you call realloc from with this function when you run out space. You double the amount of allocated space each time. At most you'll allocate double the memory you need, at worst you will call realloc log n times, where is n is final intended array size.
You may also want to check for failure after calling malloc and realloc. I have not done this above.
Yes, absolutely. C99 introduced the VLA or Variable Length Array.
Some simple code would be like such:
#include <stdio.h>
int main (void) {
int arraysize;
printf("How bid do you want your array to be?\n");
scanf("%d",&arraysize);
int ar[arraysize];
return 0;
}
Arrays, by definition, are fixed-size memory structures. You want a vector. Since Standard C doesn't define vectors, you could try looking for a library, or hand-rolling your own.
You need to do dynamic allocation: You want a pointer to a memory address of yet-unkown size. Read up on malloc and realloc.
If all you need is a data structure where in you can change its size dynamically then the best option you can go for is a linked list. You can add data to the list dynamically allocating memory for it and this would be much easier!!
If you're a beginner, maybe you don't want to deal with malloc and free yet. So if you're using GCC, you can allocate variable size arrays on the stack, just specifying the size as an expression.
For example:
#include <stdio.h>
void dyn_array(const unsigned int n) {
int array[n];
int i;
for(i=0; i<n;i++) {
array[i]=i*i;
}
for(i=0; i<n;i++) {
printf("%d\n",array[i]);
}
}
int main(int argc, char **argv) {
dyn_array(argc);
return 0;
}
But keep in mind that this is a non standard extension, so you shouldn't count on it if portability matters.
You can use malloc to allocate memory dynamically (i.e. the size is not known until runtime).
C is a low level language: you have to manually free up the memory after it's used; if you don't, your program will suffer from memory leaks.
UPDATE
Just read your comment on another answer.
You're asking for an array with a dynamically-changing-size.
Well, C has no language/syntactic facilities to do that; you either have to implement this yourself or use a library that has already implemented it.
See this question: Is there an auto-resizing array/dynamic array implementation for C that comes with glibc?
For something like this, you might want to look into data structures such as:
Linked Lists (Ideal for this situation)
Various Trees (Binary Trees, Heaps, etc)
Stacks & Queues
But as for instantiating a variable sized array, this isn't really possible.
The closest to a dynamic array is by using malloc and it's associated commands (delete, realloc, etc).
But in this situation, using commands like malloc may result in the need to expand the array, an expensive operation where you initialize another array and then copy the old array into that. Lists, and other datatypes, are generally much better at resizing.
If you're looking for array facilities and don't want to roll your own, try the following:
Glib
Apache APR
NSPR
Above given answers are correct but there is one correction, the function malloc() reserve a block of memory of specified size and return a pointer of type void* which can be casted into pointer of any form.
Syntax: ptr = (cast-type*) malloc(byte-size)
#include<stdio.h>
#include<cstdlib>
int main(int argc,char* argv[]){
int *arraySize,length;
scanf("%d",&length);
arraySize = (int*)malloc(length*sizeof(int));
for(int i=0;i<length;i++)
arraySize[i] = i*2;
for(int i=0;i<length;i++)
printf("arrayAt[%d]=%d\n",i,arraySize[i]);
free(arraySize);
}