I am having a C program where I declare an 2d array with the size of 8,388,608 * 23. When I run the program I get the following error:
[1] 12142 segmentation fault (core dumped)
I think that the size of the array is to big.
Here is my code:
int a[8388608][23];
a[0][0] = 10;
You likely declared int a[8388608][23]; inside a function, and the C implementation attempted to allocate space on the stack.
In common C implementations on macOS, Linux, and Windows, the space designated for the stack by default ranges from 1 MiB to 8 MiB (8,388,608 bytes), depending on the operating system and whether it is a main thread or a spawned thread. Since your array exceeded the space for the stack, using it accessed memory not mapped for your process and generated a segmentation fault.
The C standard requires an implementation to have sufficient memory to execute at least some programs (C 2018 5.2.4.1) but allows there to be a limit on the memory available and does not require an implementation to provide any warning or error handling when a program exceeds the limit. It allows a program to fail and abort.
The stack size for a program can be set through linker options. However, it is generally best not to use the stack for large amounts of data. If a program needs an array throughout its entire execution, it can allocated statically by defining it outside of any function. The amount of memory needed will then be computed during link time and reserved when the program is loaded.
When a function needs a large amount of memory temporarily, it should be allocated dynamically. You can do this with malloc:
int (*a)[23] = malloc(8388608 * sizeof *a);
if (!a)
{
fprintf(stderr, "Error, unable to allocate memory.\n");
exit(EXIT_FAILURE);
}
When the function is done with the memory, it should release it with free(a);.
Actually there are no limit except computers or servers RAM from the memory perspective.
From another side compiler will not give permission to set size of each field more than long.
int a[(1<<31)] // ok
int a[(1<<63ll)] // ok
int a[(1<<70ll)] // not ok, since in c there is no value greater than long long from integer types
Related
I'm studying dynamic memory allocation in C, and I want to ask a question - let us suppose we have a program that receives a text input from the user. We don't know how long that text will be, it could be short, it could also be extremely long, so we know that we have to allocate memory to store the text in a buffer. In cases in which we receive a very long text, is there a way to find out whether we have enough memory space to allocate more memory to the text? Is there a way to have an indication that there is no memory space left?
You can use malloc() function if it returned NULL that means there no enough mem space but if it returned address of the mem it means there are mem space available example:
void* loc = malloc(sizeof(string));
ANSI C has no standard functions to get the size of available free RAM.
You may use platform-specific solutions.
C - Check currently available free RAM?
In C we typically use malloc, calloc and realloc for allocation of dynamic memory. As it has been pointed out in both answers and comments these functions return a NULL pointer in case of failure. So typical C code would be:
SomeType *p = malloc(size_required);
if (p == NULL)
{
// ups... malloc failed... add error handling
}
else
{
// great... p now points to allocated memory that we can use
}
I like to add that on (at least) Linux systems, the return value from malloc (and friends) is not really an out-of-memory indicator.
If the return value is NULL, we know the call failed and that we didn't get any memory that we can use.
But even if the return value is non-NULL, there is no guarantee that the memory really is available.
From https://man7.org/linux/man-pages/man3/free.3.html :
By default, Linux follows an optimistic memory allocation
strategy. This means that when malloc() returns non-NULL there
is no guarantee that the memory really is available. In case it
turns out that the system is out of memory, one or more processes
will be killed by the OOM killer.
We don't know how long that text will be
Sure we do, we always set a maximum limit. Because all user input needs to be sanitised anyway - so we always require a maximum limit on every single user input. If you don't do this, it likely means that your program is broken since it's vulnerable to buffer overruns.
Typically you'll read each line of user input into a local array allocated on the stack. Then you can check if it is valid (are strings null terminated etc) before allocating dynamic memory and then copy it over there.
By checking the return value of malloc etc you'll see if there was enough memory left or not.
There is no standard library function that tells you how much memory is available for use.
The best you can do within the bounds of the standard library is to attempt the allocation using malloc, calloc, or realloc and check the return value - it it’s NULL, then the allocation operation failed.
There may be system-specific routines that can provide that information, but I don’t know of any off the top of my head.
I made a test on linux with 8GB RAM. The overcommit has three main modes 0, 1 and 2 which are default, unlimited, and never:
Default:
$ echo 0 > /proc/sys/vm/overcommit_memory
$ ./a.out
After loop: Cannot allocate memory
size 17179869184
size 400000000
log2(size)34.000000
This means 8.5 GB were successfuly allocated, just about the amount of physical RAM. I tried to tweak it, but without changing swap, which is only 4 GB.
Unlimited:
$ echo 1 > /proc/sys/vm/overcommit_memory
$ ./a.out
After loop: Cannot allocate memory
size 140737488355328
size 800000000000
log2(size)47.000000
48 bits is virtual address size. 140 TB. Physical is only 39 bits (500 GB).
No overcommmit:
$ echo 2 > /proc/sys/vm/overcommit_memory
$ ./a.out
After loop: Cannot allocate memory
size 2147483648
size 80000000
log2(size)31.000000
2 GB is just what free command declares as free. Available are 4.6 GB.
malloc() fails in the same way if the process's resources are restricted - so this ENOMEM does not really specify much. "Cannot allocate memory" (aka ENOMEM aka 12) just says "malloc failed, guess why" or rather "malloc failed, NO more MEMory for you now.".
Well here is a.out which allocates doubling sizes until error.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <math.h>
int main() {
size_t sz = 4096;
void *p;
while (!errno) {
p = malloc(sz *= 2);
free(p);
}
perror("After loop");
printf("size %ld\n", sz);
printf("size %lx\n", sz);
printf("log2(size)%f\n", log2((double)sz));
}
But I don't think this kind of probing is very useful/
Buffer
we have to allocate memory to store the text in a buffer
But not the whole text at once. With a real buffer (not just an allocated memory as destination) you could read portions of the input and store them away (out of memory onto disk).
Only disadvantage is: if I cannot use a partial input, then all the buffered copying and saving is wasted.
I really wonder what happens if I type and type fast for a couple of billion years -- without a newline.
We can allocate much more than we have as RAM, but we only need a fraction of that RAM: the buffer. But Lundin's answer shows it is much easier (typical) to rely on newlines and maximum length.
getline(3)
This gnu/posix function has the malloc/realloc built in. The paramters are a bit complicated, because a new pointer and size can be returned by reference. And return value of -1 can also mean ENOMEM, not end-of-file.
fgets() is the line-truncating version.
fread() is newline independant, with fixed size. (But you asked about text input - long lines or long overall text, or both?)
Good Q, good As, good comments about "live input":
getline how to limit amount of input as you can with fgets
My question is pretty straightforward.
I'm building a small program to analyse and simulate random text using Markov chains. My first MC had memory of size 2, working on the alphabet {a, b, ..., z}. Therefore, my transition matrix was of size 26 * 26 * 26.
But now, I'd like to enhance my simulation using a MC with memory of size 4. Therefore, I need to store my probabilities of transitions in a 5D array of size 26*26*26*26*26.
The problem is (I believe), that C doesn't allow me to declare and manipulate such a array, as it might be too big. In fact, I got a segmentation faults 11 prompt when writing :
int count[26][26][26][26][26]
Is there a way to get around this restriction?
Thanks!
On a typical PC architecture with 32-bit integers, int count[26][26][26][26][26] creates an object of size 47525504 bytes, 47MB, which is manageable on most current computers, but is likely too large for automatic allocation (aka on the stack).
You can declare count as a global or a static variable, or you can allocate it from the heap and make count a pointer with this declaration:
int (*count)[26][26][26][26] = calloc(sizeof(*count), 26);
if (count == NULL) {
/* handle allocation failure gracefully */
fprintf(stderr, "cannot allocate memory for 5D array\n");
exit(1);
}
Make it global1 or make it static or dynamically allocate the same amount of memory. Dynamic memory allocation allocates memory from a portion of memory which doesn't have the constraint to an extent larger than the one you faced. Variables having automatic storage duration are likely to stored in stack in most implementations. Dynamic memory belongs to heap in most implementations.
You can do this (Illustration):-
int (*a)[26][26][26][26] = malloc(sizeof *a *26);
if(!a){ perror("malloc");exit(1);}
...
free(a);
1static storage duration - all variables defined in file scope have static storage duration.
With this kind of array declaration, your data will be stored in stack. And stack have usually only 8 MB on Unix like systems and 1 MB on Windows. But you need at least 4*26^5 B (roughly 46 MB).
Prefered solution would be allocate this array on heap using malloc.
But you can also instruct compiler to increase the stack size...
Try this
#define max=11881376 //answer of 26*26*26*26*26
int count[max]; //array
I have I am doing this problem on SPOJ. http://www.spoj.com/problems/NHAY/. It requires taking input dynamically. In the code below even though I am not allocating memory to char *needle using malloc() - I am taking l = 1 - yet I am able to take input of any length and also it is printing out the entire string. Sometimes it gives runtime error. Why is this when I have not allocated enough memory for the string?
#include<stdio.h>
#include<malloc.h>
#include<ctype.h>
#include<stdlib.h>
int main()
{
long long l;
int i;
char *needle;
while(1){
scanf("%lld",&l);
needle =(char *)malloc(sizeof(char)*l);
scanf("%s",needle);
i=0;
while(needle[i]!='\0'){
printf("%c",needle[i]);
i++;
}
free(needle);
}
}
I also read on stackoverflow that a string is a char * so I should declare char *needle. How can I use this fact in the code? If I take l = 1 then no matter, what the length of the input string it should contain characters only up to the memory allocated for the char * pointer, i.e 1 byte. How can I do that?
Your code is producing an intentional buffer overflow by having sscanf copying a string bigger than the allocated space into the memory allocated by malloc. This "works" because in most cases, the buffer that is allocated is somewhere in the middle of a page so copying more data into the buffer "only" overwrites adjacent data. C (and C++) don't do any array bounds checking on plain C array and thus the error is uncaught.
In the cases where you end up with a runtime error, you most likely copied part of the string into unmapped and unallocated memory, which trigger an access violation.
Memory is usually allocated from the underlying OS in pages of a fixed size. For example, on x86 systems, pages are usually 4k in size. If the mapped address you are writing to is far enough away from the beginning and end of the page, the whole string will fit within the boundaries of the page. If you get close enough to the upper boundary, the code may attempt to write past the boundary, triggering the access violation.
[With some assumptions about the underlying system]
The reason it works for now is that the C library manages pools of memory allocated in pages from the operating system. The operating system only returns pages. The C library returns arbitrary amounts of data.
For your first allocation, you are getting read/write pages allocated by the operating system and managed by the pool. You are going off the edge of the data allocated by the library but are within the page returned by the operating system.
DOing what you are doing will corrupt the structure of the pool and a more extensive program using dynamic memory will eventually crash.
C language do not have default bound check. At the best it will crash while debugging, sometimes it will work as expected. Otherwise you will end up overwriting other memory blocks.
It will not always work. It is Undefined Behaviour.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *arr = (int*)malloc(10);
int i;
for(i=0;i<100;i++)
{
arr[i]=i;
printf("%d", arr[i]);
}
return 0;
}
I am running above program and a call to malloc will allocate 10 bytes of memory and since each int variable takes up 2 bytes so in a way I can store 5 int variables of 2 bytes each thus making up my total 10 bytes which I dynamically allocated.
But on making a call to for-loop it is allowing me to enter values even till 99th index and storing all these values as well. So in a way if I am storing 100 int values it means 200 bytes of memory whereas I allocated only 10 bytes.
So where is the flaw with this code or how does malloc behave? If the behaviour of malloc is non-deterministic in such a manner then how do we achieve proper dynamic memory handling?
The flaw is in your expectations. You lied to the compiler: "I only need 10 bytes" when you actually wrote 100*sizeof(int) bytes. Writing beyond an allocated area is undefined behavior and anything may happen, ranging from nothing to what you expect to crashes.
If you do silly things expect silly behaviour.
That said malloc is usually implemented to ask the OS for chunks of memory that the OS prefers (like a page) and then manages that memory. This speeds up future mallocs especially if you are using lots of mallocs with small sizes. It reduces the number of context switches that are quite expensive.
First of all, in the most Operating Systems the size of int is 4 bytes. You can check that with:
printf("the size of int is %d\n", sizeof(int));
When you call the malloc function you allocate size at heap memory. The heap is a set aside for dynamic allocation. There's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time. Because your program is small and you have no collision in the heap you can run this for with more values that 100 and it runs too.
When you know what are you doing with malloc then you build programs with proper dynamic memory handling. When your code has improper malloc allocation then the behaviour of the program is "unknown". But you can use gdb debugger to find where the segmentation will be revealed and how the things are in heap.
malloc behaves exactly as it states, allocates n number bytes of memory, nothing more. Your code might run on your PC, but operating on non-allocated memory is undefined behavior.
A small note...
Int might not be 2 bytes, it varies on different architectures/SDKs. When you want to allocate memory for n integer elements, you should use malloc( n * sizeof( int ) ).
All in short, you manage dynamic memory with other tools that the language provides ( sizeof, realloc, free, etc. ).
C doesn't do any bounds-checking on array accesses; if you define an array of 10 elements, and attempt to write to a[99], the compiler won't do anything to stop you. The behavior is undefined, meaning the compiler isn't required to do anything in particular about that situation. It may "work" in the sense that it won't crash, but you've just clobbered something that may cause problems later on.
When doing a malloc, don't think in terms of bytes, think in terms of elements. If you want to allocate space for N integers, write
int *arr = malloc( N * sizeof *arr );
and let the compiler figure out the number of bytes.
This code gives me segmentation fault about 1/2 of the time:
int main(int argc, char **argv) {
float test[2619560];
int i;
for(i = 0; i < 2619560; i++)
test[i] = 1.0f;
}
I actually need to allocate a much larger array, is there some way of allowing the operating system to allow me get more memory?
I am using Linux Ubuntu 9.10
You are overflowing the default maximum stack size, which is 8 MB.
You can either increase the stack size - eg. for 32 MB:
ulimit -s 32767
... or you can switch to allocation with malloc:
float *test = malloc(2619560 * sizeof test[0]);
Right now you're allocating (or at least trying to) 2619560*sizeof(float) bytes on the stack. At least in most typical cases, the stack can use only a limited amount of memory. You might try defining it static instead:
static float test[2619560];
This gets it out of the stack, so it can typically use any available memory instead. In other functions, defining something as static changes the semantics, but in the case of main it makes little difference (other than the mostly theoretical possibility of a recursive main).
Don't put such a large object on the stack. Instead, consider storing it in the heap, by allocation with malloc() or its friends.
2.6M floats isn't that many, and even on a 32-bit system you should be ok for address space.
If you need to allocate a very large array, be sure to use a 64-bit system (assuming you have enough memory!). 32-bit systems can only address about 3G per process, and even then you can't allocate it all as a single contigous block.
It is the stack overflower.
You'd better to use malloc function to get memory larger than stack size which you can get it from "ulimit -s".