mmap() fails when allocating large amounts of memory

mmap() fails when allocating large amounts of memory - c

For my program I need an array of bytes of the size of 1/8th of the processs virtual memory space.
I used the getrlimit() system call to get the virtual memory size, then set it to the maximum limit using setrlimit(). I then used mmap() to allocate an array the size of 1/8th of the virtual memory size. Like so:
struct rlimit mem_limit;
if(getrlimit(RLIMIT_AS, &mem_limit) != 0){
return -errno;
}
mem_limit.rlim_cur = mem_limit.rlim_max;
if(setrlimit(RLIMIT_AS, &mem_limit) != 0){
return -errno;
}
array_size = (mem_limit.rlim_cur)/8;
printf("memory size is %lu bytes, array size is %lu bytes\n", mem_limit.rlim_cur, array_size);
mem_array = (char*) mmap(0, array_size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if(mem_array == MAP_FAILED){
printf("mmap failed with %d. allocation size = %lu\n", errno, g_shadow_mem_size);
return -errno;
}
mmap() fails here with errno 12, which as far as I know means there's not enough memory. I don't understand why since the program barely allocates memory other than this, let alone the other 7/8th of the memory.
I tried using malloc(), specifying an offset for mmap(), using the soft limit instead of the hard limit, allocating 1/32 of the memory instead of 1/8, using MAP_NORESERVE in the flags - nothing works so far.
I tried running a simple test program that only does the mmap() and no other memory allocations and it doesn't work either.
This is what I get:
memory size is 18446744073709551615 bytes, array size is 2305843009213693951 bytes
mmap failed with 12. allocation size = 2305843009213693951

The manpage of getrlimit has the following explanation.
RLIMIT_AS
This is the maximum size of the process's virtual memory (address space).
It doesn't return the available memory size but the memory size (18EB) in the 64-bit address space.
If you need to find out the available memory size, you can use the sysinfo function.
#include <stdio.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <linux/kernel.h>
#include <sys/sysinfo.h>
int main()
{
struct rlimit m;
struct sysinfo s;
getrlimit(RLIMIT_AS, &m);
printf("getrlimit rlim_cur:%ld, rlim_max:%ld\n", m.rlim_cur, m.rlim_max);
sysinfo(&s);
printf("sysinfo totalram:%ld, freeram:%ld, totalswap:%ld, freeswap:%ld\n", s.totalram, s.freeram, s.totalswap, s.freeswap);
return 0;
}

Related

How to correctly allocate a large chunk of RAM with calloc/malloc in Windows 10?

I need to allocate memory for a vector with n=10^9 (1 billion) rows using calloc or malloc but when I try to allocate this amount of memory the system crashes and returns me NULL, which I presumed to be the system not allowing me to allocate this big chunk of memory. I'm using Windows 10 in a 64-bit platform with 16 GB RAM.
However, when I ran the same code in a Linux OS (Debian) the system actually allocated the amount I demanded, so now I'm wondering:
How can I allocate this big chunk using Windows 10, once I'm out of time to
venture in Linux yet?
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
int main(void) {
uint32_t * a = calloc(1000000000, 4);
printf("a = %08x\n", a);
return 0;
}

The C run time won't let you do this but windows will, use the VirtualAlloc API instead of calloc. Specify NULL for the lpAddress parameter, MEM_COMMIT for flAllocationType and PAGE_READWRITE for flProtect. Also note that even though dwSize uses the "dw" decoration that is usually DWORD in this case the parameter is actually a SIZE_T which is 64-bit for 64-bit builds.
Code would look like:
#include <windows.h>
...
LPVOID pResult = VirtualAlloc(NULL, dwSize, MEM_COMMIT, PAGE_READWRITE);
if(NULL == pResult)
{ /* Handle allocation error */ }
Where dwSize is the number of bytes of memory you wish to allocate, you later use VirtualFree to release the allocated memory : VirtualFree(pResult, 0, MEM_RELEASE);
The dwSize parameter to VirtualFree is for use when you specify MEM_DECOMMIT (rather than MEM_RELEASE), allowing you to put memory back in the reserved but uncommitted state (meaning that actual pages have not yet been found to satisfy the allocation).

How to use malloc with madvise and enable MADV_DONTDUMP option

I'm looking to use madvise and malloc but I have always the same error:
madvise error: Invalid argument
I tried to use the MADV_DONTDUMP to save some space in my binaries but it didn't work.
The page size is 4096.
int main(int argc, char *argv[])
{
void *p_optimize_object;
unsigned int optimize_object_size = 4096*256;
optimize_object_size = ((optimize_object_size / 4096) + 1) * 4096;
printf("optimize_object_size = %d\n", optimize_object_size);
p_optimize_object = malloc(optimize_object_size);
if (madvise(p_optimize_object, optimize_object_size, MADV_DONTDUMP | MADV_SEQUENTIAL) == -1)
{
perror("madvise error");
}
printf("OK\n");
return 0;
}
Here's the command:
$ gcc -g -O3 madvice.c && ./a.out
Output:
madvise error: Invalid argument

You can't and even if you could do it in certain cases with certain flags (and the flags you're trying to use here should be relatively harmless), you shouldn't. madvise operates on memory from lower level allocations than malloc gives you and messing with the memory from malloc will likely break malloc.
If you want some block of memory that you can call madvise on, you should obtain it using mmap.

Your usage of sizeof is wrong; you are allocating only four bytes of memory (sizeof unsigned int), and calling madvise() with a size argument of 1M for the same chunk of memory.
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
void *p_optimize_object;
unsigned int optimize_object_size = 4096*256;
optimize_object_size = ((optimize_object_size / 4096) + 1) * 4096;
printf("optimize_object_size = %d\n", optimize_object_size);
p_optimize_object = malloc(sizeof(optimize_object_size));
fprintf(stderr, "Allocated %zu bytes\n", sizeof(optimize_object_size));
if (madvise(p_optimize_object, optimize_object_size, MADV_WILLNEED | MADV_SEQUENTIAL) == -1)
{
perror("madvise error");
}
printf("OK\n");
return 0;
}
Output:
optimize_object_size = 1052672
Allocated 4 bytes
madvise error: Invalid argument
OK
UPDATE:
And the other problem is that malloc() can give you non-aligned memory (probably with an alignment of 4,8,16,...), where madvice() wants page-aligned memory:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
void *p_optimize_object;
unsigned int optimize_object_size = 4096*256;
int rc;
optimize_object_size = ((optimize_object_size / 4096) + 1) * 4096;
printf("optimize_object_size = %d\n", optimize_object_size);
#if 0
p_optimize_object = malloc(sizeof(optimize_object_size));
fprintf(stderr, "Allocated %zu bytes\n", sizeof(optimize_object_size));
#elif 0
p_optimize_object = malloc(optimize_object_size);
fprintf(stderr, "Allocated %zu bytes\n", optimize_object_size);
#else
rc = posix_memalign (&p_optimize_object, 4096, optimize_object_size);
fprintf(stderr, "Allocated %zu bytes:%d\n", optimize_object_size, rc);
#endif
// if (madvise(p_optimize_object, optimize_object_size, MADV_WILLNEED | MADV_SEQUENTIAL) == -1)
if (madvise(p_optimize_object, optimize_object_size, MADV_WILLNEED | MADV_DONTFORK) == -1)
{
perror("madvise error");
}
printf("OK\n");
return 0;
}
OUTPUT:
$ ./a.out
optimize_object_size = 1052672
Allocated 1052672 bytes:0
OK
And the alignement requerement appears to be linux-specific:
Linux Notes
The current Linux implementation (2.4.0) views this system call more as a command than as advice and hence may return an error when it cannot
do what it usually would do in response to this advice. (See the ERRORS description above.) This is non-standard behavior.
The Linux implementation requires that the address addr be page-aligned, and allows length to be zero. If there are some parts of the speci‐
fied address range that are not mapped, the Linux version of madvise() ignores them and applies the call to the rest (but returns ENOMEM from
the system call, as it should).
Finally:
I tried to use the MADV_DONTDUMP to save some space in my binaries but it didn't work.
Which, of course, doesn't make sense. Malloc or posix_memalign add to your address space, making (at least) the VSIZ of your running program larger. What happens to the this space is completely in the hands of the (kernel) memory manager, driven by your program's references to the particular memory, with maybe a few hints from madvice.

I tried to use the MADV_DONTDUMP to save some space in my binaries but it didn't work.
Read again, and more carefully, the madvise(2) man page.
The address should be page aligned. The result of malloc is generally not page aligned (page size is often 4Kbytes, but see sysconf(3) for SC_PAGESIZE). Use mmap(2) to ask for a page-aligned segment in your virtual address space.
You won't save any space in your binary executable. You'll just save space in your core dump, see core(5). And core dumps should not happen. See signal(7) (read also about segmentation fault and undefined behaviour).
To disable core dumps, consider rather setrlimit(2) with RLIMIT_CORE (or the ulimit -c bash builtin in your terminal running a bash shell).

How to allocate a large memory in C [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I'm trying to write a program that can obtain a 1GB memory from the system by malloc(1024*1024*1024).
After I got the start address of the memory, In my limited understanding, if I want to initialize it, just using memset() to achieve. But the truth is there will trigger a segfault after a while.
And I tried using gdb to find where cause it, finally found if I do some operate of memory more than 128 MB will lead to this fault.
Is there has any rule that limits program just can access memory less than 128 MB? Or I used the wrong way to allocate and initialize it?
If there is a need for additional information, please tell me.
Any suggestion will be appreciated.
[Platform]
Linux 4.10.1 with gcc 5.4.0
Build program with gcc test.c -o test
CPU: Intel i7-6700
RAM: 16GB
[Code]
size_t mem_size = 1024 * 1024 * 1024;
...
void *based = malloc(mem_size); //mem_size = 1024^3
int stage = 65536;
int initialized = 0;
if (based) {
printf("Allocated %zu Bytes from %lx to %lx\n", mem_size, based, based + mem_size);
} else {
printf("Error in allocation.\n");
return 1;
}
int n = 0;
while (initialized < mem_size) { //initialize it in batches
printf("%6d %lx-%lx\n", n++, based+initialized, based+initialized+stage);
memset(based + initialized, '$', stage);
initialized += stage;
}
[Result]
Allocated 1073741824 Bytes from 7f74c9e66010 to 7f76c9e66010
...
2045 7f7509ce6010-7f7509d66010
2046 7f7509d66010-7f7509de6010
2047 7f7509de6010-7f7509e66010
2048 7f7509e66010-7f7509ee6010 //2048*65536(B)=128(MB)
Segmentation fault (core dumped)

There are two possible issues here. The first is that you're not using malloc() correctly. You need to check if it returns NULL, or a non-NULL value.
The other issue could be that the OS is over-committing memory, and the out-of-memory (OOM) killer is terminating your process. You can disable over-committing of memory and getting dumps to detect via these instructions.
Edit
Two major problems:
Don't do operations with side effects (ie: n++) inside a logging statement. VERY BAD practice, as logging calls are often removed at compile time in large projects, and now the program behaves differently.
Cast based to a (char *).
This should help with your problem.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
size_t mem_size = 1024 * 1024 * 1024;
printf("MEMSIZE: %lu\n", mem_size);
printf("SIZE OF: void*:%lu\n", sizeof(void*));
printf("SIZE OF: char*:%lu\n", sizeof(char*));
void *based = malloc(mem_size); //mem_size = 1024^3
int stage = 65536;
int initialized = 0;
if (based) {
printf("Allocated %zu Bytes from %p to %p\n", mem_size, based, based + mem_size);
} else {
printf("Error in allocation.\n");
return 1;
}
int n = 0;
while (initialized < mem_size) { //initialize it in batches
//printf("%6d %p-%p\n", n, based+initialized, based+initialized+stage);
n++;
memset((char *)based + initialized, '$', stage);
initialized += stage;
}
free(based);
return 0;
}

Holy, I found the problem - pointer type goes wrong.
Here is the complete code
int main(int argc, char *argv[]) {
/*Allocate 1GB memory*/
size_t mem_size = 1024 * 1024 * 1024;
// the problem is here, I used to use pointer as long long type
char* based = malloc(mem_size);
// and it misleading system to calculate incorrect offset
if (based) {
printf("Allocated %zu Bytes from %lx to %lx\n", mem_size, based, based + mem_size);
} else {
printf("Allocation Error.\n");
return 1;
}
/*Initialize the memory in batches*/
size_t stage = 65536;
size_t initialized = 0;
while (initialized < mem_size) {
memset(based + initialized, '$', stage);
initialized += stage;
}
/*And then I set the breakpoint, check the memory content with gdb*/
...
return 0;
Thank you for the people who have given me advice or comments :)

It is very unusual for a process to need such a large chunk of continuous memory and yes, the kernel does impose such memory limitations. You should probably know that malloc() when dealing with a memory request larger than 128 Kb it calls mmap() behind the curtains. You should try to use that directly.
You should also know that the default policy for the kernel when allocating is to allocate more memory than it has.
The logic is that most allocated memory is not actually used so it relatively safe to allow allocations that exceed the actual memory of the system.
EDIT: As some people have it pointed out, when your process does start to use the memory allocated successfully by the kernel it will get killed by the OOM Killer. This code has produced the following output:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int arc, char**argv)
{
char *c = malloc(sizeof(char) * 1024 * 1024 * 1024 * 5);
if(c)
{
printf("allocated memory\n");
memset(c, 1, sizeof(char) * 1024 * 1024 * 1024 * 5);
free(c);
}
else
{
printf("Out of memory\n");
}
return 0;
}
Output:
$ ./a.out
allocated memory
Killed
But after you change the limits of the system:
# echo 2 > /proc/sys/vm/overcommit_memory
# exit
exit
$ ./a.out
Out of memory
As you can see, the memory allocation was successful on the system and the problem appeared only after the memory was used
:EDIT
There are limits that the kernel imposes on how much memory you can allocate and you can check them out with these commands:
grep -e MemTotal -e CommitLimit -e Committed_AS /proc/meminfo
ulimit -a
The first command will print the total memory and the second will display the limit that the kernel imposes on allocations (CommitLimit). The limit is based on your system memory and the over-commit ratio defined on your system that you can check with this command cat /proc/sys/vm/overcommit_ratio.
The Committed_AS is the memory that is already allocated to the system at the moment. You will notice that this can exceed the Total Memory without causing a crash.
You can change the default behavior of your kernel to never overcommit by writing echo 2 > /proc/sys/vm/overcommit_memory You can check the man pages for more info on this.
I recommend checking the limits on your system and then disabling the default overcommit behavior of the kernel. Then try to see if your system can actually allocated that much memory by checking to see if malloc() of mmap() fail when allocating.
sources: LSFMM: Improving the out-of-memory killer
and Mastering Embedded Linux Programming by Chris Simmonds

max size a single pread can obtain

I am using pread to obtain huge amount of data at one time.
but If I try to gather a huge amount of data (for instance 100mb) and save it into an array I get a segfault....
is there a hard limit on the max number of bytes a pread can read?
#define _FILE_OFFSET_BITS 64
#define BLKGETSIZE64 _IOR(0x12,114,size_t)
#define _POSIX_C_SOURCE 200809L
#include <stdio.h>
#include <inttypes.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int readdata(int fp,uint64_t seekpoint, uint64_t seekwidth) {
int16_t buf[seekwidth];
if (pread(fp,buf,seekwidth,seekpoint)==seekwidth) {
printf("SUCCES READING AT: %"PRIu64"| WITH READ WIDTH: %"PRIu64"\n",seekpoint,seekwidth);
return 1;
} else {
printf("ERROR READING AT: %"PRIu64"| WITH READ WIDTH: %"PRIu64"\n",seekpoint,seekwidth);
return 2;
}
}
int main() {
uint64_t readwith,
offset;
int fp=open("/dev/sdc",O_RDWR);
readwith=10000; offset=0;
readdata(fp,offset,readwith);
readwith=100000; offset=0;
readdata(fp,offset,readwith);
readwith=1000000; offset=0;
readdata(fp,offset,readwith);
readwith=10000000; offset=0;
readdata(fp,offset,readwith);
readwith=10000000; offset=0;
readdata(fp,offset,readwith);
readwith=100000000; offset=0;
readdata(fp,offset,readwith);
readwith=1000000000; offset=0;
readdata(fp,offset,readwith);
readwith=10000000000; offset=0;
readdata(fp,offset,readwith);
readwith=100000000000; offset=0;
readdata(fp,offset,readwith);
readwith=1000000000000; offset=0;
readdata(fp,offset,readwith);
readwith=10000000000000; offset=0;
readdata(fp,offset,readwith);
readwith=100000000000000; offset=0;
readdata(fp,offset,readwith);
readwith=1000000000000000; offset=0;
readdata(fp,offset,readwith);
close(fp);
}

There is no hard limit on the maximum number of bytes that pread can read. However, reading that large an amount of data in one contiguous block is probably a bad idea. There are a few alternatives I'll describe later.
In your particular case, the problem is likely that you are trying to stack allocate the buffer. There is a limited amount of space available to the stack; if you run cat /proc/<pid>/limits, you can see what that is for a particular process (or just cat /proc/self/limits to check for the shell that you're running). In my case, that happens to be 8388608, or 8 MB. If you try to use more than this limit, you will get a segfault.
You can increase the maximum stack size using setrlimit or the Linux-specific prlimit, but that's generally not considered something good to do; your stack is something that is permanently allocated to each thread, so increasing the size increases how much address space each thread has allocated to it. On a 32 bit system (which are becoming less relevant, but there are still 32 bit systems out there, or 32 bit applications on 64 bit systems), this address space is actually fairly limited, so just a few threads with a large amount of stack space allocated could exhaust your address space. It would be better to take an alternate approach.
One such alternate approach is to use malloc to dynamically allocate your buffer. Then you will only use this space when you need it, not all the time for your whole stack. Yes, you do have to remember to free the data afterwards, but that's not all that hard with a little bit of careful thought in your programming.
Another approach, that can be good for large amounts of data like this, is to use mmap to map the file into your address space instead of trying to read the whole thing into a buffer. What this does is allocate a region of address space, and any time you access that address space, the data will be read from that file to populate the page that you are reading from. This can be very handy when you want random access to the file, but will not actually be reading the whole thing, you will be instead skipping around the file. Only the pages that you actually access will be read, rather than wasting time reading the whole file into a buffer and then accessing only portions of it.
If you use mmap, you will need to remember to munmap the address space afterwards, though if you're on a 64 bit system, it matters a lot less if you remember to munmap than it does if you remember to free allocated memory (on a 32 bit system, address space is actually at a premium, so leaving around large mappings can still cause problems). mmap will only use up address space, not actual memory; since it's backed by a file, if there's memory pressure the kernel can just write out any dirty data to disk and stop keeping the contents around in memory, while for an allocated buffer, it needs to actually preserve the data in RAM or swap space, which are generally fairly limited resources. And if you're just using it to read data, it doesn't even have to flush out dirty data to disk, it can just free-up the page and reuse it, and if you access the page again, it will read it back in.
If you don't need random access to all of that data at once, it's probably better to just read and process the data in smaller chunks, in a loop. Then you can use stack allocation for its simplicity, without worrying about increasing the amount of address space allocated to your stack.
edit to add: Based on your sample code and other question, you seem to be trying to read an entire 2TB disk as a single array. In this case, you will definitely need to use mmap, as you likely don't have enough RAM to hold the entire contents in memory. Here's an example; note that this particular example is specific to Linux:
#include <stdio.h>
#include <err.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <linux/fs.h>
#include <unistd.h>
int main(int argc, char **argv) {
if (argc != 2)
errx(1, "Wrong number of arguments");
int fd = open(argv[1], O_RDONLY);
if (fd < 0)
err(2, "Failed to open %s", argv[1]);
struct stat statbuf;
if (fstat(fd, &statbuf) != 0)
err(3, "Failed to stat %s", argv[1]);
size_t size;
if (S_ISREG(statbuf.st_mode)) {
size = statbuf.st_size;
} else if (S_ISBLK(statbuf.st_mode)) {
if (ioctl(fd, BLKGETSIZE64, &size) != 0)
err(4, "Failed to get size of block device %s", argv[1]);
}
printf("Size: %zd\n", size);
char *mapping = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
if (MAP_FAILED == mapping)
err(5, "Failed to map %s", argv[1]);
/* do something with `mapping` */
munmap(mapping, statbuf.st_size);
return 0;
}

Why does mmap() fail with ENOMEM on a 1TB sparse file?

I've been working with large sparse files on openSUSE 11.2 x86_64. When I try to mmap() a 1TB sparse file, it fails with ENOMEM. I would have thought that the 64 bit address space would be adequate to map in a terabyte, but it seems not. Experimenting further, a 1GB file works fine, but a 2GB file (and anything bigger) fails. I'm guessing there might be a setting somewhere to tweak, but an extensive search turns up nothing.
Here's some sample code that shows the problem - any clues?
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
char * filename = argv[1];
int fd;
off_t size = 1UL << 40; // 30 == 1GB, 40 == 1TB
fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0666);
ftruncate(fd, size);
printf("Created %ld byte sparse file\n", size);
char * buffer = (char *)mmap(NULL, (size_t)size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if ( buffer == MAP_FAILED ) {
perror("mmap");
exit(1);
}
printf("Done mmap - returned 0x0%lx\n", (unsigned long)buffer);
strcpy( buffer, "cafebabe" );
printf("Wrote to start\n");
strcpy( buffer + (size - 9), "deadbeef" );
printf("Wrote to end\n");
if ( munmap(buffer, (size_t)size) < 0 ) {
perror("munmap");
exit(1);
}
close(fd);
return 0;
}

The problem was that the per-process virtual memory limit was set to only 1.7GB. ulimit -v 1610612736 set it to 1.5TB and my mmap() call succeeded. Thanks, bmargulies, for the hint to try ulimit -a!

Is there some sort of per-user quota, limiting the amount of memory available to a user process?

My guess is the the kernel is having difficulty allocating the memory that it needs to keep up with this memory mapping. I don't know how swapped out pages are kept up with in the Linux kernel (and I assume that most of the file would be in the swapped out state most of the time), but it may end up needing an entry for each page of memory that the file takes up in a table. Since this file might be mmapped by more than one process the kernel has to keep up with the mapping from the process's point of view, which would map to another point of view, which would map to secondary storage (and include fields for device and location).
This would fit into your addressable space, but might not fit (at least contiguously) within physical memory.
If anyone knows more about how Linux does this I'd be interested to hear about it.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

mmap() fails when allocating large amounts of memory - c

Related

How to correctly allocate a large chunk of RAM with calloc/malloc in Windows 10?

How to use malloc with madvise and enable MADV_DONTDUMP option

How to allocate a large memory in C [closed]

max size a single pread can obtain

Why does mmap() fail with ENOMEM on a 1TB sparse file?

Categories

Resources