What determines how much memory can be allocated? - c

This is a follow-up to my previous question about why size_t is necessary.
Given that size_t is guaranteed to be big enough to represent the largest size of a block of memory you can allocate (meaning there can still be some integers bigger than size_t), my question is...
What determines how much you can allocate at once?

The architecture of your machine, the operating system (but the two are intertwined) and your compiler/set of libraries determines how much memory you can allocate at once.
malloc doesn't need to be able to use all the memory the OS could give him. The OS doesn't need to make available all the memory present in the machine (and various versions of Windows Server for example have different maximum memory for licensing reasons)
But note that the OS can make available more memory than the one present in the machine, and even more memory than the one permitted by the motherboard (let's say the motherboard has a single memory slot that accepts only 1gb memory stick, Windows could still let a program allocate 2gb of memory). This is done throught the use of Virtual Memory, Paging (you know, the swap file, your old and slow friend :-) Or, for example, through the use of NUMA.

I can think of three constraints, in actual code:
The biggest unsigned int size_t is able to allocate. size_t should be the same type (same size, etc.) the OS' memory allocation mechanism is using.
The biggest block the operating system is able to handle in RAM (how are block's size represented? how this representation affects the maximum block size?).
Memory fragmentation (largest free block) and the total available free RAM.


Is there a performance cost to using large mmap calls that go beyond expected memory usage?

Edit: On systems that use on-demand paging
For initializing data structures that are both persistent for the duration of the program and require a dynamic amount of memory is there any reason not to mmap an upper bound from the start?
An example is an array that will persistent for the entire program's life but whose final size is unknown. The approach I am most familiar with is something along the lines of:
type * array = malloc(size);
and when the array has reached capacity doubling it with:
array = realloc(array, 2 * size);
size *= 2;
I understand this is probably the best way to do this if the array might freed mid execution so that its VM can be reused, but if it is persistent is there any reason not to just initialize the array as follows:
array = mmap(0,
-1, 0)
so that the elements never needs to be copied.
Edit: Specifically for an OS that uses on-demand paging.
Don't try to be smarter than the standard library, unless you 100% know what you are doing.
malloc() already does this for you. If you request a large amount of memory, malloc() will mmap() you a dedicated memory area. If what you are concerned about is the performance hit coming from doing size *= 2; realloc(old, size), then just malloc(huge_size) at the beginning, and then keep track of the actual used size in your program. There really is no point in doing an mmap() unless you explicitly need it for some specific reason: it isn't faster nor better in any particular way, and if malloc() thinks it's needed, it will do it for you.
It's fine to allocate upper bounds as long as:
You're building a 64bit program: 32bit ones have restricted virtual space, even on 64bit CPUs
Your upper bounds don't approach 2^47, as a mathematically derived one might
You're fine with crashing as your out-of-memory failure mode
You'll only run on systems where overcommit is enabled
As a side note, an end user application doing this may want to borrow a page from GHC's book and allocate 1TB up front even if 10GB would do. This unrealistically large amount will ensure that users don't confuse virtual memory usage with physical memory usage.
If you know for a fact that wasting a chunk of memory (most likely an entire page which is likely 4096 bytes) will not cause your program or the other programs running on your system to run out of memory, AND you know for a fact that your program will only ever be compiled and run on UNIX machines, then this approach is not incorrect, but it is not good programming practice for the following reasons:
The <stdlib.h> file you #include to use malloc() and free() in your C programs is specified by the C standard, but it is specifically implemented for your architecture by the writers of the operating system. This means that your specific system was kept in-mind when these functions were written, so finding a sneaky way to improve efficiency for memory allocation is unlikely unless you know the inner workings of memory management in your OS better than those who wrote it.
Furthermore, the <sys/mman.h> file you include to mmap() stuff is not part of the C standard, and will only compile on UNIX machines, which reduces the portability of your code.
There's also a really good chance (assuming a UNIX environment) that malloc() and realloc() already use mmap() behind-the-scenes to allocate memory for your process anyway, so it's almost certainly better to just use them. (read that as "realloc doesn't necessarily actively allocate more space for me, because there's a good chance there's already a chunk of memory that my process has control of that can satisfy my new memory request without calling mmap() again")
Hope that helps!

Fortran: insufficient virtual memory

I - not a professional software engineer - am currently extending a quite large scientific software.
At runtime I get an error stating "insufficient virtual memory".
At this point during runtime, the used working memory is about 550mb and the error accurs when a rather big threedimensional array is dynamically allocated. The array - if it would be allocated - would be about a size of 170mb. Adding this to the already used 550mb the program would still be way below the 2gb boundary that is set for 32bit applications. Also there is more than enough working memory available on the system.
Visual Studio is currently set that it allocates arrays on the stack. Allocating them on the heap does not make any difference anyway.
Splitting the array into smaller arrays (being the size of the one big array in sum) results in the program running just fine. So I guess that the dynamically allocated memory has to be available in one adjacent block.
So there I am and I have no clue how to solve this. I can not deallocate some of the already used 550mb as the data is still required. I also can not change very much of the configuration (e.g. the compiler).
Is there a solution for my problem?
Thank you some much in advance and best regards
The virtual memory is the memory your program can address. It is usually the sum of the physical memory and the swap space. For example, if you have 16GB of physical memory and 4GB of swap space, the virtual memory will be 20GB. If your Fortran program tries to allocate more than those 20 addressable GB, you will get an "insufficient virtual memory" error.
To get an idea of the required memory of your 3D array:
allocate (A(nx,ny,nz))
You have nx*ny*nz elements and each element takes 8 bytes in double precision or 4 bytes in single precision. I let you do the math.
Some things:
1. It is usually preferable to to allocate huge arrays using operating system services rather than language facilities. That will circumvent any underlying library problems.
You may have a problem with 550MB in a 32-bit system. Usually there is some division of the 4GB address space into dedicated regions.
You need to make sure you have enough virtual memory.
a) Make sure your page file space is large enough.
b) Make sure that your system is not configured to limit processes address space sizes to smaller than what you need.
c) Make sure that your accounts settings are not limiting your process address space to smaller than allowed by the system.

Under what circumstances can malloc return NULL?

It has never happened to me, and I've programming for years now.
Can someone give me an example of a non-trivial program in which malloc will actually not work?
I'm not talking about memory exhaustion: I'm looking for the simple case when you are allocating just one memory block in a bound size given by the user, lets say an integer, causes malloc to fail.
You need to do some work in embedded systems, you'll frequently get NULL returned there :-)
It's much harder to run out of memory in modern massive-address-space-and-backing-store systems but still quite possible in applcations where you process large amounts of data, such as GIS or in-memory databases, or in places where your buggy code results in a memory leak.
But it really doesn't matter whether you've never experienced it before - the standard says it can happen so you should cater for it. I haven't been hit by a car in the last few decades either but that doesn't mean I wander across roads without looking first.
And re your edit:
I'm not talking about memory exhaustion, ...
the very definition of memory exhaustion is malloc not giving you the desired space. It's irrelevant whether that's caused by allocating all available memory, or heap fragmentation meaning you cannot get a contiguous block even though the aggregate of all free blocks in the memory arena is higher, or artificially limiting your address space usage such using the standards-compliant function:
void *malloc (size_t sz) { return NULL; }
The C standard doesn't distinguish between modes of failure, only that it succeeds or fails.
Just try to malloc more memory than your system can provide (either by exhausting your address space, or virtual memory - whichever is smaller).
will probably do it. If not, repeat a few times until you run out.
Any program at all written in c that needs to dynamically allocate more memory than the OS currently allows.
For fun, if you are using ubuntu type in
ulimit -v 5000
Any program you run will most likely crash (due to a malloc failure) as you've limited the amount of available memory to any one process to a pithy amount.
Unless your memory is already completely reserved (or heavily fragmented), the only way to have malloc() return a NULL-pointer is to request space of size zero:
char *foo = malloc(0);
Citing from the C99 standard, §7.20.3, subsection 1:
If the size of the space requested is zero, the behavior is implementationdefined: either a null pointer is returned, or the behavior is as if the size were some
nonzero value, except that the returned pointer shall not be used to access an object.
In other words, malloc(0) may return a NULL-pointer or a valid pointer to zero allocated bytes.
Pick any platform, though embedded is probably easier. malloc (or new) a ton of RAM (or leak RAM over time or even fragment it by using naive algorithms). Boom. malloc does return NULL for me on occasion when "bad" things are happening.
In response to your edit. Yes again. Memory fragmentation over time can make it so that even a single allocation of an int can fail. Also keep in mind that malloc doesn't just allocate 4 bytes for an int, but can grab as much space as it wants. It has its own book-keeping stuff and quite often will grab 32-64 bytes minimum.
On a more-or-less standard system, using a standard one-parameter malloc, there are three possible failure modes (that I can think of):
The size of allocation requested is not allowed. Eg, some systems may not allow an allocation > 16M, even if more storage is available.
A contiguous free area of the size requested, with default boundary, cannot be located in the heap. There may still be plenty of heap, but just not enough in one piece.
The total allocated heap has exceeded some "artificial" limit. Eg, the user may be prohibited from allocation more than 100M, even if there's 200M free and available to the "system" in a single combined heap.
(Of course, you can get combinations of 2 and 3, since some systems allocate non-contiguous blocks of address space to the heap as it grows, placing the "heap size limit" on the total of the blocks.)
Note that some environments support additional malloc parameters such as alignment and pool ID which can add their own twists.
Just check the manual page of malloc.
On success, a pointer to the memory block allocated by the function.
The type of this pointer is always void*, which can be cast to the desired type of data pointer in order to be dereferenceable.
If the function failed to allocate the requested block of memory, a null pointer is returned.
Yes. Malloc will return NULL when the kernel/system lib are certain that no memory can be allocated.
The reason you typically don't see this on modern machines is that Malloc doesn't really allocate memory, but rather it requests some “virtual address space” be reserved for your program so you might write in it. Kernels such as modern Linux actually over commit, that is they let you allocate more memory than your system can actually provide (swap + RAM) as long as it all fits in the address space of the system (typically 48bits on 64bit platforms, IIRC). Thus on these systems you will probably trigger an OOM killer before you will trigger a return of a NULL pointer. A good example is a 512MB RAM in a 32bit machine: it's trivial to write a C program that will be eaten by the OOM killer because of it trying to malloc all available RAM + swap.
(Overcomitting can be disabled at compile time on Linux, so it depends on the build options whether or not a given Linux kernel will overcommit. However, stock desktop distro kernels do it.)
Since you asked for an example, here's a program that will (eventually) see malloc return NULL:
perror();void*malloc();main(){for(;;)if(!malloc(999)){perror(0);return 0;}}
What? You don't like deliberately obfuscated code? ;) (If it runs for a few minutes and doesn't crash on your machine, kill it, change 999 to a bigger number and try again.)
EDIT: If it doesn't work no matter how big the number is, then what's happening is that your system is saying "Here's some memory!" but so long as you don't try to use it, it doesn't get allocated. In which case:
perror();char*p;void*malloc();main(){for(;;){p=malloc(999);if(p)*p=0;else{perror(0);return 0;}}
Should do the trick. If we can use GCC extentions, I think we can get it even smaller by changing char*p;void*malloc(); to void*p,*malloc(); but if you really wanted to golf you'd be on the Code Golf SE.
when the malloc param is negative or 0 or you have no memory left on heap.
I had to correct somebody's code which looked like this.
const int8_t bufferSize = 128;
void *buffer = malloc(bufferSize);
Here buffer is NULL because bufferSize is actually -128

How to find how much memory is actually used up by a malloc call?

If I call:
char *myChar = (char *)malloc(sizeof(char));
I am likely to be using more than 1 byte of memory, because malloc is likely to be using some memory on its own to keep track of free blocks in the heap, and it may effectively cost me some memory by always aligning allocations along certain boundaries.
My question is: Is there a way to find out how much memory is really used up by a particular malloc call, including the effective cost of alignment, and the overhead used by malloc/free?
Just to be clear, I am not asking to find out how much memory a pointer points to after a call to malloc. Rather, I am debugging a program that uses a great deal of memory, and I want to be aware of which parts of the code are allocating how much memory. I'd like to be able to have internal memory accounting that very closely matches the numbers reported by top. Ideally, I'd like to be able to do this programmatically on a per-malloc-call basis, as opposed to getting a summary at a checkpoint.
There isn't a portable solution to this, however there may be operating-system specific solutions for the environments you're interested in.
For example, with glibc on Linux, you can use the mallinfo() function from <malloc.h> which returns a struct mallinfo. The uordblks and hblkhd members of this structure contains the dynamically allocated address space used by the program including book-keeping overhead - if you take the difference of this before and after each malloc() call, you will know the amount of space used by that call. (The overhead is not necessarily constant for every call to malloc()).
Using your example:
char *myChar;
size_t s = sizeof(char);
struct mallinfo before, after;
int mused;
before = mallinfo();
myChar = malloc(s);
after = mallinfo();
mused = (after.uordblks - before.uordblks) + (after.hblkhd - before.hblkhd);
printf("Requested size %zu, used space %d, overhead %zu\n", s, mused, mused - s);
Really though, the overhead is likely to be pretty minor unless you are making a very very high number of very small allocations, which is a bad idea anyway.
It really depends on the implementation. You should really use some memory debugger. On Linux Valgrind's Massif tool can be useful. There are memory debugging libraries like dmalloc, ...
That said, typical overhead:
1 int for storing size + flags of this block.
possibly 1 int for storing size of previous/next block, to assist in coallescing blocks.
2 pointers, but these may only be used in free()'d blocks, being reused for application storage in allocated blocks.
Alignment to an approppriate type, e.g: double.
-1 int (yes, that's a minus) of the next/previous chunk's field containing our size if we are an allocated block, since we cannot be coallesced until we're freed.
So, a minimum size can be 16 to 24 bytes. and minimum overhead can be 4 bytes.
But you could also satisfy every allocation via mapping memory pages (typically 4Kb), which would mean overhead for smaller allocations would be huge. I think OpenBSD does this.
There is nothing defined in the C library to query the total amount of physical memory used by a malloc() call. The amount of memory allocated is controlled by whatever memory manager is hooked up behind the scenes that malloc() calls into. That memory manager can allocate as much extra memory as it deemes necessary for its internal tracking purposes, on top of whatever extra memory the OS itself requires. When you call free(), it accesses the memory manager, which knows how to access that extra memory so it all gets released properly, but there is no way for you to know how much memory that involves. If you need that much fine detail, then you need to write your own memory manager.
If you do use valgrind/Massif, there's an option to show either the malloc value or the top value, which differ a LOT in my experience. Here's an excerpt from the Valgrind manual http://valgrind.org/docs/manual/ms-manual.html :
...However, if you wish to measure all the memory used by your program,
you can use the --pages-as-heap=yes. When this option is enabled,
Massif's normal heap block profiling is replaced by lower-level page
profiling. Every page allocated via mmap and similar system calls is
treated as a distinct block. This means that code, data and BSS
segments are all measured, as they are just memory pages. Even the
stack is measured...

How can I reserve memory addresses without allocating them

I would like (in *nix) to allocate a large, contigious address space, but without consuming resources straight away, i.e. I want to reserve an address range an allocate from it later.
Suppose I do foo=malloc(3*1024*1024*1024) to allocate 3G, but on a 1G computer with 1G of swap file. It will fail, right?
What I want to do is say "Give me a memory address range foo...foo+3G into which I will be allocating" so I can guarantee all allocations within this area are contiguous, but without actually allocating straight away.
In the example above, I want to follow the foo=reserve_memory(3G) call with a bar=malloc(123) call which should succeedd since reserve_memory hasn't consumed any resources yet, it just guarantees that bar will not be in the range foo...foo+3G.
Later I would do something like allocate_for_real(foo,0,234) to consume bytes 0..234 of foo's range. At this point, the kernel would allocate some virtual pages and map them to foo...foo+123+N
Is this possible in userspace?
(The point of this is that objects in foo... need to be contiguous and cannot reasonably be moved after they are created.)
Thank you.
Short answer: it already works that way.
Slightly longer answer: the bad news is that there is no special way of reserving a range, but not allocating it. However, the good news is that when you allocate a range, Linux does not actually allocate it, it just reserves it for use by you, later.
The default behavior of Linux is to always accept a new allocation, as long as there is address range left. When you actually start using the memory though, there better be some memory or at least swap backing it up. If not, the kernel will kill a process to free memory, usually the process which allocated the most memory.
So the problem in Linux with default settings gets shifted from, "how much can I allocate", into "how much can I allocate and then still be alive when I start using the memory?"
Here is some info on the subject.
I think, a simple way would be to do that with a large static array.
On any modern system this will not be mapped to existing memory (in the executable file on disk or in RAM of your execution machine) unless you will really access it. Once you will access it (and the system has enough resources) it will be miraculously initialized to all zeros.
And your program will seriously slow down once you reach the limit of physical memory and then randomly crash if you run out of swap.
