Confusion after counting maximum allocation that can be done by malloc() - c

I have the following piece of code that runs in C on Ubuntu. It counts how many Gigabytes that can be allocated by the OS via malloc().
#include <sys/types.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
int count = 0;
char* a;
while (1){
a = (char*)malloc(1024*1024*1024);
if (a==NULL) break;
count++;
}
printf("%d\n", count);
return 0;
}
Surprisingly, when running on my machine, it prints more than 100,000. I feel it is so unreasonable. My RAM is 8GB, my hard disk is about 500 GB, where does this 100,000 come from?

This is memory overcommit:
[...]Under the default memory management strategy, malloc() essentially always succeeds, with the kenrel assuming you're not really going to use all of the memory you just asked for. The malloc()'s will continue to succeed, but not until you actually try to use the memory you allocated will the kernel 'really' allocate it. [...]
If we look at a Linux man page for malloc it says (emphasis mine):
By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available. In case it turns out that the system is out of memory, one or more processes will be killed by the OOM killer.
and:
For more information, see the description of /proc/sys/vm/overcommit_memory and /proc/sys/vm/oom_adj in proc(5), and the Linux kernel source file Documentation/vm/overcommit-accounting.

The operating system allows you to allocate much more memory than it has available. If you actually try to use the memory, it will run out.
If you want to see it run out, try, e.g., doing a memset on the memory after each successful malloc.

Related

Cannot Exhaust Physical Memory in 32 bit Linux

So I've got an interesting OS based problem for you. I've spent the last few hours conversing with anyone I know who's experienced with C programming, and nobody seems to be able to come up with a definitive answer as to why this behaviour is occurring.
I have a program that is intentionally designed to cause an extreme memory leak, (as an example of what happens when you don't free memory after allocating it). On 64 bit operating systems, (Windows, Linux, etc), it does what it should do. It fills physical ram, then fills the swap space of the OS. In Linux, the process is then terminated by the OS. In Windows however, it is not, and it continues running. The eventual result is a system crash.
Here's the code:
#include <stdlib.h>
#include <stdio.h>
void main()
{
while(1)
{
int *a;
a = (int*)calloc(65536, 4);
}
}
However, if you compile and run this code on a 32 bit Linux distribution, it has no effect on physical memory usage at all. It uses approximately 1% of my 4 GB of allocated RAM, and it never rises after that. I don't have a legitimate copy of 32 Bit Windows to test on, so I can't be certain this occurs on 32 bit Windows as well.
Can somebody please explain why the use of calloc will fill the physical ram of a 64 bit Linux OS, but not a 32 bit Linux OS?
The malloc and calloc functions do not technically allocate memory, despite their name. They actually allocate portions of your program's address space with OS-level read/write permissions. This is a subtle difference and is not relevant most of the time.
This program, as written, only consumes address space. Eventually, calloc will start returning NULL but the program will continue running.
#include <stdlib.h>
// Note main should be int.
int main() {
while (1) {
// Note calloc should not be cast.
int *a = calloc(65536, sizeof(int));
}
}
If you write to the addresses returned from calloc, it will force the kernel to allocate memory to back those addresses.
#include <stdlib.h>
#include <string.h>
int main() {
size_t size = 65536 * 4;
while (1) {
// Allocates address space.
void *p = calloc(size, 1);
// Forces the address space to have allocated memory behind it.
memset(p, 0, size);
}
}
It's not enough to write to a single location in the block returned from calloc because the granularity for allocating actual memory is 4 KiB (the page size... 4 KiB is the most common). So you can get by with just writing to each page.
What about the 64-bit case?
There is some bookkeeping overhead for allocating address space. On a 64-bit system, you get something like 40 or 48 bits of address space, of which about half can be allocated to the program, which comes to at least 8 TiB. On a 32-bit system this comes to 2 GiB or so (depending on kernel configuration).
So on a 64-bit system, you can allocate ~8 TiB, and a 32-bit system you can allocate ~2 GiB, and the overhead is what causes the problems. There is typically a small amount of overhead for each call to malloc or calloc.
See also Why malloc+memset is slower than calloc?

Is malloc faster when I freed memory before

When I allocate and free memory and afterwards I allocate memory that is max the size as the previously freed part.
May the 2nd allocation be faster than the first?
Maybe because it already knows a memory region that is free?
Or because this part of the heap is still assigned to the process?
Are there other possible advantages?
Or does it generally make no difference?
Edit: As asked in the comments:
I am especially interested in gcc and MSVC.
My assumption was that the memory was not "redeemed" by the OS before.
As there is a lot going about specific details about implementation, I'd like to make it more clear, that this is a hypothetical question.
I don't intend to abuse this, but I just want to know IF this may occur and what the reasons for the hypothetical speedup might be.
On some common platforms like GCC x86_64, there are two kinds of malloc(): the traditional kind for small allocations, and the mmap kind for large ones. Large, mmap-based allocations will have less interdependence. But traditional small ones will indeed experience a big speedup in some cases when memory has previously been free()'d.
This is because as you suggest, free() does not instantly return memory to the OS. Indeed it cannot do so in general, because the memory might be in the middle of the heap which is contiguous. So on lots of systems (but not all), malloc() will only be slow when it needs to ask the OS for more heap space.
Memory allocation with malloc should be faster whenever you avoid making system calls like sbrk or mmap. You will at least save a context switch.
Make an experiment with the following program
#include <stdlib.h>
int main() {
void* x = malloc(1024*1024);
free(x);
x = malloc(1024*1024);
}
and run it with command strace ./a.out
When you remove call to free you will notice two additional system calls brk.
Here's simple banchmark I compiled at -O1:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv){
for(int i=0;i<10000000;i++){
char volatile * p = malloc(100);
if(!p) { perror(0); exit(1); }
*p='x';
//free((char*)p);
}
return 0;
}
An iteration cost about 60ns with free and about 150ns without on my Linux.
Yes, mallocs after free can be significantly faster.
It depends on the allocated sizes. These small sizes will not be returned to the OS. For larger sizes that are powers of two, the glibc malloc starts mmaping and unmmapping and then I'd expect a slowdown in the freeing variant.

Does malloc() use brk() or mmap()?

c code:
// program break mechanism
// TLPI exercise 7-1
#include <stdio.h>
#include <stdlib.h>
void program_break_test() {
printf("%10p\n", sbrk(0));
char *bl = malloc(1024 * 1024);
printf("%x\n", sbrk(0));
free(bl);
printf("%x\n", sbrk(0));
}
int main(int argc, char **argv) {
program_break_test();
return 0;
}
When compiling following code:
printf("%10p\n", sbrk(0));
I get warning tip:
format ‘%p’ expects argument of type ‘void *’, but argument 2 has type ‘int’
Question 1: Why is that?
And after I malloc(1024 * 1024), it seems the program break didn't change.
Here is the output:
9b12000
9b12000
9b12000
Question 2: Does the process allocate memory on heap when start for future use? Or the compiler change the time point to allocate? Otherwise, why?
[update] Summary: brk() or mmap()
After reviewing TLPI and check man page (with help from author of TLPI), now I understand how malloc() decide to use brk() or mmap(), as following:
mallopt() could set parameters to control behavior of malloc(), and there is a parameter named M_MMAP_THRESHOLD, in general:
If requested memory is less than it, brk() will be used;
If requested memory is larger than or equals to it, mmap() will be used;
The default value of the parameter is 128kb (on my system), but in my testing program I used 1Mb, so mmap() was chosen, when I changed requested memory to 32kb, I saw brk() would be used.
The book mentioned that in TLPI page 147 and 1035, but I didn't read carefully of that part.
Detailed info of the parameter could be found in man page for mallopt().
If we change the program to see where the malloc'd memory is:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
void program_break_test() {
printf("%10p\n", sbrk(0));
char *bl = malloc(1024 * 1024);
printf("%10p\n", sbrk(0));
printf("malloc'd at: %10p\n", bl);
free(bl);
printf("%10p\n", sbrk(0));
}
int main(int argc, char **argv) {
program_break_test();
return 0;
}
It's perhaps a bit clearer that sbrk wouldn't change. The memory given to us by malloc is being mapped into a wildly different location.
You could also use strace on Linux to see what system calls are made, and find out that malloc is using mmap to perform the allocation.
malloc is not limited to using sbrk to allocate memory. It might, for example, use mmap to map a large MAP_ANONYMOUS block of memory; normally mmap will assign a virtual address well away from the data segment.
There are other possibilities, too. In particular, malloc, being a core part of the standard library, is not itself limited to standard library functions; it can make use of operating-system-specific interfaces.
If you use malloc in your code, it will call brk() at the beginning, allocated 0x21000 bytes from the heap, that's the address you printed, so the Question 1: the following mallocs requirements can be meet from the pre-allocated space, so these mallocs actually didn't call brk, it is a optimization in malloc. If next time you want to malloc size beyond that boundary, a new brk will be called (if not large than the mmap threshold).

Problem usage memory in C

Please help :)
OS : Linux
Where in " sleep(1000);", at this time "top (display Linux tasks)" wrote me 7.7 %MEM use.
valgrind : not found memory leak.
I understand, wrote correctly and all malloc result is NULL.
But Why in this time "sleep" my program NOT decreased memory ? What missing ?
Sorry for my bad english, Thanks
~ # tmp_soft
For : Is it free?? no
Is it free?? yes
For 0
For : Is it free?? no
Is it free?? yes
For 1
END : Is it free?? yes
END
~ #top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23060 root 20 0 155m 153m 448 S 0 7.7 0:01.07 tmp_soft
Full source : tmp_soft.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
struct cache_db_s
{
int table_update;
struct cache_db_s * p_next;
};
void free_cache_db (struct cache_db_s ** cache_db)
{
struct cache_db_s * cache_db_t;
while (*cache_db != NULL)
{
cache_db_t = *cache_db;
*cache_db = (*cache_db)->p_next;
free(cache_db_t);
cache_db_t = NULL;
}
printf("Is it free?? %s\n",*cache_db==NULL?"yes":"no");
}
void make_cache_db (struct cache_db_s ** cache_db)
{
struct cache_db_s * cache_db_t = NULL;
int n = 10000000;
for (int i=0; i = n; i++)
{
if ((cache_db_t=malloc(sizeof(struct cache_db_s)))==NULL) {
printf("Error : malloc 1 -> cache_db_s (no free memory) \n");
break;
}
memset(cache_db_t, 0, sizeof(struct cache_db_s));
cache_db_t->table_update = 1; // tmp
cache_db_t->p_next = *cache_db;
*cache_db = cache_db_t;
cache_db_t = NULL;
}
}
int main(int argc, char **argv)
{
struct cache_db_s * cache_db = NULL;
for (int ii=0; ii 2; ii++) {
make_cache_db(&cache_db);
printf("For : Is it free?? %s\n",cache_db==NULL?"yes":"no");
free_cache_db(&cache_db);
printf("For %d \n", ii);
}
printf("END : Is it free?? %s\n",cache_db==NULL?"yes":"no");
printf("END \n");
sleep(1000);
return 0;
}
For good reasons, virtually no memory allocator returns blocks to the OS
Memory can only be removed from your program in units of pages, and even that is unlikely to be observed.
calloc(3) and malloc(3) do interact with the kernel to get memory, if necessary. But very, very few implementations of free(3) ever return memory to the kernel1, they just add it to a free list that calloc() and malloc() will consult later in order to reuse the released blocks. There are good reasons for this design approach.
Even if a free() wanted to return memory to the system, it would need at least one contiguous memory page in order to get the kernel to actually protect the region, so releasing a small block would only lead to a protection change if it was the last small block in a page.
Theory of Operation
So malloc(3) gets memory from the kernel when it needs it, ultimately in units of discrete page multiples. These pages are divided or consolidated as the program requires. Malloc and free cooperate to maintain a directory. They coalesce adjacent free blocks when possible in order to be able to provide large blocks. The directory may or may not involve using the memory in freed blocks to form a linked list. (The alternative is a bit more shared-memory and paging-friendly, and it involves allocating memory specifically for the directory.) Malloc and free have little if any ability to enforce access to individual blocks even when special and optional debugging code is compiled into the program.
1. The fact that very few implementations of free() attempt to return memory to the system is not at all due to the implementors slacking off.Interacting with the kernel is much slower than simply executing library code, and the benefit would be small. Most programs have a steady-state or increasing memory footprint, so the time spent analyzing the heap looking for returnable memory would be completely wasted. Other reasons include the fact that internal fragmentation makes page-aligned blocks unlikely to exist, and it's likely that returning a block would fragment blocks to either side. Finally, the few programs that do return large amounts of memory are likely to bypass malloc() and simply allocate and free pages anyway.
If you're trying to establish whether your program has a memory leak, then top isn't the right tool for the job (valrind is).
top shows memory usage as seen by the OS. Even if you call free, there is no guarantee that the freed memory would get returned to the OS. Typically, it wouldn't. Nonetheless, the memory does become "free" in the sense that your process can use it for subsequent allocations.
edit If your libc supports it, you could try experimenting with M_TRIM_THRESHOLD. Even if you do follow this path, it's going to be tricky (a single used block sitting close to the top of the heap would prevent all free memory below it from being released to the OS).
Generally free() doesn't give back physical memory to OS, they are still mapped in your process's virtual memory. If you allocate a big chunk of memory, libc may allocate it by mmap(); then if you free it, libc may release the memory to OS by munmap(), in this case, top will show that your memory usage comes down.
So, if you want't to release memory to OS explicitly, you can use mmap()/munmap().
When you free() memory, it is returned to the standard C library's pool of memory, and not returned to the operating system. In the vision of the operating system, as you see it through top, the process is still "using" this memory. Within the process, the C library has accounted for the memory and could return the same pointer from malloc() in the future.
I will explain it some more with a different beginning:
During your calls to malloc, the standard library implementation may determine that the process does not have enough allocated memory from the operating system. At that time, the library will make a system call to receive more memory from the operating system to the process (for example, sbrk() or VirtualAlloc() system calls on Unix or Windows, respectively).
After the library requests additional memory from the operating system, it adds this memory to its structure of memory available to return from malloc. Later calls to malloc will use this memory until it runs out. Then, the library asks the operating system for even more memory.
When you free memory, the library usually does not return the memory to the operating system. There are many reasons for this. One reason is that the library author believed you will call malloc again. If you will not call malloc again, your program will probably end soon. Either case, there is not much advantage to return the memory to the operating system.
Another reason that the library may not return the memory to the operating system is that the memory from operating system is allocated in large, contiguous ranges. It could only be returned when an entire contiguous range is no longer in use. The pattern of calling malloc and free may not clear the entire range of use.
Two problems:
In make_cache_db(), the line
for (int i=0; i = n; i++)
should probably read
for (int i=0; i<n; i++)
Otherwise, you'll only allocate a single cache_db_s node.
The way you're assigning cache_db in make_cache_db() seems to be buggy. It seems that your intention is to return a pointer to the first element of the linked list; but because you're reassigning cache_db in every iteration of the loop, you'll end up returning a pointer to the last element of the list.
If you later free the list using free_cache_db(), this will cause you to leak memory. At the moment, though, this problem is masked by the bug described in the previous bullet point, which causes you to allocate lists of only length 1.
Independent of these bugs, the point raised by aix is very valid: The runtime library need not return all free()d memory to the operating system.

Heap size limitation in C

I have a doubt regarding heap in program execution layout diagram of a C program.
I know that all the dynamically allocated memory is allotted in heap which grows dynamically. But I would like to know what is the max heap size for a C program ??
I am just attaching a sample C program ... here I am trying to allocate 1GB memory to string and even doing the memset ...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *temp;
mybuffer=malloc(1024*1024*1024*1);
temp = memset(mybuffer,0,(1024*1024*1024*1));
if( (mybuffer == temp) && (mybuffer != NULL))
printf("%x - %x\n", mybuffer, &mybuffer[((1024*1024*1024*1)-1)]]);
else
printf("Wrong\n");
sleep(20);
free(mybuffer);
return 0;
}
If I run above program in 3 instances at once then malloc should fail atleast in one instance [I feel so] ... but still malloc is successfull.
If it is successful can I know how the OS takes care of 3GB of dynamically allocated memory.
Your machine is very probably overcomitting on RAM, and not using the memory until you actually write it. Try writing to each block after allocating it, thus forcing the operating system to ensure there's real RAM mapped to the address malloc() returned.
From the linux malloc page,
BUGS
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available. This is a really bad bug. In
case it turns out that the system is out of memory, one or more pro‐
cesses will be killed by the infamous OOM killer. In case Linux is
employed under circumstances where it would be less desirable to sud‐
denly lose some randomly picked processes, and moreover the kernel ver‐
sion is sufficiently recent, one can switch off this overcommitting
behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files vm/overcommit-
accounting and sysctl/vm.txt.
You're mixing up physical memory and virtual memory.
http://apollo.lsc.vsc.edu/metadmin/references/sag/x1752.html
http://en.wikipedia.org/wiki/Virtual_memory
http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory
Malloc will allocate the memory but it does not write to any of it. So if the virtual memory is available then it will succeed. It is only when you write something to it will the real memory need to be paged to the page file.
Calloc if memory serves be correctly(!) write zeros to each byte of the allocated memory before returning so will need to allocate the pages there and then.

Resources