Dynamic Allocation of Memory - c

How malloc call managed by user-library. I need the explanation of "How memory is being allocated in user space when malloc is called. Who manage it. Like sbrk() is called to enter in kernel space".

The C runtime library manages the heap. The heap has some preallocated free store. If the runtime can't find a contiguous block there it tries to request more memory from the operating system - calls sbrk().
If the latter fails "out of memory" is reported - malloc() returns a null pointer. If additional memory is requested successfully and the received chunk is bigger that what the malloc() caller asked for the block in chunk is divided - one part is marked as occupied and returned to the caller and the other one is added to the free store.
From the point when sbrk() returned successfully the memory chunk belongs to the calling program address space.

The malloc() package of functions manages the space. It obtains relatively large chunks of memory from the system using sbrk() and passes out smaller chunks to its callers as requested, using whichever of the many possible algorithms it is designed to use. The free() function places released memory back into its list of 'available for use' memory. Very seldom does it actually release memory back to the operating system itself.
There are many articles on the design of different versions of malloc(). There are many debugging versions of malloc(), in particular, which look for abuses of the allocated memory. You can read about memory allocation in Knuth 'The Art of Computer Programming'; it's in volume 1 in my memory serves.

Related

Why does malloc() call mmap() and brk() interchangeably?

I'm new to C and heap memory, still struggling to understand dynamic memory allocation.
I traced Linux system calls and found that if I use malloc to request a small amount of heap memory, then malloc calls brk internally.
But if I use malloc to request a very large amount of heap memory, then malloc calls mmap internally.
So there must be a big difference between brk and mmap, but theoretically we should be able to use brk to allocate heap memory regardless of the requested size. So why does malloc call mmap when allocating a large amount of memory?
so why malloc calls mmap when it comes to allocate a large size of memory?
The short answer is for improved efficiency on newer implementations of Linux, and the updated memory allocation algorithms that come with them. But keep in mind that this is a very implementation dependent topic, and the whys and wherefores would vary greatly for differing vintages and flavors of the specific Linux OS being discussed.
Here is fairly recent write-up regarding the low-level parts mmap() and brk() play in Linux memory allocation. And, a not so recent, but still relevant Linux Journal article that includes some content that is very on-point for the topic here, including this:
For very large requests, malloc() uses the mmap() system call to find
addressable memory space. This process helps reduce the negative
effects of memory fragmentation when large blocks of memory are freed
but locked by smaller, more recently allocated blocks lying between
them and the end of the allocated space. In this case, in fact, had
the block been allocated with brk(), it would have remained unusable
by the system even if the process freed it.
(emphasis mine)
Regarding brk():
incidentally, "...mmap() didn't exist in the early versions of Unix. brk() was the only way to increase the size of the data segment of the process at that time. The first version of Unix with mmap() was SunOS in the mid 80's, the first open-source version was BSD-Reno in 1990.". Since that time, modern implementation of memory allocation algorithms have been refactored with many improvements, greatly reducing the need for them to include using brk().
mmap (when used with MAP_ANONYMOUS) allocates a chunk of RAM that can be placed anywhere within the process's virtual address space, and that can be deallocated later (with munmap) independently of all other allocations.
brk changes the ending address of a single, contiguous "arena" of virtual address space: if this address is increased it allocates more memory to the arena, and if it is decreased, it deallocates the memory at the end of the arena. Therefore, memory allocated with brk can only be released back to the operating system when a continuous range of addresses at the end of the arena is no longer needed by the process.
Using brk for small allocations, and mmap for big allocations, is a heuristic based on the assumption that small allocations are more likely to all have the same lifespan, whereas big allocations are more likely to have a lifespan that isn't correlated with any other allocations' lifespan. So, big allocations use the system primitive that lets them be deallocated independently from anything else, and small allocations use the primitive that doesn't.
This heuristic is not very reliable. The current generation of malloc implementations, if I remember correctly, has given up altogether on brk and uses mmap for everything. The malloc implementation I suspect you are looking at (the one in the GNU C Library, based on your tags) is very old and mainly continues to be used because nobody is brave enough to take the risk of swapping it out for something newer that will probably but not certainly be better.
brk() is a traditional way of allocating memory in UNIX -- it just expands the data area by a given amount. mmap() allows you to allocate independent regions of memory without being restricted to a single contiguous chunk of virtual address space.
malloc() uses the data space for "small" allocations and mmap() for "big" ones, for a number of reasons, including reducing memory fragmentation. It's just an implementation detail you shouldn't have to worry about.
Please check this question also.
Reducing fragmentation is commonly given as the reason why mmap is used for large allocations; see ryyker’s answer for details. But I think that’s not the real benefit nowadays; in practice there’s still fragmentation even with mmap, just in a larger pool (the virtual address space, rather than the heap).
The big advantage of mmap is discardability.
When allocating memory with sbrk, if the memory is actually used (so that the kernel maps physical memory at some point), and then freed, the kernel itself can’t know about that, unless the allocator also reduces the program break (which it can’t if the freed block isn’t the topmost previously-used block under the program break). The result is that the contents of that physical memory become “precious” as far as the kernel is concerned; if it ever needs to re-purpose that physical memory, it then has to ensure that it doesn’t lose its contents. So it might end up swapping pages out (which is expensive) even though the owning process no longer cares about them.
When allocating memory with mmap, freeing the memory doesn’t just return the block to a pool somewhere; the corresponding virtual memory allocation is returned to the kernel, and that tells the kernel that any corresponding physical memory, dirty or otherwise, is no longer needed. The kernel can then re-purpose that physical memory without worrying about its contents.
the key part of the reason I think, which I copied from the chat said by Peter
free() is a user-space function, not a system call. It either hands them back to the OS with munmap or brk, or keeps them dirty in user-space. If it doesn't make a system call, the OS must preserve the contents of those pages as part of the process state.
So when you use brk to increase your memory adress, when return back, you have to use the brk a negtive value, so brk only can return the most recently memory block you allocated, when you call malloc(huge), malloc(small), free(huge). the huge cannot be returned back to system, you can only maintain a list of fragmentation for this process, so the huge is actually hold by this process. this is the drawback of brk.
but the mmap and munmap can avoid this.
I want to emphasize another view point.
malloc is system function that allocate memory.
You do not really need to debug it, because in some implementations, it might give you memory from static "arena" (e.g. static char array).
In some other implementations it may just return null pointer.
If you want to see what mallow really do, I suggest you look at
http://gee.cs.oswego.edu/dl/html/malloc.html
Linux gcc malloc is based on this.
You can take a look at jemalloc too. It basically uses same brk and mmap, but organizes the data differently and usually is "better".
Happy researching.

Large memory usage of aio [duplicate]

Here's my question: Does calling free or delete ever release memory back to the "system". By system I mean, does it ever reduce the data segment of the process?
Let's consider the memory allocator on Linux, i.e ptmalloc.
From what I know (please correct me if I am wrong), ptmalloc maintains a free list of memory blocks and when a request for memory allocation comes, it tries to allocate a memory block from this free list (I know, the allocator is much more complex than that but I am just putting it in simple words). If, however, it fails, it gets the memory from the system using say sbrk or brk system calls. When a memory is free'd, that block is placed in the free list.
Now consider this scenario, on peak load, a lot of objects have been allocated on heap. Now when the load decreases, the objects are free'd. So my question is: Once the object is free'd will the allocator do some calculations to find whether it should just keep this object in the free list or depending upon the current size of the free list it may decide to give that memory back to the system i.e decrease the data segment of the process using sbrk or brk?
Documentation of glibc tells me that if the allocation request is much larger than page size, it will be allocated using mmap and will be directly released back to the system once free'd. Cool. But let's say I never ask for allocation of size greater than say 50 bytes and I ask a lot of such 50 byte objects on peak load on the system. Then what?
From what I know (correct me please), a memory allocated with malloc will never be released back to the system ever until the process ends i.e. the allocator will simply keep it in the free list if I free it. But the question that is troubling me is then, if I use a tool to see the memory usage of my process (I am using pmap on Linux, what do you guys use?), it should always show the memory used at peak load (as the memory is never given back to the system, except when allocated using mmap)? That is memory used by the process should never ever decrease(except the stack memory)? Is it?
I know I am missing something, so please shed some light on all this.
Experts, please clear my concepts regarding this. I will be grateful. I hope I was able to explain my question.
There isn't much overhead for malloc, so you are unlikely to achieve any run-time savings. There is, however, a good reason to implement an allocator on top of malloc, and that is to be able to trace memory leaks. For example, you can free all memory allocated by the program when it exits, and then check to see if your memory allocator calls balance (i.e. same number of calls to allocate/deallocate).
For your specific implementation, there is no reason to free() since the malloc won't release to system memory and so it will only release memory back to your own allocator.
Another reason for using a custom allocator is that you may be allocating many objects of the same size (i.e you have some data structure that you are allocating a lot). You may want to maintain a separate free list for this type of object, and free/allocate only from this special list. The advantage of this is that it will avoid memory fragmentation.
No.
It's actually a bad strategy for a number of reasons, so it doesn't happen --except-- as you note, there can be an exception for large allocations that can be directly made in pages.
It increases internal fragmentation and therefore can actually waste memory. (You can only return aligned pages to the OS, so pulling aligned pages out of a block will usually create two guaranteed-to-be-small blocks --smaller than a page, anyway-- to either side of the block. If this happens a lot you end up with the same total amount of usefully-allocated memory plus lots of useless small blocks.)
A kernel call is required, and kernel calls are slow, so it would slow down the program. It's much faster to just throw the block back into the heap.
Almost every program will either converge on a steady-state memory footprint or it will have an increasing footprint until exit. (Or, until near-exit.) Therefore, all the extra processing needed by a page-return mechanism would be completely wasted.
It is entirely implementation dependent. On Windows VC++ programs can return memory back to the system if the corresponding memory pages contain only free'd blocks.
I think that you have all the information you need to answer your own question. pmap shows the memory that is currenly being used by the process. So, if you call pmap before the process achieves peak memory, then no it will not show peak memory. if you call pmap just before the process exits, then it will show peak memory for a process that does not use mmap. If the process uses mmap, then if you call pmap at the point where maximum memory is being used, it will show peak memory usage, but this point may not be at the end of the process (it could occur anywhere).
This applies only to your current system (i.e. based on the documentation you have provided for free and mmap and malloc) but as the previous poster has stated, behavior of these is implmentation dependent.
This varies a bit from implementation to implementation.
Think of your memory as a massive long block, when you allocate to it you take a bit out of your memory (labeled '1' below):
111
If I allocate more more memory with malloc it gets some from the system:
1112222
If I now free '1':
___2222
It won't be returned to the system, because two is in front of it (and memory is given as a continous block). However if the end of the memory is freed, then that memory is returned to the system. If I freed '2' instead of '1'. I would get:
111
the bit where '2' was would be returned to the system.
The main benefit of freeing memory is that that bit can then be reallocated, as opposed to getting more memory from the system. e.g:
33_2222
I believe that the memory allocator in glibc can return memory back to the system, but whether it will or not depends on your memory allocation patterns.
Let's say you do something like this:
void *pointers[10000];
for(i = 0; i < 10000; i++)
pointers[i] = malloc(1024);
for(i = 0; i < 9999; i++)
free(pointers[i]);
The only part of the heap that can be safely returned to the system is the "wilderness chunk", which is at the end of the heap. This can be returned to the system using another sbrk system call, and the glibc memory allocator will do that when the size of this last chunk exceeds some threshold.
The above program would make 10000 small allocations, but only free the first 9999 of them. The last one should (assuming nothing else has called malloc, which is unlikely) be sitting right at the end of the heap. This would prevent the allocator from returning any memory to the system at all.
If you were to free the remaining allocation, glibc's malloc implementation should be able to return most of the pages allocated back to the system.
If you're allocating and freeing small chunks of memory, a few of which are long-lived, you could end up in a situation where you have a large chunk of memory allocated from the system, but you're only using a tiny fraction of it.
Here are some "advantages" to never releasing memory back to the system:
Having already used a lot of memory makes it very likely you will do so again, and
when you release memory the OS has to do quite a bit of paperwork
when you need it again, your memory allocator has to re-initialise all its data structures in the region it just received
Freed memory that isn't needed gets paged out to disk where it doesn't actually make that much difference
Often, even if you free 90% of your memory, fragmentation means that very few pages can actually be released, so the effort required to look for empty pages isn't terribly well spent
Many memory managers can perform TRIM operations where they return entirely unused blocks of memory to the OS. However, as several posts here have mentioned, it's entirely implementation dependent.
But lets say I never ask for allocation of size greater than say 50 bytes and I ask a lot of such 50 byte objects on peak load on the system. Then what ?
This depends on your allocation pattern. Do you free ALL of the small allocations? If so and if the memory manager has handling for a small block allocations, then this may be possible. However, if you allocate many small items and then only free all but a few scattered items, you may fragment memory and make it impossible to TRIM blocks since each block will have only a few straggling allocations. In this case, you may want to use a different allocation scheme for the temporary allocations and the persistant ones so you can return the temporary allocations back to the OS.

free() not freeing memory in embedded linux.

I have allocated memory using malloc() in embedded Linux (around 10 MB). And checked the free memory it was 67080 kB but even after freeing it using free() it remains the same. It is only after the application is terminated the memory is available again. Does free() not make the freed memory available to the system, if so how to make it available.
free is a libc library call. it marks heap space as available for reuse. It does not guarantee that the associated virtual mapping will be released. Only after a dirty virtual mapping is released by your OS, then that memory will be system wide free again. This can only happen in chunks of pages.
Also if you allocated memory using malloc and family and didn't use it then it didn't actually consume physical memory until then - so freeing it will do nothing.
Does free() not make the freed memory available to the system.
No, usually not.
malloc() normally requests memory from the OS by the low level sbrk() or mmap() call. Once assigned to the application, free() just returns the memory to a memory pool that belongs to the application. That is, it's not returned back to the OS for use in another process. (Though some heuristics are in-place to do so in certain circumstances).
If swap space is in place, this becomes less of a problem, the OS will swap out the unused memory of applications to make room for additional physical memory that's required.
if so how to make it available.
Exit the application.
Or you would need to write your own memory allocator that could do this.(which in the general case is not an easy task especially if you don't want to sacrifice overhead and speed).
For a relatively big single piece of 10MB, you could simply request anonymous memory with mmap() and the memory will be released back to the OS when you munmap() that piece of memory.
Taken from the malloc 3 man page:
Normally, malloc() allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks
of memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is
adjustable using mallopt(3)
You can try to modify the MMAP_THRESHOLD so that by using malloc you are invoking mmap. If you do so, free guarantees that the memory allocated through mmap will return back to the system as soon as you free it.
Your malloc() calls obtain memory from the system, and maintain a heap data structure for keeping track of used and free memory within the process. Your free() calls return memory to the heap, where they are marked free, but they're still part of the process's memory.
If you want memory deallocation to return pages to the system, you'll have to write your own memory manager, but keep in mind that it'll only be able to completely free memory under the right conditions: It depends on the behavior of your application, whether your allocations and deallocations span page boundaries and cleanly de-fragment, etc. You need to understand the memory allocation behavior of your application to know whether this will be any benefit.

How malloc knows the present free memory locations

How does actually malloc get the present free memory space available in the microcontroller.
Does it keep a list of areas unallocated continously in the runtime?
How does it get information of a previous malloc assignment memory allocation if there are two malloc statements in the code
How can one know which memory is free and which one is not at runtime. At compilation time we can know which all locations in RAM is assigned by the compiler for the variable. Does malloc uses this information to do this.
As the commentators said above, there are multiple implementations of malloc and the algorithm may vastly vary for each of these implementations. This is a vast and complicated area and you should read up on the memory management to get a complete idea on the topic.
In simple words, all the malloc implementations are backed up by the kernel's memory management schemes. The kernel see the whole system memory as pages for fixed size (4k, 8k etc) and all the allocations and frees are done on the pages. There will be a memory management subsystem exists for all the kernel implementations and which does the accounting of whole memory allocations and frees happening on the system. When you call a malloc, it will eventually reaches this memory management subsystem, and looks for the next available free page from the pool and allocates for the requesting process. Before giving the page to the requester, he will make sure to mark it as used and same way when you free up the memory it will add it back to the free pool and unmark used. There exists so many implementations on how the kernel does all these effectively (read up on memory manager implementations in linux)
In common implementations, there exists a minimal memory manager functionality in the userspace itself. The user space process itself maintains a free pool and when a malloc requests memory, before breaking in to kernel, it will look in its own free pool if memory is available. If available it will mark it up and satisfies the request without the help of kernel. Similarly, when you free up the memory, the freed up chunk of memory will not immediately go back to kernel's free pool instead it will stay with the process's free pool so that next malloc can use this.
As I said in the beginning, this is a huge and complicated topic and you can find a lot of documentations available in the internet about this.

Is calloc exactly the same as malloc + memset?

In linux, is calloc exactly the same as malloc + memset or does this depend on the exact linux/kernel version?
I am particularly interested in the question of whether you can calloc more RAM than you physically have (as you can certainly malloc more RAM than you physically have, you just can't write to it). In other words, does calloc always actually write to the memory you have been allocated as the specs suggest it should.
Of course, that depends on the implementation, but on a modern day Linux, you probably can. Easiest way is to try it, but I'm saying this based on the following logic.
You can malloc more than the memory you have (physical + virtual) because the kernel delays allocation of your memory until you actually use it. I believe that's to increase the chances of your program not failing due to memory limits, but that's not the question.
calloc is the same as malloc but zero initializes the memory. When you ask Linux for a page of memory, Linux already zero-initializes it. So if calloc can tell that the memory it asked for was just requested from the kernel, it doesn't actually have to zero initialize it! Since it doesn't, there is no access to that memory and therefore it should be able to request more memory than there actually is.
As mentioned in the comments this answer provides a very good explanation.
Whether calloc needs to write to the memory depends on whether it got the allocation from heap pages that are already assigned to the process, or it had to request more memory be assigned to the process by the kernel (using a system call such as sbrk() or mmap()). When the kernel assigns new memory to a process, it always zeroes it first (typically using a VM optimization, so it doesn't actually have to write to the page). But if it's reusing memory that was assigned previously, it has to use memset() to zero it.
It is not mentioned in the cited duplicate or here. Linux uses virtual memory and can allocate more memory that physically available in the system. A naive implementation of calloc() that simply does a malloc() plus memset() in user space will touch every page.
As Linux typically allocates in 4k chunks, all of the calloc() blocks are the same and initially read as zero. That is the same 4k chunk of memory can be mapped read only and the entire calloc() space in only taking up approximately size/4k * pointer_size + 4k. As the program writes to the calloc() space, a page fault happens and Linux will allocate a new page (4k) and resume the program.
This is called copy-on-write or COW for short. malloc() will generally behave the same way. For small sizes, the 'C' library will use binning and share 4k pages with other small sized allocation.
So, there are typically two layers involved.
Linux kernel's process memory management.
glibc heap management.
If the memory size requested is large and requires new memory allocated to the process, then most of the above applies (via Linux's process memory management). However, if the memory requested is small, then it will be like a malloc() plus memset(). In the large allocation size, the memset() is damaging as it touches the memory and the kernel thinks it needs a new page to allocate.
You can't malloc(3) more ram than the kernel gives the process doing the malloc(3)-ing. malloc(3) returns NULL if you can't allocate the amount of memory you want to allocate. In addition, malloc(3) and memset(3) are defined by your c library (libc.so) and not your kernel. The Linux kernel defines mmap(2) and other low-level memory allocation functions, not the *alloc(3) family (excluding kalloc()).

Resources