From my book:
Recall from our first discussion that modern dynamic memory managers
not only use sbrk() but also mmap(). This process helps reduce the
negative effects of memory fragmentation when large blocks of memory
are freed but locked by smaller, more recently allocated blocks lying
between them and the end of the allocated space. In this case, had the
block been allocated with sbrk(), it would have probably remained
unused by the system for some time (or at least most of it).
Can someone kindly explain how using mmap reduces the negative effects of memory fragmentation? The given example didn't make any sense to me and wasn't clear at all.
it would have probably remained unused by the system for some time
Why this claim was made, when we free it the system can use it later. Maybe the OS keeps list of freed blocks in heap to use them when possible instead of using more space in heap.
Please Relate to both questions.
Advantages of mmap() over sbrk()?
brk/sbrk is LIFO. Let's say you increase the segment size by X number of bytes to make room for allocation A and X number of bytes to make allocation B, and then free A. You cannot reduce the allocated memory because B is still allocated. And since the segment is shared across the entire program, if multiple parts of the program use it directly, you will have no way of knowing whether particular part is still in use or not. And if one part of the program (let's say malloc) assumes entire control over the use of brk/sbrk, then calling them elsewhere will break the program.
By contrast, mmap can be unmapped in any order and allocation by one part of the program doesn't conflict with other parts of the program.
brk/sbrk are not part of the POSIX standard and thus not portable.
By contrast, mmap is standard and portable.
mmap can also do things like map files into memory which is not possible using brk/sbrk.
it would have probably remained unused by the system for some time
Why this claim was made
See 1.
Maybe the OS keeps list of freed block
There are no "blocks". There is one (virtual) block called the data segment. brk/sbrk sets the size of that block.
But doesn't mmap allocate on heap
No. "Heap" is at the end of the data segment and heap is what grows using brk/sbrk. mmap does not allocate in the area of memory that has been allocated using brk/sbrk.
mmap creates a new segment elsewhere in the address space.
does malloc actually save the free blocks that were allocated with sbrk for later usage?
If it is allocated using brk/sbrk in the first place, and if malloc hasn't reduced the size of the "heap" (in case that was possible), then malloc may reuse a free "slot" that has been previously freed. It would be a useful thing to do.
"then calling them elsewhere will break the program." can you give an example
malloc(42);
sbrk(42);
malloc(42); // maybe kaboom, who knows?
In conclusion: Just don't use brk/sbrk to set the segment size. Perhaps there's little reason to use (anonymous) mmap either. Use malloc in C.
When sbrk() is used, the heap is just one, large block of memory. If your pattern of allocating and freeing doesn't leave large, contiguous blocks of memory, every large allocation will need to grow the heap. This can result in inefficient memory use, because of all the unused gaps that are left in the heap.
With mmap(), you can have a bunch of independent blocks of mapped memory. So you could use the sbrk() heap for your small allocations, which can be packed neatly, and use mmap() for large allocations. When you're done with one of these large blocks, you can just remove the entire mapping.
Related
I have read in the K&R book that "the space that malloc manages may not be contiguous. each block contains size ,a pointer to next block.". But when i did google i saw that most of the people are saying that malloc always does contiguous memory allocation. So please clear this doubt .What is the fact? The space that malloc manages is always contiguous or may or may not be contiguous.
A single block of memory returned by malloc, calloc or realloc will always be contiguous.
The next block of memory returned by a separate alloc call can be at any address and does not need to be contiguous with the first block.
Look at the diagram on the very page you read that sentence:
The region with dots is not owned by malloc, and is right between two regions that malloc manages. The region not owned by malloc is mostly likely allocated by another part of the program (for example, using mmap(2)). It could also just be not allocated.
Despite this 'hole' in the region of memory managed by malloc, malloc (and it's family of functions) only allocate blocks of memory that are contiguous.
So, the region managed by malloc could be discontiguous, but the blocks allocated by malloc are contiguous.
The memory segment you receive from a single call to malloc() is always contiguous. There's no warranty on the segments returned from different calls. It's possible that you get that impression at program start on the first calls to malloc, but as memory is returned to the heap with free() the memory returned is reused and finally you end with more or less amount of dispersion, making different calls to return completely unrelated pointers.
On other side, malloc() uses sbrk(2) system call that increases/decreases the size of the data segment, if possible. But in the case it cannot use sbrk(), malloc() uses memmap() to get one or more heaps to operate. Memmap() was not available in unix systems when that book was written, so in principle things can have changed a lot since then. Those heaps (mmap()ed) are mapped into the virtual space of the process, based on many possible situations by the kernel, not by malloc(), like the loading of dynamic shared objects, and this clashes with the enforcement of continuity in the address space. Probably the addresses will show ranges of discontinous memory interspersed with other ranges of completely different things, like physical memory attachments (like video memory if the operating system offers this service) memory mapped files, shared libraries, etc.
On other side, malloc must allocate some data for each chunk of memory that it gives to you.... Many implementations locate this chunk on one side of the given segment, so even if the chuncks were contiguous, you cannot normally use the memory between them or you'll corrupt the memory malloc() uses to handle the heap.
We know that malloc calls mmap internally. But mmap doesn't necessarily map to the heap as mmap can map objects to any area in virtual memory, then how does malloc do internally to make sure that the requested size of memory is from the heap?
When malloc uses mmap to allocate memory, it doesn‘t care where the memory comes from — it delegates the allocation to mmap, and relies on that to provide a usable block of memory.
In the GNU C library (and probably in other implementations too), such allocations are tracked separately from the allocations managed using sbrk. All operations involving mmaped allocations are also delegated (reallocation and freeing).
From the kernel’s perspective, such allocations are off-heap, i.e. after the program break. From the programmer’s perspective, they’re all the same; the main practical consequences compared to sbrk-only allocations is that you can’t assume that allocated blocks are within the program break, or that the address space between two allocated blocks is accessible, but you shouldn‘t do that anyway.
See also the POSIX specification for malloc — it doesn’t say anything about the heap.
Till now what I understood is as follow:
malloc internally uses sbrk and brk to allocate memory by increasing top of heap.
mmap allocate memory in form of pages.
Now, let's say current top of sbrk/malloc is 0x001000. And I use mmap to allocate a page of 4KB which is allocated at 0x0020000. Later, if I used malloc multiple times and because of that it had to increase sbrk top. So, what if top reaches 0x002000?
So, it will be great if someone can clarify the following.
Is above scenario possible?
If no than please point out flaw in my understanding of malloc and mmap.
If yes than I assume it is not safe to use it in this way. So, is there any other way to use both safely?
Thank you.
malloc is normally not implemented this way today... malloc used sbrk(2) in old implementations, when extending the data segment was the only way to ask the system for more virtual memory. Newer systems use mmap(2) if available, as they allow more flexibility when the virtual space is large enough (each mmaped chunk is managed as a new data segment for the process requesting it). sbrk(2) expands and shrinks the data segment, just like a stack.... so you have to be careful using sbrk(2) in case you are going to use it intermixed with a sbrk implementation of malloc. The way malloc operates, normally disallows you to return any memory obtained with sbrk(2) if you intermix the calls... so you can only use it to grow the data segment safely.
sbrk(2) also allocates memory in pages. Since paged virtual memory emerged, almost all o.s. allocation is made in page units. Newer systems have even more than one pagesize (e.g. 4Kb and 2Mb sizes), so you can get benefit of that, depending on the application.
As 64bit systems get more and more use, there's no problem in allocating address space large enough to allow for both mecanisms to live together. This is an advantage for a multiple heap malloc implementation, as memory is allocated and deallocated independently, and never in LIFO allocated order.
Malloc uses different approaches to allocate memory, but implementations normally try not to interfere with user sbrk(2) usage. You have to be careful, that is, if you intermix malloc(3) calls with sbrk(2) in a sbrk(2) malloc system. then you run the risk of sbrk(2)ing over the malloc adjusted data segment, and breaking the malloc internal data structures. You had better not to use sbrk(2) yourself if you are using a sbrk(2) implementation of malloc.
Finally, to answer your question, mmap(2) allocates memory as malloc(3) does, so malloc is not, and has not to be, aware of the allocated memory you did for your own use with mmap(2).
What exactly is heap memory?
Whenever a call to malloc is made, memory is assigned from something called as heap. Where exactly is heap. I know that a program in main memory is divided into instruction segment where program statements are presents, Data segment where global data resides and stack segment where local variables and corresponding function parameters are stored. Now, what about heap?
The heap is part of your process's address space. The heap can be grown or shrunk; you manipulate it by calling brk(2) or sbrk(2). This is in fact what malloc(3) does.
Allocating from the heap is more convenient than allocating memory on the stack because it persists after the calling routine returns; thus, you can call a routine, say funcA(), to allocate a bunch of memory and fill it with something; that memory will still be valid after funcA() returns. If funcA() allocates a local array (on the stack) then when funcA() returns, the on-stack array is gone.
A drawback of using the heap is that if you forget to release heap-allocated memory, you may exhaust it. The failure to release heap-allocated memory (e.g., failing to free() memory gotten from malloc()) is sometimes called a memory leak.
Another nice feature of the heap, vs. just allocating a local array/struct/whatever on the stack, is that you get a return value saying whether your allocation succeeded; if you try to allocate a local array on the stack and you run out, you don't get an error code; typically your thread will simply be aborted.
The heap is the diametrical opposite of the stack. The heap is a large pool of memory that can be used dynamically – it is also known as the “free store”. This is memory that is not automatically managed – you have to explicitly allocate (using functions such as malloc), and deallocate (e.g. free) the memory. Failure to free the memory when you are finished with it will result in what is known as a memory leak – memory that is still “being used”, and not available to other processes. Unlike the stack, there are generally no restrictions on the size of the heap (or the variables it creates), other than the physical size of memory in the machine. Variables created on the heap are accessible anywhere in the program.
Oh, and heap memory requires you to use pointers.
A summary of the heap:
the heap is managed by the programmer, the ability to modify it is
somewhat boundless
in C, variables are allocated and freed using functions like malloc() and free()
the heap is large, and is usually limited by the physical memory available
the heap requires pointers to access it
credit to craftofcoding
Basically, after memory is consumed by the needs of programs, what is left is the heap. In C that will be the memory available for the computer, for virtual machines it will be less than that.
But, this is the memory that can be used at run-time as your program needs memory dynamically.
You may want to look at this for more info:
http://computer.howstuffworks.com/c28.htm
Reading through this, this is actually beyond the realms of C. C doesn't specify that there's a heap behind malloc; it could just as easily be called a linked list; you're just calling it a heap by convention.
What the standard guarantees is that malloc will either return a pointer to an object that has dynamic storage duration, and your heap is just one type of data structure which facilitates the provision of such a storage duration. It's the common choice. Nonetheless, the very developers who wrote your heap have recognised that it might not be a heap, and so you'll see no reference of the term heap in the POSIX malloc manual for example.
Other things that are beyond the realms of standard C include such details of the machine code binary which is no longer C source code following compilation. The layout details, though typical, are all implementation-specific as opposed to C-specific.
The heap, or whichever book-keeping data structure is used to account for allocations, is generated during runtime; as malloc is called, new entries are (presumably) added to it and as free is called, new entries are (again, presumably) removed from it.
As a result, there's generally no need to have a section in the machine code binary for objects allocated using malloc, however there are cases where applications are shipped standalone baked into microprocessors, and in some of these cases you might find that flash or otherwise non-volatile memory might be reserved for that use.
Suppose we do a malloc request for memory block of size n where 2 ^k !=n for k>0.
Malloc returns us space for that requestted memory block but how is the remainig buffer handled from the page. I read Pages are generally blocks of memory which are powers of two.
Wiki states the following:
Like any method of memory allocation, the heap will become fragmented; that is,
there will be sections of used and unused memory in the allocated
space on the heap. A good allocator will attempt to find an unused area
of already allocated memory to use before resorting to expanding the heap.
So my question is how is this tracked?
EDIT: How is the unused memory tracked when using malloc ?
This really depends on the specific implementation, as Morten Siebuhr pointed out already. In very simple cases, there might be a list of free, fixed-size blocks of memory (possibly all having the same size), so the unused memory is simply wasted. Note that real implementations will never use such simplistic algorithms.
This is an overview over some simple possibilities: http://www.osdcom.info/content/view/31/39/
This Wikipedia entry has several interesting links, including the one above: http://en.wikipedia.org/wiki/Dynamic_memory_allocation#Implementations
As a final remark, googling "malloc implementation" turns up a heap (pun intended) of valuable links.
A standard BSD-style memory allocator basically works like this:
It keeps a linked list of pre-allocated memory blocks for sizes 2^k for k<=12 (for example).
In reality, each list for a given k is composed of memory-blocks from different areas, see below.
A malloc request for n bytes is serviced by calculating n', the closest 2^k >= n, then looking up the first area in the list for k, and then returning the first free block in the free-list for the given area.
When there is no pre-allocated memory block for size 2^k, an area is allocated, an area being some larger piece of continuous memory, say a 4kB piece of memory. This piece of memory is then chopped up into pieces that are 2^k bytes. At the beginning of the continuous memory area there is book-keeping information such as where to find the linked list of free blocks within the area. A bitmap can also be used, but a linked list typically has better cache behavior (you want the next allocated block to return memory that is already in the cache).
The reason for using areas is that free(ptr) can be implemented efficiently. ptr & 0xfffff000 in this example points to the beginning of the area which contains the book-keeping structures and makes it possible to link the memory block back into the area.
The BSD allocator will waste space by always returning a memory block 2^k in size, but it can reuse the memory of the block to keep the free-list, which is a nice property. Also allocation is blazingly fast.
Modifications to the above general idea include:
Using anonymous mmap for large allocations. This shifts the work over to the kernel for handling large mallocs and avoids wasting a lot of memory in these cases.
The GNU version of malloc have special cases for non-power-of-two buckets. There is nothing inherent in the BSD allocator that requires returning 2^k memory blocks, only that there are pre-defined bucket sizes. The GNU allocator has more buckets and thus waste less space.
Sharing memory between threads is a tricky subject. Lock-contention during allocation is an important consideration, so in the GNU allocator for example will eagerly create extra areas for different threads for a given bucket size if it ever encounters lock-contention during allocation.
This varies a lot from implementation to implementation. Some waste the space, some sub-divide pages until they get the requested size (or close to it) &c.
If you are asking out of curiosity, I suggest you read the source code for the implementation in question,
If it's because of performance worries, try to benchmark it and see what happens.