In linux, is calloc exactly the same as malloc + memset or does this depend on the exact linux/kernel version?
I am particularly interested in the question of whether you can calloc more RAM than you physically have (as you can certainly malloc more RAM than you physically have, you just can't write to it). In other words, does calloc always actually write to the memory you have been allocated as the specs suggest it should.
Of course, that depends on the implementation, but on a modern day Linux, you probably can. Easiest way is to try it, but I'm saying this based on the following logic.
You can malloc more than the memory you have (physical + virtual) because the kernel delays allocation of your memory until you actually use it. I believe that's to increase the chances of your program not failing due to memory limits, but that's not the question.
calloc is the same as malloc but zero initializes the memory. When you ask Linux for a page of memory, Linux already zero-initializes it. So if calloc can tell that the memory it asked for was just requested from the kernel, it doesn't actually have to zero initialize it! Since it doesn't, there is no access to that memory and therefore it should be able to request more memory than there actually is.
As mentioned in the comments this answer provides a very good explanation.
Whether calloc needs to write to the memory depends on whether it got the allocation from heap pages that are already assigned to the process, or it had to request more memory be assigned to the process by the kernel (using a system call such as sbrk() or mmap()). When the kernel assigns new memory to a process, it always zeroes it first (typically using a VM optimization, so it doesn't actually have to write to the page). But if it's reusing memory that was assigned previously, it has to use memset() to zero it.
It is not mentioned in the cited duplicate or here. Linux uses virtual memory and can allocate more memory that physically available in the system. A naive implementation of calloc() that simply does a malloc() plus memset() in user space will touch every page.
As Linux typically allocates in 4k chunks, all of the calloc() blocks are the same and initially read as zero. That is the same 4k chunk of memory can be mapped read only and the entire calloc() space in only taking up approximately size/4k * pointer_size + 4k. As the program writes to the calloc() space, a page fault happens and Linux will allocate a new page (4k) and resume the program.
This is called copy-on-write or COW for short. malloc() will generally behave the same way. For small sizes, the 'C' library will use binning and share 4k pages with other small sized allocation.
So, there are typically two layers involved.
Linux kernel's process memory management.
glibc heap management.
If the memory size requested is large and requires new memory allocated to the process, then most of the above applies (via Linux's process memory management). However, if the memory requested is small, then it will be like a malloc() plus memset(). In the large allocation size, the memset() is damaging as it touches the memory and the kernel thinks it needs a new page to allocate.
You can't malloc(3) more ram than the kernel gives the process doing the malloc(3)-ing. malloc(3) returns NULL if you can't allocate the amount of memory you want to allocate. In addition, malloc(3) and memset(3) are defined by your c library (libc.so) and not your kernel. The Linux kernel defines mmap(2) and other low-level memory allocation functions, not the *alloc(3) family (excluding kalloc()).
Related
I'm new to C and heap memory, still struggling to understand dynamic memory allocation.
I traced Linux system calls and found that if I use malloc to request a small amount of heap memory, then malloc calls brk internally.
But if I use malloc to request a very large amount of heap memory, then malloc calls mmap internally.
So there must be a big difference between brk and mmap, but theoretically we should be able to use brk to allocate heap memory regardless of the requested size. So why does malloc call mmap when allocating a large amount of memory?
so why malloc calls mmap when it comes to allocate a large size of memory?
The short answer is for improved efficiency on newer implementations of Linux, and the updated memory allocation algorithms that come with them. But keep in mind that this is a very implementation dependent topic, and the whys and wherefores would vary greatly for differing vintages and flavors of the specific Linux OS being discussed.
Here is fairly recent write-up regarding the low-level parts mmap() and brk() play in Linux memory allocation. And, a not so recent, but still relevant Linux Journal article that includes some content that is very on-point for the topic here, including this:
For very large requests, malloc() uses the mmap() system call to find
addressable memory space. This process helps reduce the negative
effects of memory fragmentation when large blocks of memory are freed
but locked by smaller, more recently allocated blocks lying between
them and the end of the allocated space. In this case, in fact, had
the block been allocated with brk(), it would have remained unusable
by the system even if the process freed it.
(emphasis mine)
Regarding brk():
incidentally, "...mmap() didn't exist in the early versions of Unix. brk() was the only way to increase the size of the data segment of the process at that time. The first version of Unix with mmap() was SunOS in the mid 80's, the first open-source version was BSD-Reno in 1990.". Since that time, modern implementation of memory allocation algorithms have been refactored with many improvements, greatly reducing the need for them to include using brk().
mmap (when used with MAP_ANONYMOUS) allocates a chunk of RAM that can be placed anywhere within the process's virtual address space, and that can be deallocated later (with munmap) independently of all other allocations.
brk changes the ending address of a single, contiguous "arena" of virtual address space: if this address is increased it allocates more memory to the arena, and if it is decreased, it deallocates the memory at the end of the arena. Therefore, memory allocated with brk can only be released back to the operating system when a continuous range of addresses at the end of the arena is no longer needed by the process.
Using brk for small allocations, and mmap for big allocations, is a heuristic based on the assumption that small allocations are more likely to all have the same lifespan, whereas big allocations are more likely to have a lifespan that isn't correlated with any other allocations' lifespan. So, big allocations use the system primitive that lets them be deallocated independently from anything else, and small allocations use the primitive that doesn't.
This heuristic is not very reliable. The current generation of malloc implementations, if I remember correctly, has given up altogether on brk and uses mmap for everything. The malloc implementation I suspect you are looking at (the one in the GNU C Library, based on your tags) is very old and mainly continues to be used because nobody is brave enough to take the risk of swapping it out for something newer that will probably but not certainly be better.
brk() is a traditional way of allocating memory in UNIX -- it just expands the data area by a given amount. mmap() allows you to allocate independent regions of memory without being restricted to a single contiguous chunk of virtual address space.
malloc() uses the data space for "small" allocations and mmap() for "big" ones, for a number of reasons, including reducing memory fragmentation. It's just an implementation detail you shouldn't have to worry about.
Please check this question also.
Reducing fragmentation is commonly given as the reason why mmap is used for large allocations; see ryyker’s answer for details. But I think that’s not the real benefit nowadays; in practice there’s still fragmentation even with mmap, just in a larger pool (the virtual address space, rather than the heap).
The big advantage of mmap is discardability.
When allocating memory with sbrk, if the memory is actually used (so that the kernel maps physical memory at some point), and then freed, the kernel itself can’t know about that, unless the allocator also reduces the program break (which it can’t if the freed block isn’t the topmost previously-used block under the program break). The result is that the contents of that physical memory become “precious” as far as the kernel is concerned; if it ever needs to re-purpose that physical memory, it then has to ensure that it doesn’t lose its contents. So it might end up swapping pages out (which is expensive) even though the owning process no longer cares about them.
When allocating memory with mmap, freeing the memory doesn’t just return the block to a pool somewhere; the corresponding virtual memory allocation is returned to the kernel, and that tells the kernel that any corresponding physical memory, dirty or otherwise, is no longer needed. The kernel can then re-purpose that physical memory without worrying about its contents.
the key part of the reason I think, which I copied from the chat said by Peter
free() is a user-space function, not a system call. It either hands them back to the OS with munmap or brk, or keeps them dirty in user-space. If it doesn't make a system call, the OS must preserve the contents of those pages as part of the process state.
So when you use brk to increase your memory adress, when return back, you have to use the brk a negtive value, so brk only can return the most recently memory block you allocated, when you call malloc(huge), malloc(small), free(huge). the huge cannot be returned back to system, you can only maintain a list of fragmentation for this process, so the huge is actually hold by this process. this is the drawback of brk.
but the mmap and munmap can avoid this.
I want to emphasize another view point.
malloc is system function that allocate memory.
You do not really need to debug it, because in some implementations, it might give you memory from static "arena" (e.g. static char array).
In some other implementations it may just return null pointer.
If you want to see what mallow really do, I suggest you look at
http://gee.cs.oswego.edu/dl/html/malloc.html
Linux gcc malloc is based on this.
You can take a look at jemalloc too. It basically uses same brk and mmap, but organizes the data differently and usually is "better".
Happy researching.
I used C library malloc to allocate 8MB memory, after using that memory I used free to free the 8MB memory.
But when i used malloc again to allocate 8MB of memory, it is allocating the same location as allocated previously.
How to avoid this problem and why does this occur?
EDIT: I'm implementing a tool to test main memory, if malloc allocates the same block of memory it is not possible to check the whole memory
This is not a problem per se, and is by design. Typical implementations of malloc will recycle blocks of memory for reasons of performance. In any case, since malloc returns addresses from a limited pool of values, there's no way it could guarantee not to recycle blocks.
The only sure fire way to stop malloc returning blocks that is has returned before is to stop freeing them. Of course, that's not really very practical.
I'm implementing a tool to test main memory. If malloc allocates the same block of memory it is not possible to check the whole memory.
Your tool to test main memory cannot be implemented with malloc, or indeed by any user mode program. Modern operating systems don't give you access to physical memory. Rather they present a virtualized view of memory. The addresses in your program are not physical addresses, they are virtual address. Testing physical memory requires you to go in at a much lower level than is possible from a user mode program.
This should help you
How malloc works?
To prevent this, allocate a few bytes using malloc/calloc and then free the bigger chunk of memory.
BTW this is not a wrong behavior to get the same memory address.
You might want to call system() to run a few linux commands (that provide detailed memory mgmt options) from your code.Main memory management cannot be done/tested using malloc/free, they are limited to operating on the memory allocated to your program when it is running.
I have allocated memory using malloc() in embedded Linux (around 10 MB). And checked the free memory it was 67080 kB but even after freeing it using free() it remains the same. It is only after the application is terminated the memory is available again. Does free() not make the freed memory available to the system, if so how to make it available.
free is a libc library call. it marks heap space as available for reuse. It does not guarantee that the associated virtual mapping will be released. Only after a dirty virtual mapping is released by your OS, then that memory will be system wide free again. This can only happen in chunks of pages.
Also if you allocated memory using malloc and family and didn't use it then it didn't actually consume physical memory until then - so freeing it will do nothing.
Does free() not make the freed memory available to the system.
No, usually not.
malloc() normally requests memory from the OS by the low level sbrk() or mmap() call. Once assigned to the application, free() just returns the memory to a memory pool that belongs to the application. That is, it's not returned back to the OS for use in another process. (Though some heuristics are in-place to do so in certain circumstances).
If swap space is in place, this becomes less of a problem, the OS will swap out the unused memory of applications to make room for additional physical memory that's required.
if so how to make it available.
Exit the application.
Or you would need to write your own memory allocator that could do this.(which in the general case is not an easy task especially if you don't want to sacrifice overhead and speed).
For a relatively big single piece of 10MB, you could simply request anonymous memory with mmap() and the memory will be released back to the OS when you munmap() that piece of memory.
Taken from the malloc 3 man page:
Normally, malloc() allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks
of memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is
adjustable using mallopt(3)
You can try to modify the MMAP_THRESHOLD so that by using malloc you are invoking mmap. If you do so, free guarantees that the memory allocated through mmap will return back to the system as soon as you free it.
Your malloc() calls obtain memory from the system, and maintain a heap data structure for keeping track of used and free memory within the process. Your free() calls return memory to the heap, where they are marked free, but they're still part of the process's memory.
If you want memory deallocation to return pages to the system, you'll have to write your own memory manager, but keep in mind that it'll only be able to completely free memory under the right conditions: It depends on the behavior of your application, whether your allocations and deallocations span page boundaries and cleanly de-fragment, etc. You need to understand the memory allocation behavior of your application to know whether this will be any benefit.
I have written my own my_malloc() function that manages its own physical memory. In my application I want to be able use both the libc malloc() as well as my own my_malloc() function. So I somehow need to partition the virtual address space, malloc should always assign a virtual address only if its from its dedicated pool, same thing with my_malloc(). I cannot limit heap size, I just need to guarantee that malloc() and my_malloc() never return the same/overlapping virtual addresses.
thanks!
One possibility would be to have my_malloc() call malloc() at startup to pre-allocate a large pool of memory, then apportion that memory to its callers and manage it accordingly. However, a full implementation would need to handle garbage collection and defragmentation.
Another possibility would be to have my_malloc() call malloc() each time it needs to allocate memory and simply handle whatever "bookkeeping" aspects you're interested in, such as number of blocks allocated, number of blocks freed, largest outstanding block, total allocated memory, etc. This is by far the safer and more efficient mechanism, as you're passing all of the "hard" operations to malloc().
One answer is to make your my_malloc use memory allocated by malloc. Using big enough blocks would mostly acheive that; then within each block your version will maintain its own structures and allocate parts of it to callers.
It gets tricky because you can't rely on extending the entire available space for your version, as you can when you get memory from sbrk or similar. So your version would have to maintain several blocks.
Reserve a large block of virtual address space, and have that be the pool from which my_malloc() allocates. Once you have reserved a large contiguous region of memory from the OS, then subsequent calls to malloc() have to come from elsewhere.
For example, on Windows, you can use VirtualAlloc() to reserve a 256mb block of space. The memory won't actually be allocated until you "commit" it with a subsequent call, but it will reserve an address range (such as 0x4000000-0x5000000) which subsequent malloc() will not use. Then your my_malloc() can commit blocks out of this reserved range as requested, and subdivide them by whatever allocation scheme you've written.
I'm told the equivalent Linux call is mmap(). (edit: I previously said "kmalloc or vmalloc, depending on whether you need the memory to be physically contiguous or not," but those are kernel-level functions.)
We use this mechanism in our app to redirect all allocations of a certain size into our own custom pooled-block allocator for speed and efficiency. Among other things, it lets us reserve virtual pages in certain specific sizes that are more efficient for the CPU to handle.
If you add an mmap(2) call near the start of the program, you can allocate as much memory as you need with whatever addresses you need (see the hint, that's usually left NULL for the OS to determine) immediately; that will prevent malloc(3), or any other memory allocation routines, from getting those particular pages.
Don't worry about the memory usage; since modern systems are quite happy to overcommit, you'll only use a few hundred kilobytes more kernel space to handle page tables. Not too bad.
How malloc call managed by user-library. I need the explanation of "How memory is being allocated in user space when malloc is called. Who manage it. Like sbrk() is called to enter in kernel space".
The C runtime library manages the heap. The heap has some preallocated free store. If the runtime can't find a contiguous block there it tries to request more memory from the operating system - calls sbrk().
If the latter fails "out of memory" is reported - malloc() returns a null pointer. If additional memory is requested successfully and the received chunk is bigger that what the malloc() caller asked for the block in chunk is divided - one part is marked as occupied and returned to the caller and the other one is added to the free store.
From the point when sbrk() returned successfully the memory chunk belongs to the calling program address space.
The malloc() package of functions manages the space. It obtains relatively large chunks of memory from the system using sbrk() and passes out smaller chunks to its callers as requested, using whichever of the many possible algorithms it is designed to use. The free() function places released memory back into its list of 'available for use' memory. Very seldom does it actually release memory back to the operating system itself.
There are many articles on the design of different versions of malloc(). There are many debugging versions of malloc(), in particular, which look for abuses of the allocated memory. You can read about memory allocation in Knuth 'The Art of Computer Programming'; it's in volume 1 in my memory serves.