I am allocating the array and freeing it every callback of an audio thread. The main user thread (a web browser) is constantly allocating and deallocating memory based on user input. I am sending the uninited float array to the audio card. (example in my page from my profile.) The idea is to hear program state changes.
When I call malloc(sizeof(float)*256*13) and smaller i get an array filled with a wide range of floats which have a seemingly random distribution. It is not right to call it random - presumably this comes from whatever the memory block previously held. This is the behavior I expected and want to exploit. However when I do malloc(sizeof(float)*256*14) and larger, I get an array filled only with zeros. I would like to know why this cliff exists and if theres something I can do to get around it. I know it is undefined behavior per the standard, but I'm hoping someone that knows the implementation of malloc on some system might have an explanation.
Does this mean malloc is also memsetting the block to zero for larger sizes? This would be surprising since it wouldn't be efficient. Even if there are more chunks of memory zeroed out, I'd expect something to happen sometimes, since the arrays are constantly changing.
If possible I would like to be able to obtain chunks of memory that are reallocated over recently freed memory, so any alternatives would be welcomed.
I guess this is a strange question for some because my goal is to explore undefined behavior and use bad programming practices deliberately, but this is the application I am interested in making, so please bear with the usage of uninited arrays. I know the behavior of such usage is undefined, so please bear with me and don't tell me not to do it. I'm developing on a mac 10.5.
Most likely, the larger allocations result in the heap manager directly requesting pages of virtual address space from the kernel. Freeing will return that address space back to the kernel. The kernel must zero all pages that are allocated for a process - this is to prevent data leaking from one process to another.
Smaller allocations are handled by the user-mode heap manager within the process by taking these larger page allocations from the kernel, carving them up into smaller blocks, and reusing blocks on subsequent allocations. These do not need to be zero-initialized, since the memory contents always comes from your own process.
What you'll probably find is that previous requests could be filled using smaller blocks joined together. But when you request the bigger memory, then the existing free memory probably can't handle that much and flips some inbuilt switch for a request direct from the OS.
Related
I'm writing a simple malloc implementation for a college project. One of the tasks is to sometimes give back freed memory to the OS (the example given was of a process using say 1GB malloc-ed memory during a period, and afterwards it only uses 100MB memory until it terminates), however I'm not sure how to implement this. I was thinking of periodically checking the amount of memory the process has allocated and the amount freed and, if possible, give back some of the freed pages to the OS, but I'm not sure if this is an efficient approach.
EDIT: I didn't realize when I first wrote this, but the way I worded this is too vague. By "unused memory" I'm talking specifically about freed one.
Asking the OS for memory or returning it back are (relatively) expensive operation because they require a context switch user/kernel and back. For that reason, in most implementations, the malloc call only asks for large chunks and internally allocates from those chunks, and manages freed memory with a free blocks list. In that case, it only returns memory to the OS when a full chunk is present in the free list.
For a custom implementation, the rule for returning memory to the system is up to the programmer (you...).
Here's my question: Does calling free or delete ever release memory back to the "system". By system I mean, does it ever reduce the data segment of the process?
Let's consider the memory allocator on Linux, i.e ptmalloc.
From what I know (please correct me if I am wrong), ptmalloc maintains a free list of memory blocks and when a request for memory allocation comes, it tries to allocate a memory block from this free list (I know, the allocator is much more complex than that but I am just putting it in simple words). If, however, it fails, it gets the memory from the system using say sbrk or brk system calls. When a memory is free'd, that block is placed in the free list.
Now consider this scenario, on peak load, a lot of objects have been allocated on heap. Now when the load decreases, the objects are free'd. So my question is: Once the object is free'd will the allocator do some calculations to find whether it should just keep this object in the free list or depending upon the current size of the free list it may decide to give that memory back to the system i.e decrease the data segment of the process using sbrk or brk?
Documentation of glibc tells me that if the allocation request is much larger than page size, it will be allocated using mmap and will be directly released back to the system once free'd. Cool. But let's say I never ask for allocation of size greater than say 50 bytes and I ask a lot of such 50 byte objects on peak load on the system. Then what?
From what I know (correct me please), a memory allocated with malloc will never be released back to the system ever until the process ends i.e. the allocator will simply keep it in the free list if I free it. But the question that is troubling me is then, if I use a tool to see the memory usage of my process (I am using pmap on Linux, what do you guys use?), it should always show the memory used at peak load (as the memory is never given back to the system, except when allocated using mmap)? That is memory used by the process should never ever decrease(except the stack memory)? Is it?
I know I am missing something, so please shed some light on all this.
Experts, please clear my concepts regarding this. I will be grateful. I hope I was able to explain my question.
There isn't much overhead for malloc, so you are unlikely to achieve any run-time savings. There is, however, a good reason to implement an allocator on top of malloc, and that is to be able to trace memory leaks. For example, you can free all memory allocated by the program when it exits, and then check to see if your memory allocator calls balance (i.e. same number of calls to allocate/deallocate).
For your specific implementation, there is no reason to free() since the malloc won't release to system memory and so it will only release memory back to your own allocator.
Another reason for using a custom allocator is that you may be allocating many objects of the same size (i.e you have some data structure that you are allocating a lot). You may want to maintain a separate free list for this type of object, and free/allocate only from this special list. The advantage of this is that it will avoid memory fragmentation.
No.
It's actually a bad strategy for a number of reasons, so it doesn't happen --except-- as you note, there can be an exception for large allocations that can be directly made in pages.
It increases internal fragmentation and therefore can actually waste memory. (You can only return aligned pages to the OS, so pulling aligned pages out of a block will usually create two guaranteed-to-be-small blocks --smaller than a page, anyway-- to either side of the block. If this happens a lot you end up with the same total amount of usefully-allocated memory plus lots of useless small blocks.)
A kernel call is required, and kernel calls are slow, so it would slow down the program. It's much faster to just throw the block back into the heap.
Almost every program will either converge on a steady-state memory footprint or it will have an increasing footprint until exit. (Or, until near-exit.) Therefore, all the extra processing needed by a page-return mechanism would be completely wasted.
It is entirely implementation dependent. On Windows VC++ programs can return memory back to the system if the corresponding memory pages contain only free'd blocks.
I think that you have all the information you need to answer your own question. pmap shows the memory that is currenly being used by the process. So, if you call pmap before the process achieves peak memory, then no it will not show peak memory. if you call pmap just before the process exits, then it will show peak memory for a process that does not use mmap. If the process uses mmap, then if you call pmap at the point where maximum memory is being used, it will show peak memory usage, but this point may not be at the end of the process (it could occur anywhere).
This applies only to your current system (i.e. based on the documentation you have provided for free and mmap and malloc) but as the previous poster has stated, behavior of these is implmentation dependent.
This varies a bit from implementation to implementation.
Think of your memory as a massive long block, when you allocate to it you take a bit out of your memory (labeled '1' below):
111
If I allocate more more memory with malloc it gets some from the system:
1112222
If I now free '1':
___2222
It won't be returned to the system, because two is in front of it (and memory is given as a continous block). However if the end of the memory is freed, then that memory is returned to the system. If I freed '2' instead of '1'. I would get:
111
the bit where '2' was would be returned to the system.
The main benefit of freeing memory is that that bit can then be reallocated, as opposed to getting more memory from the system. e.g:
33_2222
I believe that the memory allocator in glibc can return memory back to the system, but whether it will or not depends on your memory allocation patterns.
Let's say you do something like this:
void *pointers[10000];
for(i = 0; i < 10000; i++)
pointers[i] = malloc(1024);
for(i = 0; i < 9999; i++)
free(pointers[i]);
The only part of the heap that can be safely returned to the system is the "wilderness chunk", which is at the end of the heap. This can be returned to the system using another sbrk system call, and the glibc memory allocator will do that when the size of this last chunk exceeds some threshold.
The above program would make 10000 small allocations, but only free the first 9999 of them. The last one should (assuming nothing else has called malloc, which is unlikely) be sitting right at the end of the heap. This would prevent the allocator from returning any memory to the system at all.
If you were to free the remaining allocation, glibc's malloc implementation should be able to return most of the pages allocated back to the system.
If you're allocating and freeing small chunks of memory, a few of which are long-lived, you could end up in a situation where you have a large chunk of memory allocated from the system, but you're only using a tiny fraction of it.
Here are some "advantages" to never releasing memory back to the system:
Having already used a lot of memory makes it very likely you will do so again, and
when you release memory the OS has to do quite a bit of paperwork
when you need it again, your memory allocator has to re-initialise all its data structures in the region it just received
Freed memory that isn't needed gets paged out to disk where it doesn't actually make that much difference
Often, even if you free 90% of your memory, fragmentation means that very few pages can actually be released, so the effort required to look for empty pages isn't terribly well spent
Many memory managers can perform TRIM operations where they return entirely unused blocks of memory to the OS. However, as several posts here have mentioned, it's entirely implementation dependent.
But lets say I never ask for allocation of size greater than say 50 bytes and I ask a lot of such 50 byte objects on peak load on the system. Then what ?
This depends on your allocation pattern. Do you free ALL of the small allocations? If so and if the memory manager has handling for a small block allocations, then this may be possible. However, if you allocate many small items and then only free all but a few scattered items, you may fragment memory and make it impossible to TRIM blocks since each block will have only a few straggling allocations. In this case, you may want to use a different allocation scheme for the temporary allocations and the persistant ones so you can return the temporary allocations back to the OS.
Is it possible to 'reserve' memory before a malloc() call? In other words, can I do something (perhaps OS-specific) which ensures there is a certain amount of free memory available, so that you know that your next malloc() (or realloc() etc.) call won't return NULL due to lack of memory?
The 'reservation' or 'pre-allocation' can fail just like a malloc, but if it succeeds, I want to be sure my next malloc() succeeds.
Notes:
Yes, I know, I want to allocate memory before allocating memory. That's exactly right. The thing is the later allocations are not really under my control and I want to be able to assume they succeed.
Bonus points for an answer regarding multi-threaded code as well.
My motivation: I was considering adopting the use of glib for my C development, but apparently it abort()s when it fails to allocate memory, and that's not acceptable to me.
Perhaps a solution which dynamically replaces the malloc symbol with something else? Or the symbol for the function wrapping the sbrk system call?
With glibc you can hook the allocation functions:
https://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html
Now that you control memory allocation in your program, you can do what you like, including writing a function to reserve a (possibly thread-local, since you asked about multi-threading) chunk of memory from the system that future calls to your malloc and realloc hooks will use to return memory.
Obviously, you need to somehow know in advance an upper bound how much memory will be required by the series of malloc calls that you need to not fail.
Back in the old Mac Toolbox days it was extremely common to use a chunk of memory called a "rainy-day fund." You'd allocate enough memory such that if you freed it, there'd be enough free memory to throw up a dialog box explaining that the app had run out of memory, save your work, and exit. Then you'd keep that pointer around until malloc() returned null, and at least you'd be guaranteed to be able to deal with it gracefully.
That was on a 100% real-memory system, though, and things these days are very different. Still, if we're talking about those small and simple real-memory systems that still exist, then a similar strategy still makes sense.
I realize the following does not directly answer your question with respect to malloc(). It is instead an attempt to offer up another avenue that might be applicable to your situation.
For a few years I was dealing with certified embedded systems. Two of the constraints were that 1) we were forbidden to free memory and 2) we were forbidden from allocating memory beyond a certain point during the initialization process. This was because fragmentation that could result from dynamic memory allocations and deallocations made it too costly to certify (and guarantee that allocations would succeed).
Our solution was to allocate pools of memory during the early initialization process. The blocks of memory handled by a given pool would all be the same size, thereby avoiding the fragmentation issue. Different pools would handle differently sized memory blocks for a different purpose. This meant that we had to allocate enough memory up front for our worst case memory consumption scenario as well as manage those pools ourselves.
Hope this helps.
Obviously there's no magic way for your program to ensure your system has an arbitrary amount of memory, but you can get the memory as soon as your process starts, so that it won't fail unexpectedly part way through the work/day when it'll be a right pain.
On some OSes, simply doing a big malloc then freeing the memory immediately will still have called sbrk or similar OS function to grow your process memory, but even that's not a great solution because getting virtual address space is still a ways short of getting physical memory to back it when needed, so you'd want to write through that memory with some noise values, then even if it's swapped out to disk while unused you can expect that the virtual memory of the system is committed to your memory needs and will instead deny other programs (or smaller new/malloc requests you make later ;-P) memory should the system be running short.
Another approach is to seek an OS specific function to insist on locking memory pages in physical memory, such as mlock(2) on Linux.
(These kind of "I'm the most important thing on the server" assumptions tend to make for a fragile system once a few running programs have all taken that attitude....)
I have read this-
Memory that is allocated by malloc(for example) and that is not freed
using free() function is released when the program terminates.And
that it is done by the opearting system. So when does having or not
having a garbage collector come into picture?
Or is it that not all operating systems do this automatic release of memory on program termination?
That claim about malloc and free is correct for all modern computing operating systems. But the statement as a whole reflects a complete misunderstanding of the purpose of garbage collection.
The reason you call free is not to clean things up for after your program terminates. The reason you call free is to permit the memory to be re-used during the subsequent execution of a long-running program.
Consider a message server that handles a hundred messages per second. You call malloc when you receive a new message. And then you have to do lots of things with it. You may have to log it. You may have to send it to other clients. You may have to write it to a database. When you are done, if you don't free it, after a few days you'll have millions of messages stuck in memory. So you have to call free.
But when do you call free? When one client is done sending a message, another client might still be using it. And maybe the database still needs it.
The purpose of garbage collection is to ensure that object's used memory is released (so it can be re-used to hold a new message during the application's lifetime) without having to burden the application programmer with the duty (and risks) associated with tracking exactly when the object is no longer required by any code that might be using it.
If an application doesn't run for very long or doesn't have any objects whose lifetimes are difficult to figure out, then garbage collection doesn't do very much good. And there are other techniques (such as reference-counted pointers) that can provide many of the same benefits as garbage collection. But there is a real problem that garbage collection does solve.
Most modern operating systems will indeed free everything you're allocated on program termination. However, a garbage collector will free unused memory before program termination. This allows your program to skip the frees, but still manage to keep allocating memory indefinitely, as long as it lets go to references to memory that isn't being used anymore, and as long as your total working set size doesn't exceed physical memory limits.
All OS do free memory when the program quits. Memory leaks are 'only' a problem because they waste memory on the machine, or cause the program to crash. So you'd garbage collect to prevent these things from happening, but without worrying about actually freeing your own pointers when you're done with them.
They're two solutions to the same problem, really. But, it's because the problem happens during runtime that you're worried about it.
Imagine a long running process like a web server that malloc()s a bunch of data structures for every connection it services. If it never free()s any of that memory, the process memory usage will continually grow, possibly consuming everything available on the system.
I have made a program in c and wanted to see, how much memory it uses and noticed, that the memory usage grows while normally using it (at launch time it uses about 250k and now it's at 1.5mb). afaik, I freed all the unused memory and after some time hours, the app uses less memory. Could it be possible, that the freed memory just goes from the 'active' memory to the 'wired' or something, so it's released when free space is needed?
btw. my machine runs on mac os x, if this is important.
How do you determine the memory usage? Have you tried using valgrind to locate potential memory leaks? It's really easy. Just start your application with valgrind, run it, and look at the well-structured output.
If you're looking at the memory usage from the OS, you are likely to see this behavior. Freed memory is not automatically returned to the OS, but normally stays with the process, and can be malloced later. What you see is usually the high-water mark of memory use.
As Konrad Rudolph suggested, use something that examines the memory from inside the process to look for memory links.
The C library does not usually return "small" allocations to the OS. Instead it keeps the memory around for the next time you use malloc.
However, many C libraries will release large blocks, so you could try doing a malloc of several megabytes and then freeing it.
On OSX you should be able to use MallocDebug.app if you have installed the Developer Tools from OSX (as you might have trouble finding a port of valgrind for OSX).
/Developer/Applications/PerformanceTools/MallocDebug.app
I agree with what everyone has already said, but I do want to add just a few clarifying remarks specific to os x:
First, the operating system actually allocates memory using vm_allocate which allocates entire pages at a time. Because there is a cost associated with this, like others have stated, the C library does not just deallocate the page when you return memory via free(3). Specifically, if there are other allocations within the memory page, it will not be released. Currently memory pages are 4096 bytes in mac os x. The number of bytes in a page can be determined programatically with sysctl(2) or, more easily, with getpagesize(2). You can use this information to optimize your memory usage.
Secondly, user-space applications do not wire memory. Generally the kernel wires memory for critical data structures. Wired memory is basically memory that can never be swapped out and will never generate a page fault. If, for some reason, a page fault is generated in a wired memory page, the kernel will panic and your computer will crash. If your application is increasing your computer's wired memory by a noticeable amount, it is a very bad sign. It generally means that your application is doing something that significantly grows kernel data structures, like allocating and not reaping hundreds of threads of child processes. (of course, this is a general statement... in some cases, this growth is expected, like when developing a virtual host or something like that).
In addition to what the others have already written:
malloc() allocates bigger chunks from the OS and spits it out in smaller pieces as you malloc() it. When free()ing, the piece first goes into a free-list, for quick reuse by another malloc if the size fits. It may at this time be merged with another free item, to form bigger free blocks, to avoid fragmentation (a whole bunch of different algorithms exist there, from freeLists to binary-sized-fragments to hashing and what not else).
When freed pieces arrive so that multiple fragments can be joined, free() usually does this, but sometimes, fragments remain, depending on size and orderof malloc() and free(). Also, only when a big such free block has been created will it be (sometimes) returned to the OS as a block. But usually, malloc() keeps things in its pocket, dependig on the allocated/free ratio (many heuristics and sometimes compile or flag options are often available).
Notice, that there is not ONE malloc/free algotrithm. There is a whole bunch of different implementations (and literature). Highly system, OS and library dependent.