Intricacies of malloc and free - c

I have two related questions hence I am asking them in this single thread.
Q1) How can I confirm if my OS is clearing un-"free"'ed memory (allocated using malloc) automatically when a program terminates? I am using Ubuntu 11.04, 32-bit with gcc-4.5.2
As per a tutorial page by Steven Summit here, "Freeing unused memory (malloc'ed) is a good idea, but it's not mandatory. When your program exits, any memory which it has allocated but not freed should be automatically released. If your computer were to somehow ``lose'' memory just because your program forgot to free it, that would indicate a problem or deficiency in your operating system."
Q2) Suppose, foo.c mallocs a B-bytes memory. Later on, foo.c frees this B-bytes memory locations and returns it to the OS. Now my question is, can those PARTICULAR B-bytes of memory locations be re-allocated to foo.c (by the OS) in the current instance OR those B-bytes can't be allocated to foo.c untill its current instance terminates ?
EDIT : I would recommend everyone who reads my question to read the answer to a similar question here and here. Both answers explain the interaction and working of malloc() and free() in good detail without using very esoteric terms. To understand the DIFFERENCE between memory-management tools used by kernel (e.g. brk(), mmap()) and those used by the C-compiler (e.g. malloc(), free()), this is a MUST READ.

When a process ends either thru a terminating signal, e.g. SIGSEGV, or thru the _exit(2) system call (which happens to be called also when returning from main), all the process resources are released by the kernel. In particular, the process address space, including heap memory (allocated with mmap(2) (or perhaps sbrk(2)) syscall (used by malloc library function) is released.
Of course, the free library function either (often) makes the freed memory zone reusable by further future calls to malloc or (occasionally, for large memory zones) release some big memory chunk to the kernel using e.g. munmap(2) system call.
To know more about the memory map of process 1234, read sequentially the /proc/1234/maps pseudo-file (or /proc/self/maps from inside the process). The /proc file system is the preferred way to query the kernel about processes. (there is also /proc/self/statm and /proc/self/smaps and many other interesting things).
The detailed behavior of free and malloc is implementation dependent. You should view malloc as a way to get heap memory, and free as a way to say that a previously malloc-ed zone is useless, and the system (i.e. standard C library + kernel) can do whatever it wants with it.
Use valgrind to hunt memory leak bugs. You could also consider using Boehm's conservative garbage collector, i.e. use GC_malloc instead of malloc and don't bother about freeing manually memory.

Most Modern OS will reclaim the allocated memory so you need not worry about that.
The OS doesn't understand if your application/program leaked memory or not it simply reclaims what it allocated to an process once the process completes.
Yes freed memory can be reused(if needed) & the reuse can happen in the same instantiation.

Q1. You just have to assume that the operating system is behaving correctly.
Q2. There is no reason why the bytes can't be reallocated to foo.c it just depends on how the memory allocation routines work.

Q1) I'm not sure about how you can confirm. However, about the second paragraph, it's considered good style to always free whatever memory you allocate. A good explanation of this is found here: What REALLY happens when you don't free after malloc?.
Q2) Definitely; those bytes are usually the first to be reallocated (depending on the malloc implementation). For a great explanation, see: How do malloc() and free() work?.

Related

Does malloc without corresponding free always produce a memory leak?

Does malloc without corresponding free always produce a memory leak, or are there situations when it doesn't?
It depends on how you define "memory leak". If you define it as having any outstanding objects with allocated storage duration at the time of program exit, then yes, it's a leak. This is what tools like valgrind report. However, it's not a useful definition at all.
My definition of memory leak is roughly an unbounded increase in total memory consumption of the program over its lifetime despite having a bounded working set. For example, if I always have at most 10 tabs open in my browser, to the same 10 sites, but memory usage keeps increasing unboundedly, that's a memory leak. On the other hand, a program that allocates a buffer to load a whole file into memory, loads the file, prints it in reverse, then exits without freeing the memory does not have a memory leak.
One particular important case where a malloc without free is not only not-a-leak but absolutely necessary (for general code that can't make assumptions about the whole program it's running in) is any use of runtime-allocated constant tables whose generation is controlled by call_once. No matter how late you tried to free such tables, it would be possible for code (in another thread, or an atexit handler, etc.) to attempt to access it after free, and call_once type interfaces intentionally do not provide any way to synchronize any access except first call (this is how they avoid introducing unwanted acquire barriers/synchronization cost at every read).
Note that the concept of "working set" here is somewhat subjective and highly load-bearing. Often memory leaks are a matter of the software still considering something part of its working set when the user no longer considers it so.
A memory leak is a situation where a program allocates memory, does not free it when it is no longer in use, and loses track of its address (the value of the pointer returned by malloc, calloc or realloc).
Since the pointer is lost, the memory can no longer be freed and will stay attached to the program until it exits.
If the program exits, all memory associated with it is reclaimed by the operating system (except for rare circumstances beyond the scope of this question), so the memory leak has no consequences.
If the program executes for a long time, potentially until the system is shut down, unused blocks of memory attached to the program cannot be used for other purposes. If the amount of memory thus wasted is small, again no consequences are expected.
Conversely, if the program keeps allocating more memory and not free it, the system will run out of memory for the program to use and either return NULL for an allocation request or become unstable as it uses virtual memory to honor the requests at the expense of other programs and at the cost of lengthy swapping operations to storage devices or other compression techniques. At some point the system may kill processes at random to try and recover usable memory.
Such memory leaks are problematic and must be avoided. They are especially problematic in library functions that may be used in programs that run for extended sessions, such as web browsers, email readers, file managers, media players, program managers...
Unlike other programming languages, C does not have an embedded garbage collector that could determine which allocated blocks of memory are still in use, so it is the programmer's responsibility to keep track of all allocated blocks and free them as soon as possible. Advanced tools such as valgrind can be used to verify if all allocated blocks have been freed upon program exit. Although it is not necessary to free the memory at exit, it is good programming practice and a good way to determine if all allocated memory blocks have been accounted for.
The answer might depend on the implementation of malloc, but generally there are two cases where malloc is expected to not produce a memory leak:
when you pass it 0 as the size parameter, some implementations will just return NULL and not allocate anything, while others will return an unique pointer and even though this counts as zero bytes allocated you will leak about 64 bytes anyway in bookkeeping records.
When Out Of Memory happens. Check out the global variable errno to have specific values, normally ENOMEM to see if it failed. In such cases mallocreturns NULL as well.
The standard does not require a memory leak at all. So per se, there is no situation that will guarantee a memory leak.
On the other hand the standard does not require a memory leak to not happen in the scenario you mention either.
In most situations, all allocated memory will be freed when the program exits. But there may be exceptions on some systems, especially embedded ones. If this is vital to your program, you should not rely on it.

Does terminating a program reclaim memory in the same way as free()?

I saw this answer to a stack overflow question that says that freeing memory at the very end of a c program is actually harmful because it moves variables that wouldn't be used again into system memory.
I'm confused why the free() method in C would do anything different than the operating system reclaiming the heap at the end of the program.
Does anyone know if there is a real difference between free() and termination in terms of memory management and if so how the operating system may treat these two differently?
e.g.
would anything different happen between these two short programs?
void main() {
int* mem = malloc(1);
return 0;
}
void main() {
int* mem = malloc(1);
free(mem);
return 0;
}
No, terminating a program, as with exit or abort, does not reclaim memory in the same way as free. Using free causes some activity that ultimately has no effect when the operating system discards the data maintained by malloc and free.
exit has some complications, as it does not immediately terminate the program. For now, let’s just consider the effect of immediately terminating the program and consider the complications later.
In a general-purpose multi-user operating system, when a process is terminated, the operating system releases the memory it was using to make it available for other purposes.1 In large part, this simply means the operating system does some accounting operations.
In contrast, when you call free, software inside the program runs, and it has to look up the size of the memory you are freeing and then insert information about that memory into the pool of memory it is maintaining. There could be thousands or tens of thousands (or more) of such allocations. A program that frees all its data may have to execute many thousands of calls to free. Yet, in the end, when the program exits, all of the changes produced by free will vanish, as the operating system will discard all the data about that pool of memory—all of the data is in memory pages the operating system does not preserve.
So, in this regard, the answer you link to is correct, calling free is a waste. And, as it points out, the necessity of going through all the data structures in the program to fetch the pointers in them so the memory they point to can be freed causes all those data structures to be read into memory if they had been swapped out to disk. For large programs, it can take a considerable amount of time and other resources.
On the other hand, it is not clear it is easy to avoid many calls to free. This is because releasing memory is not the only thing a terminating program has to clean up. A program may want to write final data to files or send final messages to network connections. Furthermore, a program may not have established all of this context directly. Most large programs rely on layers of software, and each software package may have set up its own context, and often no way is provided to tell other software “I want to exit now. Finish the valuable context, but skip all the freeing of memory.” So all the desired clean-up tasks may be interwined with the free-memory tasks, and there may be no good way to untangle them.
Software should generally be written so that nothing terrible happens if a program is suddenly aborted (since this can happen from a loss of power, not just deliberate user action). But even though a program might be able to tolerate an abort, there can still be value in a graceful exit.
Getting back to exit, calling the C exit routine does not exit the program immediately. Exit handlers (registered with atexit) are called, stream buffers are flushed, and streams are closed. Any software libraries you called may have set up their own exit handlers so that they can finish up when the program is exiting. So, if you want to be sure libraries you have used in your program are not calling free when you end the program, you have to call abort, not exit. But it is generally preferred to end a program gracefully, not by aborting. Calling abort will not call exit handlers, flush streams, close streams, or perform other wind-down code that exit does—data can be lost when a program calls abort.
Footnote
1 Releasing memory does not mean it is immediately available for other purposes. The specific result of this depends on each page of memory. For example:
If the memory is shared with other processes, it is still needed for them, so releasing it from use by this process only decrements the number of processes using the memory. It is not immediately available for any other use.
If the memory is not in use by any other processes but contains data mapped from a file on disk, the operating system might mark it as available when needed but leave it alone for the moment. This is because you might run the same program again, and it would be nice if the data were still in memory, so why not just leave it in place just in case? The data might even be used by a different program that uses the same file. (For example, many programs might use the same shared library.)
If the memory is not in use by any other processes and was just used by the program as a work area, not mapped from a file, then system may mark it as immediately available and not containing anything useful.
would anything different happen between these two short programs?
The simple answer is: it makes no difference, the memory is released to the system in both cases. Calling free() is not strictly necessary and does incur an infinitesimal overhead but may prove useful when trying to track memory leaks in more complex programs.
Does terminating a program reclaim memory in the same way as free?
Not exactly:
Terminating a program releases the memory used by the program, be it for the program code, data, stack or heap. It also releases some other resources such as file handles, device handles, network sockets... All this is done efficiently, no matter how many blocks of memory have been allocated with malloc().
Conversely, free() makes the block of memory available for further use by the program for later calls to malloc() or realloc(). Depending on its size and the implementation of the heap, this freed block may or may not be returned to the OS for use by other programs. Also worth noting it the fragmentation problem, where small blocks of freed memory may not be usable for a larger allocation because they are surrounded by allocated blocks. The C heap does not perform packing or de-fragmentation, it merely coalesces adjacent free blocks. Freeing all allocated blocks before leaving the program may be useful for debugging purposes, but may be complicated and time consuming, while not necessary for the memory to be reused by the system after the program terminates.
free() is a user level memory management function and depends on malloc implementation you are currently using. The user-level allocator might maintain a linked-list of memory chunk and malloc/free will take the chunk of appropropriate size/put it back.
exit() Destroys an address space and all regions.
This is related to malloced heap as well as some other regions and in-kernel data structures used for managing address space of the process:
Each address space consists of a number of page-aligned regions
of memory that are in use. They never overlap and represent a set
of addresses which contain pages that are related to each other in
terms of protection and purpose. These regions are represented by
a struct vm_area_struct and are roughly analogous to the
vm_map_entry struct in BSD. For clarity, a region may represent the
process heap for use with malloc(), a memory mapped file such as
a shared library or a block of anonymous memory allocated with
mmap(). The pages for this region may still have to be allocated, be
active and resident or have been paged out
Reference: https://www.kernel.org/doc/gorman/html/understand/understand007.html
The reason well-designed programs free memory at exit is to check for memory leaks. If your application-level memory allocation does not go to zero after your last deallocation, you know that you have a memory memory that is not being managed properly and probably have a memory leak in your code.
would anything different happen between these two short programs?
YES
I'm confused why the free() method in C would do anything different than the operating system reclaiming the heap at the end of the program.
The operating system allocates memory in pages. Heap managers (such as malloc/free implementations) allocate pages from the operating system and subdivide the pages into smaller allocations. Calls to free() normally return memory to the heap. They do not return the pages to the operating system.

Is memory being freed when calling exec(3) an implementation detail?

From what I've read, the general consensus seems to be that you don't need to free memory before running exec(3). However, in the POSIX standard, handling of the heap / malloc memory does not seem to be explicitly detailed. I know it's common for people to not bother freeing memory when an application is exiting because the OS will clean up the data, but from what I understand, that's an OS implementation detail; the OS is not required to free the memory even though many modern systems do. Is this also the case with exec(3)? I'm wondering if freeing malloc'd memory before exec(3) is the right thing to do even though it's not necessary for many modern operating systems.
No, this is not an implementation detail. The freeing of memory is implicit in the definition of the exec function family:
The exec family of functions shall replace the current process image with a new process image.
The POSIX standard doesn't appear to have a clear definition of "process image", but from context, it's pretty clear that this includes all aspects of process state, including memory allocations.
From a practical standpoint, any system where memory left allocated at exec time was not deallocated would be essentially unusable, as those memory allocations would become unreachable.

C, Xcode and memory

Why after executing next C-code Xcode shows 20KB more than it was?
void *p = malloc(sizeof(int)*1000);
free(p);
Do I have to free the memory another way? Or it's just an Xcode mistake?
When you say "Xcode shows 20KB more than it was", I presume you mean that the little bar graph goes up by 20kB.
When you malloc an object, the C library first checks the process's address space to see if there is enough free space to satisfy the request. If there isn't enough memory, it goes to the operating system to ask for more virtual memory to be allocated to the process. The graph in Xcode measures the amount of virtual memory the process has.
When you free an object, the memory is never returned to the operating system, rather, it is "just" placed on the list of free blocks for malloc to reuse. I put the word "just" in scare quotes because the actual algorithm can be quite complex, in order to minimise fragmentation of the heap and the time taken to malloc and free blocks. The reason memory is never returned to the operating system is that it is very expensive to do system calls to the OS to get and free memory.
Thus, you will never see the memory usage of the process go down. If you malloc a Gigabyte of memory and then free it, the process will still appear to be using a Gigabyte of virtual memory.
If you want to see if your program really leaks, you need to use the leaks profile tool. This intercepts malloc and free calls so it knows which blocks are still nominally in use and which have been freed.

Heap memory allocation

If I allocate memory dynamically in my program using malloc() but I don't free the memory during program runtime, will the dynamically allocated memory be freed after program terminates?
Or if it is not freed, and I execute the same program over and over again, will it allocate the different block of memory every time? If that is the case, how should I free that memory?
Note: one answer I could think of is rebooting the machine on which I am executing the program. But if I am executing the program on a remote machine and rebooting is not an option?
Short answer: Once your process terminates, any reasonable operating system is going to free all memory allocated by that process. So no, memory allocations will not accumulate when you re-start your process several times.
Process and memory management are typically a responsibility of the operating system, so whether allocated memory is freed or not after a process terminates is actually dependent on the operating system. Different operating systems can handle memory management differently.
That being said, any reasonable operating system (especially a multi-tasking one) is going to free all of the memory that a process allocated once that process terminates.
I assume the reason behind this is that an operating system has to be able to gracefully handle irregular situations:
malicious programs (e.g. those that don't free their memory intentionally, in the hope of affecting the system they run on)
abnormal program terminations (i.e. situations where a program ends unexpectedly and therefore might not get a chance to explicitly free its dynamically allocated memory itself)
Any operating system worth its salt has to be able to deal with such situations. It has to isolate other parts of the system (e.g. itself and other running processes) from a faulty process. If it did not, a process' memory leak would propagate to the system. Meaning that the OS would leak memory (which is usually considered a bug).
One way to protect the system from memory leaks is by ensuring that once a process ends, all the memory (and possibly other resources) that it used get freed.
Any memory a program allocated should be freed when the program terminates, regardless of whether it's allocated statically or dynamically. The main exception to this is if the process is forked to another process.
If you do not explicitly free any memory you malloc, it will stay allocated until the process is terminated.
Even if your OS does cleanup on exit(). The syscall to exit is often wrapped by an exit() function. Here is some pseudo code, derived from studying several libc implementations, to demonstrate what happens around main() that could cause a problem.
//unfortunately gcc has no builtin for stack pointer, so we use assembly
#ifdef __x86_64__
#define STACK_POINTER "rsp"
#elif defined __i386__
#define STACK_POINTER "esp"
#elif defined __aarch64__
#define STACK_POINTER "x13"
#elif defined __arm__
#define STACK_POINTER "r13"
#else
#define STACK_POINTER "sp" //most commonly used name on other arches
#endif
char **environ;
void exit(int);
int main(int,char**,char**);
_Noreturn void _start(void){
register long *sp __asm__( STACK_POINTER );
//if you don't use argc, argv or envp/environ, just remove them
long argc = *sp;
char **argv = (char **)(sp + 1);
environ = (char **)(sp + argc + 1);
//init routines for threads, dynamic linker, etc... go here
exit(main((int)argc, argv, environ));
__builtin_unreachable(); //or for(;;); to shut up compiler warnings
}
Notice that exit is called using the return value of main. On a static build without a dynamic linker or threads, exit() can be a directly inlined syscall(__NR_exit,main(...)); however if your libc uses a wrapper for exit() that does *_fini() routines (most libc implementations do), there is still 1 function to call after main() terminates.
A malicious program could LD_PRELOAD exit() or any of the routines it calls and turn it into a sort of zombie process that would never have its memory freed.
Even if you do free() before exit() the process is still going to consume some memory (basically the size of the executable and to some extent the shared libraries that aren't used by other processes), but some operating systems can re-use the non-malloc()ed memory for subsequent loads of that same program such that you could run for months without noticing the zombies.
FWIW, most libc implementations do have some kind of exit() wrapper with the exception of dietlibc (when built as a static library) and my partial, static-only libc.h that I've only posted on the Puppy Linux Forum.
If I allocate memory dynamically in my program using malloc() but I
don't free the memory during program runtime, will the dynamically
allocated memory be freed after program terminates?
The operating system will release the memory allocated through malloc to be available to other systems.
This is much more complex than your question makes it sound, as the physical memory used by a process may be written to disk (paged-out). But with both Windows, Unix (Linux, MAC OS X, iOS, android) the system will free the resources it has committed to the process.
Or if it is not freed, and I execute the same program over and over
again, will it allocate the different block of memory every time? If
that is the case, how should I free that memory?
Each launch of the program, gets a new set of memory. This is taken from the system, and provided as virtual addresses. Modern operating systems use address-space-layout-randomization (ASLR) as a security feature, this means that the heap should provide unique addresses each time your program launches. But as the resources from other runs have been tidied up, there is no need to free that memory.
As you have noted, if there is no way for a subsequent run to track where it has committed resources, how is it expected to be able to free them.
Also note, you can run your program multiple launches that run at the same time. The memory allocated may appear to overlap - each program may see the same address allocated, but that is "virtual memory" - the operating system has set each process up independently so it appears to use the same memory, but the RAM associated with each process would be independent.
Not freeing the memory of a program when it executes will "work" on Windows and Unix, and probably any other reasonable operating system.
Benefits of not freeing memory
The operating system keeps a list of large memory chunks allocated to the process, and also the malloc library keeps tables of small chunks of memory allocated to malloc.
By not freeing the memory, you will save the work accounting for these small lists when the process terminates. This is even recommended in some cases (e.g. MSDN : Service Control Handler suggests SERVICE_CONTROL_SHUTDOWN should be handled by NOT freeing memory)
Disadvantages of not freeing memory
Programs such as valgrind and application verifier check for program correctness by monitoring the memory allocated to a process and reporting on leaks.
When you don't free the memory, these will report a lot of noise, making unintentional leaks difficult to find. This would be important, if you were leaking memory inside a loop, which would limit the size of task your program could deliver.
Several times in my career, I have converted a process to a shared object/dll. These were problematic conversions, because of leaks that were expected to be handled by the OS process termination, started to survive beyond the life of "main".
As we say brain of the Operating system is kernel. Operating system has several responsibilities.
Memory Management is a function of kernel.
Kernel has full access to the system's memory and must allow processes
to safely access this memory as they require it.
Often the first step in doing this is virtual addressing, usually achieved by paging and/or segmentation. Virtual addressing allows the kernel to make a given physical address appear to be another address, the virtual address. Virtual address spaces may be different for different processes; the memory that one process accesses at a particular (virtual) address may be different memory from what another process accesses at the same address.
This allows every program to behave as if it is the only one (apart
from the kernel) running and thus prevents applications from crashing
each other
Memory Allocation
malloc
Allocate block of memory from heap
. .NET Equivalent: Not applicable. To call the standard C function, use PInvoke.
The Heap
The heap is a region of your computer's memory that is not managed
automatically for you, and is not as tightly managed by the CPU. It is
a more free-floating region of memory (and is larger). To allocate
memory on the heap, you must use malloc() or calloc(), which are
built-in C functions. Once you have allocated memory on the heap, you
are responsible for using free() to deallocate that memory once you
don't need it any more. If you fail to do this, your program will have
what is known as a memory leak. That is, memory on the heap will
still be set aside (and won't be available to other processes).
Memory Leak
For Windows
A memory leak occurs when a process allocates memory from the paged or nonpaged pools, but does not free the memory. As a result, these limited pools of memory are depleted over time, causing Windows to slow down. If memory is completely depleted, failures may result.
Determining Whether a Leak Exists describes a technique you can use
if you are not sure whether there is a memory leak on your system.
Finding a Kernel-Mode Memory Leak describes how to find a leak that
is caused by a kernel-mode driver or component.
Finding a User-Mode Memory Leak describes how to find a leak that is
caused by a user-mode driver or application.
Preventing Memory Leaks in Windows Applications
Memory leaks are a class of bugs where the application fails to release memory when no longer needed. Over time, memory leaks affect the performance of both the particular application as well as the operating system. A large leak might result in unacceptable response times due to excessive paging. Eventually the application as well as other parts of the operating system will experience failures.
Windows will free all memory allocated by the application on process
termination, so short-running applications will not affect overall
system performance significantly. However, leaks in long-running
processes like services or even Explorer plug-ins can greatly impact
system reliability and might force the user to reboot Windows in order
to make the system usable again.
Applications can allocate memory on their behalf by multiple means. Each type of allocation can result in a leak if not freed after use
. Here are some examples of common allocation patterns:
Heap memory via the HeapAlloc function or its C/C++ runtime
equivalents malloc or new
Direct allocations from the operating system via the VirtualAlloc
function.
Kernel handles created via Kernel32 APIs such as CreateFile,
CreateEvent, or CreateThread, hold kernel memory on behalf of the
application
GDI and USER handles created via User32 and Gdi32 APIs (by default,
each process has a quota of 10,000 handles)
For Linux
memprof is a tool for profiling memory usage and finding memory leaks.
It can generate a profile how much memory was allocated by each
function in your program. Also, it can scan memory and find blocks
that you’ve allocated but are no longer referenced anywhere.
Memory allocated by the malloc needs to be freed by the allocating program.If not and memory is kept on being allocated then one point will come that the program will run out of allowable memory allocation and throw a segmentation or out of memory error. Every set of memory allocation by malloc needs to be accompanied by free.

Resources