I'm currently learning memory layout in C. For now I know there exist several sections in C program memory: text, data, bss, heap and stack. They also say heap is shared with other things beyond the program.
My questions are these.
What exactly is the heap shared with? One source states that Heap must always be freed in order to make it available for other processes whereas another says The heap area is shared by all threads, shared libraries, and dynamically loaded modules in a process. If it is not shared with other processes, do I really have to free it while my program is running (not at the end of it)?
Some sources also single out high addresses (the sixth section) for command line arguments and environment variables. Shall this be considered as another layer and a part of a program memory?
Are the other sections shared with anything else beyond a program?
The heap is a per-process memory: each process has its own heap, which is shared only within the same process space (like between the process threads, as you said). Why should you free it? Not properly to give space to other processes (at least in modern OS where the process memory is reclaimed by the OS when the process dies), but to prevent heap exhaustion within your process memory: in C, if you don't deallocate the heap memory regions you used, they will be always considered as busy even when they are not used anymore. Thus, to prevent undesired errors, it's a good practice to free the memory in the heap as soon as you don't need it anymore.
In a C program the command line variables are stored in the stack as function variables of the main. What happens is that usually the stack is allocated in the highest portion of a process memory, which is mapped to the high addresses (this is probably the reason why some sources point out what you wrote). But, generally speaking, there isn't any sixth memory area.
As said by the others, the text area can be shared by processes. This area usually contains the binary code, which would be the same for different processes which share the same binary. For performance reasons, the OS can allow to share such memory area, (think for example when you fork a child process).
Heap is shared with other processes in a sense that all processes use RAM. The more of it you use, the less is available to other programs. Heap sharing with other threads in your own program means that all your threads actually see and access the same heap (same virtual address space, same actual RAM, with some luck also same cache).
No.
text can be shared with other processes. These days it is marked as read-only, so having several processes share text makes sense. In practice this means that if you are already running top and you run another instance it makes no sense to load text part again. This would waste time and physical RAM. If the OS is smart enough it can map those RAM pages into virtual address space of both top instances, saving time and space.
On the official aspect:
The terms thread, process, text section, data section, bss, heap and stack are not even defined by the C language standard, and every platform is free to implement these components however "it may like".
Threads and processes are typically implemented at the operating-system layer, while all the different memory sections are typically implemented at the compiler layer.
On the practical aspect:
For every given process, all these memory sections (text section, data section, bss, heap and stack) are shared by all the threads of that process.
Hence, it is under the responsibility of the programmer to ensure mutual-exclusion when accessing these memory sections from different threads.
Typically, this is achieved via synchronization utilities such as semaphores, mutexes and message queues.
In between processes, it is under the responsibility of the operating system to ensure mutual-exclusion.
Typically, this is achieved via virtual-memory abstraction, where each process runs inside its own logical address space, and each logical address space is mapped to a different physical address space.
Disclaimer: some would claim that each thread has its own stack, but technically speaking, those stacks are usually allocated consecutively on the stack of the process, and there's usually no one to prevent a thread from accessing the stacks of other threads, whether intentionally or by mistake (aka stack overflow).
Related
hay
I have a question about location of threads in memory,
Where is threads stack located? And is there a way to display it (using gdb, readelf or something similar)
is there a way to display it...using gdb...?
Sure, GDB can show you the stack of any thread. I don't remember the commands, but they're right there in the manual. ISTR, there's one command that will list all of the threads, there's another that you use to tell it which thread you want to look at. Everything else works just like how it works for a single-threaded program.
I think there's also a way you can tell GDB to iterate over the threads (i.e., perform a single command, such as dump the stack, once for each thread in the program.)
Where is threads stack located?
Um, It's located in memory.
Seriously. Why do you want to know? In most programming environments that I have ever heard of, the entire stack for a thread gets allocated all at once, and it can not grow. There's usually some way for the program to say how big it needs the stack of a new thread to be if the default size is not big enough.
In Linux, the program typically would obtain space for a new thread's stack by calling 'mmap(...)` with arguments that allow the OS to choose the virtual address. But, there's no reason why it has to work that way. The program could allocate the stack from the heap, if that made any sense.
In other operating systems, there's probably some mechanism similar to mmap that lets the OS choose the address.
if you want an exact answer it is in the memory between heap and stack
If you go back thirty or more years, A process in a Unix-like OS would be given one contiguous block of virtual memory, typically starting at page one (page zero would be unallocated because that's how you get a segfault if the program follows a NULL pointer.) The lowest addresses would contain the program's "text" segment (e.g., its code and immutable strings), then the "data" segment (initialized static variables), then the "bss" segment (uninitialized static variables.)
Everything from the top of the BSS to the top of the given VM region was "wilderness" (i.e., untouched). The program's heap would grow up into the wilderness from the bottom, and its one and only call stack would grow down into the wilderness from the top. If the heap and the stack ever met, then you'd get a "stack overflow" or a malloc() error.
Things are more complicated these days, when a program can have dozens or even hundreds of call stacks. Instead of that "wilderness" Linux programs today can use 'mmap(...)` to create additional VM regions--either for a new thread's stack, or to add to the heap, or to map a file into memory for random access.
I saw this answer to a stack overflow question that says that freeing memory at the very end of a c program is actually harmful because it moves variables that wouldn't be used again into system memory.
I'm confused why the free() method in C would do anything different than the operating system reclaiming the heap at the end of the program.
Does anyone know if there is a real difference between free() and termination in terms of memory management and if so how the operating system may treat these two differently?
e.g.
would anything different happen between these two short programs?
void main() {
int* mem = malloc(1);
return 0;
}
void main() {
int* mem = malloc(1);
free(mem);
return 0;
}
No, terminating a program, as with exit or abort, does not reclaim memory in the same way as free. Using free causes some activity that ultimately has no effect when the operating system discards the data maintained by malloc and free.
exit has some complications, as it does not immediately terminate the program. For now, let’s just consider the effect of immediately terminating the program and consider the complications later.
In a general-purpose multi-user operating system, when a process is terminated, the operating system releases the memory it was using to make it available for other purposes.1 In large part, this simply means the operating system does some accounting operations.
In contrast, when you call free, software inside the program runs, and it has to look up the size of the memory you are freeing and then insert information about that memory into the pool of memory it is maintaining. There could be thousands or tens of thousands (or more) of such allocations. A program that frees all its data may have to execute many thousands of calls to free. Yet, in the end, when the program exits, all of the changes produced by free will vanish, as the operating system will discard all the data about that pool of memory—all of the data is in memory pages the operating system does not preserve.
So, in this regard, the answer you link to is correct, calling free is a waste. And, as it points out, the necessity of going through all the data structures in the program to fetch the pointers in them so the memory they point to can be freed causes all those data structures to be read into memory if they had been swapped out to disk. For large programs, it can take a considerable amount of time and other resources.
On the other hand, it is not clear it is easy to avoid many calls to free. This is because releasing memory is not the only thing a terminating program has to clean up. A program may want to write final data to files or send final messages to network connections. Furthermore, a program may not have established all of this context directly. Most large programs rely on layers of software, and each software package may have set up its own context, and often no way is provided to tell other software “I want to exit now. Finish the valuable context, but skip all the freeing of memory.” So all the desired clean-up tasks may be interwined with the free-memory tasks, and there may be no good way to untangle them.
Software should generally be written so that nothing terrible happens if a program is suddenly aborted (since this can happen from a loss of power, not just deliberate user action). But even though a program might be able to tolerate an abort, there can still be value in a graceful exit.
Getting back to exit, calling the C exit routine does not exit the program immediately. Exit handlers (registered with atexit) are called, stream buffers are flushed, and streams are closed. Any software libraries you called may have set up their own exit handlers so that they can finish up when the program is exiting. So, if you want to be sure libraries you have used in your program are not calling free when you end the program, you have to call abort, not exit. But it is generally preferred to end a program gracefully, not by aborting. Calling abort will not call exit handlers, flush streams, close streams, or perform other wind-down code that exit does—data can be lost when a program calls abort.
Footnote
1 Releasing memory does not mean it is immediately available for other purposes. The specific result of this depends on each page of memory. For example:
If the memory is shared with other processes, it is still needed for them, so releasing it from use by this process only decrements the number of processes using the memory. It is not immediately available for any other use.
If the memory is not in use by any other processes but contains data mapped from a file on disk, the operating system might mark it as available when needed but leave it alone for the moment. This is because you might run the same program again, and it would be nice if the data were still in memory, so why not just leave it in place just in case? The data might even be used by a different program that uses the same file. (For example, many programs might use the same shared library.)
If the memory is not in use by any other processes and was just used by the program as a work area, not mapped from a file, then system may mark it as immediately available and not containing anything useful.
would anything different happen between these two short programs?
The simple answer is: it makes no difference, the memory is released to the system in both cases. Calling free() is not strictly necessary and does incur an infinitesimal overhead but may prove useful when trying to track memory leaks in more complex programs.
Does terminating a program reclaim memory in the same way as free?
Not exactly:
Terminating a program releases the memory used by the program, be it for the program code, data, stack or heap. It also releases some other resources such as file handles, device handles, network sockets... All this is done efficiently, no matter how many blocks of memory have been allocated with malloc().
Conversely, free() makes the block of memory available for further use by the program for later calls to malloc() or realloc(). Depending on its size and the implementation of the heap, this freed block may or may not be returned to the OS for use by other programs. Also worth noting it the fragmentation problem, where small blocks of freed memory may not be usable for a larger allocation because they are surrounded by allocated blocks. The C heap does not perform packing or de-fragmentation, it merely coalesces adjacent free blocks. Freeing all allocated blocks before leaving the program may be useful for debugging purposes, but may be complicated and time consuming, while not necessary for the memory to be reused by the system after the program terminates.
free() is a user level memory management function and depends on malloc implementation you are currently using. The user-level allocator might maintain a linked-list of memory chunk and malloc/free will take the chunk of appropropriate size/put it back.
exit() Destroys an address space and all regions.
This is related to malloced heap as well as some other regions and in-kernel data structures used for managing address space of the process:
Each address space consists of a number of page-aligned regions
of memory that are in use. They never overlap and represent a set
of addresses which contain pages that are related to each other in
terms of protection and purpose. These regions are represented by
a struct vm_area_struct and are roughly analogous to the
vm_map_entry struct in BSD. For clarity, a region may represent the
process heap for use with malloc(), a memory mapped file such as
a shared library or a block of anonymous memory allocated with
mmap(). The pages for this region may still have to be allocated, be
active and resident or have been paged out
Reference: https://www.kernel.org/doc/gorman/html/understand/understand007.html
The reason well-designed programs free memory at exit is to check for memory leaks. If your application-level memory allocation does not go to zero after your last deallocation, you know that you have a memory memory that is not being managed properly and probably have a memory leak in your code.
would anything different happen between these two short programs?
YES
I'm confused why the free() method in C would do anything different than the operating system reclaiming the heap at the end of the program.
The operating system allocates memory in pages. Heap managers (such as malloc/free implementations) allocate pages from the operating system and subdivide the pages into smaller allocations. Calls to free() normally return memory to the heap. They do not return the pages to the operating system.
I was trying to create the condition for malloc to return a NULL pointer. In the below program, though I can see malloc returning NULL, once the program is forcebly terminated, I see that all other programs are becoming slow and finally I had to reboot the system. So my question is whether the memory for heap is shared with other programs? If not, other programs should not have affected. Is OS is not allocating certain amount of memory at the time of execution? I am using windows 10, Mingw.
#include <stdio.h>
#include <malloc.h>
void mallocInFunction(void)
{
int *ptr=malloc(500);
if(ptr==NULL)
{
printf("Memory Could not be allocated\n");
}
else
{
printf("Allocated memory successfully\n");
}
}
int main (void)
{
while(1)
{
mallocInFunction();
}
return(0);
}
So my question is whether the memory for heap is shared with other programs?
Physical memory (RAM) is a resource that is shared by all processes. The operating system makes decisions about how much RAM to allocate to each process and adjusts that over time.
If not, other programs should not have affected. Is OS is not allocating certain amount of memory at the time of execution?
At the time the program starts executing, the operating system has no idea how much memory the program will want or need. Instead, it deals with allocations as they happen. Unless configured otherwise, it will typically do everything it possibly can to allow the program's allocation to succeed because presumably there's a reason the program is doing what it's doing and the operating system won't try to second guess it.
... whether the memory for heap is shared with other programs?
Well, the C standard doesn't exactly require a heap, but in the context of a task-switching, multi-user and multi-threaded OS, of course memory is shared between processes! The C standard doesn't require any of this, but this is all pretty common stuff:
CPU cache memory tends to be preferred for code that's executed often, though this might get swapped around quite a bit; that may or may not be swapped to a heap.
Task switching causes registers to be swapped to other forms of memory; that may or may not be swapped to a heap.
Entire pages are swapped to and from disk, so that other programs can make use of them when your OS switches execution away from your program and to the other programs, and when it's your programs turn to execute again among other reasons. This may or may not involve manipulating the heap.
FWIW, you're referring to memory that has allocated storage duration. It's best to avoid using terms like heap and stack, as they're virtually meaningless. The memory you're referring to is on a silicon chip, regardless of whether it uses a heap or a stack.
... Is OS is not allocating certain amount of memory at the time of execution?
Speaking of silicon chips and execution, your OS likely only has control of one processor (a silicon chip which contains some logic circuits and memory, among other things I'm sure) with which to execute many programs! To summarise this post, yes, your program is most likely sharing those silicon chips with other programs!
On a tangential note, I don't think heap overflow means what you think it means.
Your question cannot be answered in the context of C, the language. For C, there's no such thing as a heap, a process, ...
But it can be answered in the context of operating systems. Even a bit generically because many modern multitasking OSes do similar things.
Given a modern multitasking OS, it will use virtual address spaces for each process. The OS manages a fixed size of physical RAM and divides this into pages, when a process needs memory, such pages are mapped into the process' virtual address space (typically using a different virtual address than the physical one). So when all memory pages are claimed by the OS itself and by the processes running, the OS will typically save some of these pages that are not in active use to disk, in a swap area, in order to serve this page as a fresh page to the next process requesting one. But when the original page is touched (and this is typically the case with free(), see below), it must first be loaded from disk again, but to have a free page for this, another page must be saved to swap space.
This is, like all disk I/O, slow, and it's probably what you see happening here.
Now to fully understand this: what does malloc() do? It typically requests from the operating system to have the memory of the own process increased (and if necessary, the OS does this by mapping another page), and it uses this new memory by writing some information there about the block of memory requested (so free() can work correctly later) and ultimately returns a pointer to a block that's free to use for the program. free() uses the information written by malloc(), modifies it to indicate this block is free again, and it typically can't give any memory back to the OS because there are other malloc()d blocks in the same page. It will give memory back when possible, but that's the exception in a typical scenario where dynamic allocations are heavily used.
So, the answer to your question is: Yes, the RAM is shared because there is only one set of physical RAM. The OS does the best it can to hide that fact and virtualize RAM, but if a process consumes all that is there, this will have visible effects.
malloc() is not system call but libc library function. So when a program ask for allocating memory via malloc(), system call brk()/sbrk() OR mmap() to allocated page(s), more details here.
Please keep in mind that the memory you get is all virtual in nature, that means if you have 3GB of physical RAM you can actually allocate almost infinite memory. So how does this happens? This happens via concept called 'paging', where system stores and retrieves data from secondary memory storage(HDD/SDD) to main memory(RAM), more details here.
So with this theory, out of memory usually quite rare but program like above which is checking system limits, this can happen. This is nicely explained here.
Now, why other programs are sort of hanged OR slow? Because they all share the same operating system and system is starving for resource. In fact at a point the system will crash and reboot again.
Hope this helps?
I am not clear with memory management when a process is in execution
during run time
Here is a diagram
I am not clear with the following in the image:
1) What is the stack which this image is referring to?
2) What is memory mapping segment which is referring to file mappings?
3) What does the heap have to do with a process. Is the heap only handled in a process or is the heap something maintained by the operating system kernel and then memory space is allocated by malloc (using the heap) when ever a user space application invokes this?
The article mentions
http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory/
virtual address space, which in 32-bit mode is always a 4GB block of
memory addresses. These virtual addresses are mapped to physical
memory by page tables,
4) Does this mean that at a time only one program runs in memory occupying entire 4 GB of RAM?
The same article also mentions
Linux randomizes the stack, memory mapping segment, and heap by adding
offsets to their starting addresses. Unfortunately the 32-bit address
space is pretty tight, leaving little room for randomization and
hampering its effectiveness.
5) Is it referring to randomizing the stack within a process or is it referring to something which is left after counting the space of all the processes?
1) What is the stack which this image is referring to?
The stack is for allocating local variables and function call frames (which include things like function parameters, where to return after the function has called, etc.).
2) What is memory mapping segment which is referring to file mappings?
Memory mapping segment holds linked libraries. It also is where mmap calls are allocated. In general, a memory mapped file is simply a region of memory backed by a file.
3) What does the heap have to do with a process. Is the heap only handled in a process or is the heap something maintained by the operating system kernel and then memory space is allocated by malloc (using the heap) when ever a user space application invokes this?
The heap is process specific, and is managed by the process itself, however it must request memory from the OS to begin with (and as needed). You are correct, this is typically where malloc calls are allocated. However, most malloc implementations make use of mmap to request chunks of memory, so there is really less of a distinction between heap and the memory mapping segment. Really, the heap could be considered part of the memory mapped segment.
4) Does this mean that at a time only one program runs in memory occupying entire 4 GB of RAM?
No, that means the amount of addressable memory available to the program is limited to 4 GB of RAM, what is actually contained in memory at any given time is dependent on how the OS allocated physical memory, and is beyond the scope of this question.
5) Is it referring to randomizing the stack within a process or is it referring to something which is left after counting the space of all the processes?
I've never seen anything that suggests 4gb of space "hampers" the effectiveness of memory allocation strategies used by the OS. Additionally, as #Jason notes, the locations of the stack, memory mapped segment, and heap are randomized "to prevent predictable security exploits, or at least make them a lot harder than if every process the OS managed had each portion of the executable in the exact same virtual memory location." To be specific, the OS is randomizing the virtual addresses for the stack, memory mapped region, and heap. On that note, everything the process sees is a virtual address, which is then mapped to a physical address in memory, depending on where the specific page is located. More information about the mapping between virtual and physical addresses can be found here.
This wikipedia article on paging is a good starting point for learning how the OS manages memory between processes, and is a good resource to read up on for answering questions 4 and 5. In short, memory is allocated in pages to processes, and these pages either exist in main memory, or have been "paged out" to the disk. When a memory address is requested by a process, it will move the page from the disk to main memory, replacing another page if needed. There are various page replacement strategies that are used and I refer you to the article to learn more about the advantages and disadvantages of each.
Part 1. The Stack ...
A function can call a function, which might call another function. Any variables allocated end up on the stack through each iteration. And de-allocated as each function exits, hence "stack". You might consider Wikipedia for this stuff ... http://en.wikipedia.org/wiki/Stack_%28abstract_data_type%29
Linux randomizes the stack, memory mapping segment, and heap by adding offsets to their starting addresses. Unfortunately the 32-bit address space is pretty tight, leaving little room for randomization and hampering its effectiveness.
I believe this is more of a generalization being made in the article when comparing the ability to randomize in 32 vs. 64-bits. 3GB of addressable memory in 32-bits is still quite a bit of space to "move around" ... it's just not as much room as can be afforded in a 64-bit OS, and there are certain applications, such as image-editors, etc. that are very memory intensive, and can easily use up the entire 3GB of addressable memory available to them. Keep in mind I'm saying "addressable" memory ... this is dependent on the platform and not the amount of physical memory available in the system.
i need to have reliable measurement of allocated memory in a linux process. I've been looking into mallinfo but i've read that it is deprecated. What is the state of the art alternative for this sort of statistics?
basically i'm interested in at least two numbers:
number (and size) of allocated memory blocks/pages from the kernel by any malloc or whatever implementation uses the C library of choice
(optional but still important) number of allocated memory by userspace code (via malloc, new, etc.) minus the deallocated memory by it (via free, delete, etc.)
one possibility i have is to override malloc calls with LD_PRELOAD, but it might introduce an unwanted overhead at runtime, also it might not interact properly with other libraries i'm using that also rely on LD_PRELOAD aop-ness.
another possibility i've read is with rusage.
To be clear, this is NOT for debugging purposes, the memory usage is intrinsic feature of the application (similar to Mathematica or Matlab that display the amount of memory used, only that more precise at the block-level)
For this purpose - a "memory usage" introspection feature within an application - the most appropriate interface is malloc_hook(3). These are GNU extensions that allow you to hook every malloc(), realloc() and free() call, maintaining your statistics.
To see how much memory is mapped by your application from the kernel's point of view, you can read and collate the information in the /proc/self/smaps pseudofile. This also lets you see how much of each allocation is resident, swapped, shared/private, clean/dirty etc.
/proc/PID/status contains a few useful pieces of information (try running cat /proc/$$/status for example).
VmPeak is the largest your process's virtual memory space ever became during its execution. This includes all pages mapped into your process, including executable pages, mmap'ed files, stack, and heap.
VmSize is the current size of your process's virtual memory space.
VmRSS is the Resident Set Size of your process; i.e., how much of it is taking up physical RAM right now. (A typical process will have lots of stuff mapped that it never uses, like most of the C library. If no processes need a page, eventually it will be evicted and become non-resident. RSS measures the pages that remain resident and are mapped into your process.)
VmHWM is the High Water Mark of VmRSS; i.e. the highest that number has been during the lifetime of the process.
VmData is the size of your process's "data" segment; i.e., roughly its heap usage. Note that small blocks on which you have done malloc and then free will still be in use from the kernel's point of view; large blocks will actually be returned to the kernel when freed. (If memory serves, "large" means greater than 128k for current glibc.) This is probably the closest to what you are looking for.
These measurements are probably better than trying to track malloc and free, since they indicate what is "really going on" from a system-wide point of view. Just because you have called free() on some memory, that does not mean it has been returned to the system for other processes to use.