Memory Layout of a C program (Phyical vs Logical view) - c

As per my understanding, the logical view of the C program is divided into many segments such as
Code
Data
Bss
Heap
Stack (typical implementation: Heap and Stack growing in opposite directions).
How are these segments aligned in the physical memory?
As per my understanding, physical memory uses frames of fixed size to store the pages of the process.
If that is the case then what how is this actually consistent with the user view? Example: the stack and heap area might be distributed among many pages. Pages might be scattered through the memory.

If that is the case then what how is this actually consistent with the
user view? Example: the stack and heap area might be distributed among
many pages. Pages might be scattered through the memory.
The virtual memory system maps "virtual" addresses onto physical addresses so that user code never knows or cares about where the memory it's using is located in physical memory. This is typically done using a hardware memory management unit (MMU), but it could also be done by the operating system without the MMU. As far as user code is concerned, there's just one nice big address space that's always available.

Related

What is virtual memory? [duplicate]

This question already has answers here:
What are the differences between virtual memory and physical memory?
(6 answers)
Closed 3 years ago.
What is virtual memory and, how it differs from physical memory (RAM)? It says that physical memory is stored on sth on motherboard, while virtual memory is stored on disk.
Somewhere it also says that virtual spaces are used only when the physical memory is filled, which confused me a lot.
Then, why Windows uses virtual memory? Is it because the RAMs are small-spaced and not designed for big storage, so use the virtual to store more bigger-sized things?
The next thing is about the address. Since virtuals are on disk, they shouldn't share the address of physicals. So they have independent addresses. Is that right?
And,
When writing memory of another process, why recommend using VirtualAlloc instead of HeapAlloc?
Is it true that virtual memory is process-dependent and the physical memory shared through processes?
"Virtual memory" means there is a valid address space, which does not map to any particular physical memory or storage, hence virtual. In context of modern common operating systems, each process has its own virtual memory space, with overlapping virtual memory addresses.
This address space is divided into pages for easier management (example size 4 KB). Each valid page can be in 3 different states:
not stored physically (assumed to be all 0). If process writes to this kind of page, it needs to be given a page of physical memory (by OS, see below) so value can be stored.
Mapped to physical memory, meaning some page-size area in computers RAM stores the contents, and they can be directly used by the process.
Swapped out to disk (might be a swap file), in order to free physical RAM pages (done automatically by the operating system). If the process accesses the page (read or write), it needs to be loaded to page in RAM first (see above).
Only when virtual memory page is mapped to physical RAM page, is there something there. In other cases, if process accesses that page, there is a CPU exception, which transfers control to operating system. OS then needs to either map that virtual memory page to RAM (possibly needing to free some RAM first by swapping current data out to swap file, or terminating some application if out of all memory) and load the right data into it, or it can terminate the application (address was not in valid range, or is read-only but process tries to write).
Same page of memory can also be mapped to several places at once, for example with shared memory, so same data can be accessed by several processes at once (virtual address is probably different, so can't share pointer variables).
Another special case of virtual memory use is mapping a regular file on disk to virtual memory (same thing which happens with swap file, but now controlled by normal application process). Then OS takes care of actually reading bytes (in page-sized chunks) from disk and writing changes back, the process can just access the memory like any memory.
Every modern multi-tasking general purpose operating system uses virtual memory, because the CPUs they run support it, and because it solves a big bunch of problems, for example memory fragmentation, transparently using swapping to disk, memory protection... They could be solved differently, but virtual memory is the way today.
Physical memory is shared between processes the same way as computer power supply is shared, or CPU is shared. It is part of the physical computer. A normal process never handles actual physical memory addresses, all that it sees is virtual memory, which may be mapped to different physical locations.
The contents of virtual memory are not normally shared, except when they are (when using shared memory for example).
Not sure you mean by "When collecting memory for other process", so can't answer that.
Virtual memory can essentially be thought of as a per process virtual address that's mapped to a physical address. In the case of x86 there is a register CR3 that points to the translation table for that process. When allocating new memory the OS will allocate physical memory, which may not even be contiguous, and then set a free contiguous virtual region to point to that physical memory. Whenever the CPU accesses any virtual memory it uses this translation table in CR3 to convert it to the actual physical address.
More Info
https://en.m.wikipedia.org/wiki/Control_register#CR3
https://en.m.wikipedia.org/wiki/Page_table
To quote Wikipedia:
In computing, virtual memory (also virtual storage) is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory."
Because virtual memory is an illusory memory (so, non-existent), some other computer resources rather than RAM is used. In this case, the resource used is the disk, because it has a lot of space, more than RAM, where the OS can run its VM stuff.
Somewhere it also says that virtual spaces are used only when the physical memory is filled, which confused me a lot.
It shouldn't. VM uses the disk and I/O with the disk is much slower than I/O with the RAM. This is why physical memory is preferred nowadays and VM is used when physical memory is not enough.
Then, why Windows uses virtual memory? Is it because the RAMs are small-spaced and not designed for big storage, so use the virtual to store more bigger-sized things?
This is one of the main reasons, yes. In the past (the 70s), computer memory was very expensive so workarounds had to be conceived.

When a large block of memory is requested on the heap, if contiguous space is not available on the RAM, is it allocated on the disk(swap)?

In Linux, when memory is requested (using calloc / malloc), if a contiguous block of the requested size is not available does the kernel map multiple separate pieces of memory into one single virtual block and hand it over to the application or is it allocated on disk?
If it is allocated on disk, when a large enough block becomes free, is it automatically moved to the RAM or does it live on the disk for its life?
In all Virtual Memory capable OS's, nowadays all, the kernel uses a pool of different strategies together to manage the memory.
The processor guarantees to programs that memory exists up to the nominal addressing capability of the processor, or something less depending on specs. So you're allowed to allocate memory up to MAX legal memory amount for that OS.
The resources used are the physical memory installed in the system and disk storage.
Depending on program requests, the MM (memory manager) allocates virtual space: the OS creates a data structure that describes the requested memory (generally referred to as creating paging directories), but no real memory is allocated. The next stage is called commitment, you get into this stage when trying to access the memory previously allocated. If there is any physical memory available, the MM, called through the exception generated by the attempt to access the memory, commits pages of real memory, inserting their address in the page directory, then control flows back to user code. Now the memory exists!
And if we try to access memory not allocated? This is the case of invalid memory access that generates an exception, in Linux a SegFault.
Going back to the physical memory commitment, what if there is no more physical memory available? The MM use different algorithms to look for physical memory not accessed for a long time, no more requested or discardable. Even other things get involved such as task priority, task state (suspended tasks are good candidates for memory recollecting), etc. All this physical memory can be recollected, its contents saved in a swap file on disk (paging file), and physical storage can then be mapped into the new process.
The reverse process is made when a program tries to access its memory currently cached on disk. The MM recollects physical memory from other processes (using the same techniques described above), commits it in process space, copies data from the disk cache and then gives back control to user code.
All these processes are completely transparent to the user, and you don't have to worry about it.
If you want to read more search for memory management, paging files, GDT (Global Descriptor Tables) and LDT (Local Descriptor Tables) here for X86 architectures, other processor use different structures and registers, but the principle is the same.

Means to allocate contiguous physical memory

I am aware that with C malloc and posix_memaligh one can allocate contiguous memory from the virtual address space of a process. However, I was wondering whether somehow one can allocate a buffer of physically contiguous memory? I am investigating side channel attacks that exploit L2 cache so I want to be sure that I can access the right cache lines..
Your best and easiest take at continuous memory is to request a single "huge" page from the system. The availability of those depends on your CPU and kernel options (on x86_64 the 2MB huge pages are usually available and some CPUs can also do 1GB pages; other architectures can be more flexible than this). Check out Hugepagesize field in /proc/meminfo for the size of huge pages on your setup.
Those can be accessed in two ways:
By means of a MAP_HUGETLB flag passed to mmap(). This way you can be sure that the "huge" virtual page corresponds to a continuous physical memory range. Unfortunately, whether the kernel can supply you with a "huge" page depends on many factors (current layout of memory utilization, kernel options, etc - also see the hugepages kernel boot parameter).
By means of mapping a file from a dedicated HugeTLB filesystem (see here: http://lwn.net/Articles/375096/). With HugeTLB file system you can configure the number of huge pages available in advance for some assurance that the necessary amount of huge pages will be available.
The other approach is to write a kernel module which will allocate continuous physical memory on the kernel side and then map it into your process' address space on request. This approach is sometimes employed on special purpose hardware in embedded systems. Of course, there's still no guarantee that the kernel side memory allocator will be able to come with an appropriately sized continuous physical address range, so on some occasions such address ranges are pre-reserved on boot (one dumb approach is to pass max_addr parameter to kernel on boot to leave some of the RAM out of kernel's reach).
On (almost [Note 1]) all virtual memory architectures, virtual memory is mapped to physical memory in units of a "page". The size of a page is (almost) always a power of 2, and pages are aligned by that size, because the mapping is done by only using the high-order bits of the address. It's common to see a page size of 4K (12 bits of address), although modern CPUs have an option to map much larger pages in order to reduce the size of mapping tables.
Since L2_CACHE_SIZE will generally also be a power of 2 and will be smaller than the page size, any single aligned allocation of size L2_CACHE_SIZE will necessarily be in a single page, so the bytes in the alignment will be physically contiguous as well.
So in this particular case, you can be assured that your allocated memory will be a single cache-line (at least, on standard machine architectures).
Note 1: Undoubtedly there are machines -- possibly imaginary -- which do not function this way. But the one you are playing with is not one of them.

how does a program runs in memory and the way memory is handled by Operating system

I am not clear with memory management when a process is in execution
during run time
Here is a diagram
I am not clear with the following in the image:
1) What is the stack which this image is referring to?
2) What is memory mapping segment which is referring to file mappings?
3) What does the heap have to do with a process. Is the heap only handled in a process or is the heap something maintained by the operating system kernel and then memory space is allocated by malloc (using the heap) when ever a user space application invokes this?
The article mentions
http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory/
virtual address space, which in 32-bit mode is always a 4GB block of
memory addresses. These virtual addresses are mapped to physical
memory by page tables,
4) Does this mean that at a time only one program runs in memory occupying entire 4 GB of RAM?
The same article also mentions
Linux randomizes the stack, memory mapping segment, and heap by adding
offsets to their starting addresses. Unfortunately the 32-bit address
space is pretty tight, leaving little room for randomization and
hampering its effectiveness.
5) Is it referring to randomizing the stack within a process or is it referring to something which is left after counting the space of all the processes?
1) What is the stack which this image is referring to?
The stack is for allocating local variables and function call frames (which include things like function parameters, where to return after the function has called, etc.).
2) What is memory mapping segment which is referring to file mappings?
Memory mapping segment holds linked libraries. It also is where mmap calls are allocated. In general, a memory mapped file is simply a region of memory backed by a file.
3) What does the heap have to do with a process. Is the heap only handled in a process or is the heap something maintained by the operating system kernel and then memory space is allocated by malloc (using the heap) when ever a user space application invokes this?
The heap is process specific, and is managed by the process itself, however it must request memory from the OS to begin with (and as needed). You are correct, this is typically where malloc calls are allocated. However, most malloc implementations make use of mmap to request chunks of memory, so there is really less of a distinction between heap and the memory mapping segment. Really, the heap could be considered part of the memory mapped segment.
4) Does this mean that at a time only one program runs in memory occupying entire 4 GB of RAM?
No, that means the amount of addressable memory available to the program is limited to 4 GB of RAM, what is actually contained in memory at any given time is dependent on how the OS allocated physical memory, and is beyond the scope of this question.
5) Is it referring to randomizing the stack within a process or is it referring to something which is left after counting the space of all the processes?
I've never seen anything that suggests 4gb of space "hampers" the effectiveness of memory allocation strategies used by the OS. Additionally, as #Jason notes, the locations of the stack, memory mapped segment, and heap are randomized "to prevent predictable security exploits, or at least make them a lot harder than if every process the OS managed had each portion of the executable in the exact same virtual memory location." To be specific, the OS is randomizing the virtual addresses for the stack, memory mapped region, and heap. On that note, everything the process sees is a virtual address, which is then mapped to a physical address in memory, depending on where the specific page is located. More information about the mapping between virtual and physical addresses can be found here.
This wikipedia article on paging is a good starting point for learning how the OS manages memory between processes, and is a good resource to read up on for answering questions 4 and 5. In short, memory is allocated in pages to processes, and these pages either exist in main memory, or have been "paged out" to the disk. When a memory address is requested by a process, it will move the page from the disk to main memory, replacing another page if needed. There are various page replacement strategies that are used and I refer you to the article to learn more about the advantages and disadvantages of each.
Part 1. The Stack ...
A function can call a function, which might call another function. Any variables allocated end up on the stack through each iteration. And de-allocated as each function exits, hence "stack". You might consider Wikipedia for this stuff ... http://en.wikipedia.org/wiki/Stack_%28abstract_data_type%29
Linux randomizes the stack, memory mapping segment, and heap by adding offsets to their starting addresses. Unfortunately the 32-bit address space is pretty tight, leaving little room for randomization and hampering its effectiveness.
I believe this is more of a generalization being made in the article when comparing the ability to randomize in 32 vs. 64-bits. 3GB of addressable memory in 32-bits is still quite a bit of space to "move around" ... it's just not as much room as can be afforded in a 64-bit OS, and there are certain applications, such as image-editors, etc. that are very memory intensive, and can easily use up the entire 3GB of addressable memory available to them. Keep in mind I'm saying "addressable" memory ... this is dependent on the platform and not the amount of physical memory available in the system.

C accessing memory location

I mean the physical memory, the RAM.
In C you can access any memory address, so how does the operating system then prevent your program from changing memory address which is not in your program's memory space?
Does it set specific memory adresses as begin and end for each program, if so how does it know how much is needed.
Your operating system kernel works closely with memory management (MMU) hardware, when the hardware and OS both support this, to make it impossible to access memory you have been disallowed access to.
Generally speaking, this also means the addresses you access are not physical addresses but rather are virtual addresses, and hardware performs the appropriate translation in order to perform the access.
This is what is called a memory protection. It may be implemented using different methods. I'd recommend you start with a Wikipedia article on this subject — http://en.wikipedia.org/wiki/Memory_protection
Actually, your program is allocated virtual memory, and that's what you work with. The OS gives you a part of the RAM, you can't access other processes' memory (unless it's shared memory, look it up).
It depends on the architecture, on some it's not even possible to prevent a program from crashing the system, but generally the platform provides some means to protect memory and separate address space of different processes.
This has to do with a thing called 'paging', which is provided by the CPU itself. In old operating systems, you had 'real mode', where you could directly access memory addresses. In contrast, paging gives you 'virtual memory', so that you are not accessing the raw memory itself, but rather, what appears to your program to be the entire memory map.
The operating system does "memory management" often coupled with TLB's (Translation Lookaside Buffers) and Virtual Memory, which translate any address to pages, which the operation system can tag readable or executable in the current processes context.
The minimum requirement for a processors MMU or memory management unit is in current context restrict the accessable memory to a range which can be only set in processors registers in supervisor mode (as opposed to user mode).
The logical address is generated by the CPU which is mapped to the physical address by the memory mapping unit. Unlike the physical address space the logical address is not restricted by memory size and you just get to work with the logical address space. The address binding is done by MMU. So you never deal with the physical address directly.
Most computers (and all PCs since the 386) have something called the Memory Management Unit (or MMU). It's job is to translate local addresses used by a program into the physical addresses needed to fetch real bytes from real memory. It's the operating system's job to program the MMU.
As a result of this, programs can be loaded into any region of memory and appear, from that program's point of view while executing, to be be any any other address. It's common to find that the code for all programs appear (locally) to be at the same address and their data always appears (locally) to be at the same address even though physically they will be in different locations. With each memory access, the MMU transparently translates from the local address space to the physical one.
If a program trys to access a memory address that has not been mapped into its local address space, the hardware generates an exception and typically gets flagged as a "segmentation violation", followed by the forcible termination of the program. This protects from accessing the memory of other processes.
But that doesn't have to be the case! On systems with "virtual memory" and current resource demands on RAM that exceed the amount of physical memory, some pages (just blocks of memory of a common size, often on the order of 4-8kB) can be written out to disk and given as RAM to a program trying to allocate and use new memory. Later on, when that page is needed by whatever program owns it, the memory access causes an exception and the OS swaps out some other memory page and re-loads the needed one from disk. The program that "page-faulted" gets delayed while this happens but otherwise notices nothing.
There are lots of other tricks the MMU/OS can do as well, like sharing memory between processes, making a disk file appear to be direct-memory-accessible, setting some pages as "NX" so they can't be treated as executable code, using arbitrary sections of the logical memory space regardless of how much and at what address the physical ram uses, and more.

Resources