Do we require MMU when virtual address space is equal to physical address space? - virtual-address-space

The MMU is used to translate virtual address to physical address for a running process with the help of page table corresponding to that process.
Lets take a scenario when the virtual address space is equal to physical address space. Do we really require MMU in that case as we won't we having a situation where same virtual address space maps to different physical addresses?
Lets say
Virtual Address
| 20 bits (V) | 12 bits(PO) |
Physical Address
| 20 bits(PPN) | 12 bits(PO) |
where V = Virtual Page
PO = Page offset
PPN = Physical Page Number
Do we really require Page Table for every process?
What problems might appear when there are more than one process?
Please neglect cache memory for simplification.

The MMU does much more than mapping a virtual address space to a physical address space of different size. The most important point of an MMU is memory protection, which is relevant even if both address spaces have the same size:
The MMU handles pages (e.g. 4 kB of size) of virtual memory that are mapped to pages of physical memory.
In most systems, there is not only a single virtual address space, but one for every process. Under MMU control, every process can only access pages as allowed by the operating system (which programs the MMU). Most pages of different processes are isolated from each other, so that e.g. one process cannot crash another process by writing into its memory.
Mapping of virtual to physical pages under OS control allows address space randomization, so that reading across a virtual page boundary results in reading random data instead of certain data (a protection against e.g. buffer overflow attacks).
Moreover, even if there is a single process, pages can be treated as read-write, read-only, execute only, and access forbidden. This allows to restrict a process to access its own pages in a permitted way, e.g. it can make it impossible to execute "data" stored.
Some more information can be found here.

Related

What is virtual memory? [duplicate]

This question already has answers here:
What are the differences between virtual memory and physical memory?
(6 answers)
Closed 3 years ago.
What is virtual memory and, how it differs from physical memory (RAM)? It says that physical memory is stored on sth on motherboard, while virtual memory is stored on disk.
Somewhere it also says that virtual spaces are used only when the physical memory is filled, which confused me a lot.
Then, why Windows uses virtual memory? Is it because the RAMs are small-spaced and not designed for big storage, so use the virtual to store more bigger-sized things?
The next thing is about the address. Since virtuals are on disk, they shouldn't share the address of physicals. So they have independent addresses. Is that right?
And,
When writing memory of another process, why recommend using VirtualAlloc instead of HeapAlloc?
Is it true that virtual memory is process-dependent and the physical memory shared through processes?
"Virtual memory" means there is a valid address space, which does not map to any particular physical memory or storage, hence virtual. In context of modern common operating systems, each process has its own virtual memory space, with overlapping virtual memory addresses.
This address space is divided into pages for easier management (example size 4 KB). Each valid page can be in 3 different states:
not stored physically (assumed to be all 0). If process writes to this kind of page, it needs to be given a page of physical memory (by OS, see below) so value can be stored.
Mapped to physical memory, meaning some page-size area in computers RAM stores the contents, and they can be directly used by the process.
Swapped out to disk (might be a swap file), in order to free physical RAM pages (done automatically by the operating system). If the process accesses the page (read or write), it needs to be loaded to page in RAM first (see above).
Only when virtual memory page is mapped to physical RAM page, is there something there. In other cases, if process accesses that page, there is a CPU exception, which transfers control to operating system. OS then needs to either map that virtual memory page to RAM (possibly needing to free some RAM first by swapping current data out to swap file, or terminating some application if out of all memory) and load the right data into it, or it can terminate the application (address was not in valid range, or is read-only but process tries to write).
Same page of memory can also be mapped to several places at once, for example with shared memory, so same data can be accessed by several processes at once (virtual address is probably different, so can't share pointer variables).
Another special case of virtual memory use is mapping a regular file on disk to virtual memory (same thing which happens with swap file, but now controlled by normal application process). Then OS takes care of actually reading bytes (in page-sized chunks) from disk and writing changes back, the process can just access the memory like any memory.
Every modern multi-tasking general purpose operating system uses virtual memory, because the CPUs they run support it, and because it solves a big bunch of problems, for example memory fragmentation, transparently using swapping to disk, memory protection... They could be solved differently, but virtual memory is the way today.
Physical memory is shared between processes the same way as computer power supply is shared, or CPU is shared. It is part of the physical computer. A normal process never handles actual physical memory addresses, all that it sees is virtual memory, which may be mapped to different physical locations.
The contents of virtual memory are not normally shared, except when they are (when using shared memory for example).
Not sure you mean by "When collecting memory for other process", so can't answer that.
Virtual memory can essentially be thought of as a per process virtual address that's mapped to a physical address. In the case of x86 there is a register CR3 that points to the translation table for that process. When allocating new memory the OS will allocate physical memory, which may not even be contiguous, and then set a free contiguous virtual region to point to that physical memory. Whenever the CPU accesses any virtual memory it uses this translation table in CR3 to convert it to the actual physical address.
More Info
https://en.m.wikipedia.org/wiki/Control_register#CR3
https://en.m.wikipedia.org/wiki/Page_table
To quote Wikipedia:
In computing, virtual memory (also virtual storage) is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory."
Because virtual memory is an illusory memory (so, non-existent), some other computer resources rather than RAM is used. In this case, the resource used is the disk, because it has a lot of space, more than RAM, where the OS can run its VM stuff.
Somewhere it also says that virtual spaces are used only when the physical memory is filled, which confused me a lot.
It shouldn't. VM uses the disk and I/O with the disk is much slower than I/O with the RAM. This is why physical memory is preferred nowadays and VM is used when physical memory is not enough.
Then, why Windows uses virtual memory? Is it because the RAMs are small-spaced and not designed for big storage, so use the virtual to store more bigger-sized things?
This is one of the main reasons, yes. In the past (the 70s), computer memory was very expensive so workarounds had to be conceived.

How does making the virtual address contiguous in physical address zone improve performance?

Recently I am reading code about hugepage in dpdk(dpdk.org). I see the code makes the virtual address contiguous in physical address zone on purpose. Specifically, it first checks if there exists physically contiguous zone in hugepages and map the physically contiguous zone into contiguous virtual address. How does this improve the performance?
The source code says:
To reserve a big contiguous amount of memory, we use the hugepage feature of linux. For that, we need to have hugetlbfs mounted. This code will create many files in this directory (one per page) and map them in virtual memory. For each page, we will retrieve its physical address and remap it in order to have a virtual contiguous zone as well as a physical contiguous zone.
Why is this remapping necessary?
map the physically contiguous zone into contiguous virtual address. How does this improve the performance?
DPDK needs both physical and virtual addresses. The virtual address is used normally, to load/store some data. The physical address is necessary for the userspace drivers to transfer data to/from devices.
For example, we allocate a few mbufs with virtual addresses 0x41000, 0x42000 and 0x43000. Then we fill them with some data and pass those virtual addresses to the PMD to transfer.
The driver has to convert those virtual addresses to physical. If physical pages are mapped to virtual address space noncontiguous, to convert virtual to physical addresses we need to search through all the mappings. For example, virtual address 0x41000 might correspond to physical 0x81000, 0x42000 corresponds to 0x16000, and 0x43000 — to 0x64000.
The best case of such a search is one memory read, the worst case — a few memory reads for each buffer.
But if we are sure that both virtual and physical addresses of a memory zone are contiguous, we simply add an offset to the virtual address to get the physical and vice versa. For example, virtual 0x41000 corresponds to 0x81000, virtual 0x42000 to physical 0x82000, and 0x43000 — 0x83000.
The offset we know from the mapping. The worst case of such a translation is one memory read per all the buffers in a burst, which is a huge improvement for the translation.
Why is this remapping necessary?
To map a huge page to a virtual address space an mmap system call is used. The API of the call allows to specify the fixed virtual address for the huge page to be mapped. This allows to map huge pages one after another creating a contiguous virtual memory zone. For example, we can mmap a huge page at the virtual address 0x200000, the next one at the virtual address 0x400000 and so on.
Unfortunately, we don't know physical addresses of the huge pages until they are mapped. So at the virtual address 0x200000 we might map the physical address 0x800000, and at the virtual address 0x400000 — the physical 0x600000.
But once we mapped those huge pages for the first time, we know both physical and virtual addresses. So all we need to do is to remap them in the correct order: at virtual address 0x1200000 we map physical 0x600000, and at 0x1400000 — physical 0x800000.
Now we have a virtually and physically contiguous memory zone starting at the virtual address 0x1200000 and physical address 0x600000. So to convert virtual to physical addresses in this memory zone we just need to subtract the offset 0x600000 from the virtual address as described previously.
Hope this clarifies a bit the idea of contiguous memory zones and remapping.

How do I get DRAM address instead of Virtual address

I understand if I try to print the address of an element of an array it would be an address from virtual memory not from real memory (physical memory) i.e DRAM.
printf ("Address of A[5] and A[6] are %u and %u", &A[5], &A[6]);
I found addresses were consecutive (assuming elements are chars). In reality they may not be consecutive at least not in the DRAM. I want to know the real addresses. How do I get that?
I need to know this for either Windows or Linux.
You can't get the physical address for a virtual address from user code; only the lowest levels of the kernel deal with physical addresses, and you'd have to intercept things there.
Note that the physical address for a virtual address may not be constant while the program runs — the page might be paged out from one physical address and paged back in to a different physical address. And if you make a system call, this remapping could happen between the time when the kernel identifies the physical address and when the function call completes because the program requesting the information was unscheduled and partially paged out and then paged in again.
The simple answer is that, in general, for user processes or threads in a multiprocessing OS such as Windows or Linux, it is not possible to find the address even of of a static variable in the processor's memory address space, let alone the DRAM address.
There are a number of reasons for this:
The memory allocated to a process is virtual memory. The OS can remap this process memory from time-to-time from one physical address range to another, and there is no way to detect this remaping in the user process. That is, the physical address of a variable can change during the lifetime of a process.
There is no interface from userspace to kernel space that would allow a userspace process to walk through the kernel's process table and page cache in order to find the physical address of the process. In Linux you can write a kernel module or driver that can do this.
The DRAM is often mapped to the procesor address space through a memory management unit (MMU) and memory cache. Although the MMU maping of DRAM to the processor address space is usually done only once, during system boot, the processor's use of the cache can mean that values written to a variable might not be written through to the DRAM in all cases.
There are OS-specific ways to "pin" a block of allocated memory to a static physical location. This is often done by device drivers that use DMA. However, this requires a level of privilege not available to userspace processes, and, even if you have the physical address of such a block, there is no pragma or directive in the commonly used linkers that you could use to allocate the BSS for a process at such a physical address.
Even inside the Linux kernel, virtual to physical address translation is not possible in the general case, and requires knowledge about the means that were used to allocate the memory to which a particular virtual address refers.
Here is a link to an article called Translating Virtual to Physical Address on Windows: Physical Addresses that gives you a hint as to the extreme ends to which you must go to get physical addresses on Windows.

System Wide Page Table

I have a doubt when each process has its own separate page table then why is a system wide page table required? Also if a page table is such that it maps virtual address to a physical address then I think two process may map to same physical address because all process have same virtual address space. Is this true?
About the second part, which is mapping virtual address to the same physical address, for library code and different instances of an application code, this is indeed what is done. The code is given read only access and the same virtual address is mapped to the same physical address. In this way, there is no need to have multiple copies of the same code in the physical memory, all this assuming ASLR is not enabled.
Now concerning the data part, Modern OSs like Linux, use demand paging, that is a page is only brought to the physical memory when it is accessed (read or write). At that point, the kernel can make sure to assign a unique physical address for that page. I don't know what is the purpose of system wide page table though.
A system-wide page table would be used by the kernel, which in most systems is always mapped in to memory. (Typically a 32-bit system will allocate the lower 2-3 GB of virtual address space to the user process and the upper 1-2 GB to the kernel.) Making the kernel mappings common across all processes means that you don't have to worry about making sure kernel code you're about to run is mapped when you enter a system call from userland.

Mapping of Virtual Address to Physical Address

I have a doubt when each process has it's own separate page table then why is there s system wide page table required ? Also if Page table is such that it maps virtual address to a physical address then I think two process may map to same physical address because all process have same virtual address space . Any good link on system wide page table will also solve my problem?
Each process has its own independent virtual address space - two processes can have virtpage 1 map to different physpages. Processes can participate in shared memory, in which case they each have some virtpage mapping to the same physpage.
The virtual address space of a process can be used to map virtpages to physpages, to memory mapped files, devices, etc. Virtpages don't have to be wired to RAM. A process could memory-map an entire 1GB file - in which case, its physical memory usage might only be a couple megs, but its virtual address space usage would be 1GB or more. Many processes could do this, in which case the sum of virtual address space usage across all processes might be, say, 40 GB, while the total physical memory usage might be only, say, 100 megs; this is very easy to do on 32-bit systems.
Since lots of processes load the same libraries, the OS typically puts the libs in one set of read-only executable pages, and then loads mappings in the virtpage space for each process to point to that one set of pages, to save on physical memory.
Processes may have virtpage mappings that don't point to anything, for instance if part of the process's memory got written to the page file - the process will try to access that page, the CPU will trigger a page fault, the OS will see the page fault and handle it by suspending the process, reading the pages back into ram from the page file and then resuming the process.
There are typically 3 types of page faults. The first type is when the CPU does not have the virtual-physical mapping in the TLB - the processor invokes the pagefault software interrupt in the OS, the OS puts the mapping into the processor for that process, then the proc re-runs the offending instructions. These happen thousands of times a second.
The second type is when the OS has no mapping because, say, the memory for the process has been swapped to disk, as explained above. These happen infrequently on a lightly loaded machine, but happen more often as memory pressure is increased, up to 100s to 1000s of times per second, maybe even more.
The third type is when the OS has no mapping because the mapping does not exist - the process is trying to access memory that does not belong to it. This generates a segfault, and typically, the process is killed. These aren't supposed to happen often, and solely depend on how well written the software is on the machine, and does not have anything to do with scheduling or machine load.
Even if you already knew that, I figured I throw that in for the community.

Resources