How does making the virtual address contiguous in physical address zone improve performance?

How does making the virtual address contiguous in physical address zone improve performance? - c

Recently I am reading code about hugepage in dpdk(dpdk.org). I see the code makes the virtual address contiguous in physical address zone on purpose. Specifically, it first checks if there exists physically contiguous zone in hugepages and map the physically contiguous zone into contiguous virtual address. How does this improve the performance?
The source code says:
To reserve a big contiguous amount of memory, we use the hugepage feature of linux. For that, we need to have hugetlbfs mounted. This code will create many files in this directory (one per page) and map them in virtual memory. For each page, we will retrieve its physical address and remap it in order to have a virtual contiguous zone as well as a physical contiguous zone.
Why is this remapping necessary?

map the physically contiguous zone into contiguous virtual address. How does this improve the performance?
DPDK needs both physical and virtual addresses. The virtual address is used normally, to load/store some data. The physical address is necessary for the userspace drivers to transfer data to/from devices.
For example, we allocate a few mbufs with virtual addresses 0x41000, 0x42000 and 0x43000. Then we fill them with some data and pass those virtual addresses to the PMD to transfer.
The driver has to convert those virtual addresses to physical. If physical pages are mapped to virtual address space noncontiguous, to convert virtual to physical addresses we need to search through all the mappings. For example, virtual address 0x41000 might correspond to physical 0x81000, 0x42000 corresponds to 0x16000, and 0x43000 — to 0x64000.
The best case of such a search is one memory read, the worst case — a few memory reads for each buffer.
But if we are sure that both virtual and physical addresses of a memory zone are contiguous, we simply add an offset to the virtual address to get the physical and vice versa. For example, virtual 0x41000 corresponds to 0x81000, virtual 0x42000 to physical 0x82000, and 0x43000 — 0x83000.
The offset we know from the mapping. The worst case of such a translation is one memory read per all the buffers in a burst, which is a huge improvement for the translation.
Why is this remapping necessary?
To map a huge page to a virtual address space an mmap system call is used. The API of the call allows to specify the fixed virtual address for the huge page to be mapped. This allows to map huge pages one after another creating a contiguous virtual memory zone. For example, we can mmap a huge page at the virtual address 0x200000, the next one at the virtual address 0x400000 and so on.
Unfortunately, we don't know physical addresses of the huge pages until they are mapped. So at the virtual address 0x200000 we might map the physical address 0x800000, and at the virtual address 0x400000 — the physical 0x600000.
But once we mapped those huge pages for the first time, we know both physical and virtual addresses. So all we need to do is to remap them in the correct order: at virtual address 0x1200000 we map physical 0x600000, and at 0x1400000 — physical 0x800000.
Now we have a virtually and physically contiguous memory zone starting at the virtual address 0x1200000 and physical address 0x600000. So to convert virtual to physical addresses in this memory zone we just need to subtract the offset 0x600000 from the virtual address as described previously.
Hope this clarifies a bit the idea of contiguous memory zones and remapping.

Related

Why both ioremap() and kmap() are needed?

In case of memory mapped IO: both memory mapped IO and RAM are in same CPU physical address space.
When we map any higmem memory we use kmap() and when we need to map memory mapped IO we use ioremap().
So my question is why we need to separate functions to get kernel virtual address, since both (memory mapped IO and RAM) are in the same physical address space?
Why can't we have single function, as both are creating virtual address for some CPU physical address.

What is virtual memory? [duplicate]

This question already has answers here:
What are the differences between virtual memory and physical memory?
(6 answers)
Closed 3 years ago.
What is virtual memory and, how it differs from physical memory (RAM)? It says that physical memory is stored on sth on motherboard, while virtual memory is stored on disk.
Somewhere it also says that virtual spaces are used only when the physical memory is filled, which confused me a lot.
Then, why Windows uses virtual memory? Is it because the RAMs are small-spaced and not designed for big storage, so use the virtual to store more bigger-sized things?
The next thing is about the address. Since virtuals are on disk, they shouldn't share the address of physicals. So they have independent addresses. Is that right?
And,
When writing memory of another process, why recommend using VirtualAlloc instead of HeapAlloc?
Is it true that virtual memory is process-dependent and the physical memory shared through processes?

"Virtual memory" means there is a valid address space, which does not map to any particular physical memory or storage, hence virtual. In context of modern common operating systems, each process has its own virtual memory space, with overlapping virtual memory addresses.
This address space is divided into pages for easier management (example size 4 KB). Each valid page can be in 3 different states:
not stored physically (assumed to be all 0). If process writes to this kind of page, it needs to be given a page of physical memory (by OS, see below) so value can be stored.
Mapped to physical memory, meaning some page-size area in computers RAM stores the contents, and they can be directly used by the process.
Swapped out to disk (might be a swap file), in order to free physical RAM pages (done automatically by the operating system). If the process accesses the page (read or write), it needs to be loaded to page in RAM first (see above).
Only when virtual memory page is mapped to physical RAM page, is there something there. In other cases, if process accesses that page, there is a CPU exception, which transfers control to operating system. OS then needs to either map that virtual memory page to RAM (possibly needing to free some RAM first by swapping current data out to swap file, or terminating some application if out of all memory) and load the right data into it, or it can terminate the application (address was not in valid range, or is read-only but process tries to write).
Same page of memory can also be mapped to several places at once, for example with shared memory, so same data can be accessed by several processes at once (virtual address is probably different, so can't share pointer variables).
Another special case of virtual memory use is mapping a regular file on disk to virtual memory (same thing which happens with swap file, but now controlled by normal application process). Then OS takes care of actually reading bytes (in page-sized chunks) from disk and writing changes back, the process can just access the memory like any memory.
Every modern multi-tasking general purpose operating system uses virtual memory, because the CPUs they run support it, and because it solves a big bunch of problems, for example memory fragmentation, transparently using swapping to disk, memory protection... They could be solved differently, but virtual memory is the way today.
Physical memory is shared between processes the same way as computer power supply is shared, or CPU is shared. It is part of the physical computer. A normal process never handles actual physical memory addresses, all that it sees is virtual memory, which may be mapped to different physical locations.
The contents of virtual memory are not normally shared, except when they are (when using shared memory for example).
Not sure you mean by "When collecting memory for other process", so can't answer that.

Virtual memory can essentially be thought of as a per process virtual address that's mapped to a physical address. In the case of x86 there is a register CR3 that points to the translation table for that process. When allocating new memory the OS will allocate physical memory, which may not even be contiguous, and then set a free contiguous virtual region to point to that physical memory. Whenever the CPU accesses any virtual memory it uses this translation table in CR3 to convert it to the actual physical address.
More Info
https://en.m.wikipedia.org/wiki/Control_register#CR3
https://en.m.wikipedia.org/wiki/Page_table

To quote Wikipedia:
In computing, virtual memory (also virtual storage) is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory."
Because virtual memory is an illusory memory (so, non-existent), some other computer resources rather than RAM is used. In this case, the resource used is the disk, because it has a lot of space, more than RAM, where the OS can run its VM stuff.
Somewhere it also says that virtual spaces are used only when the physical memory is filled, which confused me a lot.
It shouldn't. VM uses the disk and I/O with the disk is much slower than I/O with the RAM. This is why physical memory is preferred nowadays and VM is used when physical memory is not enough.
Then, why Windows uses virtual memory? Is it because the RAMs are small-spaced and not designed for big storage, so use the virtual to store more bigger-sized things?
This is one of the main reasons, yes. In the past (the 70s), computer memory was very expensive so workarounds had to be conceived.

What happens to my pointers when I initialize paging? C kernel dev

I'm writing a kernel from scratch and am confused about what will happen once I initialize paging and map my kernel to a different location in virtual memory. My kernel is loaded to physical address 0x100000 on startup but I plan on mapping it to the virtual address 0xC0100000 so I can leave virtual address 0x100000 available for VM86 processes (more specifically, I plan on mapping physical addresses 0x100000 through 0x40000000 to virtual addresses 0xC0100000 through 0xFFFFF000). Anyway, I have a bitmap to keep track of my page frames located at physical address 0x108000 with the address stored in a uint32_t pointer. My question is, what will happen to this pointer when I initialize paging? Will it still point to my bitmap located at physical address 0x108000 or will it point to whatever the virtual address 0x108000 is mapped to in my page table? If the latter is true, how do I get around the problem that my pointers will not be correct once paging is enabled? Will I have to update my pointers or am I going about this completely wrong?

Do we require MMU when virtual address space is equal to physical address space?

The MMU is used to translate virtual address to physical address for a running process with the help of page table corresponding to that process.
Lets take a scenario when the virtual address space is equal to physical address space. Do we really require MMU in that case as we won't we having a situation where same virtual address space maps to different physical addresses?
Lets say
Virtual Address
| 20 bits (V) | 12 bits(PO) |
Physical Address
| 20 bits(PPN) | 12 bits(PO) |
where V = Virtual Page
PO = Page offset
PPN = Physical Page Number
Do we really require Page Table for every process?
What problems might appear when there are more than one process?
Please neglect cache memory for simplification.

The MMU does much more than mapping a virtual address space to a physical address space of different size. The most important point of an MMU is memory protection, which is relevant even if both address spaces have the same size:
The MMU handles pages (e.g. 4 kB of size) of virtual memory that are mapped to pages of physical memory.
In most systems, there is not only a single virtual address space, but one for every process. Under MMU control, every process can only access pages as allowed by the operating system (which programs the MMU). Most pages of different processes are isolated from each other, so that e.g. one process cannot crash another process by writing into its memory.
Mapping of virtual to physical pages under OS control allows address space randomization, so that reading across a virtual page boundary results in reading random data instead of certain data (a protection against e.g. buffer overflow attacks).
Moreover, even if there is a single process, pages can be treated as read-write, read-only, execute only, and access forbidden. This allows to restrict a process to access its own pages in a permitted way, e.g. it can make it impossible to execute "data" stored.
Some more information can be found here.

How do I get DRAM address instead of Virtual address

I understand if I try to print the address of an element of an array it would be an address from virtual memory not from real memory (physical memory) i.e DRAM.
printf ("Address of A[5] and A[6] are %u and %u", &A[5], &A[6]);
I found addresses were consecutive (assuming elements are chars). In reality they may not be consecutive at least not in the DRAM. I want to know the real addresses. How do I get that?
I need to know this for either Windows or Linux.

You can't get the physical address for a virtual address from user code; only the lowest levels of the kernel deal with physical addresses, and you'd have to intercept things there.
Note that the physical address for a virtual address may not be constant while the program runs — the page might be paged out from one physical address and paged back in to a different physical address. And if you make a system call, this remapping could happen between the time when the kernel identifies the physical address and when the function call completes because the program requesting the information was unscheduled and partially paged out and then paged in again.

The simple answer is that, in general, for user processes or threads in a multiprocessing OS such as Windows or Linux, it is not possible to find the address even of of a static variable in the processor's memory address space, let alone the DRAM address.
There are a number of reasons for this:
The memory allocated to a process is virtual memory. The OS can remap this process memory from time-to-time from one physical address range to another, and there is no way to detect this remaping in the user process. That is, the physical address of a variable can change during the lifetime of a process.
There is no interface from userspace to kernel space that would allow a userspace process to walk through the kernel's process table and page cache in order to find the physical address of the process. In Linux you can write a kernel module or driver that can do this.
The DRAM is often mapped to the procesor address space through a memory management unit (MMU) and memory cache. Although the MMU maping of DRAM to the processor address space is usually done only once, during system boot, the processor's use of the cache can mean that values written to a variable might not be written through to the DRAM in all cases.
There are OS-specific ways to "pin" a block of allocated memory to a static physical location. This is often done by device drivers that use DMA. However, this requires a level of privilege not available to userspace processes, and, even if you have the physical address of such a block, there is no pragma or directive in the commonly used linkers that you could use to allocate the BSS for a process at such a physical address.
Even inside the Linux kernel, virtual to physical address translation is not possible in the general case, and requires knowledge about the means that were used to allocate the memory to which a particular virtual address refers.
Here is a link to an article called Translating Virtual to Physical Address on Windows: Physical Addresses that gives you a hint as to the extreme ends to which you must go to get physical addresses on Windows.