I'm on 32bit machine. From what I understand, User space's address ranges from 0x00000000 to 0xbfffffff, and kernel's ranges from 0xc0000000 to 0xffffffff.
But when I used pmap to see a process's memory allocation, I see that the library are loaded in around 0xf7777777. Please see the attached screenshot. Does it mean those libraries are loaded in kernel space? And when I used mmap(), I got the address from 0xe0000000. So, mmap() got memory from kernel space?
I'm on 32bit machine. From what I understand, User space's address
ranges from 0x00000000 to 0xbfffffff, and kernel's ranges from
0xc0000000 to 0xffffffff.
Not exactly. Kernel memory space starts at 0xC0000000, but it doesn't have to fill the entire GB. In fact, it fills up to virtual address 0xF7FFFFFF. This covers 896MB of physical memory. Virtual addresses 0xF8000000 and up are used as a 128MB window for the kernel to map any region of physical memory beyond the 896MB limit.
All user processes share the same memory map for virtual addresses 0xC0000000 and beyond, so if the kernel does not use its entire GB of virtual space, it may reuse part of it to map commonly used shared libraries, so every process can see them.
Related
Architecture: X86, Linux
I have gone through several threads related to kmalloc() and vmalloc() and I know the basic differences between them but still have some doubts. I need your help to understand the fundamentals.
So As we know that in 32-bit systems the virtual address space is divided between user and kernel. The high 1GB memory addresses are allocated for kernel and the lower 3GB to user space.
Please correct me i am wrong here:
So out of 1GB 896MB is 1:1 mapped (physically contiguous) to the kernel logical address and 128MB is used to access the high memory addresses means accessing the physical(RAM) beyond 896 MB by mapping it into the 128MB using vmalloc()? (lets consider we have 2GB of RAM)
And what would happen if we have only 1GB of physical RAM in point 1. The 896MB is 1:1 mapping between kernel logical address and physical addresses achieved through kmalloc(). Then what would be the use of 128MB, why do we need vmalloc() anymore? So lets say if we have 1GB of RAM which entirely can be accessed through kernel logical address so does that mean that it is possible only when the amount of memory requested by a thread/process can be allocated in physically contiguous addresses. And if there is no physical contiguous memory available, in that case memory allocated will not be physically contiguous, hence we need to use vmalloc() to map it in 128MB of space and access it through there?
What happens when we use 64-bit system? Is it true that both kmalloc() and vmalloc() can have enormous amount of address space and can access entire RAM?
Arch=x86_64
I am working through a DMA solution following the process outlined in this question,
Direct Memory Access in Linux
My call to ioremap successfully returns with an address, pt.
In my call to remap_pfn_range I use, virt_to_phys(pt) >> PAGE_SHIFT, to specify the pfn of the area generated by the ioremap call.
When the userspace application using mmap executes and the call to remap_pfn_range is made, the machine crashes. I assume the mapping is off and I am forcing the system to use memory that is already allocated (screen glitches before exit), however I'm not clear on where the mismatch is occurring. The system has 4 Gigs of Ram and I reserved 2Gigs by using the kernel boot option mem=2048M.
I use BUFFER_SIZE=1024u*1024u*1024u and BUFFER_OFFSET=2u*1024u*1024u*1024u.
putting these into pt=ioremap(BUFFER_SIZE,BUFFER_OFFSET) I believe pt should equal a virtual address to the physical memory located at the 2GB boundary up to the 3GB boundary. Is this assumption accurate?
When I execute my kernel module, but I change my remap_pfn_range to use vma->vm_pgoff>>PAGE_SHIFT as the target pfn the code executes with no error and I can read and write to the memory. However this is not using the reserved physical memory that I intended.
Since everything works when using vma->vm_pgoff>>PAGE_SHIFT I believe my culprit is between my ioremap and the remap_pfn_range
Thanks for any suggestions!
The motivation behind the use of this kernel module is the need for large contiguous buffers for DMA from a PCI device. In this application, recompiling the kernel isn't an option so I'm trying to accomplish it with a module + hardware.
My call to ioremap successfully returns with an address, pt.
In my call to remap_pfn_range I use, virt_to_phys(pt) >> PAGE_SHIFT,
to specify the pfn of the area generated by the ioremap call.
This is illegal, because ioremap reserves virtual region in vmalloc area. The virt_to_phys() is OK only for linearly mapped part of memory.
putting these into pt=ioremap(BUFFER_SIZE,BUFFER_OFFSET) I believe pt
should equal a virtual address to the physical memory located at the
2GB boundary up to the 3GB boundary. Is this assumption accurate?
That is not exactly true, for example on my machine
cat /proc/iomem
...
00001000-0009ebff : System RAM
...
00100000-1fffffff : System RAM
...
There may be several memory banks, and the memory not obligatory will start at address 0x0 of physical address space.
This might be usefull for you Dynamic DMA mapping Guide
I came across the following paragraph in an article about malloc.
The heap is a continuous (in term of virtual addresses) space of
memory with three bounds:a starting point, a maximum limit (managed
through sys/ressource.h’s functions getrlimit(2) and setrlimit(2)) and
an end point called the break. The break marks the end of the mapped
memory space, that is, the part of the virtual address space that has
correspondence into real memory.
I would like to better understand the concept of mapped region and unmapped region.
If memory addresses are 64 bits long, as in many modern computers, you have 18446744073709551616 possible memory addresses. (It depends on the processor architecture how many bits can actually be used, but addresses are stored using 64 bits.) That is more than 17 billion gigabytes, which is probably more memory than your computer actually has. So only some of those 17 billion gigabytes correspond to actual memory. For the rest of the addresses, the memory simply doesn't exist. There is no correspondence between the memory address and a memory location. Those addresses are, therefore, unmapped.
That is the simple explanation. In reality, it's a bit more complicated. The memory addresses of your program are not the actual memory addresses of the memory chips, the physical memory, in your computer. Instead, it is virtual memory. Each process has its own memory space, that is, its own 18446744073709551616 addresses, and the memory addresses that a process uses are translated to physical memory addresses by the computer hardware. So one process may have stored some data at memory address 4711, which is actually stored in a real physical memory chip over here, and another process may have also stored some data at memory address 4711, but that is a completely different place, stored in a real physical memory chip over there. The process-internal virtual memory addresses are translated, or mapped, to actual physical memory, but not all of them. The rest, again, are unmapped.
That is, of course, also a simplified explanation. You can use more virtual memory than the amount of physical memory in your computer. This is done by paging, that is, taking some chunks (called pages) of memory not being used right now, and storing them on disk until they are needed again. (This is also called "swapping", even if that term originally meant writing all the memory of a process to disk, not just parts of it.)
And to complicate it even further, some modern operating systems such as Linux and MacOS X (but, I am told, not Windows) overcommit when they allocate memory. This means that they allocate more memory addresses than can be stored on the computer, even using the disk. For example, my computer here with 32 gigabytes of physical memory, and just 4 gigabytes available for paging out data to disk, can't possibly allow for more than 36 gigabyes of actual, usable, virtual memory. But malloc happily allocates more than one hundred thousand gigabytes. It is not until I actually try to store things in all that memory that it is connected to physical memory or disk. But it was part of my virtual memory space, so I would call that too mapped memory, even though it wasn't mapped to anything.
The mapped region in the heap means the virtual memory area that can be mapped together with physical memory. And the unmapped region indicates the unused virtual memory space which does not point to any physical memory.
The boundary between mapped region for the heap and unmapped region is the system break point. As the malloc() is used to request some memory, the system break point would be moved to enlarge the mapped region. Linux system offeres brk() and sbrk() methods to increase and decrease the virtual memory address of the system break point.
I'm writing a kernel from scratch and am confused about what will happen once I initialize paging and map my kernel to a different location in virtual memory. My kernel is loaded to physical address 0x100000 on startup but I plan on mapping it to the virtual address 0xC0100000 so I can leave virtual address 0x100000 available for VM86 processes (more specifically, I plan on mapping physical addresses 0x100000 through 0x40000000 to virtual addresses 0xC0100000 through 0xFFFFF000). Anyway, I have a bitmap to keep track of my page frames located at physical address 0x108000 with the address stored in a uint32_t pointer. My question is, what will happen to this pointer when I initialize paging? Will it still point to my bitmap located at physical address 0x108000 or will it point to whatever the virtual address 0x108000 is mapped to in my page table? If the latter is true, how do I get around the problem that my pointers will not be correct once paging is enabled? Will I have to update my pointers or am I going about this completely wrong?
I understand if I try to print the address of an element of an array it would be an address from virtual memory not from real memory (physical memory) i.e DRAM.
printf ("Address of A[5] and A[6] are %u and %u", &A[5], &A[6]);
I found addresses were consecutive (assuming elements are chars). In reality they may not be consecutive at least not in the DRAM. I want to know the real addresses. How do I get that?
I need to know this for either Windows or Linux.
You can't get the physical address for a virtual address from user code; only the lowest levels of the kernel deal with physical addresses, and you'd have to intercept things there.
Note that the physical address for a virtual address may not be constant while the program runs — the page might be paged out from one physical address and paged back in to a different physical address. And if you make a system call, this remapping could happen between the time when the kernel identifies the physical address and when the function call completes because the program requesting the information was unscheduled and partially paged out and then paged in again.
The simple answer is that, in general, for user processes or threads in a multiprocessing OS such as Windows or Linux, it is not possible to find the address even of of a static variable in the processor's memory address space, let alone the DRAM address.
There are a number of reasons for this:
The memory allocated to a process is virtual memory. The OS can remap this process memory from time-to-time from one physical address range to another, and there is no way to detect this remaping in the user process. That is, the physical address of a variable can change during the lifetime of a process.
There is no interface from userspace to kernel space that would allow a userspace process to walk through the kernel's process table and page cache in order to find the physical address of the process. In Linux you can write a kernel module or driver that can do this.
The DRAM is often mapped to the procesor address space through a memory management unit (MMU) and memory cache. Although the MMU maping of DRAM to the processor address space is usually done only once, during system boot, the processor's use of the cache can mean that values written to a variable might not be written through to the DRAM in all cases.
There are OS-specific ways to "pin" a block of allocated memory to a static physical location. This is often done by device drivers that use DMA. However, this requires a level of privilege not available to userspace processes, and, even if you have the physical address of such a block, there is no pragma or directive in the commonly used linkers that you could use to allocate the BSS for a process at such a physical address.
Even inside the Linux kernel, virtual to physical address translation is not possible in the general case, and requires knowledge about the means that were used to allocate the memory to which a particular virtual address refers.
Here is a link to an article called Translating Virtual to Physical Address on Windows: Physical Addresses that gives you a hint as to the extreme ends to which you must go to get physical addresses on Windows.