I/O memory region remap - c

The main reason for I/O memory region is to read/write anything to that memory.
If the register address is given, we can use readx/writex (x stands for b/l/w).
Then why do we have to use the address returned by io_remap which is nothing but the same as the address of the particular register given in the data sheet?

ioremap is architecture specific function/macro. On some architectures it won't do anything and just basically return the address specified as an argument. It may do much more than that on other architectures, though. Take arm or x86 as an example - the ioremap will do a lot of checks before letting you using the memory region, for example.
What's more important than those checks, however, is that ioremap can setup a mapping of virtual addresses (from vmalloc area) to the requested physical ones and ensure that caching is disabled for addresses you are going to use. So in most cases pointer returned by ioremap will not be the same as just numeric address from the datasheet.
You want caching to be disabled because I/O registers are controlled by some external (from CPU point of view) devices. This means that processor can't know when its content changed, making cache content invalid.

The thing returned by request_mem_region is a struct resource *, you don't use it to access the I/O memory, and you don't have to do much of anything with it except check it for NULL. request_mem_region isn't part of the mapping you need to do to access the I/O, and your driver would actually (probably) work without it, but by calling it you make some information available in kernel data structures, and make sure that two drivers aren't trying to use overlapping memory ranges.

Related

Reserving memory for DMA usage

I am trying to make use of a contiguous memory i reserved while passing the "mem" parameter to Linux when booting.
Now, i have the physical address of this space i reserved earlier, and the length of it, and i wish to make use of this reserved space for DMA purposes in my driver.
Normally i would use dma_alloc_coherent() , and if i were using CMA i would use that too, but in this case, its different.
Now, i have read that an acceptable way of mapping a physical space to kernel virtual space is to use ioremap
And, an acceptable way of "taking over" a contiguous space for DMA purposes is to use dma_map_single (mapping it for bus address)
I'm having trouble combining the two. ioremap works and returns a virtual address. Now, i have read that this is no ordinary virtual address and i should only be using access methods to read/write from this memory.
Thing is, when i try to pass this virtual address to dma_map_single , it doesn't report an error, but i suspect that this is wrong.
Am i doing it right? What can i do to make it work like it should?
10x
You are doing right
You don't need to allocate the memory because you already set it on boot time but you need to use dam_map_single to prevent cache problems for example if you want to do a DMA from memory to the device but the RAM is not synchronised with the L2 cache (the cache has a newer version) you will get the wrong data so you need to map and unmap before and after the DMA operation

memcpy takes virtual address or physical address?

I am working on Video HAL Application & there I am getting Camera frame CallBack from HAL Layer. During programming I found that memcpy copying data from physical address gets crashed while it is ok by copying data from virtual address. I searched for such a information about memcpy but found no where & even not on its man page.
so, my question is does memcpy required physical address or virtual address? Anywhere mentioned this type of information about memcpy?
memcpy is implemented in C or optimized in assembler. As such, it doesn't care about what type of address it gets. It just loads the addresses in the CPU registers and executes mov instructions.
It is the operating system and memory hardware architecture that are responsible for mapping any logical (virtual) address to a physical address.
Note also that with modern OS/memory architectures, each process gets its own address space. Passing addresses between address spaces will not work.
In these cases, the OS will probably provide functionality to exchange memory objects (shared or otherwise) between processes.
As Paul Ogilvie correctly explained, memcpy deals with user space addresses. As such they are virtual addresses, not necessarily physical addresses.
Yet there is a possibility for very large areas with very specific alignment characteristics to optimize memcpy by requesting the OS to remap some of the destination virtual addresses as duplicates of the physical memory mapped to the source addresses. These pages would get the COPY_ON_WRITE attribute to ensure that the program only modifies pages in the appropriate array, if and when it writes to either one. The generic implementation of memcpy in the GlibC does this (see glibc-2.3.5/sysdeps/generic/memcpy.c). But this is transparent for the programmer who still provides addresses in user space.

Is the Kernel Virtual Memory struct first formed when the process is about to execute?

I have been bothering with similar questions indirectly on my other posts. Now, my understanding is better. Thus, my questions are better. So, I want to summarize the facts here. This example is based on X86-32-bit system.
Please say yes/no to my points. If no, then please explain.
MMU will look into the CR3 register to find the Process - Page Directory base address.
The CR3 register is set by the kernel.
Now MMU after reading the Page directory base address, will offset to the Page Table index (calculated from VA), from here it will read the Page frame number, now it will find the offset on the page frame number based on the VA given. It gets the physical memory address. All this is done in MMU right? Don't know when MMU is disabled, who will do all this circus? If software then it will be slow right?
I know then page fault occurs when the MMU cannot resolve the address. The kernel is informed. The kernel will update the page table based on the reading from kernel virtual memory area struct. Am I correct?
Keeping in mind, the point 4. Does it mean that before executing any process. Perhaps during loading process. Does Kernel first fills the kernel virtual memory area struct. For example, where the section of memory will be BSS, Code, DS,etc. It could be that some sections are in RAM, and some are in Storage device. When the sections of the program is moved from storage to main memory, I am assuming that kernel would be updating the Kernel virtual memory area struct. Am I correct here? So, it is the kernel who keeps a close track on the program location - whether in storage device or RAM - inode number of device and file offset.
Sequence wise -> During Process loading ( may be a loader program)-> Kernel will populate the data in the kernel virtual memory area struct. It will also set the CR3 register. Now Process starts executing, it will initially get some frequent page faults.Now the VM area struct will be updated (if required) and then the page table. Now, MMU will succeed in translating the address. So, when I say process accessing a memory, it is the MMU which is accessing the memory on behalf of the process. This is all about user-space. Kernel space is entirely different. The kernel space doesn't need the MMU, it can directly map to the physical address - low mem. For high mem ( to access user space from kernel space), it will do the temporary page table updation - internally. This is a separate page table for kernel, a temporary one. The kernel space doesn't need MMU. Am I correct?
Don't know when MMU is disabled, who will do all this circus?
Nobody. All this circus is intended to do two things: translate the virtual address you gave it into a real address, and if it can't do that then to abort the instruction entirely and start executing a routine addressed from an architecturally pre-defined address, see "page fault" there for the basic one.
When the MMU is shut off, no translation is done and the address you gave it is fed directly down the CPU's address-processing pipe just as any address the MMU might have translated it to would have been.
So, when I say process accessing a memory, it is the MMU which is accessing the memory on behalf of the process.
You're on the right track here, the MMU is mediating the access, but it isn't doing the access. It's doing only what you described before, translating it. What's generally called the Load/Store unit, gets it next, and it's the one that handles talking to whatever holds the closest good copy of the data at that address, "does the access".
The kernel space doesn't need the MMU, it can directly map to the physical address
That depends on how you define "need". It can certainly shut it off, but it almost never does. First, it has to talk to user space, and the MMU has to be running to translate what user space has to addresses the Load-Store unit can use. Second, the flexibility and protection provided by the MMU are very valuable, they're not discarded without a really compelling reason. I know at least one OS will (or would, it's been a while) run some bulk copies MMU-off, but that's about it.

C accessing memory location

I mean the physical memory, the RAM.
In C you can access any memory address, so how does the operating system then prevent your program from changing memory address which is not in your program's memory space?
Does it set specific memory adresses as begin and end for each program, if so how does it know how much is needed.
Your operating system kernel works closely with memory management (MMU) hardware, when the hardware and OS both support this, to make it impossible to access memory you have been disallowed access to.
Generally speaking, this also means the addresses you access are not physical addresses but rather are virtual addresses, and hardware performs the appropriate translation in order to perform the access.
This is what is called a memory protection. It may be implemented using different methods. I'd recommend you start with a Wikipedia article on this subject — http://en.wikipedia.org/wiki/Memory_protection
Actually, your program is allocated virtual memory, and that's what you work with. The OS gives you a part of the RAM, you can't access other processes' memory (unless it's shared memory, look it up).
It depends on the architecture, on some it's not even possible to prevent a program from crashing the system, but generally the platform provides some means to protect memory and separate address space of different processes.
This has to do with a thing called 'paging', which is provided by the CPU itself. In old operating systems, you had 'real mode', where you could directly access memory addresses. In contrast, paging gives you 'virtual memory', so that you are not accessing the raw memory itself, but rather, what appears to your program to be the entire memory map.
The operating system does "memory management" often coupled with TLB's (Translation Lookaside Buffers) and Virtual Memory, which translate any address to pages, which the operation system can tag readable or executable in the current processes context.
The minimum requirement for a processors MMU or memory management unit is in current context restrict the accessable memory to a range which can be only set in processors registers in supervisor mode (as opposed to user mode).
The logical address is generated by the CPU which is mapped to the physical address by the memory mapping unit. Unlike the physical address space the logical address is not restricted by memory size and you just get to work with the logical address space. The address binding is done by MMU. So you never deal with the physical address directly.
Most computers (and all PCs since the 386) have something called the Memory Management Unit (or MMU). It's job is to translate local addresses used by a program into the physical addresses needed to fetch real bytes from real memory. It's the operating system's job to program the MMU.
As a result of this, programs can be loaded into any region of memory and appear, from that program's point of view while executing, to be be any any other address. It's common to find that the code for all programs appear (locally) to be at the same address and their data always appears (locally) to be at the same address even though physically they will be in different locations. With each memory access, the MMU transparently translates from the local address space to the physical one.
If a program trys to access a memory address that has not been mapped into its local address space, the hardware generates an exception and typically gets flagged as a "segmentation violation", followed by the forcible termination of the program. This protects from accessing the memory of other processes.
But that doesn't have to be the case! On systems with "virtual memory" and current resource demands on RAM that exceed the amount of physical memory, some pages (just blocks of memory of a common size, often on the order of 4-8kB) can be written out to disk and given as RAM to a program trying to allocate and use new memory. Later on, when that page is needed by whatever program owns it, the memory access causes an exception and the OS swaps out some other memory page and re-loads the needed one from disk. The program that "page-faulted" gets delayed while this happens but otherwise notices nothing.
There are lots of other tricks the MMU/OS can do as well, like sharing memory between processes, making a disk file appear to be direct-memory-accessible, setting some pages as "NX" so they can't be treated as executable code, using arbitrary sections of the logical memory space regardless of how much and at what address the physical ram uses, and more.

How does mprotect() work?

I was stracing some of the common commands in the linux kernel, and saw mprotect() was used a lot many times. I'm just wondering, what is the deciding factor that mprotect() uses to find out that the memory address it is setting a protection value for, is in its own address space?
On architectures with an MMU1, the address that mprotect() takes as an argument is a virtual address. Each process has its own independent virtual address space, so there's only two possibilities:
The requested address is within the process's own address range; or
The requested address is within the kernel's address range (which is mapped into every process).
mprotect() works internally by altering the flags attached to a VMA2. The first thing it must do is look up the VMA corresponding to the address that was passed - if the passed address was within the kernel's address range, then there is no VMA, and so this search will fail. This is exactly the same thing happens if you try to change the protections on an area of the address space that is not mapped.
You can see a representation of the VMAs in a process's address space by examining /proc/<pid>/smaps or /proc/<pid>/maps.
1. Memory Management Unit
2. Virtual Memory Area, a kernel data structure describing a contiguous section of a process's memory.
This is about virtual memory. And about dynamic linker/loader. Most mprotect(2) syscalls you see in the trace are probably related to bringing in library dependencies, though malloc(3) implementation might call it too.
Edit:
To answer your question in comments - the MMU and the code inside the kernel protect one process from the other. Each process has an illusion of a full 32-bit or 64-bit address space. The addresses you operate on are virtual and belong to a given process. Kernel, with the help of the hardware, maps those to physical memory pages. These pages could be shared between processes implicitly as code, or explicitly for interprocess communications.
The kernel looks up the address you pass mprotect in the current process's page table. If it is not in there then it fails. If it is in there the kernel may attempt to mark the page with new access rights. I'm not sure, but it may still be possible that the kernel would return an error here if there were some special reason that the access could not be granted (such as trying to change the permissions of a memory mapped shared file area to writable when the file was actually read only).
Keep in mind that the page table that the processor uses to determine if an area of memory is accessible is not the one that the kernel used to look up that address. The processor's table may have holes in it for things like pages that are swapped out to disk. The tables are related, but not the same.

Resources