I'm working on a very simple application which will test a device. There is no need for a driver and I have admin permissions.
I was going to use a mmap and this is where I got a confused.
The idea is to do the following
int devD = open("/path/to/my/device", "rw");
void *myDevPtr = mmap(start, length, prot, flags, devD, offset);
Here is where I found the documentation for it. I'm confused about every parameter except the file descriptor and the protection.
void *start. What exactly this is the start of? Is it the start of memory map for my device?
size_t length. My device has its own memory map. Is it the length of my devices memory map or this is something else?
int flags. This one puzzles me. If my file descriptor is a device, what do I set my flags to?
off_t offset. This one is also confusing. This is an offset from the start pointer, but what exactly is this offset into?
Another is question that I have is about communicating with the mmapped device. Say I need to write data to a specific register in the device. How would I do it?
I realize that these questions might look too simplistic, but I've been at it for some time now and couldn't find a concrete example that would address my situation.
Any help with this is really appreciated.
void *start. What exactly this is the start of? Is it the start of memory map for my device?
It's the logical address within your program where you want the mapping to occur. If you give it NULL, it will assign one [recommmended]. This is a "hint" and the address of the mapped area is the mmap return value
size_t length. My device has its own memory map. Is it the length of my devices memory map or this is something else?
[Not knowing your device], I would assume it's the same. But, say your device was 6GB long. You may want to access this in sections, so you might specify (e.g.) 1MB instead. And, then, remap later [see the offset section]
int flags. This one puzzles me. If my file descriptor is a device, what do I set my flags to?
Use MAP_SHARED so that what you write to the area is flushed to the "backing store" (which is your device).
off_t offset. This one is also confusing. This is an offset from the start pointer, but what exactly is this offset into?
No, it is not an offset from start. It is the offset within your device that the mapping should be done to (i.e. just like the offset for lseek).
UPDATE:
When you say: [Not knowing your device], I would assume it's the same. Do you mean that length is the length of the devices memory map?
From that standpoint, yes. If you want to map the entire area, which as I mentioned, can be large.
Normally, you map the entire device/file, starting at offset 0 for the length of the device/file.
I'm also a little confused about the offset. Say my device has a register at offset 0x100. In order to read/write this register I would need to set offset to 0x100. Am I correct?
Yes and no. You can do it two ways. Herein, let's call the mapped address [the return value from mmap] by the name mapbase.
(1) Give mmap an offset parameter of 0x100. Then, do (e.g.) val = *mapbase. Effectively, this is saying to the OS: "I only care about this one register and you handle the mapping to it"
(2) Give mmap an offset parameter of 0. Then, do val = mapbase[0x100] Effectively, this is saying: "I want a mapping to all the registers and I will handle the indexing/offsetting manually"
Method (2) is more usual (i.e. you want to create a single mapping that can access just about any register). If you use method (1), what about a register that is located at 0x80? It's inaccessible [unless you do a remap, which is time consuming].
UPDATE #2:
As arsv pointed out, you may need to open /dev/mem to map to a device's registers.
This depends upon your device's driver. Suppose we have /dev/mydevice. Now, suppose we do fdd = open("/dev/mydevice",O_RDWR)
It is up to the driver to provide a mapping between I/O done to the open file descriptor (fdd) and the device's registers.
Some drivers support this, but most don't. If the device does support this, then we do the mmap with fdd
If it doesn't we have to do fdm = open("/dev/mem",O_RDWR) and pass fdm to mmap. Of course, now the mmap offset parameter will be radically different.
Check Mapping a physical device to a pointer in User space.
start is virtual memory address to map the device to. Leave NULL there.
flags should be MAP_SHARED.
offset is into the file being mmaped; for /dev/mem, that would be page-aligned physical address of the device.
Then just write to the mmaped area.
char* ptr = mmap(..., [/dev/mem], BASE);
*(ptr + OFFSET) = value;
Keep in mind that the physical address in this case will be (BASE + OFFSET).
Related
I want to create a single kernel module driver for my application.
It interfaces with an AXIS FIFO in Programmable logic and I need to send the physical addresses of allocated memory to this device to be used in programmable logic.
My platform driver recognises the AXIS FIFO device, and using mmap makes its registers available to my user space app. (previous post of mine)
I also want to allocate memory to be used by the programmable logic, and do this by using and IOCTL command which calls a kmalloc with the given size as argument. Since I want to use the physical address - I get the physical address using __pa(x).
If I want to access this allocated memory to verify that the correct info was stored in RAM, how do I do this? Through
fd = open("/dev/mem", ...)
va = mmap (phys_address, ....)
The problem I have with this is that I can still wrongfully access parts of memory that I shouldn't. Is there a better way to do this?
Thanks.
I think the best way to do this is to create a /proc device file that maps to the allocated memory. Your kernel module kmalloc's the memory, creates the proc device, and services all the I/O calls to the device. Your userspace program reads and writes to this device, or possibly mmaps to it (if that will work, I'm not sure...).
:) I'm trying to port some legacy code (large program) to CentOS 7 but I'm hitting a snag. The core of the code is a rather awkard structure built around using mmap to allocate a hard-coded address and map a file to it. The file acts like a database (and is built by one) and includes hard-coded pointers to different sections of the mapped memory. Very ugly, but it is what it is. The entire program is built around this structure, and nobody is going to fund a rewrite.
The problem comes on the mmap line. This worked before, but no longer on CentOS 7:
mmapAddr = mmap ((void *) SMAddr, SMA_WINDOW_SIZE, PROT_READ | (readOnly ? 0 : PROT_WRITE),MAP_FILE | MAP_FIXED | MAP_SHARED, SMFileDesc, 0);
... where SMAddr is 0x8000000, SMA_WINDOW_SIZE is 127926272, and readOnly is false. So basically it's trying to map a file to the address 0x8000000 with size 122MB.
What might have changed between versions, I have no clue. But I do note that the file it's mapping is only 1,5MB. I'm not sure exactly why it needs to map so much more than the file size, but I know it's needed, and I know that a lot of nuance has gone into picking the size "122MB" for some reason.
Could a mismatch between actual file size and allocated size have been fine in the past but not any more? I know that SIGBUS means an attempt to access an invalid memory region. Given that mmap doesn't take any sort of allocated pointer, this has to be something it's doing internally.
I tried catching and blocking SIGBUS (thinking that maybe it'd be ignorable?), but the program still crashed with a SIGBUS at the same spot. Maybe I did that wrong.
Thoughts?
From here1:
The mmap() function can be used to map a region of memory that is
larger than the current size of the object. Memory access within the
mapping but beyond the current end of the underlying objects may
result in SIGBUS signals being sent to the process. The reason for
this is that the size of the object can be manipulated by other
processes and can change at any moment. The implementation should tell
the application that a memory reference is outside the object where
this can be detected; otherwise, written data may be lost and read
data may not reflect actual data in the object.
Note that references beyond the end of the object do not extend the
object as the new end cannot be determined precisely by most virtual
memory hardware. Instead, the size can be directly manipulated by
ftruncate().
So most likely the bug is that your program tries to access a region of the mapped memory which lies outside the file. The mmap call should succeed, however. Which return value do you get?
The main reason for I/O memory region is to read/write anything to that memory.
If the register address is given, we can use readx/writex (x stands for b/l/w).
Then why do we have to use the address returned by io_remap which is nothing but the same as the address of the particular register given in the data sheet?
ioremap is architecture specific function/macro. On some architectures it won't do anything and just basically return the address specified as an argument. It may do much more than that on other architectures, though. Take arm or x86 as an example - the ioremap will do a lot of checks before letting you using the memory region, for example.
What's more important than those checks, however, is that ioremap can setup a mapping of virtual addresses (from vmalloc area) to the requested physical ones and ensure that caching is disabled for addresses you are going to use. So in most cases pointer returned by ioremap will not be the same as just numeric address from the datasheet.
You want caching to be disabled because I/O registers are controlled by some external (from CPU point of view) devices. This means that processor can't know when its content changed, making cache content invalid.
The thing returned by request_mem_region is a struct resource *, you don't use it to access the I/O memory, and you don't have to do much of anything with it except check it for NULL. request_mem_region isn't part of the mapping you need to do to access the I/O, and your driver would actually (probably) work without it, but by calling it you make some information available in kernel data structures, and make sure that two drivers aren't trying to use overlapping memory ranges.
I am trying to get the (physical) location associated with a particular byte inside a file. How would I go about doing that?
I can't do this in C, because I would have to read the file into a buffer and if I tried getting the physical (not virtual) address in RAM I would get the address of the buffer not the particular byte that is in the file.
Help would be greatly appreciated.
Thanks
Map the shared memory into your process via mmap, access the page containing the bad data, then read /proc/self/pagemap to find information about how the virtual memory page maps to physical memory.
* /proc/pid/pagemap. This file lets a userspace process find out which
physical frame each virtual page is mapped to. It contains one 64-bit
value for each virtual page, containing the following data (from
fs/proc/task_mmu.c, above pagemap_read):
* Bits 0-54 page frame number (PFN) if present
* Bits 0-4 swap type if swapped
* Bits 5-54 swap offset if swapped
* Bits 55-60 page shift (page size = 1<<page shift)
* Bit 61 reserved for future use
* Bit 62 page swapped
* Bit 63 page present
If the page is not present but in swap, then the PFN contains an
encoding of the swap file number and the page's offset into the
swap. Unmapped pages return a null PFN. This allows determining
precisely which pages are mapped (or in swap) and comparing mapped
pages between processes.
Efficient users of this interface will use /proc/pid/maps to
determine which areas of memory are actually mapped and llseek to
skip over unmapped regions.
Note: This seems to be on newer kernels only. Also, here is how to translate the PFN into a physical address.
In order to do that, you should inspect the filesystem directly through a device. Which means you should be able to locate bitmap tables, i-node tables, directory entries and similar things. This is not trivial at all with modern and upcoming filesystems (e.g. Btrfs).
Apart from that, you should have to deal with block and sector offsets or addresses (maybe LBA or maybe cylinder based).
So, in my opinion, the answer is no or, at least, its solution would be incredibly complex.
I was stracing some of the common commands in the linux kernel, and saw mprotect() was used a lot many times. I'm just wondering, what is the deciding factor that mprotect() uses to find out that the memory address it is setting a protection value for, is in its own address space?
On architectures with an MMU1, the address that mprotect() takes as an argument is a virtual address. Each process has its own independent virtual address space, so there's only two possibilities:
The requested address is within the process's own address range; or
The requested address is within the kernel's address range (which is mapped into every process).
mprotect() works internally by altering the flags attached to a VMA2. The first thing it must do is look up the VMA corresponding to the address that was passed - if the passed address was within the kernel's address range, then there is no VMA, and so this search will fail. This is exactly the same thing happens if you try to change the protections on an area of the address space that is not mapped.
You can see a representation of the VMAs in a process's address space by examining /proc/<pid>/smaps or /proc/<pid>/maps.
1. Memory Management Unit
2. Virtual Memory Area, a kernel data structure describing a contiguous section of a process's memory.
This is about virtual memory. And about dynamic linker/loader. Most mprotect(2) syscalls you see in the trace are probably related to bringing in library dependencies, though malloc(3) implementation might call it too.
Edit:
To answer your question in comments - the MMU and the code inside the kernel protect one process from the other. Each process has an illusion of a full 32-bit or 64-bit address space. The addresses you operate on are virtual and belong to a given process. Kernel, with the help of the hardware, maps those to physical memory pages. These pages could be shared between processes implicitly as code, or explicitly for interprocess communications.
The kernel looks up the address you pass mprotect in the current process's page table. If it is not in there then it fails. If it is in there the kernel may attempt to mark the page with new access rights. I'm not sure, but it may still be possible that the kernel would return an error here if there were some special reason that the access could not be granted (such as trying to change the permissions of a memory mapped shared file area to writable when the file was actually read only).
Keep in mind that the page table that the processor uses to determine if an area of memory is accessible is not the one that the kernel used to look up that address. The processor's table may have holes in it for things like pages that are swapped out to disk. The tables are related, but not the same.