Confusion about second argument to remap_page_range

Confusion about second argument to remap_page_range - c

While going through the remap_page_range(), confused about the usage of second argument to it, as pointed out in the following thread, it expect this argument to be a physical address.
https://www.oreilly.com/library/view/linux-device-drivers/0596000081/ch13s02.html
But in the implementation :
int simple_mmap(struct file *filp, struct vm_area_struct *vma)
{
unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
if (offset >= _ _pa(high_memory) || (filp->f_flags & O_SYNC))
vma->vm_flags |= VM_IO;
vma->vm_flags |= VM_RESERVED;
if (remap_page_range(vma->vm_start, offset,
vma->vm_end-vma->vm_start, vma->vm_page_prot))
return -EAGAIN;
return 0;
}
It is derived from vma->vm_pgoff that seems to the virtual address (related to current user space VMA), then how the offset value is related to physical address ?
EDIT:
I see that remap_page_range() is no more part of latest Kernel source and is converted to remap_pfn_range where 3rd argument seems to be page frame number (that is derived from physical address) but I still see vm_pgoff is used for it.
https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/hw/bnxt_re/ib_verbs.c#L3936
So, how vm_pgoff is related to Physical address ?

Related

Write to physical memory from kernel module

This question has been asked at least a dozen times but I cannot figure out where is my issue.
I am writing a kernel module that must read data from a reserved memory range. These data are written by an external device.
To control the the device, we have a second register within which I want to write some data.
And this is where I start to get lost...
This is the part of the code that, from my understanding, should create a virtual mapping from the reg input in my device tree:
// Read the control memory and map to virtual address
res_ctrl = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res_ctrl) {
dev_err(&pdev->dev, "can't get device resources\n");
return -ENOENT;
}
p3chvideo_device->pchv_ctrl.paddr = res_ctrl->start;
p3chvideo_device->pchv_ctrl.size = resource_size(res_ctrl);
struct resource* res_d = request_mem_region(p3chvideo_device->pchv_ctrl.paddr, p3chvideo_device->pchv_ctrl.size, "p3chv");
p3chvideo_device->pchv_ctrl.vaddr = ioremap_nocache(p3chvideo_device->pchv_ctrl.paddr, p3chvideo_device->pchv_ctrl.size);
if (!p3chvideo_device->pchv_ctrl.vaddr) {
pr_info("Control buffer allocated vaddr: 0x%0llX paddr: 0x%0llX (size: 0x%0llX)\n", p3chvideo_device->pchv_ctrl.vaddr, p3chvideo_device->pchv_ctrl.paddr, p3chvideo_device->pchv_ctrl.size);
return -EADDRNOTAVAIL;
}
//p3chvideo_device->pchv_ctrl.vaddr = devm_ioremap(&pdev->dev, p3chvideo_device->pchv_ctrl.paddr, p3chvideo_device->pchv_ctrl.size);//ioremap(p3chvideo_device->pchv_ctrl.paddr, p3chvideo_device->pchv_ctrl.size);
pr_info("Control buffer allocated vaddr: 0x%0llX paddr: 0x%0llX (size: 0x%0llX)\n", p3chvideo_device->pchv_ctrl.vaddr, p3chvideo_device->pchv_ctrl.paddr, p3chvideo_device->pchv_ctrl.size);
From the messages in the kernel, the register is correctly detected (offset, size). In addition, I do see in /proc/iomem the reserved memory.
However, when I try to write then read the results, it doesn't work, the value I read is different from the value I wrote... It is as if the register value wasn't altered by the write operation.
static void buffer_loaded_enable_interrupt(void) {
pr_info("buffer loaded enable interrupt 0x%0llX 0x%0X\n", p3chvideo_device->pchv_ctrl.vaddr + IRQ_ENABLE_BUFFER_LOADED, (u32)(1 << 0));
// Clear buffer Loaded Interrupt
//wmb();
pr_info("Stored value before: 0x%0X", readl(p3chvideo_device->pchv_ctrl.vaddr + IRQ_ENABLE_BUFFER_LOADED));
//*(p3chvideo_device->pchv_ctrl.vaddr + IRQ_ENABLE_BUFFER_LOADED) = (u32)(1 << 0);
iowrite32((u32)(1 << 0), p3chvideo_device->pchv_ctrl.vaddr + IRQ_ENABLE_BUFFER_LOADED);
udelay(100);
pr_info("Stored value: 0x%0X", readl(p3chvideo_device->pchv_ctrl.vaddr + IRQ_ENABLE_BUFFER_LOADED));
}
If I use a devmem approach, I can write ahd check the read and it works...
What am I missing?

I finally found the issue and I am posting it so if anyone has the same kind of problem, maybe this could help!
The problem was not in the code I put above but in the type of vaddr. It was set as a ssize_t.
ssize_t vaddr;
As soon as I set it to void __iomem *, everything worked as expected!
void __iomem *vaddr;
Let's hope this post will help someone like me :)
Thanks

Dereferncing a pointer doesn't return the real value in the memory address

I'm developing an embedded application on STM8S using STVD IDE and Cosmic C compiler. I'm trying to read FLASH memory byte by byte to calculate CRC. Following is my code snippet:
uint32_t crc32_buffer(const uint8_t *buf, uint32_t len)
{
uint32_t index = 0;
uint32_t crc = 0xFFFFFFFF;
uint32_t flashIndex = 0;
uint8_t *ptr = buf;
volatile uint8_t value = 0;
volatile uint8_t i = 0;
for (index = 0; index < len; index++)
{
value = *ptr;
flashIndex = (crc & 0xFF) ^ value;
ptr++;
crc = (crc >> 8) ^ table[flashIndex];
if(bytesCntr >= 2685)
{
i++;
}
}
return ~crc;
}
The code works fine until 2694 bytes are read from the FLASH. Viewing Memory in the debugging session, I make sure that the next byte in the FLASH has value of 0C. Checking the value of ptr, I make sure it has the address of this 0C byte in the FLASH (which is 0x8B15). However, value variable always get the value of 8B instead of 0C after ptr is dereferenced.
I also tried to exclude unnecessary variables so it be like this:
crc = (crc >> 8) ^ table[(crc & 0xFF) ^ buf[index]];
But the table index was not as it should be as the memory location was read as 8B instead of 0C.
I found that the byte before and the byte after address 0x8B15 are read correctly. Only this address is read wrongly.
UPDATE-1
The disassembly of the value = *ptr; is as following:
LDW X, (0x11,SP)
LD A, (X)
LD (0x13,SP),A
When reading the byte at address 0x8B15, if I put a breakpoint at the second assembly line and then the value in the memory location is read correctly as 0C. However, if I put the breakpoint at the third assembly line instead, I find that register X has 0x8B15 (the right address) but register A has 0x8B (the wrong value).
UPDATE-2
I added an if statement inside the for loop for debugging (to put my breakpoint). I found that the code saved in memory byte which is read wrongly is always the code inside this if statement. The disassembly of this code always have something to do with SP. Even if I changed the code, the problematic memory byte is always the first instruction in the if statement. And I also noticed that the wrong read value is always 0x8B regardless what is the right value. Here is the disassembly saved in this memory location:
0x8b15 <crc32_buffer+104> 0x0C01 INC (0x01,SP) INC (_CRC_ONGOING_s,SP)

I came across the same issue last week .. It seems to be a problem with the debugging Firmware and your code both accessing the same location. If you have an active breakpoint at that same Flash location you are trying to read with your code, then your code ends up reading 0x8B from that location. If you remove or deactivate all breakpoints, the location is read correctly..

In addition to my previous answer (see above or below ..I couldn't edit that one).. Active breakpoints substitute the existing instruction at that particular Flash memory location with a BREAK instruction (opcode 0x8B), so when that memory location is read from within the application code, 0x8B will be the result.
So this is not really a 'problem', but rather a limitation of software breakpoints as implemented within the SWIM debugging firmware on the STM8S.

Translate virtual address to physical from multi-level pagetabels

I'm trying to convert an virtual memory address to a physical one, but can't get it to work. I'm currently doing an operating system as an assignment, and now I have to implement a printf function for the usermode, so when you invoke the write syscall the system should print the content of the array in usermode to the serial port(for now), and to do that i have to convert the address from virtual to physical.
Here is my code from the syscall handler:
Pcb* pcb = getCR3(); // contains the page directory for usermode
setCR3(kernelPageDir); // set the CR3 register to the kernel page directory
uint32_t tableNum = (vAddr >> 22) & 0x3ffUL; // get the upper 10 bits
uint32_t pageIndex = (vAddr >> 12) & 0x3ffUL // get the middle 10 bits
uint32_t offset = vAddr & 0xfffUL; // get the 12 lower bits
uint32_t* topTable = pcb->pageDirectory[tableNum]; // Access the top level table
uint32_t lowTable = topTable[pageIndex]; // Entry to the 2nd table
uint32_t* addr = lowTable + offset; // Should be the physical address
serialPrintf("Structure: tableNum=%08x pageIndex=%08x offset=%08x\n", tableNum, pageIndex, offset);
serialPrintf("Address: topTable=%08x lowTable=%08x addr=%08x\n",topTable, lowTable, addr);
serialPrintf("Char:%c", (char*)addr[0]);
When I run the code, it gives me a page fault when trying to access the value of it:
Structure: tableNum=00000020 pageIndex=00000048 offset=00000378
Address: topTable=00000000 lowTable=0015d000 addr=0015d378
Page fault! errcode=00000000 addr=0015d378
Here is the part from the book that explains the structure of the pages:

If you fill out a pte for the last (index 1023) entry in the second level table which is pointed to by the last (index 1023) of the first level table, then:
0xfffff000 .. 0xfffffffc will by an alias of the first level table,
and
0xffc00000 .. 0xffffffff will be an alias of the entire (sparse) page table.
For example:
int vtop(unsigned vaddr, unsigned *pa) {
unsigned *pdtb = (unsigned *)0xfffff000;
unsigned *pte = (unsigned *)0xffc00000;
if (ptdb[vaddr>>20] & 1) {
if (pte[vaddr>>12] & 1) {
*pa = pte[vaddr>>12] &~0xfff;
return 0;
}
return 2;
}
return 1;
}
You can do this for any index and adjust the pointer values, but if gets more confusing, and 0xffc/20 is out of the way.

PCIE region not aligned, and not consistent

I am developing a PCIE device driver for openwrt, and I met a data bus error when trying to access the io-memory in timer interrupt, which I mentioned in my last question. After lots of research I think I might have found the reason, but I am unable to solve it. Below are my troubles.
Last week I found out that the pcie region size might have changed during system startup. The region size of bar0 is 4096 in my driver (return from pci_resource_len) and the region size is 4097 in lspci -vv, which breaks the page size of linux kernel. By reading the source code of pciutil, I find that lspci command fetch the pcie information from /sys/devices/pci0000:00/0000:00:00.0/resouce file. So I remove all my custom components and run the original openwrt on my router. By cat /sys/devices/pci0000:00/0000:00:00.0/resouce, the first line of the result (bar0) is
0x0000000010008000 0x0000000010009000 0x0000000000040200
Moreover, I also check the content of /proc/iomem, and the content related to PCIE is
10000000-13ffffff : mem_base
10000000-13ffffff : PCI memory space
10000000-10007fff : 0000:00:00.0
10008000-10008fff : 0000:00:00.0
It is super weird that the region size of bar0 indicated by the two files above is different! According to the mechanism of PCIE, the region size should always be the power of 2. How come the region size becomes 4097?

After spending weeks reading source code of linux kernel, I find out that this is a bug of linux kernel 4.4.14.
The content of /sys/devices/pci0000:00/0000:00:00.0/resouce is generated through function resource_show in file drivers/pci/pci-sysfs.c. The related code is
for (i = 0; i < max; i++) {
struct resource *res = &pci_dev->resource[i];
pci_resource_to_user(pci_dev, i, res, &start, &end);
str += sprintf(str, "0x%016llx 0x%016llx 0x%016llx\n",
(unsigned long long)start,
(unsigned long long)end,
(unsigned long long)res->flags);
}
The function pci_resource_to_user actually invoked is located in arch/mips/include/asm/pci.h
static inline void pci_resource_to_user(const struct pci_dev *dev, int bar,
const struct resource *rsrc, resource_size_t *start,
resource_size_t *end)
{
phys_addr_t size = resource_size(rsrc);
*start = fixup_bigphys_addr(rsrc->start, size);
*end = rsrc->start + size;
}
The calculation of *end is wrong and should be replace by
*end = rsrc->start + size - (size ? 1 : 0)

In u-boot, kernel_entry points to which function?

This is the function from u-boot:
static void boot_jump_linux(bootm_headers_t *images, int flag)
{
#ifdef CONFIG_ARM64
void (*kernel_entry)(void *fdt_addr);
int fake = (flag & BOOTM_STATE_OS_FAKE_GO);
kernel_entry = (void (*)(void *fdt_addr))images->ep;
debug("## Transferring control to Linux (at address %lx)...\n",
(ulong) kernel_entry);
bootstage_mark(BOOTSTAGE_ID_RUN_OS);
announce_and_cleanup(fake);
if (!fake)
kernel_entry(images->ft_addr);
#else
unsigned long machid = gd->bd->bi_arch_number;
char *s;
void (*kernel_entry)(int zero, int arch, uint params);
unsigned long r2;
int fake = (flag & BOOTM_STATE_OS_FAKE_GO);
kernel_entry = (void (*)(int, int, uint))images->ep;
s = getenv("machid");
if (s) {
strict_strtoul(s, 16, &machid);
printf("Using machid 0x%lx from environment\n", machid);
}
debug("## Transferring control to Linux (at address %08lx)" \
"...\n", (ulong) kernel_entry);
bootstage_mark(BOOTSTAGE_ID_RUN_OS);
announce_and_cleanup(fake);
if (IMAGE_ENABLE_OF_LIBFDT && images->ft_len)
r2 = (unsigned long)images->ft_addr;
else
r2 = gd->bd->bi_boot_params;
if (!fake)
kernel_entry(0, machid, r2);
#endif
}
I understood from the related question: Trying to understand the usage of function pointer that kernel_entryis a pointer to a function. Can someone help me understand where that function is defined? I don't even know the name of this function so I failed to grepit.
NOTE: The entire u-boot source code is here.

Indeed kernel_entry is a function pointer. It is initialized from the ep field of the piece of data passed in called images, of type bootm_header_t. The definition of that struct is in include/image.h. This is the definition of a bootable image header, ie the header of a kernel image which contain the basic info to boot that image from the boot loader. Obviously, to start it, you need a program entry point, similarly to the main function in regular C programs.
In that structure, the entry point is simply defined as a memory address (unsigned long), which the code you listed cast into that function pointer.
That structure as been obtained from loading the first blocks of the image file on disk, whose location is known already by the boot loader.
Hence the actual code pointed by that function pointer belongs to a different binary, and the definition of the function must be located in a different source code. For a linux kernel, this entry point is an assembly hand coded function, whose source is in head.S. This function being highly arch dependent, you will find many files of that name implementing it accross the kernel tree.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Confusion about second argument to remap_page_range - c

Related

Write to physical memory from kernel module

Dereferncing a pointer doesn't return the real value in the memory address

Translate virtual address to physical from multi-level pagetabels

PCIE region not aligned, and not consistent

In u-boot, kernel_entry points to which function?

Categories

Resources