mmap'ed memory access very slow - c

I use v4l2 to get video frame from camera using streaming io and need to do some calculation on the frame.
However, accessing the frame memory is 10 times slower than allocating
a malloc'ed memory.
I guess the mmap'ed frame memory is not cacheable by cpu.
Here is the test code.
//mmap video buffers
struct v4l2_buffer buf;
CLEAR(buf);
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
buf.index = i;
if (-1 == xioctl(_fdCamera, VIDIOC_QUERYBUF, &buf)) {
ERROR("VIDIOC_QUERYBUF");
goto error_querybuf;
}
_buffers[i].start = mmap(NULL, buf.length,
PROT_READ | PROT_WRITE,
MAP_SHARED, _fdCamera, buf.m.offset);
//use mmap'ed buffer to do some calculation
u16* frame=_buffers[0].start;
u64 sum=0;
u16* p=frame;
time[0]=GetMicrosecond64();
while(p!=frame+PIXELS){
sum+=*p;
p++;
}
time[1]=GetMicrosecond64();
printf("mmap:sum %lld,time %lld\n",sum, time[1] - time[0]);
//use a copy of data to do some calculation
u16* frame_copy=(u16*)malloc(PIXELS*2);
memcpy(frame_copy,frame,PIXELS*2);
sum=0;
p=frame_copy;
time[0]=GetMicrosecond64();
while(p!=frame_copy+PIXELS){
sum+=*p;
p++;
}
time[1]=GetMicrosecond64();
printf("malloc:sum %lld,time %lld\n",sum, time[1] - time[0]);
update:
I use s5pv210 with linux-2.6.35.The fimc_dev.c indicates the mmaped is uncached.
How to make the frame buffer memory support both DMA and cache?
static inline int fimc_mmap_cap(struct file *filp, struct vm_area_struct *vma)
{
struct fimc_prv_data *prv_data =
(struct fimc_prv_data *)filp->private_data;
struct fimc_control *ctrl = prv_data->ctrl;
u32 size = vma->vm_end - vma->vm_start;
u32 pfn, idx = vma->vm_pgoff;
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
vma->vm_flags |= VM_RESERVED;
/*
* page frame number of the address for a source frame
* to be stored at.
*/
pfn = __phys_to_pfn(ctrl->cap->bufs[idx].base[0]);
if ((vma->vm_flags & VM_WRITE) && !(vma->vm_flags & VM_SHARED)) {
fimc_err("%s: writable mapping must be shared\n", __func__);
return -EINVAL;
}
if (remap_pfn_range(vma, vma->vm_start, pfn, size, vma->vm_page_prot)) {
fimc_err("%s: mmap fail\n", __func__);
return -EINVAL;
}
return 0;
}

Related

How to access PCI / MCH bar in linux

The background is based on Intel EDS document (example_doc)
see Section 5 ~ 5.2
I was trying to access PCI configuration space to get the CPU vendor ID,
and then accessing the MCH Bar to read out some information under OS base, not developing the driver.
I had search for few days, but most of information are driver related, or using linux kernel library such as pci.h, which is not I prefer.
My Process:
get mcfg base address from /sys/firmware/acpi/tables/MCFG
calculate actual address with bus0, dev0, func0, reg0 (ie. lspci 0000:00:00.0)
read data from the address
Code in C:
BOOLEAN IoConfigRead(uint32_t bus, uint32_t dev, uint32_t func, uint32_t reg, void* Buffer, unsigned long length){
static uint32_t pcieBar = 0;
int fd,fd1;
int sz;
uint32_t buffer[12];
uint32_t addr;
uint32_t *p;
uint32_t value;
if(pcieBar == 0){
fd = open("/sys/firmware/acpi/tables/MCFG",0);
if(fd == -1){
printf("Error opening /sys/firmware/acpi/tables/MCFG");
exit(EXIT_FAILURE);
}
sz = read(fd, buffer, 48);
if (sz != 48) {
printf("couldn't read 48 bytes from MCFG, sz=%d\n", sz);
exit(EXIT_FAILURE);
}
if (close(fd)) {
perror("Error closing /sys/firmware/acpi/tables/MCFG");
exit(EXIT_FAILURE);
}
if (buffer[0] != 0x4746434d) {
printf("MCFG signature not found\n");
exit(EXIT_FAILURE);
}
pcieBar = buffer[11];
}
addr = pcieBar;
addr |= (((DWORD)(func & 0x7)) << 12);
addr |= (((DWORD)(dev & 0x1f)) << 15);
addr |= ((bus & 0xff) << 20);
fd1 = open("/dev/mem", O_RDWR|O_DSYNC);
if (fd1 == -1) {
perror("Error opening /dev/mem");
return EXIT_FAILURE;
}
p = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd1, addr);
if(p == MAP_FAILED){
printf("Failed to mmap address,0x%x\n",addr);
}
else{
int i = 0;
while (i < length/4){
((uint32_t*)Buffer)[i] = p[(reg/4)+i];
i++;
}
munmap(p,4096);
close(fd1);
}
return 0;
}
Error Msg:
Failed to mmap address,0xe0000000
Base on the acpi table the base address is
at byte 0xc(12) which is 0xe0000000 but I'm not able to use mmap to allocate it correctly.
This code seems to be workable in the past, but I'm not able to make it work.
I'm not sure what I had missed. It's already run under sudo, maybe it need some special permission?
I had tried it on kernel: 5.3.18 / 4.12.14-23 neither of them worked.
gcc version : 7.5 / 7.3

ARM32, phys_to_virt, Unable to handle kernel paging request at virtual address

I'm working on implementing a variant of https://apenwarr.ca/log/20190216. Long story short, the main idea is to have a space in memory where to keep informations and to retrieve this information after a soft reboot/panic.
In my case, I just want to keep some variables from a reboot to another. So I've worked on a simple variant of this mechanism to do the job. The code is simply a copy paste from the original patch with some raw adaptations. I've added a syscall to enter kernel mode to execute this code (not shown here).
struct logbits {
int magic; /* needed to verify the memory across reboots */
int state;
int nb_reboot;
};
#define PERSIST_SEARCH_START 0
#ifdef CONFIG_NO_BOOTMEM
#define PERSIST_SEARCH_END 0x5e000000
#else
#define PERSIST_SEARCH_END 0xfe000000
#endif
#define PERSIST_SEARCH_JUMP (4*1024)
#define PERSIST_MAGIC 0xba5eba11
/*
* arm uses one memory model, mips uses another
*/
phys_addr_t physmem_reserve(phys_addr_t size) {
#ifdef CONFIG_NO_BOOTMEM
phys_addr_t alloc;
alloc = memblock_find_in_range_node(size, SMP_CACHE_BYTES,
PERSIST_SEARCH_START, PERSIST_SEARCH_END,
NUMA_NO_NODE);
if (!alloc) return alloc;
if (memblock_reserve(alloc, size)) {
pr_err("info_keeper: memblock_reserve failed\n");
return 0;
}
return alloc;
#else
unsigned long where;
for (where = PERSIST_SEARCH_END - size;
where >= PERSIST_SEARCH_START && where <= PERSIST_SEARCH_END - size;
where -= PERSIST_SEARCH_JUMP) {
if (reserve_bootmem(where, size, BOOTMEM_EXCLUSIVE))
continue;
else
return where;
}
return 0;
#endif
}
struct logbits *log_buf_alloc(char **new_logbuf)
{
char *buf;
phys_addr_t alloc;
unsigned long size = sizeof(struct logbits);
unsigned long full_size = size;
struct logbits *new_logbits;
alloc = physmem_reserve(full_size);
if (alloc) {
printk(KERN_INFO "info_keeper: memory reserved # 0x%08x\n", alloc);
buf = phys_to_virt(alloc);
if(buf){
*new_logbuf = buf;
new_logbits = (void*)buf;
printk(KERN_INFO "info_keeper: memory virtual # 0x%08x\n", buf);
if (new_logbits->magic != PERSIST_MAGIC) {
printk(KERN_INFO "info_keeper: header invalid, " "cleared.\n");
memset(buf, 0, full_size);
memset(new_logbits, 0, sizeof(*new_logbits));
new_logbits->magic = PERSIST_MAGIC;
} else {
printk(KERN_INFO "info_keeper: header valid; " "state=%d\n" "nb_reboot=%d\n", new_logbits->state, new_logbits->nb_reboot);
}
return new_logbits;
}else{
printk(KERN_ERR "info_keeper: failed to get phys to virt");
buf = alloc_bootmem(full_size);
*new_logbuf = buf;
new_logbits = (struct logbits*)(buf);
memset(buf, 0, full_size);
}
} else {
/* replace the buffer */
printk(KERN_ERR "info_keeper: failed to reserve bootmem " "area. disabled.\n");
buf = alloc_bootmem(full_size);
*new_logbuf = buf;
new_logbits = (struct logbits*)(buf);
memset(buf, 0, full_size);
}
return new_logbits;
}
Upon execution, the physmem_reserve function is successful and returns a memory region. Then I get a physical to virtual memory mapping from phys_to_virt. Then, when I try to access the memory, I get this Unable to handle kernel paging request at virtual address error.
Here is a sample output :
[ 42.489639] info_keeper: memory reserved # 0x5dffffc0
[ 42.494781] info_keeper: memory virtual # 0x0dffffc0
[ 42.499778] Unable to handle kernel paging request at virtual address 0dffffc0
Any idea on what is happening ?

Reading memory of another process in C without ptrace in linux

I am trying to read memory of another process and print whatever is in the memory (Heap and/or stack). I have got the range of memory addresses using /proc
I have extracted address range like this. Now I want to read the memory range of the other process like as defined.
5569032d2000-5569032f3000 rw-p 00000000 00:00 0 [heap]
I am stuck on how to access those memory addresses. I tried something like shown below , but doesn't help much.
int main(int argc, char *argv[]) {
off_t offset = strtoul(argv[1], NULL, 0);
size_t len = strtoul(argv[2], NULL, 0);
// Truncate offset to a multiple of the page size, or mmap will fail.
size_t pagesize = sysconf(_SC_PAGE_SIZE);
off_t page_base = (offset / pagesize) * pagesize;
off_t page_offset = offset - page_base;
int fd = open("/dev/mem", O_SYNC);
unsigned char *mem = mmap(NULL, page_offset + len, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, page_base);
if (mem == MAP_FAILED) {
perror("Can't map memory");
return -1;
}
size_t i;
for (i = 0; i < len; ++i)
printf("%x ", (int)mem[page_offset + i]);
//size_t i;
return 0;}
Thanks.
I am making like a debug tool for my embedded system. I can't use ptrace() as it halts the running process while trying to peek into the device memory.
I figured out to read the process of another process, I can use process_vm_readv() function as follow:
pid_t pid; // Put value of pid in this
void *remotePtr; // Put starting address
size_t bufferLength; // Put size of buffer in this, aka size to read
// Build iovec structs
struct iovec local[1];
local[0].iov_base = calloc(bufferLength, sizeof(char));
local[0].iov_len = bufferLength;
struct iovec remote[1];
remote[0].iov_base = remotePtr;
remote[0].iov_len = bufferLength;
/*Nread will contain amount of bytes of data read*/
nread = process_vm_readv(pid, local, 2, remote, 1, 0);
if (nread < 0) {
switch (errno) {
case EINVAL:
printf("ERROR: INVALID ARGUMENTS.\n");
break;
case EFAULT:
printf
("ERROR: UNABLE TO ACCESS TARGET MEMORY ADDRESS.\n");
break;
case ENOMEM:
printf("ERROR: UNABLE TO ALLOCATE MEMORY.\n");
break;
case EPERM:
printf
("ERROR: INSUFFICIENT PRIVILEGES TO TARGET PROCESS.\n");
break;
case ESRCH:
printf("ERROR: PROCESS DOES NOT EXIST.\n");
break;
default:
printf("ERROR: AN UNKNOWN ERROR HAS OCCURRED.\n");
}
return -1;
}
/* To print the read data */
printf("The read text is \n %s\n", local[0].iov_base);

How to use mmap'd reserved kernel memory in get_user_pages_fast()?

I have reserved several GB of memory via memmap=nn[KMG]$ss[KMG] in the kernel cmd line parameters. I also have a custom char device driver which uses mmap() and write() from the file_operations struct and performs DMA operations to a custom PCIe device. I use the typical DMA API to create the scatter-gather list.
In my driver, mmap() is successful and I can read/write to this io_remap_pfn_range() memory all day long from user-space. My driver's write() function is supposed to use the mmap'd buffer to create a scatter-gather list.
However, my driver's write() function fails with EFAULT when calling get_user_pages_fast() and I cannot explain why. If I use a malloc'd buffer instead of an mmap'd buffer from user-space everything works as expected. It seems to be a problem with the way get_user_pages_fast() handles the mmap'd buffer. What's the workaround?
For example,
/* From user-space
* What currently works but has poor performance
*/
int fd;
FILE *fdat;
char *buf;
ssize_t rc;
size_t len = 4*1024*1024*1024; /* 4GB */
fd = open("/dev/mycooldev", O_RDWR|O_SYNC);
rc = posix_memalign(&buf, 4096, len);
fdat = fopen("mydat.bin", "r");
fread(buf, 1, len, fdat);
fclose(fdat);
rc = write(fd, buf, len);
close(fd);
/* From user-space
* What I want to do, i.e. use the many GB of contiguous reserved memory
*/
int fd;
FILE *fdat;
char *buf;
ssize_t rc;
size_t len = 4*1024*1024*1024; /* 4GB */
fd = open("/dev/mycooldev", O_RDWR|O_SYNC);
buf = mmap(0, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
fdat = fopen("mydat.bin", "r");
fread(buf, 1, len, fdat);
fclose(fdat);
rc = write(fd, buf, len);
close(fd);
Relevant write() function in my device driver:
/* From the char device driver
* buf is a pointer to the buffer returned from either
* posix_memalign() or mmap(), but mmap() doesn't work
* because of get_user_pages_fast()
*/
ssize_t mycooldev_write(struct file *file, char __user *buf, size_t len, loff_t *pos)
{
int rc;
struct page **pages;
size_t npages;
struct sg_table sgt;
unsigned long uaddr, start, end;
...
/* EDIT: I added this call per a request in comments */
if (access_ok(VERIFY_WRITE, buf, len) == 0) {
pr_err("Not allowed to access user address\n");
return -EFAULT;
}
uaddr = (unsigned long)buf;
start = uaddr & PAGE_MASK_64; /* (0xFFFFFFFF00000000 | PAGE_MASK) */
end = uaddr + len + PAGE_SIZE - 1;
npages = (end - start) >> PAGE_SHIFT;
pages = kcalloc(npages, sizeof(struct page *), GFP_KERNEL);
/* This call fails with EFAULT when user passes in a buffer
* obtained with a call to mmap(). If user calls posix_memalign()
* to obtain their buffer, get_user_pages_fast() is successful.
*/
rc = get_user_pages_fast(uaddr, npages, 1, pages);
rc = sg_alloc_table_from_pages(&sgt, pages, npages, pos, len, GFP_KERNEL);
...
}
Why does get_user_pages_fast() not like my mmap'd buffer, but is fine with the malloc'd one? Any help is greatly appreciated.

Pinning user space buffer for DMA from Linux kernel

I'm writing driver for devices that produce around 1GB of data per second. Because of that I decided to map user buffer allocated by application directly for DMA instead of copying through intermediate kernel buffer.
The code works, more or less. But during long-run stress testing I see kernel oops with "bad page state" initiated by unrelated applications (for instance updatedb), probably when kernel wants to swap some pages:
[21743.515404] BUG: Bad page state in process PmStabilityTest pfn:357518
[21743.521992] page:ffffdf844d5d4600 count:19792158 mapcount:0 mapping: (null) index:0x12b011e012d0132
[21743.531829] flags: 0x119012c01220124(referenced|lru|slab|reclaim|uncached|idle)
[21743.539138] raw: 0119012c01220124 0000000000000000 012b011e012d0132 012e011e011e0111
[21743.546899] raw: 0000000000000000 012101300131011c 0000000000000000 012101240123012b
[21743.554638] page dumped because: page still charged to cgroup
[21743.560383] page->mem_cgroup:012101240123012b
[21743.564745] bad because of flags: 0x120(lru|slab)
[21743.569555] BUG: Bad page state in process PmStabilityTest pfn:357519
[21743.576098] page:ffffdf844d5d4640 count:18219302 mapcount:18940179 mapping: (null) index:0x0
[21743.585318] flags: 0x0()
[21743.587859] raw: 0000000000000000 0000000000000000 0000000000000000 0116012601210112
[21743.595599] raw: 0000000000000000 011301310127012f 0000000000000000 012f011d010d011a
[21743.603336] page dumped because: page still charged to cgroup
[21743.609108] page->mem_cgroup:012f011d010d011a
...
Entering kdb (current=0xffff8948189b2d00, pid 6387) on processor 6 Oops: (null)
due to oops # 0xffffffff9c87f469
CPU: 6 PID: 6387 Comm: updatedb.mlocat Tainted: G B OE 4.10.0-42-generic #46~16.04.1-Ubuntu
...
Details:
The user buffer consists of frames and neither the buffer not the frames are page-aligned. The frames in buffer are used in circular manner for "infinite" live data transfers. For each frame I get memory pages via get_user_pages_fast, then convert it to scatter-gatter table with sg_alloc_table_from_pages and finally map for DMA using dma_map_sg.
I rely on sg_alloc_table_from_pages to bind consecutive pages into one DMA descriptor to reduce size of S/G table sent to device. Devices are custom built and utilize FPGA. I took inspiration from many drivers doing similar mapping, especially video drivers i915 and radeon, but no one has all the stuff on one place so I might overlook something.
Related functions (pin_user_buffer and unpin_user_buffer are called upon separate IOCTLs):
static int pin_user_frame(struct my_dev *cam, struct udma_frame *frame)
{
const unsigned long bytes = cam->acq_frame_bytes;
const unsigned long first =
( frame->uaddr & PAGE_MASK) >> PAGE_SHIFT;
const unsigned long last =
((frame->uaddr + bytes - 1) & PAGE_MASK) >> PAGE_SHIFT;
const unsigned long offset =
frame->uaddr & ~PAGE_MASK;
int nr_pages = last - first + 1;
int err;
int n;
struct page **pages;
struct sg_table *sgt;
if (frame->uaddr + bytes < frame->uaddr) {
pr_err("%s: attempted user buffer overflow!\n", __func__);
return -EINVAL;
}
if (bytes == 0) {
pr_err("%s: user buffer has zero bytes\n", __func__);
return -EINVAL;
}
pages = kcalloc(nr_pages, sizeof(*pages), GFP_KERNEL | __GFP_ZERO);
if (!pages) {
pr_err("%s: can't allocate udma_frame.pages\n", __func__);
return -ENOMEM;
}
sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
if (!sgt) {
pr_err("%s: can't allocate udma_frame.sgt\n", __func__);
err = -ENOMEM;
goto err_alloc_sgt;
}
/* (rw == READ) means read from device, write into memory area */
err = get_user_pages_fast(frame->uaddr, nr_pages, READ == READ, pages);
if (err < nr_pages) {
nr_pages = err;
if (err > 0) {
pr_err("%s: can't pin all %d user pages, got %d\n",
__func__, nr_pages, err);
err = -EFAULT;
} else {
pr_err("%s: can't pin user pages\n", __func__);
}
goto err_get_pages;
}
for (n = 0; n < nr_pages; ++n)
flush_dcache_page(pages[n]); //<--- Is this needed?
err = sg_alloc_table_from_pages(sgt, pages, nr_pages, offset, bytes,
GFP_KERNEL);
if (err) {
pr_err("%s: can't build sg_table for %d pages\n",
__func__, nr_pages);
goto err_alloc_sgt2;
}
if (!dma_map_sg(&cam->pci_dev->dev, sgt->sgl, sgt->nents, DMA_FROM_DEVICE)) {
pr_err("%s: can't map %u sg_table entries for DMA\n",
__func__, sgt->nents);
err = -ENOMEM;
goto err_dma_map;
}
frame->pages = pages;
frame->nr_pages = nr_pages;
frame->sgt = sgt;
return 0;
err_dma_map:
sg_free_table(sgt);
err_alloc_sgt2:
err_get_pages:
for (n = 0; n < nr_pages; ++n)
put_page(pages[n]);
kfree(sgt);
err_alloc_sgt:
kfree(pages);
return err;
}
static void unpin_user_frame(struct my_dev *cam, struct udma_frame *frame)
{
int n;
dma_unmap_sg(&cam->pci_dev->dev, frame->sgt->sgl, frame->sgt->nents,
DMA_FROM_DEVICE);
sg_free_table(frame->sgt);
kfree(frame->sgt);
frame->sgt = NULL;
for (n = 0; n < frame->nr_pages; ++n) {
struct page *page = frame->pages[n];
set_page_dirty_lock(page);
mark_page_accessed(page); //<--- Without this the Oops are more frequent
put_page(page);
}
kfree(frame->pages);
frame->pages = NULL;
frame->nr_pages = 0;
}
static void unpin_user_buffer(struct my_dev *cam)
{
if (cam->udma_frames) {
int n;
for (n = 0; n < cam->udma_frame_count; ++n)
unpin_user_frame(cam, &cam->udma_frames[n]);
kfree(cam->udma_frames);
cam->udma_frames = NULL;
}
cam->udma_frame_count = 0;
cam->udma_buffer_bytes = 0;
cam->udma_buffer = NULL;
cam->udma_desc_count = 0;
}
static int pin_user_buffer(struct my_dev *cam)
{
int err;
int n;
const u32 acq_frame_count = cam->acq_buffer_bytes / cam->acq_frame_bytes;
struct udma_frame *udma_frames;
u32 udma_desc_count = 0;
if (!cam->acq_buffer) {
pr_err("%s: user buffer is NULL!\n", __func__);
return -EFAULT;
}
if (cam->udma_buffer == cam->acq_buffer
&& cam->udma_buffer_bytes == cam->acq_buffer_bytes
&& cam->udma_frame_count == acq_frame_count)
return 0;
if (cam->udma_buffer)
unpin_user_buffer(cam);
udma_frames = kcalloc(acq_frame_count, sizeof(*udma_frames),
GFP_KERNEL | __GFP_ZERO);
if (!udma_frames) {
pr_err("%s: can't allocate udma_frame array for %u frames\n",
__func__, acq_frame_count);
return -ENOMEM;
}
for (n = 0; n < acq_frame_count; ++n) {
struct udma_frame *frame = &udma_frames[n];
frame->uaddr =
(unsigned long)(cam->acq_buffer + n * cam->acq_frame_bytes);
err = pin_user_frame(cam, frame);
if (err) {
pr_err("%s: can't pin frame %d (out of %u)\n",
__func__, n + 1, acq_frame_count);
for (--n; n >= 0; --n)
unpin_user_frame(cam, frame);
kfree(udma_frames);
return err;
}
udma_desc_count += frame->sgt->nents; /* Cannot overflow */
}
pr_debug("%s: total udma_desc_count=%u\n", __func__, udma_desc_count);
cam->udma_buffer = cam->acq_buffer;
cam->udma_buffer_bytes = cam->acq_buffer_bytes;
cam->udma_frame_count = acq_frame_count;
cam->udma_frames = udma_frames;
cam->udma_desc_count = udma_desc_count;
return 0;
}
Related structures:
struct udma_frame {
unsigned long uaddr; /* User address of the frame */
int nr_pages; /* Nr. of pages covering the frame */
struct page **pages; /* Actual pages covering the frame */
struct sg_table *sgt; /* S/G table describing the frame */
};
struct my_dev {
...
u8 __user *acq_buffer; /* User-space buffer received via IOCTL */
...
u8 __user *udma_buffer; /* User-space buffer for image */
u32 udma_buffer_bytes; /* Total image size in bytes */
u32 udma_frame_count; /* Nr. of items in udma_frames */
struct udma_frame
*udma_frames; /* DMA descriptors per frame */
u32 udma_desc_count; /* Total nr. of DMA descriptors */
...
};
Questions:
How to properly pin user buffer pages and mark them as not movable?
If one frame ends and next frame starts in the same page, is it correct to handle it as two independent pages, i.e. pin the page twice?
The data comes from device to user buffer and app is supposed to not write to its buffer, but I have no control over it. Can I use DMA_FROM_DEVICE or rather
use DMA_BIDIRECTIONAL just in case?
Do I need to use something like SetPageReserved/ClearPageReserved or mark_page_reserved/free_reserved_page?
Is IOMMU/swiotlb somehow involved? E.g. i915 driver doesn't use sg_alloc_table_from_pages if swiotlb is active?
What the difference between set_page_dirty, set_page_dirty_lock and SetPageDirty functions?
Thanks for any hint.
PS: I cannot change the way the application gets the data without breaking our library API maintained for many years. So please do not advise e.g. to mmap kernel buffer...
Why do you put "READ == READ" as the third paramter? You need put flag there.
err = get_user_pages_fast(frame->uaddr, nr_pages, READ == READ, pages);
You need put "FOLL_LONGTERM" here, and FOLL_PIN is set by get_user_pages_fast internally. See https://www.kernel.org/doc/html/latest/core-api/pin_user_pages.html#case-2-rdma
In addition, you need take care of cpu and device memory coherence. Just call "dma_sync_sg_for_device(...)" before dma transfer, and "dma_sync_sg_for_cpu(...)" after dma transfer.

Resources