linux proc size limit problems - c

I'm trying to write a linux kernel module that can dump the contents of other modules to a /proc file (for analysis). In principle it works but it seems I run into some buffer limit or the like. I'm still rather new to Linux kernel development so I would also appreciate any suggestions not concerning the particular problem.
The memory that is used to store the module is allocated in this function:
char *get_module_dump(int module_num)
{
struct module *mod = unhiddenModules[module_num];
char *buffer;
buffer = kmalloc(mod->core_size * sizeof(char), GFP_KERNEL);
memcpy((void *)buffer, (void *)startOf(mod), mod->core_size);
return buffer;
}
'unhiddenModules' is an array of module structs
Then it is handed over to the proc creation here:
void create_module_dump_proc(int module_number)
{
struct proc_dir_entry *dump_module_proc;
dump_size = unhiddenModules[module_number]->core_size;
module_buffer = get_module_dump(module_number);
sprintf(current_dump_file_name, "%s_dump", unhiddenModules[module_number]->name);
dump_module_proc = proc_create_data(current_dump_file_name, 0, dump_proc_folder, &dump_fops, module_buffer);
}
The proc read function is as follows:
ssize_t dump_proc_read(struct file *filp, char *buf, size_t count, loff_t *offp)
{
char *data;
ssize_t ret;
data = PDE_DATA(file_inode(filp));
ret = copy_to_user(buf, data, dump_size);
*offp += dump_size - ret;
if (*offp > dump_size)
return 0;
else
return dump_size;
}
Smaller Modules are dumped correctly but if the module is larger than 126,796 bytes only the first 126,796 bytes are written and this error is displayed when reading from the proc file:
*** Error in `cat': free(): invalid next size (fast): 0x0000000001f4a040 ***
I've seem to run into some limit but I couldn't find anything on it. The error seems to be related so memory leaks but the buffer should be large enough so I don't see where this actually happens.

The procfs has a limit of PAGE_SIZE (one page) for read and write operations. Usually seq_file is used to iterate over the entries (modules in your case ?) to read and/or write smaller chunks. Since you are running into problems with only larger data, I suspect this is the case here.
Please have a look here and here if you are not familiar with seq_files.

A suspicious thing is that in dump_proc_read you are not using "count" parameter. I would have expected copy_to_user to take "count" as third argument instead of "dump_size" (and in subsequent calculations too). The way you do, always dump_size bytes are copied to user space, regardless the data size the application was expecting. The bigger dump_size is, the larger the user area that gets corrupted.

Related

How to pass a char *array (belonging to the user address space) to a tasklet or workqueue in a kernel module?

I’m writing a device driver. If someone calls the write operation I want it to be deferred (using tasklet or workqueue). The code should be something like that:
static ssize_t dev_write(struct file *filp, const char *buff, size_t len, loff_t *off) {
packed_work *the_task;
the_task = kzalloc(sizeof(packed_work), GFP_ATOMIC);
if (the_task == NULL) {
printk(KERN_ERR "%s: tasklet buffer allocation failure\n", MODNAME);
return -1;
}
the_task->buffer = the_task;
the_task->buff = buff;
the_task->len = len;
INIT_WORK(&(the_task->the_work), (void*)deferred_write);
schedule_work(&the_task->the_work);
return len;
}
void deferred_write(struct work_struct *data) {
printk(“the text: %s\n”, container_of(data, packed_work, the_work)->buff);
//copy_from_user(&(the_object->stream_content), container_of(data, packed_work, the_work)->buff, len);
kfree((void*)container_of(data,packed_work,the_work));
}
And the struct looks like this:
typedef struct _packed_work{
void *buffer;
const char *buff;
size_t len;
struct work_struct the_work;
} packed_work;
The problem is that the kernel crashes. It crashes even before the copy_from_user (that’s why I commented it). In the deferred_write() I can print the length of the string but not the string itself. Is it a problem because the buffer is in the user space memory?
I know that, as a workaround, I can copy the user buffer in the task struct (using the copy_from_user() in the function write()) and then use the strcpy() in the deferred_write() function. But I really would like to use the copy_from_user() in deferred_write(). Is it possible? What can I do?
Even if it is possible (and there is surely a way), the user process has probably changed the contents of the buffer in the time before deferred_write runs. Notice that user programs often allocate these buffers on the stack, so they get overwritten when the function that called write returns and calls other functions.
Even worse: the user process could have unmapped the buffer, or it could have exited.
So you should not delay reading the buffer. You should read the buffer inside the write call and not anywhere else.

delayed write from userspace to kernel space using framebuffer node

I have implemented a linux kernel driver which uses deferred IO mechanism to track the changes in framebuffer node.
static struct fb_deferred_io fb_defio = {
.delay = HZ/2,
.deferred_io = fb_dpy_deferred_io,
};
Per say the registered framebuffer node is /dev/graphics/fb1.
The sample application code to access this node is:
fbfd = open("/dev/graphics/fb1", O_RDWR);
if (!fbfd) {
printf("error\n");
exit(0);
}
screensize = 540*960*4;
/* Map the device to memory */
fbp = (unsigned char *)mmap(0, screensize, PROT_READ | PROT_WRITE, MAP_SHARED,
fbfd, 0);
if ((int)fbp == -1) {
printf("Error: failed to start framebuffer device to memory.");
}
int grey = 0x1;
for(cnt = 0; cnt < screensize; cnt++)
*(fbp + cnt) = grey<<4|grey;
This would fill up entire fb1 node with 1's.
The issue now is at the kernel driver when i try to read the entire buffer I find data mismatch at different locations.
The buffer in kernel is mapped as:
par->buffer = dma_alloc_coherent(dev, roundup((dpyw*dpyh*BPP/8), PAGE_SIZE),(dma_addr_t *) &DmaPhysBuf, GFP_KERNEL);
if (!par->buffer) {
printk(KERN_WARNING "probe: dma_alloc_coherent failed.\n");
goto err_vfree;
}
and finally the buffer is registered through register_framebuffer function.
On reading the source buffer I find that at random locations the data is not been written instead the old data is reflected.
For example:
At buffer location 3964 i was expecting 11111111 but i found FF00FF00.
On running the same application program with value of grey changed to 22222222
At buffer location 3964 i was expecting 22222222 but i found 11111111
It looks like there is some delayed write in the buffer. Is there any solution to this effect, because of partially wrong data my image is getting corrupted.
Please let me know if any more information is required.
Note: Looks like an issue of mapped buffer being cacheable or not. Its a lazy write to copy the data from cache to ram. Need to make sure that the data is copied properly but how still no idea.. :-(
"Deferred io" means that frame buffer memory is not really mapped to a display device. Rather, it's an ordinary memory area shared between user process and kernel driver. Thus it needs to be "synced" for kernel to actually do anything about it:
msync(fbp, screensize, MS_SYNC);
Calling fsync(fbfd) may also work.
You may also try calling ioctl(fbfd, FBIO_WAITFORVSYNC, 0) if your driver supports it. The call will make your application wait until vsync happens and the frame buffer data was definitely transferred to the device.
I was having a similar issue where I was having random artifacts displaying on the screen. Originally, the framebuffer driver was not using dma at all.
I tried the suggestion of using msync(), which improved the situation (artifacts happened less frequently), but it did not completely solve the issue.
After doing some research I came to the conclusion that I need to use dma memory because it is not cached. There is still the issue with mmap because it is mapping the kernel memory to userspace. However, I found that there is already a function in the kernel to handle this.
So, my solution was in my framebuffer driver, set the mmap function:
static int my_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
{
return dma_mmap_coherent(info->dev, vma, info->screen_base,
info->fix.smem_start, info->fix.smem_len);
}
static struct fb_ops my_fb_ops = {
...
.fb_mmap = my_fb_mmap,
};
And then in the probe function:
struct fb_info *info;
struct my_fb_par *par;
dma_addr_t dma_addr;
char *buf
info = framebuffer_alloc(sizeof(struct my_fb_par), &my_parent->dev);
...
buf = dma_alloc_coherent(info->dev, MY_FB_SIZE, dma_addr, GFP_KERNEL);
...
info->screen_base = buf;
info->fbops = &my_fb_ops;
info->fix = my_fb_fix;
info->fix.smem_start = dma_addr;
info->fix.smem_len = MY_FB_SIZE;
...
par = info->par
...
par->buffer = buf;
Obviously, I've left out the error checking and unwinding, but hopefully I have touched on all of the important parts.
Note: Comments in the kernel source say that dmac_flush_range() is for private use only.
Well eventually i found a better way to solve the issue. The data written through app at mmaped device node is first written in cache which is later written in RAM through delayed write policy. In order to make sure that the data is flushed properly we need to call the flush function in kernel. I used
dmac_flush_range((void *)pSrc, (void *)pSrc + bufSize);
to flush the data completely so that the kernel receives a clean data.

Within char device, where do i put ioread?

I've got a pci device and all I want is to read its memory by "cat"ing from /dev/pcidevice. My first attempt for the char device's read function looked like this:
ssize_t cdev_read(struct file *filp, char __user *buffer, size_t count, loff_t *f_pos) {
ssize_t retval = 0;
struct mypci_dev *device = filp->private_data;
/* reading data from pci device */
device->values.fst = ioread16(device->bar[0]+OFFSET_FST);
device->values.snd = ioread16(device->bar[0]+OFFSET_SND);
...
device->values.lst = ioread16(device->bar[0]+OFFSET_LST);
retval = copy_to_user(buffer, &device->values.fst, count);
return retval;
}
And it didn't work :/ I changed the copy_to_user line into
retval = copy_to_user(buffer, "dummy", strlen("dummy")+1);
but cat /dev/pcidevice still returned nothing.
Next I shifted all ioread16 calls into cdev_open() and I got what I wanted. But now I'm curious why it's only working this way. And how can I make it work the other way?
ATM I think about timers that start copying etc. but some kind of wait until ioreads have finished would be enough.
Any ideas?
copy_to_user() function returns number of bytes that could not be copied. On success, it returns 0. Now imagine how surprised the user space would be when it reads 0 bytes.
The real question is how that actually works inside cdev_open()? I hope that you read I/O bar in that function and not sending anything to the user space. In that case, try adding rmb(); after your reads before calling copy_to_user() to make sure every read is finished (rmb() is a read memory barrier).
Also, check out LDD chapter 3 and chapter 5 if you haven't done so already.
Hope it helps. Happy hacking!

copy_to_user vs memcpy

I have always been told(In books and tutorials) that while copying data from kernel space to user space, we should use copy_to_user() and using memcpy() would cause problems to the system. Recently by mistake i have used memcpy() and it worked perfectly fine with out any problems. Why is that we should use copy_to_user instead of memcpy()
My test code(Kernel module) is something like this:
static ssize_t test_read(struct file *file, char __user * buf,
size_t len, loff_t * offset)
{
char ani[100];
if (!*offset) {
memset(ani, 'A', 100);
if (memcpy(buf, ani, 100))
return -EFAULT;
*offset = 100;
return *offset;
}
return 0;
}
struct file_operations test_fops = {
.owner = THIS_MODULE,
.read = test_read,
};
static int __init my_module_init(void)
{
struct proc_dir_entry *entry;
printk("We are testing now!!\n");
entry = create_proc_entry("test", S_IFREG | S_IRUGO, NULL);
if (!entry)
printk("Failed to creats proc entry test\n");
entry->proc_fops = &test_fops;
return 0;
}
module_init(my_module_init);
From user-space app, i am reading my /proc entry and everything works fine.
A look at source code of copy_to_user() says that it is also simple memcpy() where we are just trying to check if the pointer is valid or not with access_ok and doing memcpy.
So my understanding currently is that, if we are sure about the pointer we are passing, memcpy() can always be used in place of copy_to_user.
Please correct me if my understanding is incorrect and also, any example where copy_to_user works and memcpy() fails would be very useful. Thanks.
There are a couple of reasons for this.
First, security. Because the kernel can write to any address it wants, if you just use a user-space address you got and use memcpy, an attacker could write to another process's pages, which is a huge security problem. copy_to_user checks that the target page is writable by the current process.
There are also some architecture considerations. On x86, for example, the target pages must be pinned in memory. On some architectures, you might need special instructions. And so on. The Linux kernels goal of being very portable requires this kind of abstraction.
This answer may be late but anyway copy_to_user() and it's sister copy_from_user() both do some size limits checks about user passed size parameter and buffer sizes so a read method of:
char name[] = "This message is from kernel space";
ssize_t read(struct file *f, char __user *to, size_t size, loff_t *loff){
int ret = copy_to_user(to, name, size);
if(ret){
pr_info("[+] Error while copying data to user space");
return ret;
}
pr_info("[+] Finished copying data to user space");
return 0;
}
and a user space app read as read(ret, buffer, 10); is OK but replace 10 with 35 or more and kernel will emit this error:
Buffer overflow detected (34 < 35)!
and cause the copy to fail to prevent memory leaks. Same goes for copy_from_user() which will also make some kernel buffer size checks.
That's why you have to use char name[] and not char *name since using pointer(not array) makes determining size not possible which will make kernel emit this error:
BUG: unable to handle page fault for address: ffffffffc106f280
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
Hope this answer is helpful somehow.

locking of copy_[to/from]_user() in linux kernel

as stated in: http://www.kernel.org/doc/htmldocs/kernel-hacking.html#routines-copy this functions "can" sleep.
So, do I always have to do a lock (e.g. with mutexes) when using this functions or are there exceptions?
I'm currently working on a module and saw some Kernel Oops at my system, but cannot reproduce them. I have a feeling they are fired because I'm currently do no locking around copy_[to/from]_user(). Maybe I'm wrong, but it smells like it has something to do with it.
I have something like:
static unsigned char user_buffer[BUFFER_SIZE];
static ssize_t mcom_write (struct file *file, const char *buf, size_t length, loff_t *offset) {
ssize_t retval;
size_t writeCount = (length < BUFFER_SIZE) ? length : BUFFER_SIZE;
memset((void*)&user_buffer, 0x00, sizeof user_buffer);
if (copy_from_user((void*)&user_buffer, buf, writeCount)) {
retval = -EFAULT;
return retval;
}
*offset += writeCount;
retval = writeCount;
cleanupNewline(user_buffer);
dispatch(user_buffer);
return retval;
}
Is this save to do so or do I need locking it from other accesses, while copy_from_user is running?
It's a char device I read and write from, and if a special packet in the network is received, there can be concurrent access to this buffer.
You need to do locking iff the kernel side data structure that you are copying to or from might go away otherwise - but it is that data structure you should be taking a lock on.
I am guessing your function mcom_write is a procfs write function (or similar) right? In that case, you most likely are writing to the procfs file, your program being blocked until mcom_write returns, so even if copy_[to/from]_user sleeps, your program wouldn't change the buffer.
You haven't stated how your program works so it is hard to say anything. If your program is multithreaded and one thread writes while another can change its data, then yes, you need locking, but between the threads of the user-space program not your kernel module.
If you have one thread writing, then your write to the procfs file would be blocked until mcom_write finishes so no locking is needed and your problem is somewhere else (unless there is something else that is wrong with this function, but it's not with copy_from_user)

Resources