How to do data cache flush/invalidate from linux user space - c

Trying to use cacheable mapped buffers in linux user space. These buffers will be accessed by the accelerators.
In ARMv7-A architecture, is there any possibility to flush/invalidate data cache explicitly from linux user space?
Tried __clear_cache(), it didnt work. As per URL https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html , my understanding is that it flushes only instruction cache.
user space applications run in user mode, do we need to set any privileged mode permissions for cache operations.
More info will be helpful.

There is no way to flush an ARMv7-A/ARMv8-A processor cache from userspace (kernel <= 5.13.x) without writing a kernel driver such as a simple misc class driver that would allow you to do an ioctl or sysfs action that would cause the driver to call the kernel API arch_sync_dma_for_device for the area of RAM that you wish to flush.
See
#include <linux/dma-noncoherent.h>
for the function prototype for arch_sync_dma_for_device.
So unless the logistics of your project allow you to add a kernel module to the system or rebuild and replace the kernel, you can't flush the processor caches from a userspace application. For legacy projects with product in the field, or projects whose kernel version is locked by digital signing the logistics usually do not support this type of invasive solution.
I have successfully demonstrated such a misc driver that flushes the processor caches on an IPQ ARMv8a implementation for a new product design. The driver took me about two hours to write and test.

The __builtin___clear_cache function works in my case (Zynq MP, arm64 + linux) but I think it is because I use mmaped memory from a custom linux driver kernel module which allocates a DMA coherent buffer (dma_alloc_coherent).
Edit: Back to this topic, the __builtin___clear_cache function works well in my case on a general /dev/mem mmapped DDR segment. I open the /dev/mem without the O_SYNC flag.

Related

Which of the following instructions can run in unprivileged mode? 1) Load 2)Store 3) Input 4)Output

Trying to understand user mode vs kernel mode and which instruction can run in unprivileged mode.
Per Abraham-Siberschatz, all of these may require SysCall.
It depends on which CPU it is and how the OS configured it.
For example; for 80x86 the in and out instructions can be executed in user mode if:
the IOPL field (IO privilege level) is set to be not less than the current privilege level.
the IO Permission Bitmap (in the current Task State Segment) is configured to allow access to the specific IO port/s.
Of course a lot of IO is memory mapped, and that also allows a kernel to map any "memory mapped IO" areas into user space.
In general; a monolithic kernel will run device drivers in kernel space and won't allow user mode code to access any kind of IO directly; and a micro-kernel will run device drivers in user space and will explicitly configure things so that a device driver process can do input/output for the device it is driving (but not any other device).

How to unsafely remove blockdevice driver in Linux

I am writing a block device driver for linux.
It is crucial to support unsafe removal (like usb unplug). In other words, I want to be able to shut down the block device without creating memory leaks / crashes even while applications hold open files or performing IO on my device or if it is mounted with file system.
Surely unsafe removal would possibly corrupt the data which is stored on the device, but that is something the customers are willing to accept.
Here is the basics steps I have done:
Upon unsafe removal, block device spawns a zombie which will automatically fail all new IO requests, ioctls, etc. The zombie substitutes make_request function and changes other function pointers so kernel would not need the original block device.
Block device waits for all IO which is running now (and use my internal resources) to complete
It does del_gendisk(); however this does not really free's kernel resources because they are still used.
Block device frees itself.
The zombie keeps track of the amount of opens() and close() on the block device and when last close() occurs it automatically free() itself
Result - I am not leaking the blockdevice, request queue, gen disk, etc.
However this is a very difficult mechanism which requires a lot of code and is extremely prone to race conditions. I am still struggling with corner cases, per_cpu counting of io's and occasional crashes
My questions: Is there a mechanism in the kernel which already does that? I searched manuals, literature, and countless source code examples of block device drivers, ram disks and USB drivers but could not find a solution. I am sure, that I am not the first one to encounter this problem.
Edited:
I learned from the answer below, by Dave S about the hot-plug mechanism but it does not help me. I need a solution of how to safely shut down the driver and not how to notify the kernel that driver was shut down.
Example of one problem:
blk_queue_make_request() registers a function through which my block devices serves IO. In that function I increment per_cpu counters to know how many IO's are in flight by each cpu. However there is a race condition of function being called but counter was not increased yet, so my device thinks there are 0 IO's, releases the resources and then IO comes and crashes the system. Hotplug will not assist me with this problem as far as I understand
About a decade ago I used hotplugging on a software driver project to safely add/remove an external USB disk drive which interfaced to an embedded Linux driven Set-top Box.
For your project you will also need to write a hot plug. A hotplug is a program which is used by the kernel to notify user mode software when some significant (usually hardware-related) events take place. An example is when a USB device has just been plugged in or removed.
From Linux 2.6 kernel onwards, hotplugging has been integrated with the driver model core so that any bus or class can report hotplug events when devices are added or removed.
In the kernel tree, /usr/src/linux/Documentation/usb/hotplug.txt has basic information about USB Device Driver API support for hotplugging.
See also this link, and GOOGLE as well for examples and documentation.
http://linux-hotplug.sourceforge.net/
Another very helpful document which discusses hotplugging with block devices can be found here:
https://www.kernel.org/doc/pending/hotplug.txt
This document also gives a good example of illustrating hotplug events handling:
Below is a table of the main variables you should be aware of:
Hotplug event variables:
Every hotplug event should provide at least the following variables:
ACTION
The current hotplug action: "add" to add the device, "remove" to remove it.
The 2.6.22 kernel can also generate "change", "online", "offline", and
"move" actions.
DEVPATH
Path under /sys at which this device's sysfs directory can be found.
SUBSYSTEM
If this is "block", it's a block device. Anything other subsystem is
either a char device or does not have an associated device node.
The following variables are also provided for some devices:
MAJOR and MINOR
If these are present, a device node can be created in /dev for this device.
Some devices (such as network cards) don't generate a /dev node.
DRIVER
If present, a suggested driver (module) for handling this device. No
relation to whether or not a driver is currently handling the device.
INTERFACE and IFINDEX
When SUBSYSTEM=net, these variables indicate the name of the interface
and a unique integer for the interface. (Note that "INTERFACE=eth0" could
be paired with "IFINDEX=2" because eth0 isn't guaranteed to come before lo
and the count doesn't start at 0.)
FIRMWARE
The system is requesting firmware for the device.
If the driver is creating device it could be possible to suddenly delete it:
echo 1 > /sys/block/device-name/device/delete where device-name may be sde, for example,
or
echo 1 > /sys/class/scsi_device/h:c:t:l/device/delete, where h is the HBA number, c is the channel on the HBA, t is the SCSI target ID, and l is the LUN.
In my case, it perfectly simulates scenarios for crushing writes and recovery of data from journaling.
Normally to safely remove device more steps is needed so deleting device is a pretty drastic event for data and could be useful for testing :)
please consider this:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/online_storage_reconfiguration_guide/removing_devices
http://www.sysadminshare.com/2012/09/add-remove-single-disk-device-in-linux.html

Linux device driver for a RS232 device in embedded system

I have recently started learning to write Linux device drivers for a specific project that I am working on. Previously most of the work I have done has been with devices running no OS so Linux drivers and development is somewhat new to me.
For the project I am working on I have an embedded system running a Linux based operating system. I have an external device with is controlled via RS232 that I need to write a driver for.
Questions:
1) Is there a way to access serial ports from withing kernel space (and possibly use serial.h, serial_core.h, etc.), how is this usually done, any good examples?
2) From what I found it seems like it would be much easier to access the serial ports in user space by just opening dev/ttyS* and writing to it. When writing a driver for a device like this (RS232 device) is it preferred to do it in user space or is there a way to write a kernel module? How does one decide to write a driver as a kernel module over user space or vise versa?
Are drivers only for generic devices such as UART/serial and then above that is userspace or should this driver be written as a kernel module? I appreciate the help, I have been unable to find much information to answer my questions.
There are a few times when a module that communicates over a serial port may be in the kernel. The pppd (point to point protocol daemon) is one example as Linux has some kernel code devoted to that since it is a high traffic use of serial and it also needs to turn around and put the IP packets into kernel space.
Most other uses would work better from user space since you have a good API that already takes care of a lot of the errors that can happen. This also lessens the chance that your errors will result in massive system failure.
Doing things like this from user space does result in some latency. Reads and writes are buffered, and it's often difficult to tell where in the write operations the hardware actually is, and canceling an already succeeded write call isn't really doable from user space, even if the hardware hasn't yet received the bytes.
I would suggest attempting to do it from user space first and then move to OS driver if necessary. Even if it is necessary to move this into an OS level driver, you'll likely be able to get some progress made from user space.

Why does "read" have to be a system call run in "Kernel Mode"?

As I understood, the UNIX function read() will cause an interrupt(TRAP) and invoke the system call read. I also remembered that it has to switch to "Kernel Mode" before invoking the system call read and the switching is expensive..
I was wondering that why the read operation has to be delegated to system call in "Kernel Mode", instead of being done in "User Mode" completely.
For example, if there could be a service in "User Mode" which manages the access permissions of files, the read operation can just request this service, not disturbing the Kernel..
And for the disk driver, it is said in this link that
Device drivers can run in either user or kernel mode
Does anyone have ideas about this? Why does read have to be in Kernel Mode?
Is not the way Operating Systems are designed. The definition of OS is to handle the computers' hardware and to bring resources to their users. Operating Sysmtes also have the concept of user mode and kernel mode (as you said).
By having these concepts, OS define an specific line to what a user might do and what not. Letting them manage hardware is definitely something OS don't want users to do.
read usually involves a hardware access. Accessing hardware is cumbersome and error prone and can leave the computer in an unusable state. Operating System uses drivers to control the computer's hardware.
Issuing a read (assuming a hard disk IO) generally makes a driver to send a set of commands to the disk controller, read it's output, pass it to main memory, etc. This are dangerous operations that shouldn't be trusted to User Mode.
If there would be a service in user mode to handle this. Context switch still would be needed to be done, because the service would be running as another process.
Sure thing it can be done an Operating System that allows this. But modern operating systems aren't design to fulfill this behavior.
There are other approaches to building operating systems that relies on microkernels. A microkernel just do the minimum to get the pc started and leave everything else to other modules. Meaning that if a module crashes, the system still up. That's the case of specific drivers, filesystems, etc. I don't know if microkernels let these run in user space though.
Hope this helps!
First of all: it's no longer true that calling kernel is very expensive. It used to be when causing an exception/trap/fault/interrupt were the only way to switch from user mode to kernel mode in x86 systems, but that all changed with the addition of the systenter/sysexit machine code instructions, which perform a more lightweight transition.
Even if it is/were expensive, in terms of time consumed, system calls that deal with character and block device drivers should run in kernel mode because dealing with hardware devices involves reading and writting to hardware registers, which could be memory mapped or accessed thru I/O ports.
These registers must be protected from any access from userspace process. Not doing so may lead to any process to not to use the established API for reading a file, and directly use the hardware registers to read and write to the device. In the case of a disk with file, this would allow the userprocess to bypass the filesystem entirely, and hence, all the security and permission system.
So, if we need to protect these hardware registers so no user process can use them, code that does use them cannot run at the same priviledge level as any other user process. Hence, they run in another (more priviledged) mode, which is what is called "kernel mode".
Think on what would happen if you configure a Linux system so /dev/sda (usually the main harddisk in which the root filesystem lives) is read/write to anybody and everybody:
# chmod 666 /dev/sda
Having done this is more or less the equivalent of exposing the hard disk device to any user process. You can effectively write a program that could open, read, and write files stored within this device, but at the same time, you can write a program that open, read and write ANY files within the partition, no matter which permissions files have.
That said, there are cases in which a system runs only trusted applications. This kind of system doesn't need the level of protection that is present in a general purpose system, and hence it can benefit from the increased speed that comes by not depending on layers of APIs to isolate the process from the hardware. The most widely known example would be a videoconsole system. I recall that Windows CE used to run all its programs and device drivers at the same privilege too.

How are C file I/O operations handled at low level?

To extend the title.I am wondering how the OS handles functions like fwrite,fread,fopen and fclose.
What is actually a stream?
Sorry if I was not clear enough.
BTW I am using GNU/Linux Ubuntu 11.04.
A bit better explanation of what I am trying to ask.
I want to know how are files written to HDD how are read into memory and how can is later a handle to them created.Is BIOS doing that through drivers?
The C library takes a function like fopen and converts that to the proper OS system call. On Linux that is the POSIX open function. You can see the definition for this in a Linux terminal with man 2 open. On Windows the call would be CreateFile which you can see in the MSDN documentation. On Windows NT, that function is in turn another translation of the actual NT kernel function NtCreateFile.
A stream in the C library is a collection of information stored in a FILE struct. This is usually a 'handle' to the operating system's idea of the file, an area of memory allocated as a 'buffer', and the current read and write positions.
I just noticed you tagged this with 'assembly'. You might then want to know about the really low level details. This seems like a good article.
Now you've changed the question to ask about even lower levels. Well, once the operating system gets a command to open a file, it passes that command to the VFS (Virtual File System). That piece of the operating system looks up the file name, including any directories needed and does the necessary access checks. If this is in RAM cache then no disk access is needed. If not, the VFS sends a read request to the specific file system which is probably EXT4. Then the EXT4 file system driver will determine in what disk block that directory is located in. It will then send a read command to the disk device driver.
Assuming that the disk driver is AHCI, it will convert a request to read a block into a series of register writes that will set up a DMA (Direct Memory Access) request. This looks like a good source for some details.
At that point the AHCI controller on the motherboard takes over. It will communicate with the hard disk controller to cooperate in reading the data and writing into the DMA memory location.
While this is going on the operating system puts the process on hold so it can continue with other work. The hardware is taking care of things and the CPU isn't required to pay attention. The disk request will take many milliseconds during which the CPU can run millions of instructions.
When the request is complete the AHCI controller will send an interrupt. One of the system CPUs will receive the interrupt, look in its IDT (Interrupt Descriptor Table) and jump to the machine code at that location: the interrupt handler.
The operating system interrupt handler will read some data, find out that it has been interrupted by the AHCI controller, then it will jump into the AHCI driver code. The AHCI driver will read the registers on the controller, determine that the read is complete, put a marker into its operations queue, tell the OS scheduler that it needs to run, then return. Nothing else happens at this point.
The operating system will note that it needs to run the AHCI driver's queue. When it decides to do that (it might have a real-time task running or it might be reading networking packets at the moment) it will then go read the data from the memory block marked for DMA and copy that data to the EXT4 file system driver. That EXT4 driver will then return the data to the VFS which will put it into cache. The VFS will return an operating system file handle to the open system call, which will return that to the fopen library call, which will put that into the FILE struct and return a pointer to that to the program.
fopen et al are usually implemented on top of OS-specific system calls. On Unix, this means the APIs for working with file descriptors: open, read, write, close, and a few others. On Windows, it's CreateFile, ReadFile, etc.

Resources