How does Linux kernel discover PCI devices? - c

On the driver side, pci_register_driver() is called when a driver module is loaded, or at boot time if the module is built-in. (Whenever a device/driver is added, driver/device list is looped to find a match, I get that part.)
But where/when are pci devices discovered and registered with the bus? I imagine this is arch specific, and would involve BIOS on x86, such as - BIOS routine probe PCI devices and then put the results in a list some where in RAM, before loading the kernel, and each list entry contains information of a single pci device including vendorId/deviceId etc. Kernel then pick up the list and insert them into pci_bus_type.p.klist_devices at some point. But this is pure guess, can anyone give some hints?

Actually, BIOS need not be involved.
PCI standardizes a certain procedure for discovery of devices on the bus. This procedure can be triggered at any time (not exclusively on boot) by hotplug controller or even manually, via /sys/bus/pci/rescan (see pci_rescan_bus).
The scan will proceed recursively, traversing the bridges as discovered and reading the configuration space data off each device encountered (see PCI configuration space).
For each device found, if not yet active, the kernel will look for an instance of pci_driver object with a matching pci_device_id. Then it will call the probe method of that object (the rest is driver implementation specific).
If appropriate pci_driver instance is not found, kernel will emit an event to an user space daemon (udev or hotpug or whatever), which may load an appropriate module and create the necessary pci_driver object.

Related

How to use hyperbus driver file?

I am so confused to understand hyperbus to communicate between Traveo II Body Evaluation Board to one hyperram, I read the datasheet (https://www.cypress.com/part/s27kl0641dabhi020) but still not understand how to implement example code. so for that, I search on google and I found this link (https://github.com/torvalds/linux/blob/master/drivers/mtd/hyperbus/hyperbus-core.c) but not found any example code yet so anyone helps me or anyone provides the useful link or one example code?
In Linux, drivers/mtd/hyperbus/ implements hyperbus as a Memory Technology Device, also known as an MTD device; so, the hyperbus is accessed via the Linux MTD subsystem. mtd-utils contains example code for accessing the MTD subsystem; see http://linux-mtd.infradead.org/ for further links.
The Linux MTD hyperbus subsystem exports a character device (/dev/mtdN). An userspace application opens this device (read-write), then does an MEMGETINFO ioctl() to obtain the MTD device information into a struct mtd_info_user structure (defined in <mtd/mtd-user.h>). Operations to read, write, and erase blocks are done via ioctl()s (defined in <mtd/mtd-abi.h>). (On a Linux workstation, you'll find these in /usr/include/mtd/ if standard C development packages have been installed; on Debian derivatives, they're provided by the linux-libc-dev package.)
On these machines, Device Tree is used to describe the actual hardware, including hyperbus. You do not "configure" the hyperbus from the userspace when you open the character device; each hyperbus is described by a device tree blob (.dtb) or an overlay the kernel loads (or receives from the bootloader like uboot) at bootup. I assume you already have the Device Tree etc. configured, since you are asking about how to use the character device; the character device(s) are created by the kernel from the Device Tree descriptions, and only show up if the hyperbus(es) have been configured.

What does dev_net_set do in Linux?

I am writing a simple net device driver based on the loopback driver and want to register my net_device structure. This and that page on writing a net device say to just call register_netdev. But they're writing fancy drivers with PCI express and other complicated things.
So, if I just want something like the loopback driver, I should presumably base my code on loopback.c. My question is, what does the first line of this code in loopback_net_init do:
dev_net_set(dev, net);
err = register_netdev(dev);
Apparently net is determined by this code in net_namespace.c:
register_pernet_device(ops) ...
__register_pernet_operations(list, ops)
for_each_net(net) ...
What is this looping for? What might go wrong if I skip the dev_net_set call? Why are others not using it?
AFAIK, net is a structure that will allow the kernel to interact with the device. You need it to register the device and remove it in the module cleanup function. Please review the code under linux/net/8021q/ for examples.
AFAIK, looping happens at the level of sockets (layer 5-7), whereas net_dev is used as the kernel component that immediately interacts with the driver, when you actually want to use a say, ethernet card, or SLIP,PLIP for transmitting frames (layer 2-0). Loopback happens at the level of the network subsystem of the kernel, and lies way above the drivers which interact with the hardware. So I don't see why you would need a driver to use the loopback feature. However, there is also a provision for registering a dummy device with net_dev, though I don't know if that is what you are looking for.
That said, if your intention is to simply use some driver that simulates an actual physical device without one and say, reflects the packets that it recieves, that is possible too. Basically till the net_dev layer, the kernel does all the protocol stuff (TCP/IP), and finally passes off the packet to some handle that the device driver registers with the net_dev or something similar. Similarly on receiving stuff, the device triggers an interrupt, the driver does a DMA operation, and the kernel takes over from there. Hence instead of the code for doing the DMA operation, you can make a module that simply pass over a static packet, that is compatible with ethernet/TCP/IP . In a vast majority of cases, all these aspects (the network and other subsystems) are agnostic to the underlying bus details, i.e. it shouldn't matter whether the ethernet card is connected to PCI or ISA but there can be exceptions. Thus, IMHO, you are trying to do something that should only be attempted after having a thorough understanding of the network subsystem, and a good enough understanding of the kernel as a whole. Till then you will be shooting in the dark. Sometimes you may hit, but often-times you will miss.
http://man7.org/linux/man-pages/man8/ip-netns.8.html
A network namespace is logically another copy of the network stack,
with its own routes, firewall rules, and network devices.
So for_each_net is looping over these namespaces and creating a copy of all "per net" network devices in each one.
Use ip netns list to determine whether you are using network namespaces. Often they are not used, so drivers do not necessarily need to use dev_net_set.

SPI kernel module, how to istantiate the driver?

I've been tasked to import the spi driver into an existing platform running Openwrt.
After "successfully" build the full Openwrt: packages and the kernel matching the one running into the platform, including the spidev kernel module I run into some trouble in make this module work.
**insmod** of the driver terminates without error, and I see that in the **/sys/class** the creation of the directory **spidev**, but it is empty.
Looking at the code of the spidev kernel module, the function **probe** has caught my eye. My feel is that this function is what actually allocates the minor device number an make the device available to be used. But it's not clear to me who, or what should call it.
Another doubt I have is about the architecture. The spi depends on the underlaying architecture, which in my case is the MT7620 soc which is a mipsel architecture have a specific spi code. In my understanding this SOC specific code is wrapped by the spidev kernel module and the link between these two entities should be
status = spi_register_driver(&spidev_spi_driver);
in the
static int __init spidev_init(void)
function.
Again, I'm far to be sure of what I'm writing and I'm here asking for directions.
First, you may want to read the Linux Device Driver - Chapter 14.
Generally speaking, in the Linux kernel you have devices and drivers on a bus (subsystem). You have to register them using the functions register_device() and register_driver() (the exact name depends on the subsystem). When a device or a driver get registered the subsystem will try to match devices with drivers. If they match, the subsystem calls the driver's function probe(). So, the execution of the probe() function can be triggered by the driver registration or the device registration. This also means that the device exists before the execution of probe()
Back to your specific case: SPI. To register a device instance you need to use spi_alloc_device and spi_add_device, or spi_new_device. Where to put this code? This depends on your need. Typically, the SPI devices are declared in some architecture file or device-tree description. Probably you can also do it in a module (not sure).
The spidev is a Linux device driver that exports the SPI raw interface to the user-space. This means that once you register an SPI device instance the driver spidev take control of it. Actually, the spidev does nothing: it waits for an user-space program to read/write data on the SPI bus. So, you will end up writing the driver for your device in userspace.

How to unsafely remove blockdevice driver in Linux

I am writing a block device driver for linux.
It is crucial to support unsafe removal (like usb unplug). In other words, I want to be able to shut down the block device without creating memory leaks / crashes even while applications hold open files or performing IO on my device or if it is mounted with file system.
Surely unsafe removal would possibly corrupt the data which is stored on the device, but that is something the customers are willing to accept.
Here is the basics steps I have done:
Upon unsafe removal, block device spawns a zombie which will automatically fail all new IO requests, ioctls, etc. The zombie substitutes make_request function and changes other function pointers so kernel would not need the original block device.
Block device waits for all IO which is running now (and use my internal resources) to complete
It does del_gendisk(); however this does not really free's kernel resources because they are still used.
Block device frees itself.
The zombie keeps track of the amount of opens() and close() on the block device and when last close() occurs it automatically free() itself
Result - I am not leaking the blockdevice, request queue, gen disk, etc.
However this is a very difficult mechanism which requires a lot of code and is extremely prone to race conditions. I am still struggling with corner cases, per_cpu counting of io's and occasional crashes
My questions: Is there a mechanism in the kernel which already does that? I searched manuals, literature, and countless source code examples of block device drivers, ram disks and USB drivers but could not find a solution. I am sure, that I am not the first one to encounter this problem.
Edited:
I learned from the answer below, by Dave S about the hot-plug mechanism but it does not help me. I need a solution of how to safely shut down the driver and not how to notify the kernel that driver was shut down.
Example of one problem:
blk_queue_make_request() registers a function through which my block devices serves IO. In that function I increment per_cpu counters to know how many IO's are in flight by each cpu. However there is a race condition of function being called but counter was not increased yet, so my device thinks there are 0 IO's, releases the resources and then IO comes and crashes the system. Hotplug will not assist me with this problem as far as I understand
About a decade ago I used hotplugging on a software driver project to safely add/remove an external USB disk drive which interfaced to an embedded Linux driven Set-top Box.
For your project you will also need to write a hot plug. A hotplug is a program which is used by the kernel to notify user mode software when some significant (usually hardware-related) events take place. An example is when a USB device has just been plugged in or removed.
From Linux 2.6 kernel onwards, hotplugging has been integrated with the driver model core so that any bus or class can report hotplug events when devices are added or removed.
In the kernel tree, /usr/src/linux/Documentation/usb/hotplug.txt has basic information about USB Device Driver API support for hotplugging.
See also this link, and GOOGLE as well for examples and documentation.
http://linux-hotplug.sourceforge.net/
Another very helpful document which discusses hotplugging with block devices can be found here:
https://www.kernel.org/doc/pending/hotplug.txt
This document also gives a good example of illustrating hotplug events handling:
Below is a table of the main variables you should be aware of:
Hotplug event variables:
Every hotplug event should provide at least the following variables:
ACTION
The current hotplug action: "add" to add the device, "remove" to remove it.
The 2.6.22 kernel can also generate "change", "online", "offline", and
"move" actions.
DEVPATH
Path under /sys at which this device's sysfs directory can be found.
SUBSYSTEM
If this is "block", it's a block device. Anything other subsystem is
either a char device or does not have an associated device node.
The following variables are also provided for some devices:
MAJOR and MINOR
If these are present, a device node can be created in /dev for this device.
Some devices (such as network cards) don't generate a /dev node.
DRIVER
If present, a suggested driver (module) for handling this device. No
relation to whether or not a driver is currently handling the device.
INTERFACE and IFINDEX
When SUBSYSTEM=net, these variables indicate the name of the interface
and a unique integer for the interface. (Note that "INTERFACE=eth0" could
be paired with "IFINDEX=2" because eth0 isn't guaranteed to come before lo
and the count doesn't start at 0.)
FIRMWARE
The system is requesting firmware for the device.
If the driver is creating device it could be possible to suddenly delete it:
echo 1 > /sys/block/device-name/device/delete where device-name may be sde, for example,
or
echo 1 > /sys/class/scsi_device/h:c:t:l/device/delete, where h is the HBA number, c is the channel on the HBA, t is the SCSI target ID, and l is the LUN.
In my case, it perfectly simulates scenarios for crushing writes and recovery of data from journaling.
Normally to safely remove device more steps is needed so deleting device is a pretty drastic event for data and could be useful for testing :)
please consider this:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/online_storage_reconfiguration_guide/removing_devices
http://www.sysadminshare.com/2012/09/add-remove-single-disk-device-in-linux.html

Linux Device Tree: How to make the device file?

On my ARM system (Tegra based), I'm running the mainline linux kernel. It uses the device tree system.
I have enabled a hardware driver for the General-Memory-Bus (part of the SoC) in the .dts file by setting its status="okay". Recompiled the dtb and booted the kernel. But no device (/dev/xx) appears.
The driver is compiled into the kernel and can be seen by
cat /lib/modules/$(uname -r)/modules.builtin
The command
cat /sys/firmware/devicetree/base/<path to device>/status
returns "okay".
Do I need to make some kind of "mknod"?
What else is nessesary?
The traditional UNIX "stream of bytes" device model is a pretty high-level abstraction of most modern hardware, and as such there are plenty of drivers which do not create /dev entries for the devices they control largely because they don't fit that model. Bus drivers in particular are very much a case of that - they exist, but only for the sake of discovering and allowing access to the devices behind them; there is no /dev/sata that lets you interact with the actual host controller, sending out raw commands on any old port regardless of what's connected or not; there is no /dev/usb that lets you attempt arbitrary transfers to arbitrary endpoints which may or may not exist.
Furthermore, your typical 'external interface' controller as in this case is orders of magnitude less complex than an interface like SATA or USB - the 'device' itself is often little more than a register block controlling some clocks and a chip-select multiplexer. Even if the driver did create something you could interact with directly, there's not exactly much you could do with it.
The correct way to proceed in this situation is to describe your FPGA device in the DT as a child of the GMI bus, accurately reflecting the hardware, no less, then develop your own driver for that. The bus driver itself just sits transparently in the middle. And if you do want a quick and dirty way to get started by just reading and writing bus addresses directly, well, it's behind a memory-mapped I/O region; that's exactly what /dev/mem exists for.

Resources