I am currently working on writing my first Linux Networking driver and it seems to be going fairly smoothly right now. My network device is going to create an Ethernet interface but forward the Ethernet frames over PCIe to a PCIe endpoint. My question has to do with reception of forwarded Ethernet frames from the PCIe endpoint destined for my interface.
What I would normally do would be to allocate a large DMA buffer, tell the endpoint with Bus Mastering capabilities where the buffer was, and allow it to DMA to that buffer. It would then send an interrupt to signal reception and I could copy the data into an sk_buf.
My question is this:
In LDD3, it says that I should be able to DMA directly to the sk_buf because all sk_buf are in DMA memory. When and where do I allocate this buffer and tell the Bus Master where it is located? Do I do it on initialization first, then once the Bus Master has written it's first sk_buf and interrupts me signalling reception, do I allocate a new buffer and write the new location? Can this only be done with poll enabled (I think its called NAPI) reception?
Thanks in advance for your help.
Related
I have very basic question regarding Rx/Tx Hardware Queues in Ethernet Controller, what its used for ?
While looking at the following driver in Linux kernel, its seems like it is used to carry DMA descriptors ?
https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/broadcom/genet/bcmgenet.c#L2276
You are correct, the rx/tx queues contain DMA descriptors for incoming and outgoing packets.
If you are curious how network drivers work, I recommend looking at the ixy userspace network driver: https://github.com/emmericp/ixy
The code is relatively simple and very well commented, and there is a paper which explains how it works: https://www.net.in.tum.de/fileadmin/bibtex/publications/papers/ixy-writing-user-space-network-drivers.pdf
See section 4.1 NIC Ring API in the paper for an explanation of the receive (rx) and transmit (tx) queues:
NICs expose multiple circular buffers called queues or rings to transfer packets. The simplest setup uses only one receive and one transmit queue. Multiple transmit queues are merged on the NIC, incoming traffic is split according to filters or a hashing algorithm if multiple receive queues are configured. Both receive and transmit rings work in a similar way: the driver programs a physical base address and the size of the ring. It then fills the memory area with DMA descriptors, i.e., pointers to physical addresses where the packet data is stored with some metadata. Sending and receiving packets is done by passing ownership of the DMA descriptors between driver and hardware via a head and a tail pointer. The driver controls the tail, the hardware the head. Both pointers are stored in device registers accessible via MMIO.
My question is going to be rather vague but I will try to explain as detailed as I can what I am trying to resolve.
Trying to learn Linux kernel USB stack I have started to think of making a simple USB driver for my Atmel evaluation board based on ARM M0+ MCU to run away from Windows tools (Visual Studio plugin).
I have spent few days learning kernel's USB API and come to conclusion of how to make this. My driver aims to make my board connected to PC through USB cable act like a simple USB flash drive. Making that I then can easily program it with a new version of firmware written by me.
I have found that I need to find out specific interface (I am talking about interface in terms of USB specification, not interface we used to use as a code abstraction) that holds an endpoint (pipe) responsible for interaction with flash memory. And then I can map it to character device and interact with it using standard I/O operations that are described in struct file_operations structure.
Simply using cat on /proc/* file descriptor that was created by USB Core subsystem I have investigated that interface responsible for interaction with flash memory holds bulk endpoint (likewise, this terms come from USB specification, CMIIAW) that act as a "descriptor". Linux kernel USB Core subsystem gives neat interfaces to talk to different kind of endpoints whether it control, interrupt, bulk or asynchronous endpoint.
Now I have come closer to my very question.
Also the main transfer unit in communication between two USB devices is abstraction called urb - you allocate it, you fill it, you send it to USB Core subsystem, you read it if it was IN type of urb and, finally, you free it. What is confusing for me and tightly related to my question is the next API include/linux/usb.h:
static inline void usb_fill_bulk_urb(struct urb *urb,
struct usb_device *dev,
unsigned int pipe,
void *transfer_buffer,
int buffer_length,
usb_complete_t complete_fn,
void *context)
Assume I have obtained an information from board's datasheet about where to write a program code. Let's say, we have 0x00100 - 0x10000 memory region. I will compile my code, obtain a binary and then using standard Linux tools or writing a simple user-space wrapper application I will use lseek to set file's offset to 0x00100 and write system call provided with a buffer (binary compiled previously) and it's length.
In kernel space, I will have to allocate urb in write system call handler, fill it with a buffer sent from user space and submit this urb to USB Core.
BUT I can not find a way how to specify an OFFSET set earlier by lseek. Do I miss something? Maybe I missed some concepts or, perhaps, I am watching in a wrong way?
When your embedded Linux device acts as a USB mass storage device, the flash as a peripheral on Linux device is unmounted, and the gadget driver is loaded. Linux then loses control to the flash, and now the PC connected to your Linux device fully controls the flash. This is because a flash as a USB device can only has one USb host.
The gadget driver works purely in kernel space. It does not receive or transmit data from/to user space. It calls vfs_read() and vfs_write() to access the files on the flash, with an field offset. The offset is got from the USB commands sent from your host - Windows PC.
There is no way to specify offset using USB subsystem's API. I misunderstood whole conception of USB as communication protocol, unwise me. You must first learn underlying protocol your device uses to communicate with others.
If your device acts as a USB HID device then learning specification of how to exchange data with USB HID device is the way to go. If there is something proprietary then you can do nothing but reverse engineer it (listening USB packets with a sniffer on system where a driver for your device exists).
As for my board it has embedded debugger that serves as a communication module besides being debugger itself. Specifically, my device is equipped with EDBG and here is a link on description of protocol it uses for communication.
I want to send data packets into the network bypassing the Linux network stack. I mean is there any way where I can interrupt the network card driver and place a frame in the network card buffer directly to send it in the network? I am a newbie in Linux Kernel hacking so any guideline on how I can get started will be very helpful.
You would be better off if you used some virtual device like TAP. You can easily hack a control interface into the TAP kernel module, via which you can then pass frames ready to be sent out to the driver. That approach can be compared to the performance of a regular socket application as the baseline. Since in the end the TAP device will "send" out egress frames via a character device, you can easily write a test application measuring performance and latency.
I am a bit confused regarding DMA transfers with a PCIe device.
Say, for example, I have a slave PCIe device, and I want to transfer a block of data from the device to the RAM, using a DMA transaction. Note that the device is slave, and does not have a DMA "machine" on it.
I know I need to obtain a DMA-able buffer in RAM (either by allocating a coherent one, or by mapping a page) first.
But what's next? what's the API to start a DMA transfer of N bytes from address S to address D?
Can modern systems issue a DMA transfer to/from a slave pci device? if so, what is the Linux API for that?
As explained here:
[ISA]
In the original IBM PC, there was only one Intel 8237 DMA controller [...]
A PCI architecture has no central DMA controller, unlike ISA. Instead, any PCI component can request control of the bus ("become the bus master") and request to read from and write to system memory
The PCI bus does not have a "central" DMA controller - instead, each device can be a DMA "controller".
First of all, there are no slaves and slave holders inside modern PC. There is south bridge (in PCI) or Root Complex (root of PCI-express device tree) and there are some other PCI/PCIe actors, like bridges, soldered chips, plugged cards, hardware debuggers etc. I'll assume that you are asking about plugged card or some other peripheral device, like soldered Sound Card or Ethernet chip.
According to this detailed description of "Transaction Layer Packet" (TLP, "PCIe’s uppermost layer"), there is "Bus Mastership (DMA)":
On PCIe, it’s significantly less exotic. ... anyone on the bus can send read and write TLPs on the bus, exactly like the Root Complex. This allows the peripheral to access the CPU’s memory directly (DMA) or exchange TLPs with peer peripherals (to the extent that the switching entities support that).
Also, there is some benefits of DMA capability from plugged devices: DMA attack. And PCIe is listed as capable of initiating DMA transfer:
Systems may be vulnerable to a DMA attack by an external device if they have a FireWire, ExpressCard, Thunderbolt, or other expansion port that, like PCI and PCI-Express in general, hooks up attached devices directly to the physical address space.
I think, there is no universal API for programming DMA transfers that are initiated from the peripheral device itself. This depends on the what the device is, when the DMA should be started and what will be sent.
I'am working on a real-time control system that calculates the control signals in a buffered fashion (a user-mode program) and outputs to the usb device the array through isochronous transfers. The usb device them reports the execution progress through interrupt transfer, so that pc software can then calculate and push the next control array.
The software runs based on raw win32 api, C based. (C++ used only on not time sensitive parts of the program, such as interface, 3D models...).
I would like to know if there is a way to register a callback function in response to a interrupt transfer?
From what I understand, although we are talking about interrupt transfers, the USB device still has to be polled using libusb_interrupt_transfer:
Interrupt transfers are typically non-periodic, small device
"initiated" communication requiring bounded latency. An Interrupt
request is queued by the device until the host polls the USB device
asking for data.
Excerpt from https://www.beyondlogic.org/usbnutshell/usb4.shtml#Interrupt