How to offload NAPI poll function to workqueue - c

I'm working with linux 3.3, Ethernet driver for smsc911x. and I want to move the NAPI poll function to workqueue.
My questions are :
1. How do I pass the NAPI poll function arguments to the work_struct?
2. How do I get the NAPI poll function arguments back from the work_struct? (related to Q.1 above)
3. How can I return the npackets value to the original NAPI poll function caller?
Here are some explanations :
Current NAPI poll function reads recevie FIFO directly which I want to change to do it with DMA controller. For this DMA, I trigger DMA, sleep with wait_event_interruptible, and get woken up by DMA's ISR with wake_up_interruptible. As you know, NAPI poll function is in interrupt context (softirq) so I cannot sleep there for DMA completion. I want to move the NAPI poll function(reading RX FIFO) to waitqueue(process context) usnig a work_struct.
The problem is, NAPI poll function is called by the kernel with two arguments : struct napi_struct *napi and int budget.
I want to pass those argument to the work_struct and queue the work_struct to the workqueue (using queue_work function).
the work_struct looks like below. (include/linux/workqueue.h)
struct work_struct {
atomic_long_t data;
struct list_head entry;
work_func_t func;
#ifdef CONFIG_LOCKDEP
struct lockdep_map lockdep_map;
#endif
};
I take that atomic_long_t data is for passing the argument to the work_struct. how can I pass the arguments to the work_struct?
I tried this (I added in the structure for device driver struct smsc911x_data a member struct work_struct rx_work for passing the work.) :
struct work_arg { // a new struct for pass the arguments
struct napi_struct *napi;
int budget;
};
/* NAPI poll function */
static int smsc911x_poll(struct napi_struct *napi, int budget) {
struct smsc911x_data *pdata =
container_of(napi, struct smsc911x_data, napi);
struct net_device *dev = pdata->dev;
int npackets = 0;
if (enable_rx_use_dma == 1) { // when using DMA for FIFO read
prom_printf("moving it to workqueue\n");
struct work_arg *p;
p = kzalloc(sizeof(struct work_arg), GFP_KERNEL);
p->napi = napi;
p->budget = budget;
pdata->rx_work.data = (atomic_long_t) p; // <== THIS LINE
prom_printf("queue work, with napi = %x, budget = %d\n", napi, budget);
queue_work(rx_work_workqueue, &pdata->rx_work); // smsc911x_poll_work } else {
-- original NAP poll function, reads FIFO until it's empty and enables the RX interrupt and
-- keeps the number of processed packets to npackets.
return npackets;
}
For "THIS LINE" above, I'm getting error during compile.
with pdata->rx_work.data = p; , I get error: incompatible types when assigning to type 'atomic_long_t' from type 'struct work_arg *'
with pdata->rx_work.data = (atomic_long_t) p; , I get error: conversion to non-scalar type requested.
Also, in the new work function, How can I extract the original argments? I tried this below which gives me errors.
/* New work function called by the default worker thread */ static int smsc911x_poll_work(struct work_struct *work) {
struct smsc911x_data *pdata =
container_of(work, struct smsc911x_data, rx_work);
struct net_device *dev = pdata->dev;
int npackets = 0;
struct napi_struct *napi = (struct work_struct *)work->data.napi; // <== THIS LINE
int budget = (struct work_struct *)work->data.budget; // <== THIS LINE ..
}
From the above 'THIS LINE's, I get erros below.
error: 'atomic_long_t' has no member named 'napi'
error: 'atomic_long_t' has no member named 'budget'
and I don't know how to pass the return value to the original NAPI poll functino caller.
I'm not sure if this kind of conversion (from NAPI poll to workqueue) is possible.
Sorry for the long questions but any help will be greatly appreciated.
ADD : Because struct smsc911x_data has both struct napi napi; and struct work_struct rx_work; as members, I can easily obtain the struct napi *napi from work_struct *work (an argument of work function) by :
struct smsc911x_data *pdata =
container_of(work, struct smsc911x_data, rx_work); struct napi_struct *napi = &pdata.napi;
so maybe I can just pass the int budget through a new member value in struct smsc911x_data. I sill want to know the correct practice for this case.

How do I pass the NAPI poll function arguments to the work_struct?
Just create new structure, which embed work_struct and add your arguments into it:
struct my_work {
struct work_struct base_work;// Embedded work_struct
struct napi_struct *napi; // Your arguments
int budget;
};
static int smsc911x_poll(struct napi_struct *napi, int budget) {
struct my_work* p = kmalloc(sizeof(*p), GFP_ATOMIC /* Flag usable for interrupt context */);
INIT_WORK(&p->base_work, smsc911x_poll_work); // Initialize underliying structure.
p->budget = budget; // Initialize your members
p->napi = napi;
...
}
How do I get the NAPI poll function arguments back from the work_struct? (related to Q.1 above)
Use container_of:
static int smsc911x_poll_work(struct work_struct *work) {
struct my_work* p = container_of(work, struct my_work, base_work);
...
}
How can I return the npackets value to the original NAPI poll function caller?
As I understand from description(see, e.g., http://www.linuxfoundation.org/collaborate/workgroups/networking/napi) this function should process packets which are ready. And this processing should be done within function itself, without deferring to workqueue or similar.

This approach seems very ineffective since you need two interrupts, one when packet is received, and one when DMA tansfer is done.
I think this it the way of working of DMA capable network interfaces:
When packet(s) arrive, Socket Buffers are already allocated and mapped to DMA memory buffer, and DMA is armed.
Packet is transferred from NIC to Socket Buffer through DMA
NIC raises hardware interrupt (when DMA transfer is done).
Hardware interrupt handler schedules packet receiving software interrupt (SOFTIRQ)
SOFTIRQ does NAPI poll() for further processing.
NAPI poll() process packets in DMA buffers and and passes it to upper layers as sk_buff and initializes new DMA buffers. if all packets (quota) are processed, IRQ is enabled and NAPI is told to stop polling.

Related

Why do some linux header files define a function to return 0 after the declaration?

I am looking at the Linux 4.14 kernel's include/linux/clk.h file, and have noticed that some of the functions are declared, and then later defined to return 0 or NULL.
For example:
struct clk *clk_get(struct device *dev, const char *id);
...
static inline struct clk *clk_get(struct device *dev, const char *id)
{
return NULL;
}
What is the purpose of doing this? I see multiple C source files that fully define this function and still include linux/clk.h.
Linux kernel comes with lots of configuration parameters. For this particular function, you get the service if CONFIG_HAVE_CLK parameter is defined:
#ifdef CONFIG_HAVE_CLK
/**
* clk_get - lookup and obtain a reference to a clock producer.
* #dev: device for clock "consumer"
* #id: clock consumer ID
*
* Returns a struct clk corresponding to the clock producer, or
* valid IS_ERR() condition containing errno. The implementation
* uses #dev and #id to determine the clock consumer, and thereby
* the clock producer. (IOW, #id may be identical strings, but
* clk_get may return different clock producers depending on #dev.)
*
* Drivers must assume that the clock source is not enabled.
*
* clk_get should not be called from within interrupt context.
*/
struct clk *clk_get(struct device *dev, const char *id);
[...]
#else /* !CONFIG_HAVE_CLK */
static inline struct clk *clk_get(struct device *dev, const char *id)
{
return NULL;
}
[...]
This parameter is defined in arch/Kconfig as:
config HAVE_CLK
bool
help
The <linux/clk.h> calls support software clock gating and
thus are a key power management tool on many systems.

Array's data is changed if I don't printf it

I am writing a C program on Eclipse to communicate from my ARM Cortex M4-F microcontroller in I2C with its master, another MCU.
In my I2C library, I use a static global variable to store all the parameters of the communication (address, lenghts, data buffers). The issue is that a part (an array containing the data to be transmitted, which are 8 bits integers) of this variable gets modified when the interrupt (Start condition followed by the slave's address on the I2C bus) happens, even before executing the code I put the handler. It gets assigned to 8, whatever the initial value.
I tried to put breakpoints basically everywhere, and a watchpoint on the variable, the changes arises seemingly from nowhere, not in the while loop, and before the call to my_I2C_Handler(), so the interrupt is the cause apparently.
I also tried setting the variable as volatile, but that changed nothing.
I noticed one interesting thing: putting a printf of the array's data during my_I2C_Init() or my_SlaveAsync(), like so:
printf("%d\n", req.tx_data[0]);
corrects this problem, but why? I want to remove all prints after debugging.
#include <stdint.h>
#include "my_i2c.h"
void I2C1_IRQHandler(void)
{
printf("\nI2C Interrupt\n");
my_I2C_Handler(MXC_I2C1); // MXC_I2C1 is a macro for the registry used
}
int main(void)
{
int error = 0;
printf("\nStarting I2C debugging\n");
// Setup the I2C
my_I2C_Shutdown(MXC_I2C1);
my_I2C_Init(MXC_I2C1);
NVIC_EnableIRQ(I2C1_IRQn); // Enable interrupt
my_I2C_SlaveAsync(MXC_I2C1); // Prepare to receive communication
while (1)
{
LED_On(0);
LED_Off(0);
}
printf("\nDone testing\n");
return 0;
}
The structure of the request containing the parameters of the communication is like this:
typedef struct i2c_req i2c_req_t;
struct i2c_req {
uint8_t addr; // I2C 7-bit Address
unsigned tx_len; // Length of tx data
unsigned rx_len; // Length of rx
unsigned tx_num; // Number of tx bytes sent
unsigned rx_num; // Number of rx bytes sent
uint8_t *tx_data; // Data for mater write/slave read
uint8_t *rx_data; // Data for master read/slave write
};
Is declared like so in the beginning of the file:
static i2c_req_t req;
and assigned this way in my_I2C_Init():
uint8_t rx[1] = {0};
uint8_t tx[1] = {12};
req.addr = 0xAA;
req.tx_data = tx;
req.tx_len = 1;
req.rx_data = rx;
req.rx_len = 1;
req.tx_num = 0;
req.rx_num = 0;
Many thanks for your help

How to share a hardware abstraction struct without making it global

This is a question about sharing data that is "global", mimicking a piece of addressable memory that any function could access.
I'm writing code for an embedded project, where I've decoupled my physical gpio pins from the application. The application communicates with the "virtual" gpio port, and device drivers then communicate with the actual hardware. The primary motivation for this is the comfort it allows me in switching out what pins are connected to what peripheral when developing, and to include things like button matrices that use fewer physical pins while still handling them as regular gpio device registers.
typedef struct GPIO_PinPortPair
{
GPIO_TypeDef *port; /* STM32 GPIO Port */
uint16_t pin; /* Pin number */
} GPIO_PinPortPair;
typedef struct GPIO_VirtualPort
{
uint16_t reg; /* Virtual device register */
uint16_t edg; /* Flags to signal edge detection */
GPIO_PinPortPair *grp; /* List of physical pins associated with vport */
int num_pins; /* Number of pins in vport */
} GPIO_VirtualPort;
This has worked well in the code I've written so far, but the problem is that I feel like I have to share the addresses to every defined virtual port as a global. A function call would look something like this, mimicking the way it could look if I were to use regular memory mapped io.
file1.c
GPIO_VirtualPort LEDPort;
/* LEDPort init code that associates it with a list of physical pins */
file2.c
extern GPIO_VirtualPort LEDPort;
vgpio_write_pin(&LEDPort, PIN_1, SET_PIN);
I've searched both SO and the internet for best practices when it comes to sharing variables, and I feel like I understand why I should avoid global variables (no way to pinpoint where in code something happens to the data) and that it's better to use local variables with pointers and interface functions (like a "get current tick" function rather than reading a global tick variable).
My question is, given that I want to the keep the syntax as simple as possible, what is the best way to define these struct variables and then make them available for functions in other modules to use? Is it okay to use these struct variables as globals? Should I use some kind of master-array of pointers to every virtual port I have and use a getter function to avoid using extern variables?
I like to do it this way:
file1.h
typedef enum
{
VirtualPortTypeLED
} VirtualPortType;
typedef struct GPIO_PinPortPair
{
GPIO_TypeDef *port; /* STM32 GPIO Port */
uint16_t pin; /* Pin number */
} GPIO_PinPortPair;
typedef struct GPIO_VirtualPort
{
uint16_t reg; /* Virtual device register */
uint16_t edg; /* Flags to signal edge detection */
GPIO_PinPortPair *grp; /* List of physical pins associated with vport */
int num_pins; /* Number of pins in vport */
} GPIO_VirtualPort;
file1.c
GPIO_VirtualPort LEDPort;
void VirtualPortInit()
{
/* fill in all structures and members here */
LEDPort.reg = 0x1234;
...
}
GPIO_VirtualPort *VirtualPortGet(VirtualPortType vpt)
{
switch(vpt) {
case VirtualPortTypeLED:
return &LEDPort;
}
return NULL;
}
file2.c
#include file1.h
GPIO_VirtualPort *myLed;
VirtualPortInit();
myLed = VirtualPortGet(VirtualPortTypeLED);
Btw, I didn't compile this ... :)
To do this without using a global struct that references a given set of hardware or a global set of addresses you create a handle to the GPIO struct at the location that you want when starting out.
I'm not sure how the STM32 is laid out as I have no experience with that family of devices but I have seen and used this method in the situation you describe.
If your hardware is located at a particular address in memory, eg: 0x50, then your calling code asks a GPIO_Init() to give it a handle to the memory at that location. This still allows you to assign the struct at different locations if you need, for example:
/* gpio.h */
#include <stdef.h>
#include <stdint.h>
#include <bool.h>
typedef struct GPIO_Port GPIO_Port; // forward declare the struct definition
GPIO_Port *GPIO_Init(void *memory, const size_t size);
GPIO_Write_Pin(GPIO_Port *port_handle, uint8_t pin number, bool state);
A simple implementation of the GPIO_Init() function might be:
/* gpio.c */
#include "gpio.h"
struct GPIO_Port // the memory mapped struct definition
{
uint16_t first_register;
uint16_t second_register;
// etc, ordered to match memory layout of GPIO registers
};
GPIO_Port *GPIO_Init(void *memory, const size_t size)
{
// if you don't feel the need to check this then the
// second function parameter probably won't be necessary
if (size < sizeof(GPIO_Port *))
return (GPIO_Port *)NULL;
// here you could perform additional operations, e.g.
// clear the memory to all 0, depending on your needs
// return the handle to the memory the caller provided
return (GPIO_Port *)memory;
}
GPIO_Write_Pin(GPIO_Port *port_handle, uint8_t pin number, bool state)
{
uint16_t mask = 1u << pin_number;
if (state == true)
port_handle->pin_register |= mask; // set bit
else
port_handle->pin_register &= ~mask; // clear bit
}
Where the struct itself is defined only within the source file and there is no single global instance. Then you can use this like:
// this can be defined anywhere, or for eg, returned from malloc(),
// as long as it can be passed to the init function
#define GPIO_PORT_START_ADDR (0x50)
// get a handle at whatever address you like
GPIO_Port *myporthandle = GPIO_init(GPIO_PORT_START_ADDR, sizeof(*myporthandle));
// use the handle
GPIO_Write_Pin(myporthandle, PIN_1, SET_HIGH);
For the init function you can pass in the address of the memory with the real hardware location of the GPIO registers, or you can allocate some new block of RAM and pass the address of that.
Your addresses of the used memory do not have to be global, they are just passed to GPIO_Init() from the calling code and so ultimately could come from anywhere, the object handle takes over any subsequent referencing to that chunk of memory by passing to subsequent GPIO function calls. You should be able to build up your more complex functions around this idea of passing in the information that changes and the abstracted mapped memory such that you can still allow the functionality you mention with the "virtual" port.
This method has the benefit of separation of concerns (your GPIO unit is concerned only with the GPIO, not memory, something else can handle that), encapsulation (only the GPIO source needs to concern itself with the members of the GPIO port struct) and no/few globals (the handle can be instantiated and passed around as needed).
Personally I find this pattern pretty handy when it comes to unit testing. In release I pass the address for the real hardware but in test I pass an address for a struct somewhere in memory and test that the members are changed as expected by the GPIO unit - no hardware involved.

RTC initialisation in an MCU - why use a global callback

The code below is related to the initialization of an RTC in an MCU.
Would anybody know the rational for passing NULL to rtc_init() and then setting a global callback global_rtc_cb equal to it.
Why would you use a global callback at all when there is an other function called rtc_callback defined and used as the callback in the struct.
int main() {
rtc_init(NULL);
}
//-----------------------------------------------------------------
void ( * global_rtc_cb)(void *);
int rtc_init(void (*cb)(void *)) {
rtc_config_t cfg;
cfg.init_val = 0;
cfg.alarm_en = true;
cfg.alarm_val = ALARM;
cfg.callback = rtc_callback;
cfg.callback_data = NULL;
global_rtc_cb = cb;
irq_request(IRQ_RTC_0, rtc_isr_0);
clk_periph_enable(CLK_PERIPH_RTC_REGISTER | CLK_PERIPH_CLK);
rtc_set_config(QM_RTC_0, &cfg);
return 0;
}
//---------------------------------------------------------------------
/**
* RTC configuration type.
*/
typedef struct {
uint32_t init_val; /**< Initial value in RTC clocks. */
bool alarm_en; /**< Alarm enable. */
uint32_t alarm_val; /**< Alarm value in RTC clocks. */
/**
* User callback.
*
* #param[in] data User defined data.
*/
void (*callback)(void *data);
void *callback_data; /**< Callback user data. */
} rtc_config_t;
The rtc_ functions are part of the RTC driver. The RTC driver has something driver-specific to do when the event that prompts the callback occurs. This driver-specific stuff happens in rtc_callback. But there may also be other application-specific stuff that the application must do when the event occurs. The application-specific stuff should be done at the application layer, not within the driver. So if the application has something to do in response to the event it can provide a callback to rtc_init. Surely rtc_callback calls global_rtc_cb so that both the driver-specific stuff and the application-specific stuff is performed when the event occurs. Apparently your particular application doesn't need to do anything for this event so it passes NULL to rtc_init. But a different application that uses the same driver may provide a callback function.

c programming for interrupts in qnx?

client- server communication - client is sender and server is receiver. when the server receives the data on the ethernet interface(UDP) the kernel in the server is triggered. I am using QNX on the server side. server(i.e embedded pc target) is handling interrupts to trigger the embedded pc target (conatining QNX) to gain the attention to execute the newly arrived data.
const struct sigevent *handler1(void *area, int id1)
{
volatile double KernelStartExecutionTime;
struct sigevent *event = (struct sigevent *)area;
KernelStartExecutionTime = GetTimeStamp(); // calculating the time when the kernel //starts executing
measurements[18] = KernelStartExecutionTime ;
//return (NULL);
return event;
}
/*kernel calls attach the interrupt function handler to the hardware interrupt specified by intr(i.e irq) */
// InterruptAttach() : Attach an interrupt handler to an interrupt source
// interrupt source is handler1 for this example
void ISR(void) //void *ISR (void *arg)
{
/* the software must tell the OS that it wishes to associate the ISR with a particular source of interrupts.
* On x86 platforms, there are generally 16 hardware Interrupt Request lines (IRQs) */
volatile int irq = 0; //0 : A clock that runs at the resolution set by ClockPeriod()
struct sigevent event;
event.sigev_notify = SIGEV_INTR;
ThreadCtl (_NTO_TCTL_IO, NULL); // enables the hardware interrupt
id1 = InterruptAttach(irq, &handler1, &event, sizeof(event), 0); // handler1 is the ISR
while(1)
{
InterruptWait( 0, NULL );
InterruptUnmask(irq, id1);
}
InterruptDetach( id1);
}
int main(int argc, char *argv[])
{
ISR(); //function call for ISR
// pthread_create (NULL, NULL, ISR, NULL);
return 0;
}
question:
Should I create a new thread for handling interrupts within the main??

Resources