Avoiding Race Condition with event queue in event driven embedded system - c

I am trying to program stm32 and use event driven architecture. For example I am going to toggle a pin when timer interrupt occurs and transfer some data to external flash when ADC DMA buffer full interrupt occurs and so on..
There will be multiple interrupt sources each with same priority which disables nesting.
I will use the interrupts to set a flag to signal my main that interrupt occured and process data inside main. There will be no processing/instruction inside ISRs.
What bothers me is that accessing a variable(flags in this case) in main and ISRs may cause race condition bug in the long run.
So I want to use an circular event queue instead of flags.
Only ISRs will be able to write to event queue buffer and increment "head".
Only main will be able to read the event queue(and execute instructions according to event) and increment "tail".
Since ISR nesting is disabled and each ISR will access different element of event queue array and main function will only react when there is new event on event queue, race condition is avoided right? or am I missing something?
Please correct me if I am doing something wrong.
Thank you.

If the interrupt only sets a variable and nothing gets done until main context is ready to do it then there is really no reason to have an interrupt at all.
For example: if you get a DMA complete hardware interrupt and set a variable then all you have achieved is to copy one bit of information from a hardware register to a variable. You could have much simpler code with identical performance and less potential for error by instead of polling a variable just not enabling the interrupt and polling the hardware flag directly.
Only enable the interrupt if you are actually going to do something in interrupt context that cannot wait, for example: reading a UART received data register so that the next character received doesn't overflow the buffer.
If after the interrupt has done the thing that cannot wait it then needs to communicate something with main-context then you need to have shared data. This will mean that you need some way of preventing race-conditions. The simplest way is atomic access with only one side writing to the data item. If that is not sufficient then the old-fashioned way is to turn off interrupts when main context is accessing the shared data. There are more complicated ways using LDREX/STREX instructions but you should only explore these once you are sure that the simple way isn't good enough for your application.

Related

How can I determin if execution takes place in thread mode or if an exception is active? (ARMv7-A architecture)

I am using FreeRTOS on an ARM Cortex A9 CPU und I'm desperately trying to find out if it is possible to determin if the processor is executing a normal thread or an interrupt service routine. It is implemented in V7-a architecture.
I found some promising reference hinting the ICSR register (-> VECTACTIVE bits), but this only exist in the cortex M family. Is there a comparable register in the A family as well? I tried to read out the processor modes in the current processor status register (CPSR), but when read during an ISR I saw that the mode bits indicate supervisor mode rather than IRQ or FIQ mode.
Looks a lot like there is no way to determine in which state the processor is, but I wanted to ask anyway, maybe I missed something...
The processor has a pl390 General Interrupt Controller. Maybe it is possible to determine the if an interrupt has been triggered by reading some of it's registers?
If anybody can give me a clue I would be very greatfull!
Edit1:
The IRQ Handler of FreeRTOS switches the processor to Superviser mode:
And subsequently switches back to system mode:
Can I just check if the processor is in supervisor mode and assume that this means that the execution takes place in an ISR, or are there other situations where the kernel may switches to supervisor mode, without being in an ISR?
Edit2:
On request I'll add an overal background description of the solution that I want to achieve in the first place, by solving the problem of knowing the current execution context.
I'm writing a set of libraries for the CortexA9 and FreeRTOS that will access periphery. Amongst others I want to implement a library for the available HW timer from the processor's periphery.
In order to secure the access to the HW and to avoid multiple tasks trying to access the HW resource simultaneously I added Mutex Semaphores to the timer library implementation. The first thing the lib function does on call is to try to gain the Mutex. If it fails the function returns an error, otherwise it continouses its execution.
Lets focus on the function that starts the timer:
static ret_val_e TmrStart(tmr_ctrl_t * pCtrl)
{
ret_val_e retVal = RET_ERR_DEF;
BaseType_t retVal_os = pdFAIL;
XTtcPs * pHwTmrInstance = (XTtcPs *) pCtrl->pHwTmrInstance;
//Check status of driver
if(pCtrl == NULL)
{
return RET_ERR_TMR_CTRL_REF;
}else if(!pCtrl->bInitialized )
{
return RET_ERR_TMR_UNINITIALIZED;
}else
{
retVal_os = xSemaphoreTake(pCtrl->osSemMux_Tmr, INSTANCE_BUSY_ACCESS_DELAY_TICKS);
if(retVal_os != pdPASS)
{
return RET_ERR_OS_SEM_MUX;
}
}
//This function starts the timer
XTtcPs_Start(pHwTmrInstance);
(...)
Sometimes it can be helpful to start the timer directly inside an ISR. The problem that appears is that while the rest of function would support it, the SemaphoreTake() call MUST be changed to SemaphoreTakeFromISR() - moreover no wait ticks are supported when called from ISR in order to avoid a blocking ISR.
In order to achieve code that is suitable for both execution modes (thread mode and IRQ mode) we would need to change the function to first check the execution state and based on that invokes either SemaphoreTake() or SemaphoreTakeFromISR() before proceeding to access the HW.
That's the context of my question. As mentioned in the comments I do not want to implement this by adding a parameter that must be supplied by the user on every call which tells the function if it's been called from a thread or an ISR, as I want to keep the API as slim as possible.
I could take FreeRTOS approch and implement a copy of the TmrStart() function with the name TmrStartFromISR() which contains the the ISR specific calls to FreeRTOS's system resources. But I rather avoid that either as duplicating all my functions makes the code overall harder to maintain.
So determining the execution state by reading out some processor registers would be the only way that I can think of. But apparently the A9 does not supply this information easily unfortunately, unlike the M3 for example.
Another approch that just came to my mind could be to set a global variable in the assembler code of FreeRTOS that handles exeptions. In the portSAVE_CONTEXT it could be set and in the portRESTORE_CONTEXT it could be reset.
The downside of this solution is that the library then would not work with the official A9 port of FreeRTOS which does not sound good either. Moreover you could get problems with race conditions if the variable is changed right after it has been checked by the lib function, but I guess this would also be a problem when reading the state from a processor registers directly... Probably one would need to enclose this check in a critical section that prevents interrupts for a short period of time.
If somebody sees some other solutions that I did not think of please do not hesitate to bring them up.
Also please feel free to discuss the solutions I brought up so far.
I'd just like to find the best way to do it.
Thanks!
On a Cortex-A processor, when an interrupt handler is triggered, the processor enters IRQ mode, with interrupts disabled. This is reflected in the state field of CPSR. IRQ mode is not suitable to receive nested interrupts, because if a second interrupt happened, the return address for the first interrupt would be overwritten. So, if an interrupt handler ever needs to re-enable interrupts, it must switch to supervisor mode first.
Generally, one of the first thing that an operating system's interrupt handler does is to switch to supervisor mode. By the time the code reaches a particular driver, the processor is in supervisor mode. So the behavior you're observing is perfectly normal.
A FreeRTOS interrupt handler is a C function. It runs with interrupts enabled, in supervisor mode. If you want to know whether your code is running in the context of an interrupt handler, never call the interrupt handler function directly, and when it calls auxiliary functions that care, pass a variable that indicates who the caller is.
void code_that_wants_to_know_who_called_it(int context) {
if (context != 0)
// called from an interrupt handler
else
// called from outside an interrupt handler
}
void my_handler1(void) {
code_that_wants_to_know_who_called_it(1);
}
void my_handler2(void) {
code_that_wants_to_know_who_called_it(1);
}
int main(void) {
Install_Interrupt(EVENT1, my_handler1);
Install_Interrupt(EVENT2, my_handler1);
code_that_wants_to_know_who_called_it(0);
}

Interrupt triggering type (edge-triggered vs level-sensitive) mismatch

I'm writing a small OS on ARMv8 architecture with GICv3.
During device initialization, I expected an interrupt to be taken to the OS. However, it's never triggered. And it turned out I misconfigured the interrupt triggering type of the device as "level-sensitive" even though the actual interrupt triggering type is "edge-triggered". After changing the interrupt triggering type to "edge-triggered", I was able to receive the interrupt of the device.
Here's my question:
How GIC receives and forwards the asserted interrupts depends on the interrupt triggering type?
According to how GIC works (receive and forward), why was the interrupt never triggered with my wrong configuration?
What happens if I misconfigured the interrupt triggering type in the opposite way? (If I misconfigure the interrupt triggering type as "edge-triggered" but the actual one should be "level-sensitive".)
Thank you in advance.
An interrupt, in general, is just the assertion of a signal line. When the interrupt condition is present, the line is asserted; when not present the line is de-asserted.
The thing listening to a line, an interrupt controller, can interpret the signal in two ways: by transition or state. By transition means that the controller remembers it asserting itself, and retains that memory until some action is taken to clear that memory. By state means that the controller merely reflects the state of the signal line.
Transition, or Edge permits a simpler hardware controller, that can merely assert pulses indicate a change of state, whether it be carrier detected or character received. In a state or Level controller, a simple thing like carrier detected becomes a cumbersome multi-state exchange between the controller and the CPU.
State, or Level, however permits a hardware controller to insist upon attention until the underlying condition is satisfied. This is not just convenient when a single interrupt signal is shared by multiple devices, but to avoid a race between the hardware controller and the CPU. In general, Edge oriented devices need to be re-examined after processing an interrupt, to ensure the squelching of the interrupt controller didn't miss an interrupt pulse.
So, when you set the mode in something like a GIC, you need to know how the hardware device behaves, and program the GIC correspondingly. It isn't a selection of how you would like the device to behave; rather it is a configuration of how the device does behave. Mercifully, device-tree gives you that information, and if you can't believe device tree, then all is lost.

The right way to clear an interrupt flag on STM32

I'm developping a bare-metal project on a STM32L4 and I'm starting from an existing code base.
The ISRs have been implemented the following way:
read interrupt status in the peripheral to know what event(s) provoked the interrupt
do something
clear the flags that have read at the beginning.
Is it the right way to clear the flag ? Shouldn't the flags be cleared at the very beginning of the ISR ? My understanding is that, if the same peripheral event is happening a second time during step 2, it will not provoke a second IRQ so it would be lost. On the other hand if you clear the flag as soon as you can, this second event would pulse the interrupt whose state in the CPU would change to "pending and active": a second IRQ would happen.
PS: From STM32 Processor Programming Manual I read: "STM32 interrupts are both level-sensitive and pulse-sensitive".
Definitely at the beginning (unless you have special reasons in the program logic) as some time is needed the for actual write to the flag clear register to propagate through the buses.
If you decide for some reason to put it at the end of the interrupt you should leave some instructions, place the barrier instruction or read back the register before the interrupt routine return to make sure that the clear operation has propagated across the buses. Otherwise you may have a "phantom" duplicate routine calls.

Is spinlock required for every interrupt handler?

In Chapter 5 of ULK the author states as follows:
"...each interrupt handler is serialized with respect to itself-that is, it cannot execute more than one concurrently. Thus, accessing the data struct does not require synchronization primitives"
I don't quite understand why interrupt handlers is "serialized" on modern CPUs with multiple cores. I'm thinking it could be possible that a same ISR can be run on different cores simultaneously, right? If that's the case, if you don't use spinlock to protect your data it can come to a race condition.
So my question is, on a modern system with multi-cpus, for every interrupt handler you are going to write that will read & write some data, is spinlock always needed?
While executing interrupt handlers, the kernel explicitly disables that particular interrupt line at the interrupt controller, so one interrupt handler cannot be executed more than once concurrently. (The handlers of other interrupts can run concurrently, though.)
Clarification: as per CL. remark below - the kernel makes sure not to fire the interrupt handler for the same interrupt but if you have multiple registrations of the same interrupt handler for multiple interrupts than the below answer is, I believe, correct.
You are right that the same interrupt handler can run concurrently on multiple cores and that shared data needs to be protected. However, a spinlock is not the only and certainly not always the recommended way to achieve this.
A multitude of other synchronization methods, from per-CPU data, accessing shared data only using atomic operations and even Read-Copy-Update variants may be used to protect the shared data.
No spinlock is not always needed in interrupt handler.
Please note one thing first -
When an interrupt of a particular device occur on interrupt controller, that interrupt is disabled at interrupt controller and hence on all the cores for that particular device. So, the interrupt of same device cannot come on all the CPU simultaneously.
So in normal case there wont be any spin lock required as the code would not be re-entrant.
Though there are 2 cases below in which spinlock is needed in interrupt handler.
Please note, when an interrupt comes from a device and IRQ line, that cores disables all other interrupt on that core and also for that device interrupt on other core also. Interrupt from other devices can comes on other core.
So there can be a case in which, same interrupt handler is registered for different devices.
for eg:-
request_irq(A,func,..);
reqest_irq(B,func,..);
for device A interrupt handler func is called.
for device B same interrupt handler func is called.
So, a spinlock should be used in this case to prevent raise condition.
When same resource is being used in interrupt handler and also some other code which runs in process context.
For eg:- there is resource A
So there can be a case in which one cores runs in interrupt mode, the interrupt handler and is modifying the resource A and other core runs in process context and is also modifying the same resource in some other place.
So to present raise condition for that resource we should use spin lock.
section 4.6 of Understanding the Linux Kernel, 3rd Edition by Marco Cesati, Daniel P. Bovet told you the answer.
Actual interrupt handler is process by handle_IRQ_event. irq_desc[irq].lock prevent concurrently access to handle_IRQ_event by any other CPU.
If the critical data is shared b/w the interrupt handler and your process (may be a kernel thread) then you need to protect your data and hence spinlock is required.A common Kernel api for spinlock is : spin_lock().
There are also variants of these api e.g. spin_lock_irqsave() which can help avoiding the deadlock problems which one can face while acquiring/holding the spin locks.Please go through the below link to find details of the subject:
http://www.linuxjournal.com/article/5833

WaitFor implementation for Cortex M3

I'm using stm32f103 with GCC and have a task, which can be described with following pseudocode:
void http_server() {
transmit(data, len);
event = waitfor(data_sent_event | disconnect_event | send_timeout_event);
}
void tcp_interrupt() {
if (int_reg & DATA_SENT) {
emit(data_send_event);
}
}
void main.c() {
run_task(http_server);
}
I know, that all embedded OSes offer such functionality, but they are too huge for this single task. I don't need preemption, mutexes, queues and other features. Just waiting for flags in secondary tasks and raising these flags in interrupts.
Hope someone knows good tutorial on this topic or have a piece of code of context switching and wait implementation.
You will probably need to use an interrupt driven finite state machine.
There are a number of IP stacks that are independent of an operating system, or even interrupts. lwip (light weight ip) comes to mind. I used it indirectly as it was provided by xilinx. the freedos folks may have had one, certainly the crynwr packet drivers come to mind to which there were no doubt stacks built.
As far as the perhaps more simpler question. Your code is sitting in a foreground task in the waitfor() function which appears to want to be an infinite loop waiting for some global variables to change. And an interrupt comes along calls the interrupt handler which with a lot of stack work (to know it is a tcp interrupt) calls tcp_interrupt which modifies the flags, interrupt finishes and now waitfor sees the global flag change. The context switch is the interrupt which is built into the processor, no need for an operating system or anything fancy, a global variable or two and the isr. The context switch and flags/events are a freebie compared to the tcp/ip stack. udp is significantly easier, do you really need tcp btw?
If you want more than one of these waitfor() active, basically you don want to only have the one forground task sitting in one waitfor(). Then I would do one of two things. have the foreground task poll, instead of a waitfor(something) change it to an if(checkfor(something)) { then do something }.
Or setup your system so that the interrupt handler, which in your pseudo code is already very complicated to know this is tcp packet data, examines the tcp header deeper and knows to call the http_server() thing for port 80 events, and other functions for other events that you might have had a waitfor. So in this case instead of a multitasking series of functions that are waitfor()ing, create a single list of the events, and look for them in the ISR. Use a timer and interrupt and globals for the timeouts (reset a counter when a packet arrives, bump the counter on a timer interrupt if the counter reaches N then a timeout has occurred, call the timeout task handler function).

Resources