Interrupt-safe buffer

Interrupt-safe buffer - c

I'm writing code for an embedded system (Cortex M0) and do not have all the luxuries of mutexes/spinlocks/etc. Is there a simple way to add data to a shared buffer (log-file) which will be flushed to disk from my Main() loop?
If there is only a single producer (1 interrupt) and single consumer (main-loop), I could use a simple buffer where the producer increases the 'head' and the consumer the 'tail'. And it will be perfectly safe. But now that I have multiple producers (interrupts) it seems like I'm stuck.
I could give each interrupt its own buffer, and combine them in Main(), but this will require a lot of extra RAM and complexity.

You can implement this through a simple ring buffer (circular array), where you turn off the hardware interrupt sources during access. It only needs the functions init, add and remove.
I'm not certain how your particular MCU handles interrupts, but most likely they will remain pending, as long as you only enable/disable the particular hardware peripheral's interrupt. Depending on the nature of your application, you could also disable the global interrupt mask, but that's rather crude.
Generally, you don't need to worry about missing out interrupts, because if the code that handles the incoming interrupts is slower than the interrupt frequency, no software in the world will fix it. You would either have to accept data losses or increase the CPU clock to dodge such scenarios. But of course you should always try to keep the code inside the ISR as compact as possible.

Related

Avoiding Race Condition with event queue in event driven embedded system

I am trying to program stm32 and use event driven architecture. For example I am going to toggle a pin when timer interrupt occurs and transfer some data to external flash when ADC DMA buffer full interrupt occurs and so on..
There will be multiple interrupt sources each with same priority which disables nesting.
I will use the interrupts to set a flag to signal my main that interrupt occured and process data inside main. There will be no processing/instruction inside ISRs.
What bothers me is that accessing a variable(flags in this case) in main and ISRs may cause race condition bug in the long run.
So I want to use an circular event queue instead of flags.
Only ISRs will be able to write to event queue buffer and increment "head".
Only main will be able to read the event queue(and execute instructions according to event) and increment "tail".
Since ISR nesting is disabled and each ISR will access different element of event queue array and main function will only react when there is new event on event queue, race condition is avoided right? or am I missing something?
Please correct me if I am doing something wrong.
Thank you.

If the interrupt only sets a variable and nothing gets done until main context is ready to do it then there is really no reason to have an interrupt at all.
For example: if you get a DMA complete hardware interrupt and set a variable then all you have achieved is to copy one bit of information from a hardware register to a variable. You could have much simpler code with identical performance and less potential for error by instead of polling a variable just not enabling the interrupt and polling the hardware flag directly.
Only enable the interrupt if you are actually going to do something in interrupt context that cannot wait, for example: reading a UART received data register so that the next character received doesn't overflow the buffer.
If after the interrupt has done the thing that cannot wait it then needs to communicate something with main-context then you need to have shared data. This will mean that you need some way of preventing race-conditions. The simplest way is atomic access with only one side writing to the data item. If that is not sufficient then the old-fashioned way is to turn off interrupts when main context is accessing the shared data. There are more complicated ways using LDREX/STREX instructions but you should only explore these once you are sure that the simple way isn't good enough for your application.

How to exploit interrupts for data transfer over SPI peripheral

I have been implementing device driver for the SPI peripheral of the MCU in C language.
I would like to exploit interrupt mechanism for reception and also for transmission.
As far as the reception part I think that I can implement this via exposing
the function SpiRegisterCallback into the SPI driver interface. This function
enables the client register its function which will be invoked as soon as
data byte is received (reception buffer full interrupt is invoked).
As far as the transmission part I would like to use some SpiTransmit function
which will receive pointer to the data bytes to be transmitted and number of bytes
to be transmitted. As far as implementation I am going to define some internal
callback function of the SPI driver. This internal callback will be registered
for transmission buffer empty interrupt. In this callback function the passed data bytes will be gradually placed into the transmission buffer. I am not sure whether this approach
is appropriate. Can anybody give me an advice how to implement SPI peripheral
driver which exploits interrupts for data transmission? Thanks in advance for any
suggestions.

SPI is often very real-time critical, introducing a callback with function pointers means needless overhead code. The actual copying of data from SPI to RAM must be done internally by your driver. That's all the ISR should be doing. Some general guidance can be found here.
So your ISR should be filling up a buffer, then swap pointers to buffers (no slow memcpy!) in a protected way, so that the caller always has one buffer with valid data, and the ISR always has one working buffer to fill up. Let the caller poll a flag rather than to invoke a callback from inside an ISR. I like to use tripple buffering if I can spare the RAM. That is: one buffer for the ISR, one buffer for the caller and one spare that the ISR can swap with without disrupting the caller.
This is all rather intricate to code and most programmers get it wrong. DMA is superior to interrupts here, so you should really be considering DMA instead. This is something you should be considering when picking MCU.

A request for "any suggestions" does not really make this a great question because multiple answers may be acceptable, and few will be comprehensive. It invites comments rather then answers. However I will indulge:
First, this is not by any definition an exploit. To "exploit" implies making use of something for a purpose it was not intended - that is not the correct term in this case, you are not "exploiting" the interrupt mechanism, you are simply using it.
At high clock rates, in some cases the interrupt latency and context switch time involved in processing the interrupts may be less efficient than a simple busy-wait. If the transfers are more than two or three bytes at a time, you should in any case consider using DMA if available - so the interrupt will be the DMA interrupt for a complete transfer rather then a single character. For applications such as SD card interfacing or EEPROM, DMA will have a significant performance impact and free up the CPU to do other useful work concurrently. A driver that uses a busy-wait for single byte/word transfers and DMA for block transfers may be optimal. This is particularly true perhaps if you are using an RTOS and the ISR triggers a task context to process the data - the context switch overhead may be nearly as much or more than a busy-wait for a single byte. If your SPI clock is > 1MHz for example, you will wait 8us for a byte transfer, your ISR and call backs could easily be greater then that, in which case it is not worthwhile.
So my advice here is to only consider interrupts for SPI if you are using a slow clock and can get other useful work done whilst waiting for the interrupt.
A problem with allowing call-backs in interrupts is it allows the callback provider to do things ill-advised or illegal in an interrupt context, and you loose the ability to control the processing time of the interrupt. It is fine perhaps if the callback is intended for use by someone writing a device driver - they should be aware of what they are doing, but this is the device driver.

How does enabling and disabling interrupts from the kernel prevent race conditions?

Only thing I can think of is enabling/disabling interrupts also disables kernel pre-emption. This would make impossible (?) for multiple threads touching shared kernel data at the same time.
Is there something I'm missing (maybe because you can only enable/disable interrupts for one CPU at a time?)?

In ye olde days of single processor systems, blocking interrupts was the method of locking kernel data structures. If Interrupt X were in the middle of changing something, it would not want to higher priority interrupt Y to execute and leave the data structures in an ambiguous state. Of course, X should only block interrupts for the minimum amount of time required.
In multi-processor systems you have to add software locking to prevent another process from mucking with system data structures while they are being modified (both for interrupts and system calls).
However, you still have to block interrupts. If interrupt X had data structures locked (or partially locked) and interrupt Y were able to execute, it could try to lock the same data structured and would wait forever.

what's wrong in using interrupt handlers as event listeners

My system is simple enough that it runs without an OS, I simply use interrupt handlers like I would use event listener in a desktop program. In everything I read online, people try to spend as little time as they can in interrupt handlers, and give the control back to the tasks. But I don't have an OS or real task system, and I can't really find design information on OS-less targets.
I have basically one interrupt handler that reads a chunk of data from the USB and write the data to memory, and one interrupt handler that reads the data, sends the data on GPIO and schedule itself on an hardware timer again.
What's wrong with using the interrupts the way I do, and using the NVIC (I use a cortex-M3) to manage the work hierarchy ?

First of all, in the context of this question, let's refer to the OS as a scheduler.
Now, unlike threads, interrupt service routines are "above" the scheduling scheme.
In other words, the scheduler has no "control" over them.
An ISR enters execution as a result of a HW interrupt, which sets the PC to a different address in the code-section (more precisely, to the interrupt-vector, where you "do a few things" before calling the ISR).
Hence, essentially, the priority of any ISR is higher than the priority of the thread with the highest priority.
So one obvious reason to spend as little time as possible in an ISR, is the "side effect" that ISRs have on the scheduling scheme that you design for your system.
Since your system is purely interrupt-driven (i.e., no scheduler and no threads), this is not an issue.
However, if nested ISRs are not allowed, then interrupts must be disabled from the moment an interrupt occurs and until the corresponding ISR has completed. In that case, if any interrupt occurs while an ISR is in execution, then your program will effectively ignore it.
So the longer you spend inside an ISR, the higher the chances are that you'll "miss out" on an interrupt.

In many desktop programs, events are send to queue and there is some "event loop" that handle this queue. This event loop handles event by event so it is not possible to interrupt one event by other one. It also is good practise in event driven programming to have all event handlers as short as possible because they are not interruptable.
In bare metal programming, interrupts are similar to events but they are not send to queue.
execution of interrupt handlers is not sequential, they can be interrupted by interrupt with higher priority (numerically lower number in Cortex-M3)
there is no queue of same interrupts - e.g. you can't detect multiple GPIO interrupts while you are in that interrupt - this is the reason you should have all routines as short as possible.
It is possible to implement queues by yourself, feed these queues by interrupts and consume these queues in your super loop (consume while disabling all interrupts). By this approach, you can get sequential processing of interrupts. If you keep your handlers short, this is mostly not needed and you can do the work in handlers directly.
It is also good practise in OS based systems that they are using queues, semaphores and "interrupt handler tasks" to handle interrupts.

With bare metal it is perfectly fine to design for application bound or interrupt/event bound so long as you do your analysis. So if you know what events/interrupts are coming at what rate and you can insure that you will handle all of them in the desired/designed amount of time, you can certainly take your time in the event/interrupt handler rather than be quick and send a flag to the foreground task.
The common approach of course is to get in and out fast, saving just enough info to handle the thing in the foreground task. The foreground task has to spin its wheels of course looking for event flags, prioritizing, etc.
You could of course make it more complicated and when the interrupt/event comes, save state, and return to the forground handler in the forground mode rather than interrupt mode.
Now that is all general but specific to the cortex-m3 I dont think there are really modes like big brother ARMs. So long as you take a real-time approach and make sure your handlers are deterministic, and you do your system engineering and insure that no situation happens where the events/interrupts stack up such that the response is not deterministic, not too late or too long or loses stuff it is okay

What you have to ask yourself is whether all events can be services in time in all circumstances:
For example;
If your interrupt system were run-to-completion, will the servicing of one interrupt cause unacceptable delay in the servicing of another?
On the other hand, if the interrupt system is priority-based and preemptive, will the servicing of a high priority interrupt unacceptably delay a lower one?
In the latter case, you could use Rate Monotonic Analysis to assign priorities to assure the greatest responsiveness (the shortest execution-time handlers get the highest priority). In the first case your system may lack a degree of determinism, and performance will be variable under both event load, and code changes.
One approach is to divide the handler into real-time critical and non-critical sections, the time-critical code can be done in the handler, then a flag set to prompt the non-critical action to be performed in the "background" non-interrupt context in a "big-loop" system that simply polls event flags or shared data for work to complete. Often all that might be necessary in the interrupt handler is to copy some data to timestamp some event - making data available for background processing without holding up processing of new events.
For more sophisticated scheduling, there are a number of simple, low-cost or free RTOS schedulers that provide multi-tasking, synchronisation, IPC and timing services with very small footprints and can run on very low-end hardware. If you have a hardware timer and 10K of code space (sometimes less), you can deploy an RTOS.

I am taking your described problem first
As I interpret it your goal is to create a device which by receiving commands from the USB, outputs some GPIO, such as LEDs, relays etc. For this simple task, your approach seems to be fine (if the USB layer can work with it adequately).
A prioritizing problem exists though, in this case it may be that if you overload the USB side (with data from the other end of the cable), and the interrupt handling it is higher priority than that triggered by the timer, handling the GPIO, the GPIO side may miss ticks (like others explained, interrupts can't queue).
In your case this is about what could be considered.
Some general guidance
For the "spend as little time in the interrupt handler as possible" the rationale is just what others told: an OS may realize a queue, etc., however hardware interrupts offer no such concepts. If the event causing the interrupt happens, the CPU enters your handler. Then until you handle it's source (such as reading a receive holding register in the case of a UART), you lose any further occurrences of that event. After this point, until exiting the handler, you may receive whether the event happened, but not how many times (if the event happened again while the CPU was still processing the handler, the associated interrupt line goes active again, so after you return from the handler, the CPU immediately re-enters it provided nothing higher priority is waiting).
Above I described the general concept observable on 8 bit processors and the AVR 32bit (I have experience with these).
When designing such low-level systems (no OS, one "background" task, and some interrupts) it is fundamental to understand what goes on on each priority level (if you utilize such). In general, you would make the most real-time critical tasks the highest priority, taking the most care of serving those fast, while being more relaxed with the lower priority levels.
From an other aspect usually at design phase it can be planned how the system should react to missed interrupts, since where there are interrupts, missing one will eventually happen anyway. Critical data going across communication lines should have adequate checksums, an especially critical timer should be sourced from a count register, not from event counting, and the likes.
An other nasty part of interrupts is their asynchronous nature. If you fail to design the related locks properly, they will eventually corrupt something giving nightmares to that poor soul who will have to debug it. The "spend as little time in the interrupt handler as possible" statement also encourages you to keep the interrupt code reasonably short which means less code to consider for this problem as well. If you also worked with multitasking assisted by an RTOS you should know this part (there are some differences though: a higher priority interrupt handler's code does not need protection against a lower priority handler's).
If you can properly design your architecture regarding the necessary asynchronous tasks, getting around without an OS (from the no multitasking aspect) may even prove to be a nicer solution. It needs way more thinking to design it properly, however later there are much less locking related problems. I got through some mid-sized safety critical projects designed over a single background "task" with very few and little interrupts, and the experience and maintenance demands regarding those (especially the tracing of bugs) were quite satisfactory compared to some others in the company built over multitasking concepts.

AVR8 Real Time Scheduler, Serial Communication

I am currently programming an ATmega32u4. I have implemented serial communication which is implemented using a build in interrupt that executes every time there is a byte received on the Rx pin. The byte on the Rx pin is placed in a one byte buffer which is replaced when another byte is received on the Rx pin. This is a built in library in atmel.
ISR(USART1_RX_vect, ISR_BLOCK)
{
RingBuffer_Insert(&usart_rx_buffer,UDR1);
}
My code executes an interrupt when a byte is received on the Rx pin. When a byte is receives this byte is entered into my ring buffer uart_rx_buffer where it is later decoded.
If an interrupt is being executed and this causes the one byte buffer to be replaced before the UART interrupt can be executed, this byte is lost.
The result of this is that other interrupts cannot take longer than the baud rate to execute otherwise serial bytes are lost.Is there any way to avoid this problem?

One way to solve this problem would be to use the attribute ISR_NOBLOCK in all interrupts that take longer than the baud rate, causing the interrupt enable flag to be activated by the compiler as early as possible within the ISR and allowing the USART1_RX_vect to be executed inside other interrupts. However, "care should be taken to avoid stack overflows, or to avoid infinitely entering the ISR for those cases where the AVR hardware does not clear the respective interrupt flag before entering the ISR".
I've experienced this same problem and so far this was the best solution I could think of. I didn't use it nor tested it, though.
Edit: keep in mind that all other interrupts could also be executed inside interrupts declared with the attribute ISR_NOBLOCK, not just the interrupt you want. So you would basically allow all interrupts to be nested inside all interrupts, except USART1_RX_vect (and those declared with ISR_BLOCK). This is the main problem with this solution (besides the stack overflow problem).

The result of this is that other interrupts cannot take longer than the baud rate to execute otherwise serial bytes are lost. Is there any way to avoid this problem?
All your observations are correct. While allowing nested interrupts like suggested in Nuno's answer could work, it is normally something you would/should want to avoid. Allowing nested interrupts everywhere makes code petty unpredictable.
I would first try to optimize the execution time of the interrupts that are blocking your UART receive ISR. Take a look at the interrupt priorities. If several interrupts are pending, they will be executed according to this priority. This can result in "starvation" of lower level interrupts, if there is "always" a higher level interrupt pending.
What is your baud rate? Even at 115200 bit/s you can execute about 700 instructions (assuming 8MHz) per byte received. ISRs should be as short as possible. If there is one single ISR that is taking long and you can't optimize it for what reason whatsoever, you could consider just allowing nested interrupts in this single ISR (this is only feasible if the execution is not critical).
If you use a high baud rate, consider reducing it. 9600 baud is often enough, but may require asynchronous sending to prevent blocking code.