What situations cause timer interrupts to clash/miss?

What situations cause timer interrupts to clash/miss? - timer

I am working on a run-to-completion state-machine RTOS. I am wondering under what conditions can interrupts be missed? Can max CPU utilization conditions (100%) cause interrupts to be missed? Also if two timers having different handlers but same interrupt lines timeout at the same tick, which ISR runs first?
[Appreciate a reply from the perspective of a software/firmware engineer, with limited hardware knowledge.]

The typical way that interrupts are missed is when the first occurrence of an interrupt is not serviced before the second occurrence of the same interrupt source. When you don't service the first occurrence fast enough then you miss the subsequent occurrence because it can't be distinguished from the first occurrence. There is no queue to stack up multiple occurrences of an interrupt so you need to service and clear each interrupt before that particular interrupt occurs again. (Note that different interrupt sources can be pending at the same time and serviced separately because they can be distinguished. It's two occurrences of the same interrupt source that can cause you to miss one.)
100% CPU utilization doesn't necessarily cause interrupts to be missed but I guess it could contribute. It's likely to cause other problems as well.
Many microcontrollers include an interrupt prioritization mechanism, which defines which interrupt sources will be asserted over which other interrupts. This varies from one microcontroller to the next so you'd have to check the data sheet of your particular microcontroller for details.
Update:
So what conditions could cause an interrupt to not be serviced fast enough?
If the interrupts are disabled for too long then an interrupt may not be serviced fast enough. Or if a higher priority interrupt handler takes too long then a lower priority interrupt may not be serviced fast enough.
To avoid these situations, keep short both the periods where interrupts are disabled and the interrupt handler routines.

Related

How to know whether an IRQ was served immediately on ARM Cortex M0+ (or any other MCU)

For my application (running on an STM32L082) I need accurate (relative) timestamping of a few types of interrupts. I do this by running a timer at 1 MHz and taking its count as soon as the ISR is run. They are all given the highest priority so they pre-empt less important interrupts. The problem I'm facing is that they may still be delayed by other interrupts at the same priority and by code that disables interrupts, and there seems to be no easy way to know this happened. It is no problem that the ISR was delayed, as long as I know that the particular timestamp is not accurate because of this.
My current approach is to let each ISR and each block of code with interrupts disabled check whether interrupts are pending using NVIC->ISPR[0] and flagging this for the pending ISR. Each ISR checks this flag and, if needed, flags the timestamp taken as not accurate.
Although this works, it feels like it's the wrong way around. So my question is: is there another way to know whether an IRQ was served immediately?
The IRQs in question are EXTI4-15 for a GPIO pin change and RTC for the wakeup timer. Unfortunately I'm not in the position to change the PCB layout and use TIM input capture on the input pin, nor to change the MCU used.
update
The fundamental limit to accuracy in the current setup is determined by the nature of the internal RTC calibration, which periodically adds/removes 32kHz ticks, leading to ~31 µs jitter. My goal is to eliminate (or at least detect) additional timestamping inaccuracies where possible. Having interrupts blocked incidentally for, say, 50+ µs is hard to avoid and influences measurements, hence the need to at least know when this occurs.
update 2
To clarify, I think this is a software question, asking if a particular feature exists and if so, how to use it. The answer I am looking for is one of: "yes it is possible, just check bit X of register Y", or "no it is not possible, but MCU ... does have such a feature, called ..." or "no, such a feature is generally not available on any platform (but the common workaround is ...)". This information will guide me (and future readers) towards a solution in software, and/or requirements for better hardware design.

In general
The ideal solution for accurate timestamping is to use timer capture hardware (built-in to the microcontroller, or an external implementation). Aside from that, using a CPU with enough priority levels to make your ISR always the highest priority could work, or you might be able to hack something together by making the DMA engine sample the GPIO pins (specifics below).
Some microcontrollers have connections between built-in peripherals that allow one peripheral to trigger another (like a GPIO pin triggering timer capture even though it isn't a dedicated timer capture input pin). Manufacturers have different names for this type of interconnection, but a general overview can be found on Wikipedia, along with a list of the various names. Exact capabilities vary by manufacturer.
I've never come across a feature in a microcontroller for indicating if an ISR was delayed by a higher priority ISR. I don't think it would be a commonly-used feature, because your ISR can be interrupted by a higher priority ISR at any moment, even after you check the hypothetical was_delayed flag. A higher priority ISR can often check if a lower priority interrupt is pending though.
For your specific situation
A possible approach is to use a timer and DMA (similar to audio streaming, double-buffered/circular modes are preferred) to continuously sample your GPIO pins to a buffer, and then you scan the buffer to determine when the pins changed. Note that this means the CPU must scan the buffer before it is overwritten again by DMA, which means the CPU can only sleep in short intervals and must keep the timer and DMA clocks running. ST's AN4666 is a relevant document, and has example code here (account required to download example code). They're using a different microcontroller, but they claim the approach can be adapted to others in their lineup.
Otherwise, with your current setup, I don't think there is a better solution than the one you're using (the flag that's set when you detect a delay). The ARM Cortex-M0+ NVIC does not have a feature to indicate if an ISR was delayed.
A refinement to your current approach might be making the ISRs as short as possible, so they only do the timestamp collection and then put any other work into a queue for processing by the main application at a lower priority (only applicable if the work is more complex than the enqueue operation, and if the work isn't time-sensitive). Eliminating or making the interrupts-disabled regions short should also help.

What happens when a interrupt occurs in RTOS while currently in any task or another ISR?

I have read this question, It is not a valid answer when one interrupt is executing it will not disable all the other interrupts always. It is based on the interrupt type (in some case we need to do manually in our program).
My Question is what happens when a interrupt occurs while executing a
interrupt ??? If Low priority interrupt is executing then High
priority interrupt occurs the what will happen ?

It depends on the system. If the microcontroller/interrupt controller supports nested interrupts and the application enables that feature then a higher priority interrupt will interrupt a lower priority interrupt. In this case the lower priority interrupt will resume when the higher priority interrupt is complete. But if the system does not support nested interrupts then the subsequent interrupt request will pend and be serviced when the active interrupt service routine is complete.

It is too broad for SO, I think and it is arch based.
I try to give you a brief overview, expecting some DV on it. ;)
Mainly, if the arch allow nested interrupts, the interrupt with lower priority is interrupted while executing to jump to the ISR of high level interrupt.
But you can have NMI (Non Maskable Interrupt) that have priority on all other interrupts and cannot be disable.
Usually (all I think) archs have also a global interrupt enable flag, so it must be enabled to allow other interrupts to be served. Also means that an ISR, when is executing, can disable other interrupts during its job.
You can think, for example, on an RTOS implementation: the scheduler can be easily developed using a Timer within its interrupt. This interrupt must have the lower priority and mustn't stops other interrupts (usually): this grant that interrupts are served a soon as possible not considering context switch of RTOS scheduler.

The question was
What happens when a interrupt occurs in RTOS while currently in any task or another ISR?
I have written two commercial RTOS's and there is no answer that satisfies all of the criteria. However, I CAN answer as broadly as the question:
Depending on what is allowed, it will act as a normal interrupt. The problem with this question is that "what happens" is a little broad - some RTOS's do work behind the scenes with interrupts. So, the problem is that the question is not specific enough.
"What happens" in regards to a task is that NOTHING happens in regards to a task. An interrupt is an interrupt and what the relation to a task is depends on the programming. Since I don't read minds, again, the question is not specific enough.
The BEST answer is 42 (HHGTTG)

There is no one answer, sometimes nothing happens the lower prio interrupt continues to completion, sometimes the higher one interrupts the lower. It depends first on the chip/system design, second it depends on the individual programmers across the board RTOS and application folks.
Or another way to say it is, what happens is what those individuals desired to happen in their design and implementation.

ARM GIC Interrupt starvation

Not sure if there are similar questions. I tried to backread but can't find any, so here it is.
In my bare-metal application that uses ARM Cortex-A9 (dual core with GIC), some of the interrupt sources are 4 FPGA interrupts (let's say IRQ ID 58, 59, 60, 61) that have the same priority and the idea is that all simultaneously trigger continuously in run-time. I can say the interrupt handlers may qualify as long, but not very long.
All interrupts fire and are detected by GIC and all are flagged as PENDING. The problem is, only the two higher ID'ed interrupts (58, 59) get handled by CPU, starving the other two. Once 58 or 59 are done, their source will trigger again and grab the CPU over and over again. My other interrupts are indefinitely being starved.
I played around with priority, assigning higher interrupts to 60 and 61. Sure enough, 60 and 61 triggered and got handled by CPU, but 58 and 59 are starved. So it's really an issue of starvation.
Is there any way out of here, such that the other two will still be processed given their triggering rate?

Assuming the GIC implementation is one of ARM's designs, then the arbitration scheme for multiple interrupts at the same priority is fixed at "dispatch the lowest-numbered one", so if you were hoping it could be changed to some kind of round-robin scheme you're probably out of luck.
That said, if these interrupts are more or less permanently asserted and you're taking them back-to-back then that's a sign that you probably don't need to use interrupts, or at least that the design of your code is inappropriate. Depending on the exact nature of the task, here are some ideas I'd consider:
Just run a continuous polling loop cycling through each device in turn. If there are periods when each device might not need servicing and it's not straightforward to tell, retain a trivial interrupt handler that just atomically sets a flag/sequence number/etc. to inform the loop who's ready.
Handle all the interrupts on one core, and the actual processing on the other. The handler just grabs the necessary data, stuffs it into a queue, and returns as quickly as possible, while the other guy just steadily chews through the queue.
If catching every single interrupt is less important than just getting "enough" of each of them on average, leave each one disabled for a suitable timeout after handling it. Alternatively, hack up your own round-robin scheduling by having only one enabled at a time, and the handler reenables the next interrupt instead of the one just taken.

In my bare-metal application that uses ARM Cortex-A9 (dual core with GIC)...
Is there any way out of here, such that the other two will still be processed given their triggering rate?
Of course there are many ways.
You have a dual CPU so you can route a set to each CPU; 58/59 to CPU0 and 60/61 to CPU1. It is not clear how you have handled things with the distributor nor the per-CPU interfaces.
A 2nd way is to just read the status in the 58/59 handlers of the 60/61 and do the work. Ie, you can always read a status of another interrupt from the IRQ handler.
You can also service each and every pending interrupt recorded at the start of the IRQ before acknowledging the original source. A variant of '2' implemented at the IRQ controller layer.
I believe that most of these solutions avoid needless context save/restores and should also be more efficient.
Of course if you are asking the CPU to do more work than it can handle, priorities don't matter. The issue may be your code is not efficient; either the bare metal interrupt infrastructure or your FPGA IRQ handler. It is also quite likely the FPGA to CPU interface is not designed well. You may need to add FIFOs in the FPGA to buffer the data so the CPU can handle more data at a time. I have worked with several FPGA designers. They have a lot of flexibility and usually if you ask for something that will make the IRQ handler more efficient, they can implement it.

AVR8 Real Time Scheduler, Serial Communication

I am currently programming an ATmega32u4. I have implemented serial communication which is implemented using a build in interrupt that executes every time there is a byte received on the Rx pin. The byte on the Rx pin is placed in a one byte buffer which is replaced when another byte is received on the Rx pin. This is a built in library in atmel.
ISR(USART1_RX_vect, ISR_BLOCK)
{
RingBuffer_Insert(&usart_rx_buffer,UDR1);
}
My code executes an interrupt when a byte is received on the Rx pin. When a byte is receives this byte is entered into my ring buffer uart_rx_buffer where it is later decoded.
If an interrupt is being executed and this causes the one byte buffer to be replaced before the UART interrupt can be executed, this byte is lost.
The result of this is that other interrupts cannot take longer than the baud rate to execute otherwise serial bytes are lost.Is there any way to avoid this problem?

One way to solve this problem would be to use the attribute ISR_NOBLOCK in all interrupts that take longer than the baud rate, causing the interrupt enable flag to be activated by the compiler as early as possible within the ISR and allowing the USART1_RX_vect to be executed inside other interrupts. However, "care should be taken to avoid stack overflows, or to avoid infinitely entering the ISR for those cases where the AVR hardware does not clear the respective interrupt flag before entering the ISR".
I've experienced this same problem and so far this was the best solution I could think of. I didn't use it nor tested it, though.
Edit: keep in mind that all other interrupts could also be executed inside interrupts declared with the attribute ISR_NOBLOCK, not just the interrupt you want. So you would basically allow all interrupts to be nested inside all interrupts, except USART1_RX_vect (and those declared with ISR_BLOCK). This is the main problem with this solution (besides the stack overflow problem).

The result of this is that other interrupts cannot take longer than the baud rate to execute otherwise serial bytes are lost. Is there any way to avoid this problem?
All your observations are correct. While allowing nested interrupts like suggested in Nuno's answer could work, it is normally something you would/should want to avoid. Allowing nested interrupts everywhere makes code petty unpredictable.
I would first try to optimize the execution time of the interrupts that are blocking your UART receive ISR. Take a look at the interrupt priorities. If several interrupts are pending, they will be executed according to this priority. This can result in "starvation" of lower level interrupts, if there is "always" a higher level interrupt pending.
What is your baud rate? Even at 115200 bit/s you can execute about 700 instructions (assuming 8MHz) per byte received. ISRs should be as short as possible. If there is one single ISR that is taking long and you can't optimize it for what reason whatsoever, you could consider just allowing nested interrupts in this single ISR (this is only feasible if the execution is not critical).
If you use a high baud rate, consider reducing it. 9600 baud is often enough, but may require asynchronous sending to prevent blocking code.

What is the irq latency due to the operating system?

How can I estimate the irq latency on ARM processor?
What is the definition for irq latency?

Interrupt Request (irq) latency is the time that takes for interrupt request to travel from source of the interrupt to the point when it will be serviced.
Because there are different interrupts coming from different sources via different paths, obviously their latency is depending on the type of the interrupt. You can find table with very good explanations about latency (both value and causes) for particular interrupts on ARM site
You can find more information about it in ARM9E-S Core Technical Reference Manual:
4.3 Maximum interrupt latency
If the sampled signal is asserted at the same time as a multicycle instruction has started
its second or later cycle of execution, the interrupt exception entry does not start until
the instruction has completed.
The longest LDM instruction is one that loads all of the registers, including the PC.
Counting the first Execute cycle as 1, the LDM takes 16 cycles.
• The last word to be transferred by the LDM is transferred in cycle 17, and the abort
status for the transfer is returned in this cycle.
• If a Data Abort happens, the processor detects this in cycle 18 and prepares for
the Data Abort exception entry in cycle 19.
• Cycles 20 and 21 are the Fetch and Decode stages of the Data Abort entry
respectively.
• During cycle 22, the processor prepares for FIQ entry, issuing Fetch and Decode
cycles in cycles 23 and 24.
• Therefore, the first instruction in the FIQ routine enters the Execute stage of the
pipeline in stage 25, giving a worst-case latency of 24 cycles.
and
Minimum interrupt latency
The minimum latency for FIQ or IRQ is the shortest time the request can be sampled
by the input register (one cycle), plus the exception entry time (three cycles). The first
interrupt instruction enters the Execute pipeline stage four cycles after the interrupt is
asserted

There are three parts to interrupt latency:
The interrupt controller picking up the interrupt itself. Modern processors tend to do this quite quickly, but there is still some time between the device signalling it's pin and the interrupt controller picking it up - even if it's only 1ns, it's time [or whatever the method of signalling interrupts are].
The time until the processor starts executing the interrupt code itself.
The time until the actual code supposed to deal with the interrupt is running - that is, after the processor has figured out which interrupt, and what portion of driver-code or similar should deal with the interrupt.
Normally, the operating system won't have any influence over 1.
The operating system certainly influences 2. For example, an operating system will sometimes disable interrupts [to avoid an interrupt interfering with some critical operation, such as for example modifying something to do with interrupt handling, or when scheduling a new task, or even when executing in an interrupt handler. Some operating systems may disable interrupts for several milliseconds, where a good realtime OS will not have interrupts disabled for more than microseconds at the most.
And of course, the time it takes from the first instruction in the interrupt handler runs, until the actual driver code or similar is running can be quite a few instructions, and the operating system is responsible for all of them.
For real time behaviour, it's often the "worst case" that matters, where in non-real time OS's, the overall execution time is much more important, so if it's quicker to not enable interrupts for a few hundred instructions, because it saves several instructions of "enable interrupts, then disable interrupts", a Linux or Windows type OS may well choose to do so.

Mats and Nemanja give some good information on interrupt latency. There are two is one more issue I would add, to the three given by Mats.
Other simultaneous/near simultaneous interrupts.
OS latency added due to masking interrupts. Edit: This is in Mats answer, just not explained as much.
If a single core is processing interrupts, then when multiple interrupts occur at the same time, usually there is some resolution priority. However, interrupts are often disabled in the interrupt handler unless priority interrupt handling is enabled. So for example, a slow NAND flash IRQ is signaled and running and then an Ethernet interrupt occurs, it may be delayed until the NAND flash IRQ finishes. Of course, if you have priorty interrupts and you are concerned about the NAND flash interrupt, then things can actually be worse, if the Ethernet is given priority.
The second issue is when mainline code clears/sets the interrupt flag. Typically this is done with something like,
mrs r9, cpsr
biceq r9, r9, #PSR_I_BIT
Check arch/arm/include/asm/irqflags.h in the Linux source for many macros used by main line code. A typical sequence is like this,
lock interrupts;
manipulate some flag in struct;
unlock interrupts;
A very large interrupt latency can be introduced if that struct results in a page fault. The interrupts will be masked for the duration of the page fault handler.
The Cortex-A9 has lots of lock free instructions that can prevent this by never masking interrupts; because of better assembler instructions than swp/swpb. This second issue is much like the IRQ latency due to ldm/stm type instructions (these are just the longest instructions to run).
Finally, a lot of the technical discussions will assume zero-wait state RAM. It is likely that the cache will need to be filled and if you know your memory data rate (maybe 2-4 machine cycles), then the worst case code path would multiply by this.
Whether you have SMP interrupt handling, priority interrupts, and lock free main line depends on your kernel configuration and version; these are issues for the OS. Other issues are intrinsic to the CPU/SOC interrupt controller, and to the interrupt code itself.