ARM WFI won't sleep - arm

I am trying to enter standby mode on a Cortex-M4. The normal behaviour is that the device wakes up about every 2 minutes but on my latest FW release, it seems that the code is "randomly" stuck.
After investigation it seems that the code passes the WFI instruction without going to standby (no standby => no reset => infinite loop => ... => 42).
So after many unclear spec reading my understanding is that the WFI may not go to sleep if there are pending interrupts.
Can you confirm the last sentence
How to ensure all pending interrupts are cleared before calling WFI ?

There are three conditions that cause the processor to wake up from a WFI instruction:
a non-masked interrupt occurs and its priority is greater than the current execution priority (i.e. the interrupt is taken)
an interrupt masked by PRIMASK becomes pending
a Debug Entry request.
If any of the wake up conditions are true when the WFI instruction executes, then it is effectively a NOP (i.e. you don't go to sleep).
As for making sure that no interrupts are pending, that's your code that must do that. Usually it means making sure that the interrupt source is satisfied so that it does not assert its interrupt request and then clear the necessary pending bit. You can see what is pending by reading at the interrupt pending registers but interrupt handlers are usually tasked to make sure they leave things quiescent.
Note that most systems have to do some work immediately before or after executing WFI. For example, there is often a test that must be done to determine if there is any additional work to be done before deciding to go to sleep with WFI. That test and the execution of WFI are then done in a critical section where PRIMASK is set to 1 (so we are exercising option #2 above). This will insure that no interrupts gets in between the test and the WFI and that after wakeup, no interrupt gets in case there are additional operations (usually involving clocking) that need to get done. After wake up, PRIMASK is set back to 0 (exiting the critical section) and any pending interrupt is taken.
Also ARM recommends executing a DSB instruction immediately before WFI to insure that any data operations are finished before the processor goes to sleep. It may not be strictly necessary in all situations, but put it in just in case circumstances change and you overlook it.

Related

The right way to clear an interrupt flag on STM32

I'm developping a bare-metal project on a STM32L4 and I'm starting from an existing code base.
The ISRs have been implemented the following way:
read interrupt status in the peripheral to know what event(s) provoked the interrupt
do something
clear the flags that have read at the beginning.
Is it the right way to clear the flag ? Shouldn't the flags be cleared at the very beginning of the ISR ? My understanding is that, if the same peripheral event is happening a second time during step 2, it will not provoke a second IRQ so it would be lost. On the other hand if you clear the flag as soon as you can, this second event would pulse the interrupt whose state in the CPU would change to "pending and active": a second IRQ would happen.
PS: From STM32 Processor Programming Manual I read: "STM32 interrupts are both level-sensitive and pulse-sensitive".
Definitely at the beginning (unless you have special reasons in the program logic) as some time is needed the for actual write to the flag clear register to propagate through the buses.
If you decide for some reason to put it at the end of the interrupt you should leave some instructions, place the barrier instruction or read back the register before the interrupt routine return to make sure that the clear operation has propagated across the buses. Otherwise you may have a "phantom" duplicate routine calls.

Why processes cannot preempt interrupts?

I know when an interrupt occurs the process running is put on hold, and the Interrupt Service Routine is called. The current pointer is pointing to the process that was interrupted and I was told that when an interrupt occurs it is not linked to a specific process. So my question is why only another interrupt can preempt an existing interrupt routine?
Also, when a process(p2) preempts another process(p1), who is calling the schedule() method?
the first two answers both show some significant misunderstanding about interrupts and how they work
Of particular interest,
for the CPUs that we are usually using
( 86x.., power PC, 68xxx, ARM, and many others)
each interrupt source has a priority.
sadly, there are some CPUs, for instance the 68HC11, where all the interrupts, except the reset interrupt and the NMI interrupt, have the same priority so servicing any of the other interrupt events will block all the other (same priority) interrupt events.
for our discussion purposes, a higher priority interrupt event can/ will interrupt a lower priority interrupt handler.
(a interrupt handler can modify the appropriate hardware register to disable all interrupt events or just certain interrupt events. or even enable lower priority interrupts by clearing their own interrupt pending flag (usually a bit in a register)
In general, the scheduler is invoked by a interrupt handler,
(or by a process willingly giving up the CPU)
That interrupt is normally the result of a hardware timer expiring/reloading and triggering the interrupt event.
A interrupt is really just an event where the event is waiting to be serviced.
The interrupt event, when allowed, for instance by being the highest priority interrupt that is currently pending, will cause the PC register to load the first address of the related interrupt handler.
the act of diverting the PC register to the interrupt handler will (at a minimum) push the prior PC register value and status register onto the stack. (in some CPUs, there is a special set of save areas for those registers, so they are pushed onto the special area rather than on the stack.
The act of returning from an interrupt, for instance via the RTI instruction, will 'automatically' cause the prior PC and status register values to be restored.
Note: returning from an interrupt handler does not clear the interrupt event pending indication, so the interrupt handler, before exiting needs to modify the appropriate register otherwise the flow of execution will immediately reenter the interrupt handler.
The interrupt handler has to, upon entry, push any other registers that it modifies and, when ready to exit, restore them.
Only interrupts of a lower priority are blocked by the interrupt event diverting the PC to the appropriate interrupt handler. Blocked, not disabled.
on some CPUs, for instance most DSPs, there are also software interrupts that can be triggered by an instruction execution.
This is usually used by hardware interrupt handlers to trigger the data processing after some amount of data has been input/saved in a buffer. This separates the I/O from the processing thereby enabling the hardware interrupt event handler to be quick and still have the data processed in a timely manner
The above contradicts much of what the comments and other answers state. However, those comments and answers are from the misleading view of the 'user' side of the OS, while I normally program right on the bare hardware and so am very familiar with what actually happens.
So my question is why only another interrupt can preempt an existing
interrupt routine?
A hardware interrupt usually puts the processor hardware in an interrupt state where all interrupts are disabled. The interrupt-handler can, and often does, explicitly re-enable interrupts of a higher priority. Such an interrupt can then preempt the lower-priority interrupt. That is the only mechanism that can interrupt a hardware interrupt.
Also, when a process(p2) preempts another process(p1), who is calling
the schedule() method?
That depends somewhat on whether the preemption is initiated by a syscall from a thread already running, or by a hardware interrupt that causes a handler/driver to run and subsequently enter the kernel to request a reschedule. The exact mechansims, (states, stacks etc), used are architecture-dependent.
Regarding your first question: While an interrupt is running, interrupts are disabled on that processor. Therefore, it cannot be interrupted.
Regarding your second question: A process never preempts another process, it is always the OS doing that. The OS calls the scheduler routine regularly, where it decides which process will run next. So p2 doesn't say "i want to run now", it just has some attributes like a priority, remaining time slot, etc., and the OS then decides whether p2 should run now.

arm sleep mode entry and exit differences WFE, WFI

I am reasonably new to the ARM architectures and I am trying to wrap my head around the wake up mechanism.
So first of all I am finding it difficult to find good info on this. ARM's documentation seems to be very terse on the topic.
What I'd like to understand is when the Cortex (particularly the M0 as that's what I am working with) will wake up.
For reference, I have also consulted the following:
What is the purpose of WFI and WFE instructions and the event signals?
Why does the processor enter standby when using WFE instruction but not when using WFI instruction?
The docs on the WFE instructions are:
3.7.11. WFE
Wait For Event.
Syntax
WFE
Operation
If the event register is 0, WFE suspends execution until
one of the following events occurs:
an exception, unless masked by the exception mask registers or the current
priority level
an exception enters the Pending state, if SEVONPEND in the
System Control Register is set
a Debug Entry request, if debug is enabled
an event signaled by a peripheral or another processor in a
multiprocessor system using the SEV instruction.
If the event register is 1, WFE clears it to 0 and completes immediately.
For more information see Power management.
Note
WFE is intended for power saving only. When writing software assume
that WFE might behave as NOP.
Restrictions
There are no restrictions.
Condition flags
This instruction does not change the flags.
Examples
WFE ; Wait for event
The WFI:
3.7.12. WFI
Wait for Interrupt.
Syntax
WFI
Operation
WFI suspends execution until one of the following events occurs:
an exception
an interrupt becomes pending, which would preempt if PRIMASK was clear
a Debug Entry request, regardless of whether debug is enabled.
Note
WFI is intended for power saving only. When writing software assume
that WFI might behave as a NOP operation.
Restrictions
There are no restrictions.
Condition flags
This instruction does not change the flags.
Examples
WFI ; Wait for interrupt
So, some questions:
1) Firstly, can someone please clarify the difference between:
a) System Handler Priority Registers
b) Interrupt Priority Registers.
Is it just that b) are for interrupts that aren't system related such as pendSv?
Now for some scenarios. Really I would like to understand how the scenarios governed by the:
NVIC IRQ enable
NVIC pending
PRIMASK
affect the entry and exit of WFE and WFI.
So the various combinations of these bits yields 8 different scenarios
{NVIC_IRQ enable, NVIC pending, PRIMASK}.
I have already added my vague understanding so far. Please help me with this table.
000 - No prevention of WFE or WFI entry but no wake up condition either
001 - as 000
010 - How does pending affect entry into sleep mode for WFE and WFI?
011 - I guess the answer here is as 010 but with possibly different wake up conditions?
100 - I'd guess WFE and WFI both enter low power mode and exit low power mode no problem.
101 - Any difference to WFE and WFI power mode exit here?
110 - No idea!
111 - No idea!
I am excluding the priorities here as I'm not too concerned about the exception handling order just yet.
Excluding SEV and the event signals, does WFE behave the same as WFI if SEVONPEND is 0?
The primary mechanism for wake that you'll see on a Cortex-M is an interrupt, hence WFI (wait for interrupt). On all of the implementations that I've seen that results in clock-gating the core, although deeper sleep/higher latency modes are sometimes available if the design supports it.
WFE is more relevant in multi-processor designs.
With regard to the questions -
1. Interrupts and System Handlers are very similar in the Cortex-M, differing primarily by how they are triggered. The architecture distinguishes between them, but in practice they are the same.
Are for your bit tables, they don't really make sense. Each Cortex-M implementation has it's own interpretation of what happens during WFI. It can vary from basic clock gating to deep-sleep modes. Consult your microprocessor documentation for the real story.
PRIMASK doesn't affect wake from sleep behavior.
My answer your question about the different between WFI and WFE is based on ARM Cortex-A9 MPcore, please take a look at this link ARM cortex-a9 MPcore TRM.
Basically, there are four CPU modes run mode, standby mode, dormant mode, shutdown mode.
The differences for WFI and WFE are the way to bring CPU to run mode.
WFE can works with the execution of an SEV instruction on any processor in the multiprocessor system, and also works with an assertion of the EVENTI input signal.
WFI doesn't have these two.
Also the way they handle the causes.
WFI must work with IRQ_Handler, WFE doesn't have to.

Is the abrupt ending of a process using Control + C a trap or an interrupt?

It seems as though the difference between a trap and an interrupt is clear: a trap is a software-invoked call to the kernel (such as through an exception) and an interrupt is pertinent to the hardware (the disk, I/O and peripheral devices such as the mouse and the keyboard...) (learn more about the difference here).
Knowing this, under what category should pressing Control + C to end a process be classified? Is it a software-invoked call and thus a trap since it can be executed from the Shell, etc. or is it an interrupt since it's a signal that the CPU receives from the keyboard? Or are interrupts wholly outside users' domain, meaning that it's the hardware interacting with the CPU at a level that the user cannot reach?
Thank you!
It's first and foremost a signal — pressing Control-C causes the kernel to send a signal (of type SIGINT) to the current foreground process. If that process hasn't set up a handler for that signal (using one of the system calls from the signal() family), it causes the process to be killed.
The signal is, I suppose, the "interrupt" signal, but this is unrelated to hardware interrupts. Those are only used internally by the kernel.
The difference between a trap and an interrupt is not as you described in your question (thanks for the reference) but to the asynchronous nature of the events producing it. A trap means an interrupt in the code execution due to a normally incorrect/internal operation (like dividing by zero or a page fault, or as you signaled, a software interrupt), but it ocurrs always at the same place in the code (synchronously with the code execution) and an interrupt occurs because of external hardware, when some device signals the cpu to interrupt what it is doing, as it's ready to send some data. By nature, traps are synchronous and interrupts aren't.
Said this, both are anomalous events that change the normal course of execution of the cpu. Both are hardware produced, but for different reasons: the first occurs synchronously (you know always when, in which instruction, it will be produced, if produced at all) and the second not (you don't know in advance which instruction will be executing when external hardware would assert the interrupt line) Also, there are two kinds of traps (depending on the event that triggered them), one puts the instruction pointer pointing to the next instruction (for example a divide by zero trap) to be executed, and the other puts it pointing to the same instruction that caused the trap (for example a page fault, that has to be reexecuted once the cause of the trap has been corrected) Of course, software interrupts, as by their nature, are always traps (traps that always change the course of execution) as it can be predicted the exact point in the program flow where the cpu will get interrupted.
So, with this explanation, you probably can answer your question yourshelf, a Ctrl-C interrupt is an interrupt as you cannot predice in advance when it will interrupt the cpu execution and you cannot mark that point in your code.
Remember, interrupts ocurr asynchronously, traps not.
The pressing of Ctrl+C on Linux systems is used to kill a process with the signal SIGINT, and can be intercepted by a program so it can clean its self up before exiting, or not exit at all.
Had it been a trap, it'd have got instantaneously died!
Hence,it is a kind of a software interrupt!
Control-C is not an interrupt... at least not on the PC (and now MAC) hardware. In other words, the keyboard controller doesn't generate a specific interrupt for the key combination "control" and "C".
The keyboard uses only one interrupt vector which is triggered on a key down and a key up. The keyboard is an extremely slow hardware device. With the key repeat rate set to the fastest, holding down a key generate 33 interrupt per second.
If the designer of the operating system believe that control-C is extremely important, they may include the test "is this the key down for "C" AND is the "control" key triggered a keyboard interrupt some billions machine cycle ago? Then, while still processing the keyboard interrupt, they would generate a trap using a software interrupt instruction.
A better operating system would reduce the processing time of the keyboard interrupt to the strict minimum. They would just append to a circular buffer (ring buffer) the key code which include the bit pressed/released and immediately terminate the interrupt.
The operating system would then, whenever it has time, notice the change in ring buffer pointer. It would trigger the code which extract the key code from the ring buffer, verify if that code represent the "ctrl-C" combination and set a flag saying "ctrl-C" detected.
Finally, when the scheduler is ready to run a thread that belong to the current process, it check the flag "ctrl-C' detected. If it is the case, the scheduler set PC to point to the SIGINT routine instead of resuming to the previous execution address.
No matter the detail, "ctrl-C" can not be an interrupt. It is either a trap if called from the keyboard interrupt or it is a synchronization object tested asynchronously by the scheduler.

What is the irq latency due to the operating system?

How can I estimate the irq latency on ARM processor?
What is the definition for irq latency?
Interrupt Request (irq) latency is the time that takes for interrupt request to travel from source of the interrupt to the point when it will be serviced.
Because there are different interrupts coming from different sources via different paths, obviously their latency is depending on the type of the interrupt. You can find table with very good explanations about latency (both value and causes) for particular interrupts on ARM site
You can find more information about it in ARM9E-S Core Technical Reference Manual:
4.3 Maximum interrupt latency
If the sampled signal is asserted at the same time as a multicycle instruction has started
its second or later cycle of execution, the interrupt exception entry does not start until
the instruction has completed.
The longest LDM instruction is one that loads all of the registers, including the PC.
Counting the first Execute cycle as 1, the LDM takes 16 cycles.
• The last word to be transferred by the LDM is transferred in cycle 17, and the abort
status for the transfer is returned in this cycle.
• If a Data Abort happens, the processor detects this in cycle 18 and prepares for
the Data Abort exception entry in cycle 19.
• Cycles 20 and 21 are the Fetch and Decode stages of the Data Abort entry
respectively.
• During cycle 22, the processor prepares for FIQ entry, issuing Fetch and Decode
cycles in cycles 23 and 24.
• Therefore, the first instruction in the FIQ routine enters the Execute stage of the
pipeline in stage 25, giving a worst-case latency of 24 cycles.
and
Minimum interrupt latency
The minimum latency for FIQ or IRQ is the shortest time the request can be sampled
by the input register (one cycle), plus the exception entry time (three cycles). The first
interrupt instruction enters the Execute pipeline stage four cycles after the interrupt is
asserted
There are three parts to interrupt latency:
The interrupt controller picking up the interrupt itself. Modern processors tend to do this quite quickly, but there is still some time between the device signalling it's pin and the interrupt controller picking it up - even if it's only 1ns, it's time [or whatever the method of signalling interrupts are].
The time until the processor starts executing the interrupt code itself.
The time until the actual code supposed to deal with the interrupt is running - that is, after the processor has figured out which interrupt, and what portion of driver-code or similar should deal with the interrupt.
Normally, the operating system won't have any influence over 1.
The operating system certainly influences 2. For example, an operating system will sometimes disable interrupts [to avoid an interrupt interfering with some critical operation, such as for example modifying something to do with interrupt handling, or when scheduling a new task, or even when executing in an interrupt handler. Some operating systems may disable interrupts for several milliseconds, where a good realtime OS will not have interrupts disabled for more than microseconds at the most.
And of course, the time it takes from the first instruction in the interrupt handler runs, until the actual driver code or similar is running can be quite a few instructions, and the operating system is responsible for all of them.
For real time behaviour, it's often the "worst case" that matters, where in non-real time OS's, the overall execution time is much more important, so if it's quicker to not enable interrupts for a few hundred instructions, because it saves several instructions of "enable interrupts, then disable interrupts", a Linux or Windows type OS may well choose to do so.
Mats and Nemanja give some good information on interrupt latency. There are two is one more issue I would add, to the three given by Mats.
Other simultaneous/near simultaneous interrupts.
OS latency added due to masking interrupts. Edit: This is in Mats answer, just not explained as much.
If a single core is processing interrupts, then when multiple interrupts occur at the same time, usually there is some resolution priority. However, interrupts are often disabled in the interrupt handler unless priority interrupt handling is enabled. So for example, a slow NAND flash IRQ is signaled and running and then an Ethernet interrupt occurs, it may be delayed until the NAND flash IRQ finishes. Of course, if you have priorty interrupts and you are concerned about the NAND flash interrupt, then things can actually be worse, if the Ethernet is given priority.
The second issue is when mainline code clears/sets the interrupt flag. Typically this is done with something like,
mrs r9, cpsr
biceq r9, r9, #PSR_I_BIT
Check arch/arm/include/asm/irqflags.h in the Linux source for many macros used by main line code. A typical sequence is like this,
lock interrupts;
manipulate some flag in struct;
unlock interrupts;
A very large interrupt latency can be introduced if that struct results in a page fault. The interrupts will be masked for the duration of the page fault handler.
The Cortex-A9 has lots of lock free instructions that can prevent this by never masking interrupts; because of better assembler instructions than swp/swpb. This second issue is much like the IRQ latency due to ldm/stm type instructions (these are just the longest instructions to run).
Finally, a lot of the technical discussions will assume zero-wait state RAM. It is likely that the cache will need to be filled and if you know your memory data rate (maybe 2-4 machine cycles), then the worst case code path would multiply by this.
Whether you have SMP interrupt handling, priority interrupts, and lock free main line depends on your kernel configuration and version; these are issues for the OS. Other issues are intrinsic to the CPU/SOC interrupt controller, and to the interrupt code itself.

Resources