switching between ARM/THUMB state - arm

Why should an ARM controller return to ARM state from THUMB state, when an exception occurs?

One explanation might be that ARM mode is the CPU's "native" operating mode, and that it's possible to do more operations in that mode than in the limited Thumb mode. The Thumb mode, as far as I've understood, is optimized for code size, which might mean it lacks certain instructions that perhaps are necessary in exception processing.
This page mentions that exception processing is always done in ARM mode. It doesn't provide any reasons why, so maybe it's just The Way It Is, by design. It does talk about ways to exit from exception processing back to the proper (ARM or Thumb) mode though, so as long as you're not writing the exception handler yourself, you might be able to ignore this issue. That, of course, assumes that your system is set up with a "default" exception handler that does retain the execution mode.
On the other hand, this page says this, about the interrupt vectors of the Cortex-M3 ARM implementation:
The LSB of each exception vector indicates whether the exception is to
be executed in the Thumb state.
So it doesn't seem to be universally true, perhaps you can make your particular exception run in Thumb mode.

Perhaps it is because that the interrupt vector table is really an ARM instruction and to process it requires being in ARM mode. This reduces the programmers job as you dont have to write two handlers one for arm mode and one for thumb mode. How would you even know there is one entry point for an exception and you can only have one instruction type to handle it. You can certainly switch to thumb mode once entered no different than switching to thumb mode after a reset exception.
The cortex-m3 has re-defined the interrupt vector table to be more traditional (an address instead of an instruction). By necessity I would assume, the cortex-m3 is a thumb(2) only processor so either they re-define the vector table to hold thumb instructions or they re-define the table with addresses, or they have just enough of an arm core to process the load or jump that you normally see in a vector table entry.
Basically you would either need two entries per exception, one for the arm based handler and one for the thumb based handler or you require the user to write their handler with an entry point that is one mode specifically.
Even with the one mode entry point into a handler, you still have to be aware of the mode the processor was in when the exception occurred to know what address to return to and how to inspect the instruction in question that caused the exception.

It depends which CPU you have, as there are two thumb instruction sets. The original thumb instruction set (used in armv4t, armv5te) lacked instructions to be able to deal with interrupts; the newer thumb2 set (in the cortex series) has extra instructions so you can remain in thumb2 mode to service an interrupt routine.

Traditional ARM systems boot into ARM mode and jumps to reset exception vector after reset. This means that all exception vectors have to be written in ARM assembly. If your exceptions are ARM instructions naturally the CPU is forced to change its mode to ARM mode before exception handling; if this does not happen it will result in an undefined exception, which will cause another one and so on and on in an infinite loop.
Initial ARM systems only had ARM instructions, the THUMB instructions were later added on; this might be another explanation.

Related

how to single-step code on-target with no jtag, breakpoints, simulator, emulator

Let's say you have a pointer to function whose source you do not have and which is "untrusted" because it might read/write to disallowed memory region.
Before it executes each assembly instruction, you want to verify that it doesn't access disallowed memory regions.
The OS is (almost) bare-metal i.e. a custom RTOS (so no Linux or QNX).
This is for a functionality that needs to be enabled not only during development but during normal runtime.
Ideally, it'd run something like this:
void (*fptr)(int);
fptr = &someFunction; // untrusted, don't have source
// enable interrupts for each assembly instruction
_EN_INT();
// call the function
fptr();
// everytime the PC increments, some other code runs which verifies that if any load/stores are executed, it doesn't access some disallowed memory range
// disable interrupts for each assembly instruction
_DIS_INT();
QUESTION
Is it possible to call that function and pause execution after every assembly instruction?
The OS is (almost) bare-metal i.e. a custom RTOS (so no Linux or QNX).
My answer assumes that you can modify the "OS" the way you need it...
Cortex MK20DX256VLH7
This seems to be a Cortex M4 CPU.
how to single-step code on-target with no jtag, breakpoints
From the doc, it doesn't say whether you NEED an external debugger to resume execution.
If the CPU is really stopped, you'll definitely need an external signal (e.g. from a debugger).
However most CPUs support software debugging. This means that an interrupt service routine is executed whenever a breakpoint is hit. To continue execution you simply return from the interrupt service routine.
I don't know about the Cortex M4 but for the Cortex M3 you'll have to set some special registers to enable that feature. Whenever a "BKPT" instruction is hit then interrupt #12 (*) is executed.
For code in RAM you simply write an BKPT instruction (0xBExx, e.g. 0xBEBE) to the address where you want to set your breakpoint. (Before writing you read out the value to be able to restore it later on).
For code in Flash the M3 has a "Flash patching unit" which allows you to specify up to three addresses which shall be read out as 0xBExx (0xBEBE ?) even if other data is stored there. This allows you to set up to 3 breakpoints in Flash.
Interesting for you: The register controlling the debug features in the M3 (named "DEMCR") also has a bit named "MON_STEP":
If you set this bit in interrupt handler #12 then exactly one instruction is executed after returning from the interrupt handler and interrupt #12 is triggered again. The use case for this feature is - of course - single-stepping code!
To stop single-stepping you'll have to clear the MON_STEP bit again...
Important 1:
I don't know if the MK20DX256VLH7 really has all these features. However because it is a Cortex M4 chip and the M4 should have nearly all features of the M3 these features should be present...
Important 2:
Implementing single-stepping and debugging is not done quickly. Assembly language knowledge will be very helpful and you'll need a lot of time...
From the doc, ...
You will not only need the documentation for the MK20DX256VLH7 from NXP but you'll also need the Cortex M4 documentation from ARM.
(*) Offset 4*12 in the vector table is meant here (which is named "IRQ(-4)" in some ARM documents); not IRQ12.
yes the ARM emulator/interpreter sounds exactly like what I want. Is there a free one?
qemu is open-source, most of it is GPLv2. https://wiki.qemu.org/License. You'd probably need to modify it a lot, because it's designed for use as a stand-alone wrapper for a whole Unix process (qemu-user) or whole machine (qemu-system).
I googled, and there's also http://www.unicorn-engine.org/ which is designed to be used as part of a larger program (written in C with bindings for calling from various languages). It's also GPLv2 (not LGPL), so you can use it if the rest of your code is also Free software.
It's actually based on the CPU-emulation code from QEMU; they stripped out all the device / BIOS emulation stuff to make a flexible library for just emulating CPUs.
Presumably you could configure some memory protections for it and set up the starting machine state, and let it run your function (with a return address that leads to some code that hands control back to your main code?)

How to set the default ARM processor state for kernel code?

For the ARM processor supports both ARM and Thumb-2 state (instruction set), how to set which state to run kernel mode code, such as the kernel code for interrupt and system call.
In the v7 architecture*, bit 30 of the SCTLR determines whether exceptions are taken in ARM or Thumb state, which defaults to ARM on reset unless an external signal says otherwise. For v6 and earlier, exceptions are always taken in ARM state.
In the context of actually writing an OS, if you wanted to write exception handlers in Thumb in a backwards-compatible way I imagine you could also simply use an ARM interworking branch from the exception vector to the handler code itself - I'm no expert, though, so I can't say for sure there's no nasty pitfalls to this.
*and the in-between oddity that is the 1156

ARM Program Status Register

I am currently doing work with the Cortex-M3 and am having trouble understanding the use of the xPSR. I am not sure why the ISR number is stored in lowest 8 bits.
Online sources and datasheets give no clue to why the number must be stored in the xPSR. On the ARM7-TDMI the current mode (FIQ, IRQ, SVC, USR, ABT, UND) is stored in the lowest 8 bits of the CPSR. I assume it is stored there so when an exception is raised and the mode is switched the processor knows which bank of registers to save the state of the CPSR to. However the Cortex-M3 doesn't have banked registers and the xPSR is saved onto the stack when an ISR is needed to be serviced.
Can anyone enlighten me?
Is this what you are talking about?
The IPSR
The processor writes to the IPSR on exception entry and exit. Software can use an MRS instruction, to read
the IPSR, but the processor ignores writes to the IPSR by an MSR instruction. The IPSR Exception Number
field is defined as follows:
• in Thread mode, the value is 0
• in Handler mode, holds the exception number of the currently-executing exception.
An exception number indicates the currently executing exception and its entry vector, see Exception number
definition on page B1-633 and The vector table on page B1-634.
On reset, the processor is in Thread mode and the Exception Number field of the IPSR is cleared to 0. As a
result, the value 1, the exception number for reset, is a transitory value, that software cannot see as a valid
IPSR Exception Number.
I would see that as similar to the cpsr in the ARMv4/ARM7TDMI as it gives you the state in which you are executing. if you are executing in an exception and if so which one. It likely has meaning to the chip designers for similar reasons and that is where that information or a copy of that information is held. Perhaps to not re-enter an exception handler if already in that exception mode for example. Or if a second exception of some type, say a prefetch abort while executing the prefetch abort perhaps the processor hangs on purpose or chooses a different exception.
Well, I'm not entirely sure I understand the question - are you asking "why is it different" or "what is the point"?
In the first case - the answer is the same as for "why does it not support the ARM instruction set", "why are there only 2 execution modes", "why is the vector table huge and contains addresses rather than instructions" and "why are there only 2 possible stack pointers": "because". Because the M-profile has a completely different exception model.
(And because the exception model is different, what was the mode bits in the CPSR was free to use for something else.)
In the second case ... well, that's up to the developer, isn't it?
There happens to be a register that holds the currently active interrupt ID. If this is of benefit to you, do use it.
You could (for example) use it to have a single interrupt handler address stored into multiple locations in the vector table and then use the interrupt ID to identify which specific device had triggered the interrupt.
It also sounds handy from the point of view of reentrant exception handling.
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0552a/CHDBIBGJ.html

What does 'bank'ing a register mean?

Reading 'ARM Architecture' on Wikipedia and found the following statement:
Registers R0-R7 are the same across all CPU modes; they are never
banked.
R13 and R14 are banked across all privileged CPU modes except system
mode.
What does banking a register mean?
Register banking refers to providing multiple copies of a register at the same address.
Taken from section 1.4.6 of the arm docs
The term is referring to a solution for the problem that not all registers can be seen at once.
There is a different register bank for each processor mode. The banked registers give rapid context switching for dealing with processor exceptions and privileged operations.
If your looking for a more theoretical reasoning, I recommend this paper.
Edit: A much deeper answer than mine is given here
When the processor enters an exception, the banked registers are switched automatically with another set of these registers.
Virtually, the exception handler routine doesn't have to save these registers on the stack to prevent them from being clobbered later on (by the exception handler functions). The processor just keeps a safe copy of that set; and will restore the original set on exception return.
This clip explains it well
https://youtu.be/7LqPJGnBPMM?t=1419
Banked registers are registers which are not needed and accessible by the current execution mode. When the execution mode changes, the registers needed for the new mode will become usable.

ARM modes and why are there so many?

I'm currently reading/learning about ARM architecture ...
and I was wondering why there are so many modes
(FIQ, User, System, Supervisor, IRQ, ...).
My question is why do we need so many modes? Wouldn't just User and System be enough?
Thanks in advance.
It's just an architectural decision. The big advantage of the multiple modes is that they have some banked registers. Those extra registers allow you to write much less complicated exception routines.
If you were to pick only two, just USR and SYS are probably as good a choice as any, but what would happen when you took an exception? The normal ARM model is to go to an exception mode, set the banked link register for that exception mode to point to the instruction you want to return to after you resolve the exception, save the processor state in the exception mode's SPSR register, and then jump to the exception vector. USR and SYS share all their registers - using this model, you'd blow away your function return address (in LR) every time you took an interrupt!
The FIQ mode in particular has even more banked registers than the other exception modes. Those extra registers are in keeping with the "F" part of FIQ - it stands for "Fast". Not having to save and restore more processor context in software will speed up your interrupt handler.
Not too much to add to Carl's answer. Not sure what family / architecture of ARM processors you're talking about, so I'll just assume based on your question (FIQ, IRQ, etc.) that you're talking about ARM7/9/11. I won't enumerate every difference between every mode in every ARM architecture variant.
In addition to what Carl said, a few other advantages of having different modes for different circumstances:
for example, in the FIQ, you don't have to branch off right away, you can just keep on executing. With other exceptions you have to branch right away
with different modes, you have natural support for separate stacks. If you're multitasking (e.g., RTOS) and you don't have a separate stack when you're in an interrupt mode, you have to build-in extra space onto each task stack for the worst-case interrupt situation
with different modes, certain registers (e.g. CPSR, MMU regs, etc. - depends on architecture) are off-limits. Same thing with certain instructions. You don't want to let user code modify privileged registers, now do you?

Resources