The MPU in ARM Cortex-M (M0+/M3/M4/M7/etc.) is often advertised as allowing to set up protection against dereferencing the NULL pointer. But how to do this in practice? (Some online discussions, like in the Zephyr Project, indicate that the issue is not quite trivial.)
I'm looking for the simplest possible MPU code running in "Privileged mode" on bare-metal ARM Cortex-M. Please note that "protection against dereferencing the NULL pointer" means to me protection both against reads and writes. Also, it is not just about the address 0x0, but small offsets from it as well. For example, accessing a struct member via a NULL pointer should also cause MPU exception:
struct foo {
. . .
uint8_t x;
};
. . .
uint8_t x = (*(struct foo volatile *)NULL)->x; // should fail!
After some experimentation, I've come up with the MPU setting that seems to work for most ARM Cortex-M MCUs. Here is the code (using the CMSIS):
/* Configure the MPU to prevent NULL-pointer dereferencing ... */
MPU->RBAR = 0x0U /* base address (NULL) */
| MPU_RBAR_VALID_Msk /* valid region */
| (MPU_RBAR_REGION_Msk & 7U); /* region #7 */
MPU->RASR = (7U << MPU_RASR_SIZE_Pos) /* 2^(7+1) region, see NOTE0 */
| (0x0U << MPU_RASR_AP_Pos) /* no-access region */
| MPU_RASR_ENABLE_Msk; /* region enable */
MPU->CTRL = MPU_CTRL_PRIVDEFENA_Msk /* enable background region */
| MPU_CTRL_ENABLE_Msk; /* enable the MPU */
__ISB();
__DSB();
This code sets up a no-access MPU region #7 around the address 0x0 (any other MPU region will do as well). This works even for the MCUs, where the Vector Table also resides at address 0x0. Apparently, the MPU does not check access to the region by instructions other than LDR/STR, such as reading the vector address during Cortex-M exception entry.
However, in case the Vector Table resides at 0, the size of the no-access region must not contain any data that the CPU would legitimately read with the LDR instruction. This means that the size of the no-access region should be about the size of the Vector Table. In the code above, the size is set to 2^(7+1)==256 bytes, which should be fine even for relatively small vector tables.
The code above works also for MCUs that automatically relocate the Vector Table, such as STM32. For these MCUs, the size of the no-access region can be increased all the way to the relocated Vector Table, like 0x0800'0000 in the case of STM32. (You could set the size to 2^(26+1)==0x0800'0000).
Protection against NULL-pointer dereferencing is an important tool for improving the system's robustness and even for preventing malicious attacks. I hope that this answer will help fellow embedded developers.
After a major refactoring of an embedded system (IAR C on TI CC2530), I've ended up in the following situation:
After basic initialization of peripherals and global interrupt enable, the execution incorrectly ends up in an interrupt handler that communicates with external hardware. Since this hardware is not ready (remember, we end up in the ISR incorrectly), the program freezes triggering a watchdog reset.
If I insert 1, 2, 3, 5, 6, 7 etc NOPs in main(), everything works fine.
But If I insert 0, 4, 8 etc NOPs, I get the faulty behaviour.
CC2530 fetches 4 bytes of instructions from flash memory, on 4-byte boundaries.
This tells me that something is misaligned when it comes to code memory, but I simply doesn't know where to start. Nothing has changed when it comes the target settings AFAIK.
Anyone here who has seen this situation before, or can point me in the right direction?
#include <common.h>
#include <timer.h>
#include <radio.h>
#include <encryption.h>
#include "signals.h"
#include "lock.h"
#include "nfc.h"
#include "uart1_trace.h"
#include "trace.h"
//------------------------------------------------------------------------------
// Public functions
//------------------------------------------------------------------------------
void main(void)
{
setTp;
// Initialize microcontroller and peripherals
ClockSourceInit();
WatchdogEnable();
PortsInit();
TraceInit();
Timer4Init();
SleepInit();
RadioInit();
Uart1Init();
LoadAesKey("\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0");
clrTp;
NfcInit();
__enable_interrupt();
asm("nop");
// Initialize threads
LockInit();
while (true)
{
WDR();
LockRun();
}
}
void NfcInit(void)
{
// Enable wake up interrupt on external RF field present
// The process for enabling interrupts is described in section 2.5.1 in the CC2530 datasheet.
// Configure interrupt source: interrupt on falling edge, Port 0, pin 7:0
PICTL |= BIT(0);
// 1. Clear port 0 individual interrupt flag. Read-modify-write is not allowed.
// Writing 1 to a bit in this register has no effect, so 1 should be written to flags that are not to be cleared.
P0IFG = ~BIT(3);
// Clear port 0 interrupt flag. This register is bit-accessible.
P0IF = 0;
// 2. Set pin 3 interrupt-enable
P0IEN |= BIT(3);
// 3. Set port 0 interrupt-enable
IEN1 |= BIT(5);
// 4. Global interrupt enable is set in main()
}
// Interrupt handler: falling edge on signal Wake.
// This interrupt will only occur when device is powered off and NFC field present.
// When device is powered on, VCORE is always asserted.
#pragma vector = 0x6B
__interrupt static void NFC_WAKE_ISR(void)
{
static uint16 cnt = 0;
TracePutUint16(cnt); TracePuts("\r\n");
if (++cnt > 10)
IEN1 &= ~BIT(5);
P0IFG = ~BIT(3); // Clear port 1 individual interrupt flag. Read-modify-write is not allowed.
P0IF = 0; // Clear port 1 CPU interrupt flag. This register is bit-accessible.
return;
Screenshot of software init.
CH1 = External interrupt signal, active low (signal Wake).
CH2 = TP in main.c (setTp / clrTp).
The reset button on CC-debugger seems not to be debounced, so the TP signal turns on and off a few times before stabilizing (should not be an issue). VCC is stable long before the reset. When TP goes low for the last time, all peripherals are initialized.
An external NFC IC is used to wake up the MCU from sleep mode when a NFC field is present. The NFC IC is powered by one of the CC2530 I/O-pins. Normally the IC is powered off to preserve power. In this state, the energy from the NFC field is enough to generate the wake signal (active low). When this signal is detected by the MCU, it wakes up, applies power to the NFC IC, and NFC communication starts.
The NFC IC generates the signal either when powered, or when a NFC field is present.
After reset, all I/O-pins are configured as inputs with pullups. This pulled up input is enough to power the NFC IC, which is why the wake-signal is generated. Immediatly after reset, the I/O is configured (in function PortsInit()), and power to NFC IC is turned off. This makes the wake signal go low. The slow rise- and fall times are probably due to a capacitor, that I will now remove.
Here is where things get weird. Despite the wake signal being low, the external interrupt is configured for falling edge and pending int flag is cleared right before global in enabled, I end up in the ISR a few ms later (not seen in the screen shot). But only with the right number of NOPs, as described above.
If I add a > 15 ms delay before global int enable, all is fine. This coincides with the time measured from TP low to wake high.
One might think that the int is incorrectly configured for active low, but in that case I should get multiple ints, and I don't. Also, that does not explain the magic NOPs...
Compiler generated ISR assembly code:
// 77 // Interrupt handler: falling edge on signal Wake.
// 78 // This interrupt will only occur when device is powered off and NFC field present.
// 79 // When device is powered on, VCORE is always asserted.
// 80 #pragma vector = 0x6B
RSEG NEAR_CODE:CODE:NOROOT(0)
// 81 __interrupt static void NFC_WAKE_ISR(void)
NFC_WAKE_ISR:
// 82 {
PUSH A
MOV A,#-0xe
LCALL ?INTERRUPT_ENTER_XSP
; Saved register size: 15
; Auto size: 0
// 83 static uint16 cnt = 0;
// 84
// 85 TracePutUint16(cnt); TracePuts("\r\n");
; Setup parameters for call to function PutUint16
MOV R4,#(TPutc & 0xff)
MOV R5,#((TPutc >> 8) & 0xff)
MOV DPTR,#??cnt
MOVX A,#DPTR
MOV R2,A
INC DPTR
MOVX A,#DPTR
MOV R3,A
LCALL PutUint16
; Setup parameters for call to function TPuts
MOV R2,#(`?<Constant "\\r\\n">` & 0xff)
MOV R3,#((`?<Constant "\\r\\n">` >> 8) & 0xff)
LCALL TPuts
// 86
// 87 if (++cnt > 10)
MOV DPTR,#??cnt
MOVX A,#DPTR
ADD A,#0x1
MOV R0,A
INC DPTR
MOVX A,#DPTR
ADDC A,#0x0
MOV R1,A
MOV DPTR,#??cnt
MOV A,R0
MOVX #DPTR,A
INC DPTR
MOV A,R1
MOVX #DPTR,A
CLR C
MOV A,R0
SUBB A,#0xb
MOV A,R1
SUBB A,#0x0
JC ??NFC_WAKE_ISR_0
// 88 IEN1 &= ~BIT(5);
CLR 0xb8.5
// 89
// 90
// 91 P0IFG = ~BIT(3); // Clear port 1 individual interrupt flag. Read-modify-write is not allowed.
??NFC_WAKE_ISR_0:
MOV 0x89,#-0x9
// 92 P0IF = 0; // Clear port 1 CPU interrupt flag. This register is bit-accessible.
CLR 0xc0.5
// 93
// 94 return;
MOV R7,#0x1
LJMP ?INTERRUPT_LEAVE_XSP
REQUIRE _A_P0
REQUIRE P0IFG
REQUIRE _A_P1
REQUIRE _A_IEN1
REQUIRE _A_IRCON
////////////////////////////////////////////////////////////////////////////////
// lnk51ew_CC2530F64.xcl: linker command file for IAR Embedded Workbench IDE
// Generated: Mon May 24 00:00:01 +0200 2010
//
////////////////////////////////////////////////////////////////////////////////
//
// Segment limits
// ==============
//
// IDATA
// -----
-D_IDATA0_START=0x00
-D_IDATA0_END=0xFF
//
// PDATA
// -----
// We select 256 bytes of (I)XDATA memory that can be used as PDATA (see also "PDATA page setup" below)
-D_PDATA0_START=0x1E00
-D_PDATA0_END=0x1EFF
//
//
// IXDATA
// ------
-D_IXDATA0_START=0x0001 // Skip address 0x0000 (to avoid ambiguities with NULL pointer)
-D_IXDATA0_END=0x1EFF // CC2530F64 has 8 kB RAM (NOTE: 256 bytes are used for IDATA)
//
//
// XDATA
// -----
-D_XDATA0_START=_IXDATA0_START
-D_XDATA0_END=_IXDATA0_END
//
// NEAR CODE
// ---------
-D_CODE0_START=0x0000
-D_CODE0_END=0xFFFF // CC2530F64 has 64 kB code (flash)
//
// Special SFRs
// ============
//
// Register bank setup
// -------------------
-D?REGISTER_BANK=0x0 // Sets default register bank (0,1,2,3)
-D_REGISTER_BANK_START=0x0 // Start address for default register bank (0x0, 0x8, 0x10, 0x18)
//
// PDATA page setup
// ----------------
-D?PBANK_NUMBER=0x1E // High byte of 16-bit address to the PDATA area
//
// Virtual register setup
// ----------------------
-D_BREG_START=0x00
-D?VB=0x20
-D?ESP=0x9B //Extended stack pointer register location
////////////////////////////////////////////////////////////////////////////////
//
// IDATA memory
// ============
-Z(BIT)BREG=_BREG_START
-Z(BIT)BIT_N=0-7F
-Z(DATA)REGISTERS+8=_REGISTER_BANK_START
-Z(DATA)BDATA_Z,BDATA_N,BDATA_I=20-2F
-Z(DATA)VREG+_NR_OF_VIRTUAL_REGISTERS=08-7F
-Z(DATA)PSP,XSP=08-7F
-Z(DATA)DOVERLAY=08-7F
-Z(DATA)DATA_I,DATA_Z,DATA_N=08-7F
-U(IDATA)0-7F=(DATA)0-7F
-Z(IDATA)IDATA_I,IDATA_Z,IDATA_N=08-_IDATA0_END
-Z(IDATA)ISTACK+_IDATA_STACK_SIZE#08-_IDATA0_END
-Z(IDATA)IOVERLAY=08-FF
//
// ROM memory
// ==========
//
// Top of memory
// -------------
-Z(CODE)INTVEC=0
-Z(CODE)CSTART=_CODE0_START-_CODE0_END
//
// Initializers
// ------------
-Z(CODE)BIT_ID,BDATA_ID,DATA_ID,IDATA_ID,IXDATA_ID,PDATA_ID,XDATA_ID=_CODE0_START-_CODE0_END
//
// Program memory
// --------------
-Z(CODE)RCODE,DIFUNCT,CODE_C,CODE_N,NEAR_CODE=_CODE0_START-_CODE0_END
//
// Checksum
// --------
-Z(CODE)CHECKSUM#_CODE0_END
//
// XDATA memory
// ============
//
// Stacks located in XDATA
// -----------------------
-Z(XDATA)EXT_STACK+_EXTENDED_STACK_SIZE=_EXTENDED_STACK_START
-Z(XDATA)PSTACK+_PDATA_STACK_SIZE=_PDATA0_START-_PDATA0_END
-Z(XDATA)XSTACK+_XDATA_STACK_SIZE=_XDATA0_START-_XDATA0_END
//
// PDATA - data memory
// -------------------
-Z(XDATA)PDATA_Z,PDATA_I=_PDATA0_START-_PDATA0_END
-P(XDATA)PDATA_N=_PDATA0_START-_PDATA0_END
//
// XDATA - data memory
// -------------------
-Z(XDATA)IXDATA_Z,IXDATA_I=_IXDATA0_START-_IXDATA0_END
-P(XDATA)IXDATA_N=_IXDATA0_START-_IXDATA0_END
-Z(XDATA)XDATA_Z,XDATA_I=_XDATA0_START-_XDATA0_END
-P(XDATA)XDATA_N=_XDATA0_START-_XDATA0_END
-Z(XDATA)XDATA_HEAP+_XDATA_HEAP_SIZE=_XDATA0_START-_XDATA0_END
-Z(CONST)XDATA_ROM_C=_XDATA0_START-_XDATA0_END
//
// Core
// ====
-cx51
////////////////////////////////////////////////////////////////////////////////
//
// Texas Instruments device specific
// =================================
//
// Flash lock bits
// ---------------
//
// The CC2530 has its flash lock bits, one bit for each 2048 B flash page, located in
// the last available flash page, starting 16 bytes from the page end. The number of
// bytes with flash lock bits depends on the flash size configuration of the CC2530
// (maximum 16 bytes, i.e. 128 page lock bits, for the CC2530 with 256 kB flash).
// Note that the bit that controls the debug interface lock is always in the last byte,
// regardless of flash size.
//
-D_FLASH_LOCK_BITS_START=(_CODE0_END-0xF)
-D_FLASH_LOCK_BITS_END=_CODE0_END
//
// Define as segment in case one wants to put something there intentionally (then comment out the trick below)
-Z(CODE)FLASH_LOCK_BITS=_FLASH_LOCK_BITS_START-_FLASH_LOCK_BITS_END
//
// Trick to reserve the FLASH_LOCK_BITS segment from being used as normal CODE, avoiding
// code to be placed on top of the flash lock bits. If code is placed on address 0x0000,
// (INTVEC is by default located at 0x0000) then the flash lock bits will be reserved too.
//
-U(CODE)0x0000=(CODE)_FLASH_LOCK_BITS_START-_FLASH_LOCK_BITS_END
//
////////////////////////////////////////////////////////////////////////////////
According to TI, that part has got an 8051 core. Apart from being dinosaur crap, 8051 is an 8-bitter so alignment does not apply.
When random modifications to the code result in completely unrelated errors or run-away code, it is most often caused by one of these things:
You got a stack overflow, or
You got undefined behavior bugs, such as uninitialized variables, array out of bounds access etc.
Also ensure that all ISRs are registred in the interrupt vector table.
EDIT after question change 6/4:
You should normally not return from interrupts! I don't know how your specific setup works, but with a general embedded systems compiler, the non-standard interrupt keyword means two things:
Ensure that the calling convention upon entering the ISR is correct, by stacking whatever registers the CPU/ABI state are not stacked by hardware, but software.
Ensure that the same registers are restored upon leaving the ISR and that the correct return instruction is used.
On 8051 this means that the disassembled ISR absolutely must end with a RETI instruction and not a RET instruction! Chances are high that return results in RET which will sabotage your stack. Disassemble to see if this is indeed the case.
The user's guide for the CC2530 states:
The instruction that sets the PCON.IDLE bit must be aligned in a
certain way for correct operation. The first byte of the assembly
instruction immediately following this instruction must not be placed
on a 4-byte boundary.
This is likely why the system fails on NOP multiples of four.
Just below the warning, there is an implementation for fixing this alignment specifically targeted at the IAR compiler.
I am starting to believe that this is a hardware issue, related to the connection between CC2530 and the NFC IC.
The power and reset to the NFC IC that sends the external interrupt request is controlled by a CC2530 I/O pin with 20 mA current drive capacity. At reset, before execution of the program starts, all the I/O pins defaults to inputs with internal weak pull-up. It seems like the current through the pull-up resistor is enough to power up the NFC IC. The interrupt signal from the NFC IC is high whenever the NFC is powered or a NFC field is present, and inverted by a FET transistor before reaching CC2530. Hence the ISR is triggered by a falling edge on the input.
So what happens at startup is that the NFC IC is incorrectly powered on (and later off, when the ports are initialized), and the WAKE signal falls and rises very slowly due to the poor drive capacity of a pull-up (to make things worse, a large capacitor of 1 uF is connected in parallel with the gate of the FET, and another 1uF filters the NFC IC power pin).
WAKE is supposed to trigger an interrupt only on falling edge, but staying in the transition region for up to 10 ms as seen in the oscilloscope screenshot above seems to cause CC2530 to fire the interrupt even when WAKE is rising. The ISR starts to communicate with the NFC IC via SPI, but at this time, the NFC IC seems to be messed up due to the spurious transitions on VCC and reset. It refuses to respond, the execution halts in the ISR and the watchdog bites. And the process starts over, forever.
When I insert a delay that ensures WAKE to be stable high before enabling the interrupt, all is well.
If I remove the 1 uF cap on the FET gate, WAKE rises very quickly, and there is no need for the delay anymore. And when I add a 4k7 pulldown to the NFC power, it is no longer powered up at reset.
Problem seems to be solved. The refactoring rearranged the code and changed the startup sequence, which led to a different delay that revealed the issue. With the proper hardware update, no delay will be needed.
But what still disturbes me is that I don't understand the magic NOPs. When CC2530 had the interrupt enabled and encountered a slowly rising WAKE, it wouldn't always end up incorrectly in the ISR. And when it did, I could always make it run by adding 1..3 NOPs. Naturally, whenever I added or removed a line of code, the number of NOPs required changed, which as you can imagine, drove me crazy.
It took me some time to narrow things down, and I am very grateful to all your comments and proposed solutions, especially Clifford that forced me to bring out the oscilloscope.
I'm trying to use SysTick_Handler in SW4STM32 for Linux, but whenever the SysTick interrupt is triggered, execution jumps to somewhere in system memory. By my understanding, it should jump into the void SysTick_Handler(void) that I declared, or failing that, into Default_Handler declared in startup_stm32.s where the interrupt vector table is defined. I set a break point in my SysTick_Handler, but it is never reached. In the code below, it gets through init_systick() and stays in the endless for loop if I don't include SysTick_CTRL_TICKINT_Msk, as expected, but when I do include it, the debugger tells me it ends up somewhere around address 0x1fffda7c.
main.c:
#include "stm32f0xx.h"
volatile uint32_t ticks = 0;
void SysTick_Handler(void) {
ticks++;
}
void init_systick(void) {
SysTick->LOAD = 43999;
SCB->SHP[1] |= 0x40000000L;
SysTick->VAL = 0;
SysTick->CTRL = SysTick_CTRL_CLKSOURCE_Msk | SysTick_CTRL_TICKINT_Msk | SysTick_CTRL_ENABLE_Msk;
}
int main(void)
{
init_systick();
for(;;);
}
I verified from the .map file that the linker is using the declared SysTick_Handler instead of Default_Handler.
I also tried the following variation to use the standard peripheral library for setup, along with other interrupt priority values, with the same results:
#include "stm32f0xx.h"
volatile uint32_t ticks = 0;
void SysTick_Handler(void) {
ticks++;
}
void init_systick(void) {
SysTick_Config(44000);
NVIC_EnableIRQ(SysTick_IRQn);
NVIC_SetPriority(SysTick_IRQn, 0);
}
int main(void)
{
init_systick();
for(;;);
}
This shouldn't be relevant, but since the target doesn't have a timing crystal, I also modified void SetSysClock(void) in system_stm32f0xx.c to use the HSI clock and PLL, which appears to be working correctly:
static void SetSysClock(void)
{
RCC->CFGR = (RCC->CFGR & ~RCC_CFGR_SW) | RCC_CFGR_SW_HSI;
while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_HSI) ;
FLASH->ACR = FLASH_ACR_PRFTBE | FLASH_ACR_LATENCY;
RCC->CR &= ~RCC_CR_PLLON;
while (RCC->CR & RCC_CR_PLLRDY) ;
RCC->CFGR = (RCC->CFGR & ~RCC_CFGR_PLLMUL & ~RCC_CFGR_PLLSRC) | RCC_CFGR_PLLMUL11; // PLL takes 8 MHz HSI / 2 as input
RCC->CR |= RCC_CR_PLLON;
while (!(RCC->CR & RCC_CR_PLLRDY)) ;
RCC->CFGR = (RCC->CFGR & ~RCC_CFGR_SW) | RCC_CFGR_SW_PLL;
while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_PLL) ;
}
-- EDIT: More info requested in the comments --
It's an M0 core, so it doesn't have vector table relocation. From the reference manual section 2.5 (page 44):
Unlike Cortex ® M3 and M4, the M0 CPU does not support the vector table relocation.
Address 0x00000000 should be mapped either to FLASH memory at 0x08000000, system memory at 0x1fffd800, or to SRAM at 0x20000000. The memory at address 0x00000000 matches system memory at 0x1fffd800, even though SYSCFG_CFGR1 MEM_MODE is set to 00, which should map FLASH memory there. Main FLASH memory at address 0x08000000 contains the correct vector table, but address 0x00000000 is populated with address 0x1fffd99d for the SysTick vector (and all other non-NULL vectors except the reset vector, which is 0x1fffdc41); the vectors shown by the debugger at address 0x00000000 are consistent with the observed behavior. All of this information was collected while paused at a breakpoint at address 0x08000298 (a correct position in FLASH memory where the correct code has been loaded), before executing the interrupt.
The VTOR is referenced in the arm-v6 reference manual. It is a good idea to check this.
https://static.docs.arm.com/ddi0419/d/DDI0419D_armv6m_arm.pdf
0xE000ED08 VTOR RW 0x00000000a Vector Table Offset Register, VTOR on page B3-231
That reference manual is for the arm-v6M architecture which is the architecture for cortex-m0 processors. This is the bible. Most of the generic cortex-m0 features of the stm line will not be mentioned in stm refrenece manuals, just in the arm arm.
I'll reiterate, check the VTOR.
And make sure you are building for the right line of STM32F030!
The STM32F030x4 and STM32F030x6 micros have a different memory map than the STM32F030x8.
This sounds like there might be a problem with the linker file.
Can you verify that your linker file has something that looks like the following? (if it is a default file it will probably be far more complex).
. = 0;
.text 0 :
{
*(.vector);
crt0*(.text*);
main*(.text*);
*(.text*);
} > flash
Basically what this is saying is that the 'text' (code) of the program starts at address 0x0, and the first thing to put there is the vector table, followed by startup code, main, and then other code.
You will also then want to check that you have some file specifying this vector table's contents that also agrees it should be at address 0x0. This example is from an ATSAMD21E18A.
.section .vector, "a", %progbits
.equ stack_base, 0x20004000
.word stack_base
.word reset_handler
.word nmi_handler
.word hardfault_handler
.word 0
// ...
.word 0
.word systick_handler
// ...
The important thing is that it is marked as the section vector which the linker file will try to put at 0x0. For a processor with VTOR (like the M0+), this table might be specified in C without any special markings, and its location won't matter, but since you don't have VTOR, you need to make sure the linker knows to put this section right at 0x0.
jonnconn is correct. VTOR is the issue. I just wanted to add to his post (but cannot because my reputation isn't high enough) that I ran into this issue recently with a new CubeMx project.
VTOR is set to 0 on reset. VTOR is initialized in low level init before main. SystemInit in system_stm32l4xx.c (or similar for your part) seems to only initialize VTOR if USER_VECT_TAB_ADDRESS is defined. But CubeMx had this commented out. I had to uncomment it (it is in the same file), and leave VECT_TAB_OFFSET as 0. Afterwards, VTOR config was correctly set to FLASH_BASE (assuming VECT_TAB_SRAM was left undefined).
According the datasheet, that region is the system memory (for built-in bootloader). You may try two things:
Double check BOOTx pins to make sure the MCU loads FLASH instead of the system memory.
Make sure you assigned SCB->VTOR to the correct address of your own interrupt vector table.
I am using ARM-GCC compiler and I found on Internet two versions for the startup_stm32f10x_cl.c file (startup code). The processor is: STM32F105RC (ARM Cortex M3).
Common part :
#define STACK_SIZE 0x100 /*!< The Stack size */
__attribute__ ((section(".co_stack")))
unsigned long pulStack[STACK_SIZE];
Then, the first version starts the vector table like this:
void (* const g_pfnVectors[])(void) =
{
(void *)&pulStack[STACK_SIZE], /*!< The initial stack pointer */
Reset_Handler, /*!< Reset Handler */
...
while the second version looks like this:
void (* const g_pfnVectors[])(void) =
{
(void *)&pulStack[STACK_SIZE - 1], /*!< The initial stack pointer */
Reset_Handler, /*!< Reset Handler */
...
So, my question is:
Which one is the correct stack pointer initialization?
From ARM documentation for M3 cores instruction-set:
PUSH uses the value in the SP register minus four as the highest
memory address
and
On completion, PUSH updates the SP register to point to the location
of the lowest stored value
So, my interpretation is that the starting point of SP must be +4 of the highest address, i.e. the address immediately after the Stack array bounds.
Hence
(void *)&pulStack[STACK_SIZE]
looks right, as that address (although not part of the array), will never be used.
Question: - how to locate application to non 0x0000.0000 address?
Processor: NXP LPC1768
Dev system: Keil ARM 4.73
Steps used:
1) scatter file below used to set load region and execution region to 0x0000.2000
2) copied vector table to 0x2000
3) udpated vtor register to 0x2000
Problem: Application does not run.
Scatter file used:
LR_IROM1 0x00002000 0x00000D000
{ ; load region size_region
ER_IROM1 0x00002000 0x0000D000
{ ; load address = execution address
*.o (RESET, +First)
*(InRoot$$Sections)
.ANY (+RO)
}
RW_IRAM1 0x10000000 0x00008000 { ; RW data
.ANY (+RW +ZI)
}
}
This follows instructions specified in NXP app note AN10744, something else I’m missing?
Vector Table Offset Register (VTOR) points to 0x00000000 at reset.
Thus, stack pointer must be at 0x00000000, and program start address (program counter) at 0x00000004.
If you change the location of the vector table in linker settings, you need to update VTOR to point to this new location. This can only happen at runtime.
This means that you need to have a small bootloader program which does the remapping, which means that first sector must be reserved for that purpose.
Bootloader needs to:
Make sure that interrupts are disabled, so you don't accidentally use VTOR.
Update VTOR register to address 0x2000.
Get stack pointer address from 0x2000 and update stack pointer register.
Get program start address from 0x2004 and update the program counter.
You might want to check out CMSIS library, it has functions like NVIC_SetVTOR and __set_MSP which make setting these registers a little easier.
To set the program counter, you can cast the address to function pointer and then call the function:
uint32_t * vtor = (uint32_t *)0x2000;
uint32_t startAddr = vtor[1];
( (void(*)(void))startAddr )(); // Cast and call