The MPU in ARM Cortex-M (M0+/M3/M4/M7/etc.) is often advertised as allowing to set up protection against dereferencing the NULL pointer. But how to do this in practice? (Some online discussions, like in the Zephyr Project, indicate that the issue is not quite trivial.)
I'm looking for the simplest possible MPU code running in "Privileged mode" on bare-metal ARM Cortex-M. Please note that "protection against dereferencing the NULL pointer" means to me protection both against reads and writes. Also, it is not just about the address 0x0, but small offsets from it as well. For example, accessing a struct member via a NULL pointer should also cause MPU exception:
struct foo {
. . .
uint8_t x;
};
. . .
uint8_t x = (*(struct foo volatile *)NULL)->x; // should fail!
After some experimentation, I've come up with the MPU setting that seems to work for most ARM Cortex-M MCUs. Here is the code (using the CMSIS):
/* Configure the MPU to prevent NULL-pointer dereferencing ... */
MPU->RBAR = 0x0U /* base address (NULL) */
| MPU_RBAR_VALID_Msk /* valid region */
| (MPU_RBAR_REGION_Msk & 7U); /* region #7 */
MPU->RASR = (7U << MPU_RASR_SIZE_Pos) /* 2^(7+1) region, see NOTE0 */
| (0x0U << MPU_RASR_AP_Pos) /* no-access region */
| MPU_RASR_ENABLE_Msk; /* region enable */
MPU->CTRL = MPU_CTRL_PRIVDEFENA_Msk /* enable background region */
| MPU_CTRL_ENABLE_Msk; /* enable the MPU */
__ISB();
__DSB();
This code sets up a no-access MPU region #7 around the address 0x0 (any other MPU region will do as well). This works even for the MCUs, where the Vector Table also resides at address 0x0. Apparently, the MPU does not check access to the region by instructions other than LDR/STR, such as reading the vector address during Cortex-M exception entry.
However, in case the Vector Table resides at 0, the size of the no-access region must not contain any data that the CPU would legitimately read with the LDR instruction. This means that the size of the no-access region should be about the size of the Vector Table. In the code above, the size is set to 2^(7+1)==256 bytes, which should be fine even for relatively small vector tables.
The code above works also for MCUs that automatically relocate the Vector Table, such as STM32. For these MCUs, the size of the no-access region can be increased all the way to the relocated Vector Table, like 0x0800'0000 in the case of STM32. (You could set the size to 2^(26+1)==0x0800'0000).
Protection against NULL-pointer dereferencing is an important tool for improving the system's robustness and even for preventing malicious attacks. I hope that this answer will help fellow embedded developers.
After a major refactoring of an embedded system (IAR C on TI CC2530), I've ended up in the following situation:
After basic initialization of peripherals and global interrupt enable, the execution incorrectly ends up in an interrupt handler that communicates with external hardware. Since this hardware is not ready (remember, we end up in the ISR incorrectly), the program freezes triggering a watchdog reset.
If I insert 1, 2, 3, 5, 6, 7 etc NOPs in main(), everything works fine.
But If I insert 0, 4, 8 etc NOPs, I get the faulty behaviour.
CC2530 fetches 4 bytes of instructions from flash memory, on 4-byte boundaries.
This tells me that something is misaligned when it comes to code memory, but I simply doesn't know where to start. Nothing has changed when it comes the target settings AFAIK.
Anyone here who has seen this situation before, or can point me in the right direction?
#include <common.h>
#include <timer.h>
#include <radio.h>
#include <encryption.h>
#include "signals.h"
#include "lock.h"
#include "nfc.h"
#include "uart1_trace.h"
#include "trace.h"
//------------------------------------------------------------------------------
// Public functions
//------------------------------------------------------------------------------
void main(void)
{
setTp;
// Initialize microcontroller and peripherals
ClockSourceInit();
WatchdogEnable();
PortsInit();
TraceInit();
Timer4Init();
SleepInit();
RadioInit();
Uart1Init();
LoadAesKey("\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0");
clrTp;
NfcInit();
__enable_interrupt();
asm("nop");
// Initialize threads
LockInit();
while (true)
{
WDR();
LockRun();
}
}
void NfcInit(void)
{
// Enable wake up interrupt on external RF field present
// The process for enabling interrupts is described in section 2.5.1 in the CC2530 datasheet.
// Configure interrupt source: interrupt on falling edge, Port 0, pin 7:0
PICTL |= BIT(0);
// 1. Clear port 0 individual interrupt flag. Read-modify-write is not allowed.
// Writing 1 to a bit in this register has no effect, so 1 should be written to flags that are not to be cleared.
P0IFG = ~BIT(3);
// Clear port 0 interrupt flag. This register is bit-accessible.
P0IF = 0;
// 2. Set pin 3 interrupt-enable
P0IEN |= BIT(3);
// 3. Set port 0 interrupt-enable
IEN1 |= BIT(5);
// 4. Global interrupt enable is set in main()
}
// Interrupt handler: falling edge on signal Wake.
// This interrupt will only occur when device is powered off and NFC field present.
// When device is powered on, VCORE is always asserted.
#pragma vector = 0x6B
__interrupt static void NFC_WAKE_ISR(void)
{
static uint16 cnt = 0;
TracePutUint16(cnt); TracePuts("\r\n");
if (++cnt > 10)
IEN1 &= ~BIT(5);
P0IFG = ~BIT(3); // Clear port 1 individual interrupt flag. Read-modify-write is not allowed.
P0IF = 0; // Clear port 1 CPU interrupt flag. This register is bit-accessible.
return;
Screenshot of software init.
CH1 = External interrupt signal, active low (signal Wake).
CH2 = TP in main.c (setTp / clrTp).
The reset button on CC-debugger seems not to be debounced, so the TP signal turns on and off a few times before stabilizing (should not be an issue). VCC is stable long before the reset. When TP goes low for the last time, all peripherals are initialized.
An external NFC IC is used to wake up the MCU from sleep mode when a NFC field is present. The NFC IC is powered by one of the CC2530 I/O-pins. Normally the IC is powered off to preserve power. In this state, the energy from the NFC field is enough to generate the wake signal (active low). When this signal is detected by the MCU, it wakes up, applies power to the NFC IC, and NFC communication starts.
The NFC IC generates the signal either when powered, or when a NFC field is present.
After reset, all I/O-pins are configured as inputs with pullups. This pulled up input is enough to power the NFC IC, which is why the wake-signal is generated. Immediatly after reset, the I/O is configured (in function PortsInit()), and power to NFC IC is turned off. This makes the wake signal go low. The slow rise- and fall times are probably due to a capacitor, that I will now remove.
Here is where things get weird. Despite the wake signal being low, the external interrupt is configured for falling edge and pending int flag is cleared right before global in enabled, I end up in the ISR a few ms later (not seen in the screen shot). But only with the right number of NOPs, as described above.
If I add a > 15 ms delay before global int enable, all is fine. This coincides with the time measured from TP low to wake high.
One might think that the int is incorrectly configured for active low, but in that case I should get multiple ints, and I don't. Also, that does not explain the magic NOPs...
Compiler generated ISR assembly code:
// 77 // Interrupt handler: falling edge on signal Wake.
// 78 // This interrupt will only occur when device is powered off and NFC field present.
// 79 // When device is powered on, VCORE is always asserted.
// 80 #pragma vector = 0x6B
RSEG NEAR_CODE:CODE:NOROOT(0)
// 81 __interrupt static void NFC_WAKE_ISR(void)
NFC_WAKE_ISR:
// 82 {
PUSH A
MOV A,#-0xe
LCALL ?INTERRUPT_ENTER_XSP
; Saved register size: 15
; Auto size: 0
// 83 static uint16 cnt = 0;
// 84
// 85 TracePutUint16(cnt); TracePuts("\r\n");
; Setup parameters for call to function PutUint16
MOV R4,#(TPutc & 0xff)
MOV R5,#((TPutc >> 8) & 0xff)
MOV DPTR,#??cnt
MOVX A,#DPTR
MOV R2,A
INC DPTR
MOVX A,#DPTR
MOV R3,A
LCALL PutUint16
; Setup parameters for call to function TPuts
MOV R2,#(`?<Constant "\\r\\n">` & 0xff)
MOV R3,#((`?<Constant "\\r\\n">` >> 8) & 0xff)
LCALL TPuts
// 86
// 87 if (++cnt > 10)
MOV DPTR,#??cnt
MOVX A,#DPTR
ADD A,#0x1
MOV R0,A
INC DPTR
MOVX A,#DPTR
ADDC A,#0x0
MOV R1,A
MOV DPTR,#??cnt
MOV A,R0
MOVX #DPTR,A
INC DPTR
MOV A,R1
MOVX #DPTR,A
CLR C
MOV A,R0
SUBB A,#0xb
MOV A,R1
SUBB A,#0x0
JC ??NFC_WAKE_ISR_0
// 88 IEN1 &= ~BIT(5);
CLR 0xb8.5
// 89
// 90
// 91 P0IFG = ~BIT(3); // Clear port 1 individual interrupt flag. Read-modify-write is not allowed.
??NFC_WAKE_ISR_0:
MOV 0x89,#-0x9
// 92 P0IF = 0; // Clear port 1 CPU interrupt flag. This register is bit-accessible.
CLR 0xc0.5
// 93
// 94 return;
MOV R7,#0x1
LJMP ?INTERRUPT_LEAVE_XSP
REQUIRE _A_P0
REQUIRE P0IFG
REQUIRE _A_P1
REQUIRE _A_IEN1
REQUIRE _A_IRCON
////////////////////////////////////////////////////////////////////////////////
// lnk51ew_CC2530F64.xcl: linker command file for IAR Embedded Workbench IDE
// Generated: Mon May 24 00:00:01 +0200 2010
//
////////////////////////////////////////////////////////////////////////////////
//
// Segment limits
// ==============
//
// IDATA
// -----
-D_IDATA0_START=0x00
-D_IDATA0_END=0xFF
//
// PDATA
// -----
// We select 256 bytes of (I)XDATA memory that can be used as PDATA (see also "PDATA page setup" below)
-D_PDATA0_START=0x1E00
-D_PDATA0_END=0x1EFF
//
//
// IXDATA
// ------
-D_IXDATA0_START=0x0001 // Skip address 0x0000 (to avoid ambiguities with NULL pointer)
-D_IXDATA0_END=0x1EFF // CC2530F64 has 8 kB RAM (NOTE: 256 bytes are used for IDATA)
//
//
// XDATA
// -----
-D_XDATA0_START=_IXDATA0_START
-D_XDATA0_END=_IXDATA0_END
//
// NEAR CODE
// ---------
-D_CODE0_START=0x0000
-D_CODE0_END=0xFFFF // CC2530F64 has 64 kB code (flash)
//
// Special SFRs
// ============
//
// Register bank setup
// -------------------
-D?REGISTER_BANK=0x0 // Sets default register bank (0,1,2,3)
-D_REGISTER_BANK_START=0x0 // Start address for default register bank (0x0, 0x8, 0x10, 0x18)
//
// PDATA page setup
// ----------------
-D?PBANK_NUMBER=0x1E // High byte of 16-bit address to the PDATA area
//
// Virtual register setup
// ----------------------
-D_BREG_START=0x00
-D?VB=0x20
-D?ESP=0x9B //Extended stack pointer register location
////////////////////////////////////////////////////////////////////////////////
//
// IDATA memory
// ============
-Z(BIT)BREG=_BREG_START
-Z(BIT)BIT_N=0-7F
-Z(DATA)REGISTERS+8=_REGISTER_BANK_START
-Z(DATA)BDATA_Z,BDATA_N,BDATA_I=20-2F
-Z(DATA)VREG+_NR_OF_VIRTUAL_REGISTERS=08-7F
-Z(DATA)PSP,XSP=08-7F
-Z(DATA)DOVERLAY=08-7F
-Z(DATA)DATA_I,DATA_Z,DATA_N=08-7F
-U(IDATA)0-7F=(DATA)0-7F
-Z(IDATA)IDATA_I,IDATA_Z,IDATA_N=08-_IDATA0_END
-Z(IDATA)ISTACK+_IDATA_STACK_SIZE#08-_IDATA0_END
-Z(IDATA)IOVERLAY=08-FF
//
// ROM memory
// ==========
//
// Top of memory
// -------------
-Z(CODE)INTVEC=0
-Z(CODE)CSTART=_CODE0_START-_CODE0_END
//
// Initializers
// ------------
-Z(CODE)BIT_ID,BDATA_ID,DATA_ID,IDATA_ID,IXDATA_ID,PDATA_ID,XDATA_ID=_CODE0_START-_CODE0_END
//
// Program memory
// --------------
-Z(CODE)RCODE,DIFUNCT,CODE_C,CODE_N,NEAR_CODE=_CODE0_START-_CODE0_END
//
// Checksum
// --------
-Z(CODE)CHECKSUM#_CODE0_END
//
// XDATA memory
// ============
//
// Stacks located in XDATA
// -----------------------
-Z(XDATA)EXT_STACK+_EXTENDED_STACK_SIZE=_EXTENDED_STACK_START
-Z(XDATA)PSTACK+_PDATA_STACK_SIZE=_PDATA0_START-_PDATA0_END
-Z(XDATA)XSTACK+_XDATA_STACK_SIZE=_XDATA0_START-_XDATA0_END
//
// PDATA - data memory
// -------------------
-Z(XDATA)PDATA_Z,PDATA_I=_PDATA0_START-_PDATA0_END
-P(XDATA)PDATA_N=_PDATA0_START-_PDATA0_END
//
// XDATA - data memory
// -------------------
-Z(XDATA)IXDATA_Z,IXDATA_I=_IXDATA0_START-_IXDATA0_END
-P(XDATA)IXDATA_N=_IXDATA0_START-_IXDATA0_END
-Z(XDATA)XDATA_Z,XDATA_I=_XDATA0_START-_XDATA0_END
-P(XDATA)XDATA_N=_XDATA0_START-_XDATA0_END
-Z(XDATA)XDATA_HEAP+_XDATA_HEAP_SIZE=_XDATA0_START-_XDATA0_END
-Z(CONST)XDATA_ROM_C=_XDATA0_START-_XDATA0_END
//
// Core
// ====
-cx51
////////////////////////////////////////////////////////////////////////////////
//
// Texas Instruments device specific
// =================================
//
// Flash lock bits
// ---------------
//
// The CC2530 has its flash lock bits, one bit for each 2048 B flash page, located in
// the last available flash page, starting 16 bytes from the page end. The number of
// bytes with flash lock bits depends on the flash size configuration of the CC2530
// (maximum 16 bytes, i.e. 128 page lock bits, for the CC2530 with 256 kB flash).
// Note that the bit that controls the debug interface lock is always in the last byte,
// regardless of flash size.
//
-D_FLASH_LOCK_BITS_START=(_CODE0_END-0xF)
-D_FLASH_LOCK_BITS_END=_CODE0_END
//
// Define as segment in case one wants to put something there intentionally (then comment out the trick below)
-Z(CODE)FLASH_LOCK_BITS=_FLASH_LOCK_BITS_START-_FLASH_LOCK_BITS_END
//
// Trick to reserve the FLASH_LOCK_BITS segment from being used as normal CODE, avoiding
// code to be placed on top of the flash lock bits. If code is placed on address 0x0000,
// (INTVEC is by default located at 0x0000) then the flash lock bits will be reserved too.
//
-U(CODE)0x0000=(CODE)_FLASH_LOCK_BITS_START-_FLASH_LOCK_BITS_END
//
////////////////////////////////////////////////////////////////////////////////
According to TI, that part has got an 8051 core. Apart from being dinosaur crap, 8051 is an 8-bitter so alignment does not apply.
When random modifications to the code result in completely unrelated errors or run-away code, it is most often caused by one of these things:
You got a stack overflow, or
You got undefined behavior bugs, such as uninitialized variables, array out of bounds access etc.
Also ensure that all ISRs are registred in the interrupt vector table.
EDIT after question change 6/4:
You should normally not return from interrupts! I don't know how your specific setup works, but with a general embedded systems compiler, the non-standard interrupt keyword means two things:
Ensure that the calling convention upon entering the ISR is correct, by stacking whatever registers the CPU/ABI state are not stacked by hardware, but software.
Ensure that the same registers are restored upon leaving the ISR and that the correct return instruction is used.
On 8051 this means that the disassembled ISR absolutely must end with a RETI instruction and not a RET instruction! Chances are high that return results in RET which will sabotage your stack. Disassemble to see if this is indeed the case.
The user's guide for the CC2530 states:
The instruction that sets the PCON.IDLE bit must be aligned in a
certain way for correct operation. The first byte of the assembly
instruction immediately following this instruction must not be placed
on a 4-byte boundary.
This is likely why the system fails on NOP multiples of four.
Just below the warning, there is an implementation for fixing this alignment specifically targeted at the IAR compiler.
I am starting to believe that this is a hardware issue, related to the connection between CC2530 and the NFC IC.
The power and reset to the NFC IC that sends the external interrupt request is controlled by a CC2530 I/O pin with 20 mA current drive capacity. At reset, before execution of the program starts, all the I/O pins defaults to inputs with internal weak pull-up. It seems like the current through the pull-up resistor is enough to power up the NFC IC. The interrupt signal from the NFC IC is high whenever the NFC is powered or a NFC field is present, and inverted by a FET transistor before reaching CC2530. Hence the ISR is triggered by a falling edge on the input.
So what happens at startup is that the NFC IC is incorrectly powered on (and later off, when the ports are initialized), and the WAKE signal falls and rises very slowly due to the poor drive capacity of a pull-up (to make things worse, a large capacitor of 1 uF is connected in parallel with the gate of the FET, and another 1uF filters the NFC IC power pin).
WAKE is supposed to trigger an interrupt only on falling edge, but staying in the transition region for up to 10 ms as seen in the oscilloscope screenshot above seems to cause CC2530 to fire the interrupt even when WAKE is rising. The ISR starts to communicate with the NFC IC via SPI, but at this time, the NFC IC seems to be messed up due to the spurious transitions on VCC and reset. It refuses to respond, the execution halts in the ISR and the watchdog bites. And the process starts over, forever.
When I insert a delay that ensures WAKE to be stable high before enabling the interrupt, all is well.
If I remove the 1 uF cap on the FET gate, WAKE rises very quickly, and there is no need for the delay anymore. And when I add a 4k7 pulldown to the NFC power, it is no longer powered up at reset.
Problem seems to be solved. The refactoring rearranged the code and changed the startup sequence, which led to a different delay that revealed the issue. With the proper hardware update, no delay will be needed.
But what still disturbes me is that I don't understand the magic NOPs. When CC2530 had the interrupt enabled and encountered a slowly rising WAKE, it wouldn't always end up incorrectly in the ISR. And when it did, I could always make it run by adding 1..3 NOPs. Naturally, whenever I added or removed a line of code, the number of NOPs required changed, which as you can imagine, drove me crazy.
It took me some time to narrow things down, and I am very grateful to all your comments and proposed solutions, especially Clifford that forced me to bring out the oscilloscope.
I am using a Atmel SAM4E-16e on Atmel SAM4E-EK Board. I have written a bootloader for this configuration.
The bootloader receives the .bin-File via UART and writes it into Flash. This works without problems, i made a hex-dump and it was exactly what i expected:
Bootloader at 0x400000 (Flash Start Address of AT SAM4E)
My Application at 0x420000
0x800000 is Flash End Address
This is the C-Code:
int main(void){
// Init and downloading the .bin to Flash
binary_exc((void*) 0x420000);
}
int binary_exec(void * vStart){
int i;
// -- Check parameters
// Should be at least 32 words aligned
if ((uint32_t)vStart & 0x7F)
return 1;
Disable_global_interrupt();
// Disable IRQs
for (i = 0; i < 8; i ++) NVIC->ICER[i] = 0xFFFFFFFF;
// Clear pending IRQs
for (i = 0; i < 8; i ++) NVIC->ICPR[i] = 0xFFFFFFFF;
// -- Modify vector table location
// Barriars
__DSB();
__ISB();
// Change the vector table
SCB->VTOR = ((uint32_t)vStart & SCB_VTOR_TBLOFF_Msk);
// Barriars
__DSB();
__ISB();
Enable_global_interrupt();
// -- Load Stack & PC
_binExec(vStart);
return 0;
}
void _binExec (void * l_code_addr){
__asm__ ("mov r1, r0 \n"
"ldr r0, [r1, #4] \n" //I also tryed #5 but that doesn't work, too
"ldr sp, [r1] \n"
"blx r0"
);
}
But when i try to jump to my application, the Application does not start.
The code for jumping to the program is out of an example of Atmel for the SAM8X (Cortex M3). The debugger says sometimes that it the PC jumps to another Address (0x004003E2) instead, but does not go on.
I found the old topic Bootloader for Cortex M3 where the solution was to just add one but this doesn't work for me, even if i used their code. Then the debugger does not responds any more.
I am using Atmel Studio 7 with GCC. The processor runs in Thumb-Mode.
I hope you can help me to solve this problem or give me some tipps what is going wrong here.
This code assumes that program loaded at address 0x420000 starts with a vector table:
SP at offset 0 (0x420000)
Reset address at offset 4 (0x420004).
For this, the code seems perfectly correct.
But are you sure that this vector table is correct ? Is bit 0 of data at 0x420004 set as this is Thumb code? When you compile this code, is it aware that it will run from this address (For any absolute address it might use). Do you have the possibility to use a debugger to understand when the first fault occurs?
I think you should provide the disass of the first instructions of the program you try to load at this address.
I have solved the problem now.
I still use the code I posted in my question. The problem was that the .bin-file i write on my processor's flash at 0x420000 was compiled in a way that it thought it is at flash start address (0x400000).
When it has loaded the reset vector's address it was at 0x400xyz instead of 0x420xyz so the application jumped to the wrong address.
The solution was to Change the Flash start address to 0x420000 in the project I want to upload via bootloader.
Question: - how to locate application to non 0x0000.0000 address?
Processor: NXP LPC1768
Dev system: Keil ARM 4.73
Steps used:
1) scatter file below used to set load region and execution region to 0x0000.2000
2) copied vector table to 0x2000
3) udpated vtor register to 0x2000
Problem: Application does not run.
Scatter file used:
LR_IROM1 0x00002000 0x00000D000
{ ; load region size_region
ER_IROM1 0x00002000 0x0000D000
{ ; load address = execution address
*.o (RESET, +First)
*(InRoot$$Sections)
.ANY (+RO)
}
RW_IRAM1 0x10000000 0x00008000 { ; RW data
.ANY (+RW +ZI)
}
}
This follows instructions specified in NXP app note AN10744, something else I’m missing?
Vector Table Offset Register (VTOR) points to 0x00000000 at reset.
Thus, stack pointer must be at 0x00000000, and program start address (program counter) at 0x00000004.
If you change the location of the vector table in linker settings, you need to update VTOR to point to this new location. This can only happen at runtime.
This means that you need to have a small bootloader program which does the remapping, which means that first sector must be reserved for that purpose.
Bootloader needs to:
Make sure that interrupts are disabled, so you don't accidentally use VTOR.
Update VTOR register to address 0x2000.
Get stack pointer address from 0x2000 and update stack pointer register.
Get program start address from 0x2004 and update the program counter.
You might want to check out CMSIS library, it has functions like NVIC_SetVTOR and __set_MSP which make setting these registers a little easier.
To set the program counter, you can cast the address to function pointer and then call the function:
uint32_t * vtor = (uint32_t *)0x2000;
uint32_t startAddr = vtor[1];
( (void(*)(void))startAddr )(); // Cast and call
I've written a bootloader for my SAM4S that sits in sector 0 and loads an application in sector 1. The problem however is that when I attempt to jump to the new function it appears to generate an exception (debugger goes to Dummy_Handler()).
Bootloader contains the following entries in map:
.application 0x00410000 0x0
0x00410000 . = ALIGN (0x4)
0x00410000 _sappl = .
0x00410004 _sjump = (. + 0x4)
The application image map file has:
.vectors 0x00410000 0xd0 src/ASF/sam/utils/cmsis/sam4s/source/templates/gcc/startup_sam4s.o
0x00410000 exception_table
…
.text.Reset_Handler
0x0041569c 0x100 src/ASF/sam/utils/cmsis/sam4s/source/templates/gcc/startup_sam4s.o
0x0041569c Reset_Handler
Exception table is defined as follows:
const DeviceVectors exception_table = {
/* Configure Initial Stack Pointer, using linker-generated symbols */
.pvStack = (void*) (&_estack),
.pfnReset_Handler = (void*) Reset_Handler,
The bootloader declares the application jump point as:
extern void (*_sjump) ();
and then makes the following call:
_sjump();
The memory contents at 0x00410004 are 0x0041569d, and I notice that this is not word aligned. Is this because we are using Thumb instructions? Either way why is it not 0x0041569c? Or more importantly why is this going to an exception?
Thanks,
Devan
Update:
Found this but it does not appear to work for me:
void (*user_code_entry)(void);
unsigned *p;
p = (uint32_t)&_sappl + 4;
user_code_entry = (void (*)(void))(*p - 1);
if(applGood && tempGood) {
SCB->VTOR = &_sappl;
PrintHex(p);
PrintHex(*p);
PrintHex(user_code_entry);
user_code_entry();
}
The code prints:
00410004
0041569D
0041569C
Update Update:
The code that attempted to jump with a C function pointer produced the following Disassembly:
--- D:\Zebra\PSPT_SAM4S\PSPT_SAM4S\SAM4S_PSPT\BOOTLOADER\Debug/.././BOOTLOADER.c
user_code_entry();
004005BA ldr r3, [r7, #4]
004005BC blx r3
I was able to get this working with the following assembly:
"mov r1, r0 \n"
"ldr r0, [r1, #4] \n"
"ldr sp, [r1] \n"
"blx r0"
Based on this I wonder if the stack reset is required and, if so, is it possible to accomplish such in C?
I had the same problem with SAM4E. I cannot guess what your problem might be, but I can point out difficulties that I had and information I used.
My bootloader was not storing in the correct memory location a few parts of the firmware. This was leading to the dummy_handler exception. When I fixed the error in the address calculations the bootloader worked perfectly.
My suggestions:
Follow ATMEL's example: the Document and the Example Code should be enough. The main.c is enough to understand how the bootloader should work. It is not necessary to get into partitioning details at the beginning.
You may want to read how you can execute functions/ISRs from RAM
This webpage explains the Intel HEX format.
Finally, after the bootloader finishes with the upgrade, you can read the flash and send it back to the host computer. Then compare it with the original image (using a script). That is how I debugged my bootloader.
Other ideas that might help:
Do you erase each page before you write it?
Do you unlock each memory space before you erase/write it?
You could lock the Bootloader's Section to avoid overwriting it by mistake
You could lock the section(s) of the upgraded firmware.
The address you have to point to is 0x00410000 not 0x00410004. The Atmel's example code (see function binary_exec) in combination with the Intel Hex format (record type 05) should solve this question.
I hope this piece of information will be of some help!
I had the same problem on the SAM4S which brought me to this question. So if someone arrives here again, here is what I found.
As ChrisB mentions following the Atmel example code is a good start, however I found that the problem was the actual code requesting the jump which just didn’t work for the SAM4S.
What was missing was rebasing the stack pointer before the vector table and then loading the reset handler address.
Try something like this:
static void ExecuteApp(void)
{
uint32_t i;
// Pointer to the Application Section
void (*application_code_entry)(void);
// -- Disable interrupts and system timer
__disable_irq();
SysTick->CTRL = 0; // Disable System timer
// disable and clear pending IRQs
for (i = 0; i < 8; i++)
{
NVIC->ICER[i] = 0xFFFFFFFF; // disable IRQ
NVIC->ICPR[i] = 0xFFFFFFFF; // clear pending IRQ
}
// Barriers
__DSB(); // data synchronization barrier
__ISB(); // instruction synchronization barrier
// Rebase the Stack Pointer
__set_MSP(*(uint32_t *) APPCODE_START_ADDR);
// Rebase the vector table base address
SCB->VTOR = ((uint32_t) APPCODE_START_ADDR & SCB_VTOR_TBLOFF_Msk);
// Load the Reset Handler address of the application
application_code_entry = (void (*)(void))(unsigned *)(*(unsigned *)
(APP_START_RESET_VEC_ADDRESS));
__DSB();
__ISB();
// -- Enable interrupts
__enable_irq();
// Jump to user Reset Handler in the application
application_code_entry();
}