MPU settings for ARM Cortex M7 - arm

I am working on a SoC with several Cortex M7 cores. It has SRAM mapped to region 0x2000 0000 -> 0x3FFF FFFF and DDR mapped to 0x6000 0000 -> 0xDFFF FFFF.
It seems that configuring the 4 partitions of DDR to cached normal memory (WT, WB or WBA) is triggering an HW bug which freezes the whole chip after a few seconds. Even the debugger get disconnected. Note that I do not need to access the DDR to raise the problem. Just running code in SRAM to configure the MPU and do various stuff will, after a random delay, trigger the bug.
Is there any limitation to the attributes I set when configuring the MPU?
I would say only the system space in 0xE000 0000 has fixed attributes and others are freely configurable depending on the HW implementation, but I have a doubt because if I refer to ARMv7-M arch ref manual, I can for instance find this:
B3.1 The system address map (p. 588)
[...] A declared cache type can be demoted but not promoted [...]
I am not sure here if this is just a limitation during runtime for two sequential configurations or an absolute limitation that forbids enabling the cache for partitions at 0xA... and 0xC... even once during init as they are not enabled in the default address map.
Also, the documentation indicates that the DDR is normal memory, but is there any problem if I keep it configured by default as device space (without taking into account slower non re-ordered accesses inherent to this configuration)?
Here is the exact configuration for the 4 DDR partitions:
RBAR -> 60000000 80000000 A0000000 C0000000 (start addresses)
RASR => 03080039 03080039 13010039 13100039 (default settings)
RASR => 03020039 03020039 03020039 03020039 (new settings triggering the bug)
Note that for the SRAM I keep the same setting:
RBAR -> 20000000
RASR => 030B0039

Related

Cannot write TMC/ETF registers on STM32H753

This question is the follow-up of this one.
I try to configure the ETF in circular mode in order to be able to read the execution trace through the embedded software in case of critical error, on a STM32H753.
I am following the algorithm described in ARM's Trace Memory Controller reference manual (section 2.2.2)
But I cannot write the ETF registers: I unlocked the ETF macrocell by writing the magic number 0xC5ACCE55to register ETF_LAR but when I read thez registers they are all 0 (through debugger or printf) and when I write them they remain at 0.
Any advice on how to write ETF registers ?
There is actually a mistake in the STM32H7 doc:
and few pages later:
My understanding is that the second image is wrong.
Also I had to use the "Component base address (system bus)" and not the "component base address (debugger)". This notion of double address is pretty confusing as for example for ETM there is only one address.
Anyway using 0x5C014000as base address, I can read/write the registers
When accesses are made using the DAP (SWD or JTAG), the debug subsystem should be powered up as part of the handshake with the tools. In a self-hosted debug mode, then device specific control of clocks/power is usually necessary.
For STM32H7, parts:
The ETM is in the same clock/power domain as the core, so it will normally be visible.
CK_DBG_D1 clocks the trace components in the D1 power domain: System
ROM table 2, CoreSight trace funnel, ETF, system CTI and TPIU. It is a
gated version of the D1 domain system clock (CK_HCLK_D1).
Self hosted debug clock and power control uses a dedicated set of registers.
All the debug clocks (except DAPCLK) can be enabled and disabled by
register bits in the DBGMCU.

STM32F4 FSMC/FMC SRAM as Heap/Stack results in random hardfaults

we are currently evaluating to use an external SRAM for C/C++ heap storage on our platform using a STM32F439BI microcontroller.
The problem
Using the SRAM as storage for heap results in random hardfaults which are raised from buserrors/imprecice buserrors.
Without placing the heap on the SRAM, memory tests run successfully on the whole SRAM (8 bit/16 bit and 32 bit accesses).
Connecting a debugger I can observe these errors sometimes before a hardfault occurs. Most often a word is read from the SRAM and the CPU register fills with addresses of the following format: 0x-1F3-1F3 (- is most often '0', sometimes 'A' or '6'). The pattern '1F3' persists. If the same address is read again some lines further down the correct value is read (some other address in 0x60000000 space).
If I stop the program on a breakpoint at some point early in the program and step a few lines, I get these errors more frequently.
Further details
The SRAM is connected using the FMC/FSMC peripheral on FMC bank 1 and SRAM bank 1 and is therefore memory-mapped to address 0x60000000.
All settings for GPIO pins and FMC configuration are set from the startup file before main() executes or static objects are created.
The SRAM is the following: CY7C1041GN30
We connect all 16 data pins, all 18 address pins, BHE, BLE, OE, WE and CE to our controller. All pins are configured as push-pull-alternate-function, pull-up, AF_12 (FMC), very high speed. We enable clocks for all necessary pins and the clock for FMC. Note: Initially we started out without pull-up/down showing the same symptoms.
The controller runs with a clock speed of 168 MHz
As stated above, a memory test runs successfully
We use DMA for SPI, I2C and ADC data transfers
We frequently use interrupts, including external (pin) interrupts
We use the following timing settings:
AddressSetupTime: 2
AddressHoldTime: 4
DataSetupTime: 4
BusTurnAroundDuration: 1
CLKDivision: 2
DataLatency: 2
We configure the FMC as follows:
NSBank FMC_NORSRAM_BANK1,
DataAddressMux FMC_DATA_ADDRESS_MUX_DISABLE,
MemoryType FMC_MEMORY_TYPE_SRAM,
MemoryDataWidth FMC_NORSRAM_MEM_BUS_WIDTH_16,
BurstAccessMode FMC_BURST_ACCESS_MODE_DISABLE,
WaitSignalPolarity FMC_WAIT_SIGNAL_POLARITY_LOW,
WrapMode FMC_WRAP_MODE_DISABLE,
WaitSignalActive FMC_WAIT_TIMING_BEFORE_WS,
WriteOperation FMC_WRITE_OPERATION_ENABLE,
WaitSignal FMC_WAIT_SIGNAL_DISABLE,
ExtendedMode FMC_EXTENDED_MODE_DISABLE,
AsynchronousWait FMC_ASYNCHRONOUS_WAIT_DISABLE,
WriteBurst FMC_WRITE_BURST_DISABLE,
ContinuousClock FMC_CONTINUOUS_CLOCK_SYNC_ASYNC,
WriteFifo 0,
PageSize 0
We spend a lot of time of experimenting with longer timings and compared all the settings to examples including this one: Using STM32L476/486 FSMC peripheral
to drive external memories (although this one is for the STM32L4, I am fairly certain it applies to this controller as well)
Findings on similar problems
The problem sounds very similar to this errata sheet entry: "2.3.4 Corruption of data read from the FMC" but it also says the error is fixed in our revision of the controller (3)
I hope someone out there has seen this strange behaviour before and can help us. After over one week of debugging we expect some kind of error in the controller when interrupts/DMA accesses occur while the CPU accesses the SRAM (when we use it as heap, it is accessed very frequently). Hopefully you can shed some light on this topic.
Sorry for not getting back to you, internet.
Yes, we found out what the issue was (at least in our case). Problem was that the J-Link debugger we use is causing problems if it hangs above the power electronics on our pcb (it is mounted vertically). If we guide the ribbon cable out at the top (only digital electronics) the error disappears. So our guess is, that some noise from the electronics was caught up by the cable and directly injected into the JTAG port, which caused failures inside the MCU.
Just got a confirmation from ST, that there is a bug in the STM32F469 FMC that might cause incorrect values if the write fifo is disabled. The workaround is to have the fifo enabled. It is the same issue as in this F7 processor https://www.st.com/resource/en/errata_sheet/dm00145382.pdf

Atmel SAM3X dual bank switching not working

I'm currently working with an Atmel SAM3X8 ARM microcontroller that features a dual banked 2 x 256KB flash memory. I'm trying to implement a firmware update feature, that puts the new firmware into the currently unused flash bank, and when done swaps the banks using the flash remapping to run the new firmware.
The datasheet states to do so I need to set the GPNVM2 bit, then the MCU will remap the memory, so Flash 1 is now at 0x80000 and Flash 0 at 0xC0000. This will also lead to the MCU executing code beginning from Flash 1.
To cite the datasheet:
The GPNVM2 is used only to swap the Flash 0 and Flash 1. If GPNVM2 is ENABLE, the Flash 1 is mapped at
address 0x0008_0000 (Flash 1 and Flash 0 are continuous). If GPNVM2 is DISABLE, the Flash 0 is mapped at
address 0x0008_0000 (Flash 0 and Flash 1 are continuous).
[...]
GPNVM2 enables to select if Flash 0 or Flash 1 is used for the boot.
Setting GPNVM bit 2 selects the boot from Flash 1, clearing it selects the boot from Flash 0.
But when I set GPNVM2, either via SAM-BA or my own firmware using flash_set_gpnvm(2) (ASF SAM Flash Service API), it will still boot from the program in Flash 0, and the new program will still reside at Flash 1's offset 0xC0000. The state of GPNVM2 has been verified by flash_is_gpnvm_set(2)
Flashing the firmware itself to Flash1 bank works flawlessly, that has been verified by dumping the whole flash memory with SAM-BA.
There is an errata from Atmel about an issue, that the flash remapping only works for portions smaller than 64KB. My code is less than that (40KB), so this shouldn't be an issue.
I've not found any other people having this issue, nor any example how to use it, so maybe somebody could tell me if I'm doing something wrong here, or what else to check.
I had the same issue (see here: Atmel SAM3X8E dual bank switching for booting different behaviour).
After some more research I found an Application Note (Link: http://ww1.microchip.com/downloads/en/AppNotes/Atmel-42141-SAM-AT02333-Safe-and-Secure-Bootloader-Implementation-for-SAM3-4_Application-Note.pdf) which explains the boot behaviour of the SAM3X in a more clear way. The problem is that the datasheet is a bit misleading (at least I was confused too). The SAM3X has no ability to remap the the Flash banks. The booting behaviour is a bit different (see the picture in the link, it's a snipped from the Application note, page 33/34):
Booting behaviour SAM3X
Picture 3-9 shows the SAM3X's behaviour at the boot-up. The GPNVM bits 1 and 2 just determine which memory section (ROM/Flash0/Flash1) is mirrored to the boot memory (located at 0x00000000). The mapping of the Flash banks is not changed. Therefore Flash0 still is mapped to 0x00080000 and Flash1 to 0x000C0000).
As the Application Note states some other Atmel microcontrollers are able to really remap the Flash banks (e.g. SAM3SD8 and SAM4SD32/16). These processors change the location of the Flash banks as you can see in picture 3-10.
To be able to update your firmware it is therefore necessary to implement some kind of bootloader. I implemented one by myself and was able to update my firmware even without using the GPNVM bits at all. I also opend a support ticket at Microchip to clarify the booting behaviour. When I receive an answer I hope to tell you more.
EDIT:
Here's the answer from the Microchip support:
Setting the GPNVM2 bit in SAM3X will merely make the CPU 'jump to' or start from flash bank 1 i.e. 0xC0000.
No actual swap of memory addresses will take place.
To use flash bank 1, you will need to change the linker file (flash.ld) to reflect the flash start address 0xC0000.
For flash bank 0 application, change:
rom (rx) : ORIGIN = 0x00080000, LENGTH = 0x00080000 /* Flash, 512K /
to:
rom (rx) : ORIGIN = 0x00080000, LENGTH = 0x00040000 / Flash, 256K */
For flash bank 1 application, change:
rom (rx) : ORIGIN = 0x00080000, LENGTH = 0x00080000 /* Flash, 512K /
to:
rom (rx) : ORIGIN = 0x000C0000, LENGTH = 0x00040000 / Flash, 256K */
If this is not done, the reset handler in the flash 1 application will point to an address in the flash 0 application.
So, although code will start execution in flash 1 (if GPNVM2 is set), it will jump back to the flash 0 application.
The errata stating the 64kb limitation can be ignored.
Therefore the Application Note is right and no actual change of the mmory mapping is performed.
Cheers
Lukas

Start to writing a bootloader for stm32l0 in IAR

What are the appropriate steps to write add a custom bootloader for stm32l0 in IAR? The following questions are not clear:
Do I make a new IAR Project?
If yes, do I write the bootloader like a normal project and just change my original .icf file so there is a small ROM and an small RAM region for the bootloader?
if no, what things do I have to configure in the IAR proejct apart from icf file and code?
what other things do I need to think of?
I'm having trouble starting into this.
So the icf would be for the main project:
__region_ROM_start__ = 0x08000000;
__region_ROM_end__ = 0x08008FFF;
So the icf would be for the bootloader project:
__region_Bootloader_ROM_start__ = 0x08009000;
__region_Bootloader_ROM_end__ = 0x08009FFF;
and the same thing for about 0xFF of RAM?
You do not need to restrict the RAM - you can use all of it because when you switch to the application a new run-time environment will be established and the RAM will be reused.
The flash you reserve for the bootloader must be a whole number of flash pages starting from the reset address The STM32L0 has very small flash pages so there should be minimal waste, but you don't want to have to change it if your bootloader grows, because then you will have to rebuild your application code for the new start address and old application images will no longer be loadable. So consider giving yourself a little headroom.
The bootloader can be built just like any other STM32L0xx project; the application code ROM configuration must start from an address above the bootloader. So for example say you have a 1Kbyte bootloader:
Boot ROM Start: 0x0800 0000
Boot ROM End: 0x0800 03FF
Application Start: 0x0800 0400
Application End: Part size dependent.
The bootloader itself must have a means of determining that an update is available, if an update is available it must then read the application data and write it to the application flash memory, it must then disable any interrupts that may have been enabled, it may also be necessary to deinitialise any peripherals used (if they remain active when the switch to the application is made it may cause problems), then the switch to the application code is made.
It is possible if the bootloader and application both run from the same clock configuration to minimise the configuration in the application and rely on the bootloader. This is a small space saving, but less flexible. If for example you make the bootloader run using the internal RC oscillator it will be portable across multiple hardware designs that may have differing application speed and clocking requirements and different external oscillator frequencies
The switch to the application is pretty simple on Cortex-M, it simply requires the vector table to be switched to the application's vector table, then the program-counter to be loaded - the latter requires a little assembly code. The following is for Cortex-M3, it may need some adaptation for M0+ but possibly not:
Given the following in-line assembly function:
__asm void boot_jump( uint32_t address )
{
LDR SP, [R0] ;Load new stack pointer address
LDR PC, [R0, #4] ;Load new program counter address
}
The bootloader switched to the application image thus:
// Switch off core clock before switching vector table
SysTick->CTRL = 0 ;
// Switch off any other enabled interrupts too
...
// Switch vector table
SCB->VTOR = APPLICATION_START_ADDR ;
//Jump to start address
boot_jump( APPLICATION_START_ADDR ) ;
Where APPLICATION_START_ADDR is the base address of the application area; this address is the start of the application's vector table, which starts with the initial stack pointer and reset vector, the boot_jump() function loads these into the SP and PC registers to start the application as if it had been started at reset. The application's reset vector contains the application's execution start address.
Your needs may vary, but in my experience a serial bootloader (using UART) using XMODEM and decoding an image in Intel Hex format takes about 4Kb of Flash. On an STM32L0 you may want to use something simpler - 1Kb is probably feasible if you simply stream raw binary the data and use hardware flow control (you need to control data flow because erasing and programming the flash takes time and also stops the CPU from running because you cannot on STM32 write flash memory while simultaneously fetching instructions from it).
See also: How to jump between programs in Stellaris

Register mapping in a ARM based SoC

I want to understand how the registers of various peripherals/IPs are mapped to the ARM processor memory map in a microcontroller.
Say, I have a CONTROL register for UART block. When I do a write access to address (40005008), this register gets configured. Where does this mapping happens: Within the peripheral block code itself or while integrating this peripheral to the SoC/microcontroller.
For a simple peripheral like a UART it's straightforward - taking the ARM PL011 UART as an example (since I know where its documentation lives):
The programmer's model defines a bunch of registers at word-aligned offsets in a 4k block.
In terms of the actual hardware, we see the bus interface matches what the programmer's model suggests - PADDR[11:2] means only bits 11:2 of the address are connected, meaning it can only understand word-aligned addresses from 0x000 to 0xffc (similarly, note that only 16 bits of read/write data are connected, since no register is wider than that).
The memory-mapping between the UART's 12-bit address and the full 32-bit address that the CPU core spits out happens in the interconnect hardware between them. At design time, the interconnect address map will be configured to say "this 4k region at 0x40005000 is assigned to the UART0 block", etc., and the resulting bus circuitry will be generated for that.
More complex things like e.g. DMA-capable devices typically have separate interfaces for configuration and data access, so the registers can be mapped in a small relocatable block on a low-speed peripheral bus much like the UART.
Most significant bits are defined by your ASIC design, least significant bits are defined by the IP design. Your IP has several registers. The number of register, their order, is defined by the IP design. Here, your register is at address 8. Then when designing the ASIC, the peripherals are connected to the memory bus, and the way they are connected define their address. Your UART is at 40005000. You may have an other instance of the same IP at (for instance) 40006000. The two UART would be strictly identical, and you would be able to access CONTROL register of your second UART at address 40006008.

Resources