I have created a simple example program with the Xilinx SDK that has FreeRTOS and I am running into an issue which seems quite unexpected. I want to fire an software interrupt and so I have set up the code this way.
void software_test( void ) __attribute__((interrupt_handler));
void software_test( void )
{
// clear the interrupt
*((volatile uint32_t *) 0x4120000C) = 0x80;
interrupt_occurred++;
}
When I try to compile it complains about:
\interrupt_example_bsp\microblaze_0\libsrc\freertos823_xilinx_v1_1\src/portasm.S:288: multiple definition of `_interrupt_handler'
./src/freertos_hello_world.o:\Debug/../src/freertos_hello_world.c:130: first defined here
I checked portasm.S and it has the following code in it:
.global _interrupt_handler
... bunch more unreleated code here
.text
.align 4
_interrupt_handler:
portSAVE_CONTEXT
/* Stack the return address. */
swi r14, r1, portR14_OFFSET
/* Switch to the ISR stack. */
lwi r1, r0, pulISRStack
/* The parameter to the interrupt handler. */
ori r5, r0, configINTERRUPT_CONTROLLER_TO_USE
/* Execute any pending interrupts. */
bralid r15, XIntc_DeviceInterruptHandler
or r0, r0, r0
/* See if a new task should be selected to execute. */
lwi r18, r0, ulTaskSwitchRequested
or r18, r18, r0
/* If ulTaskSwitchRequested is already zero, then jump straight to
restoring the task that is already in the Running state. */
beqi r18, task_switch_not_requested
/* Set ulTaskSwitchRequested back to zero as a task switch is about to be
performed. */
swi r0, r0, ulTaskSwitchRequested
/* ulTaskSwitchRequested was not 0 when tested. Select the next task to
execute. */
bralid r15, vTaskSwitchContext
or r0, r0, r0
... bunch more code here
I am unclear how to fix this, has anyone else encountered this.
Any help is greatly appreciated. Thanks in advance.
Here is some information on implementing a Microblaze ISR using FreeRTOS: http://www.freertos.org/RTOS-Xilinx-Microblaze-KC705.html#implementing_an_ISR
Related
I've a strange problem with Cortex-M0+ (STM32G0B1RETx) when running FreeRTOS (10.3.1 heap3) and gui using lvgl (v8.3), toolchain GNU Tools for STM32 9-2020-q2-update, configuration generated from STM32CubeIDE. It occurs in random places, mostly related to size of code - changes moves it in difference places. Currently problem occurs in code responsible for handling events from buttons:
static void leftEventHandler(lv_event_t *e) {
lv_event_code_t code = lv_event_get_code(e);
uint8_t index = (uint8_t)e->user_data;
if (code == LV_EVENT_CLICKED) {
onLeftPressed(index);
}
}
By randomly pressing buttons or even entering view that using it sometimes function will not return and execute next code from memory which happens to be:
static void fragmentAttach(lv_fragment_t *self) {
Presenter_onStart();
}
Disassembly:
leftEventHandler:
08040700: push {r4, lr}
08040702: movs r4, r0
08040704: bl 0x800f1f0 <lv_event_get_code>
168 if (code == LV_EVENT_CLICKED) {
08040708: cmp r0, #7
0804070a: bne.n 0x8040714 <leftEventHandler+20>
169 Presenter_onLeftPressed(index);
0804070c: ldr r0, [r4, #12]
0804070e: uxtb r0, r0
08040710: bl 0x8047080 <Presenter_onLeftPressed>
08040714: pop {r4, pc}
120 Presenter_onStart();
fragmentAttach:
08040716: push {r4, lr}
08040718: bl 0x8046fbc <Presenter_onStart>
0804071c: pop {r4, pc}
0804071e: movs r0, r0
It looks like "pop {r4, pc}" doesn't update pc sometimes, r4 is correctly restored. "push {r4, lr}" does correctly place registers on stack and those values are unchanged when "pop" is called. "Presenter_onLeftPressed" is not called, usually "code != LV_EVENT_CLICKED" when problem occurs.
Breakpoint at "fragmentAttach" 08040716:
registers
memory
Increased stack size for threads doesn't fix the problem. Disable all threads, living only idle and timer tasks is causing hardfault at scheduler task.
After a long time of fighting with this issue I found the fix. Disabling "Prefetch Buffer" in STM32CubeMX fixed problem :)
I am trying to apply a type of side channel attack I read about in this paper that tries to infer execution state from differences in IRQ latencies on a MCU with a cortex M4 processor. The attack carefully interrupts instructions that occur right after a branch and measures the interrupt latency. When different branches have instructions of different lengths, you can look at the interrupt latency to determine in which of these branches the interrupt occurred and leak some of the program state.
I wrote a simple function that I want to attack in the way described above. I am using the SysTick timer to generate the interrupt at the correct point in time. To get an initial good value for the interrupt timer I used GDB to stop the program at the target line to see the SysTick value at that time.
I implemented a very simple interrupt handler that
loads the SysTick timer value from memory
subtracts this value from the reload value to get the elapsed time since interrupt (i.e. the IRQ latency)
clears the interrupt and
void __attribute__((interrupt("IRQ"))) SysTick_Handler(void)
{
/* USER CODE BEGIN SysTick_IRQn 0 */
SysTick->CTRL &= 0xfffffffe; // disable SysTick (~SysTick_CTRL_ENABLE_Msk)
*timer_value = SysTick->VAL; // capture counter value (as quickly as possible)
*timer_value = SysTick->LOAD - *timer_value; // subtract it from reload value to get IRQ latency
SysTick->VAL = 0; // reset initial value
}
However I find that I always get the same IRQ latency, regardless of the instruction that was interrupted. I expect the interrupt latency to be longer when a longer instruction is interrupted.
This is the function I wrote to test the attack
extern uint32_t *timer_value;
int sample_function(int *a, int *b){
/*
* function description -- store the smallest of the two value in a, if MEASURE_CYCLESS defined return the number
* of clock cycles that have been elapsed since the timer has been started
* r0 contains pointer to a
* r1 contains pointer to b
*/
__asm volatile(
/* push working registers */
"PUSH {r4-r8} \n"
/* move counter into r8 */
"MOV r8, #10 \n"
/* begin loop */
"begin_loop: \n"
/* decrement counter variable*/
"SUB r8, r8, #1 \n"
/* if counter variable not equal to 0, jump back to start of loop */
"CMP r8, #0 \n"
/* if r8 not equal to 0, jump back to begin of loop*/
"BNE begin_loop \n"
/* load a into r2 */
"LDR r2, [r0] \n"
/* load b into r3 */
"LDR r3, [r1] \n"
/* store a-b in r4, setting status flags -- if result is 0 Z flag is set */
"SUBS r4, r2, r3 \n"
/* if a-b positive, a is larger otherwise, b is larger (assuming a not equal to b) */
"BPL a_larger \n"
#ifdef SPY
/* load address of (*timer_value) into r4 -- use of LDR pseudo-instruction places constant in a literal pool*/
"LDR r4, =timer_value \n"
/* Load (*timer_value) into r4 */
"LDR r4, [r4] \n"
/* load address of Systick VAL into r5 */
"LDR r5, =0xe000e018 \n"
/* Load value at address stored in R5 (= Systick Val) */
"LDR r5, [r5] \n"
/* Move Systick Val into adress stored at r4 (= *timer_value = address of timer_value)*/
"STR r5, [r4] \n"
#endif
"NOP \n"
/*instruction that gets interrupted -- swap value*/
"STR r2, [r1] \n"
/* load value at this address into r0 (return value) */
"STR r3, [r0] \n"
"B end \n"
"a_larger: \n"
"MOV r0, #0 \n" // instruction that gets interrupted
"end: POP {r4-r8}"
); // pop working registers
}
Note, the section of code in the #define block is used to automatically determine a good timer reload value (instead of using GDB), but I'm currently not using the value I obtained this way.
I also have an empty loop in there to delay the instruction that is meant to be interrupted a bit.
The instruction that gets interrupted is the instruction right after the #define block. When I remove the NOP instruction I still get the same interrupt latency. When I increase or decrease the timer value (to interrupt some cycles earlier or later) I also still get the same IRQ latency.
Am I missing something here? Is there some behavior I do not know about?
Also, is it important to use the attribute __attribute__((interrupt("IRQ")) for an interrupt handler?
This is what I was thinking and commenting on.
bootstrap
.thumb_func
reset:
bl notmain
ldr r4,=0xE000E018
ldr r0,=0xE000E010
mov r1,#7
str r1,[r0]
b hang
.thumb_func
hang:
nop
nop
nop
nop
nop
nop
nop
b hang
setup uart and systick
void notmain ( void )
{
uart_init();
hexstring(0x12345678);
PUT32(STK_CSR,4);
PUT32(STK_RVR,0xF40000);
PUT32(STK_CVR,0x00000000);
//PUT32(STK_CSR,7);
}
event handler
.thumb_func
.globl systick_handler
systick_handler:
ldr r0,[r4]
ldr r5,[sp,#0x18]
push {r0,lr}
bl hexstrings
mov r0,r5
bl hexstring
pop {r0,pc}
grab the timer and address of interrupted instruction and print them out.
00F3FFF4 08000054
00F3FFF4 08000056
00F3FFF4 08000058
00F3FFF4 0800005A
00F3FFF4 0800005C
00F3FFF4 0800005E
00F3FFF4 08000054
00F3FFF4 08000056
00F3FFF4 08000058
00F3FFF4 0800005A
00F3FFF4 08000050
08000050 <hang>:
8000050: bf00 nop
8000052: bf00 nop
8000054: bf00 nop
8000056: bf00 nop
8000058: bf00 nop
800005a: bf00 nop
800005c: bf00 nop
800005e: e7f7 b.n 8000050 <hang>
From ARM's documentation.
Interrupt Latency
There is a maximum of a twelve cycle latency from asserting the interrupt to execution of the first instruction of the ISR when the memory being accessed has no wait states being applied. When the FPU option is implemented and a floating point context is active and the lazy stacking is not enabled, this maximum latency is increased to twenty nine cycles. The first instructions to be executed are fetched in parallel to the stack push.
And that last line we can perhaps see happening here. You can try various instructions, but this architecture has the ability to restart the long duration instructions (reads and push/pop, multiply, and such). I think to see much of a latency difference you may need to create bus or shared resource contention (vs instructions)
Also systick is an exception not an interrupt, so there may be some differences with respect to latency.
Currently, I'm working on developing an Operative System for Raspberry 2, it's my final project to obtain my University degree, and right now I'm having severe problems to create a simple timer that throws an interrupt each second because the documentation provided by ARM doesn't clarify how to initialize that module.
I read the architectural reference manual, it's in ARM architecture/Reference manuals/ARMv7-AR
Can someone explain to me how it is the process of initializing a core timer?
I will adjunct what I tried so far:
In my C file
_local_timer_init();
// ROUTING IRQ
*(volatile uint32_t*)CORE0_L_TIMER_INT_CTL = 0x8;
In my assemble file
.globl _local_timer_init
/*
THIS STEPS APPLIES IN A SYSTEM WHERE THERE IS NOT VIRTUALIZATION SUPPORT
(I think so)
1. Look into CNTKCTL register if you need
2. Look into CNTP_CTL or CNTH_CTL or CNTV_CTL to enable or disable
the corresponding timer (bit 0)
3. You have to set the compare value for the corresponding timer
CNTP_CVAL, CNTH_CVAL, CNTV_CVAL if needed
4. It should be in boot.S but you have to initialize the counter
frequency register, CNTFRQ
5. Putting the corresponding TVAL register to a right value
6. Routing the IRQ and enabling IRQ of the corresponding core
*/
_local_timer_init:
// ENABLING TIMER
mov r0, #1
mcr p15, #0, r0, c14, c3, #1 //Write to CNTV_CTL
// SETTING FREQUENCY TIMER
//we don't need this right now
// SETTING TVAL REGISTER (virtual)
mrc p15, #0, r0, c14, c0, #0 //we obtain CNTFRQ
mcr p15, #0, r0, c14, c3, #0 //Write to CNTV_TVAL
I also created my custom assemble handler for IRQ exceptions like this:
maybe the problem is here, I really don't know Is this the correct way to handle an IRQ exception?
irq_s_handler:
/*Mode: PL1 irq */
srsda sp!, #0x12 //we stores the spsr and lr at the address contained in sp of the mode irq
/*
It is necessary to switch to supervisor mode and store some registers
into it's stack for having support for nested exceptions
*/
push {r0-r12}
bl irq_c_handler
pop {r0-r12}
rfeib sp! //we do the inverse operation of srsdb
subs pc, lr, #4 //we adjust the appropiate value considered
Our current project includes FreeRTOS, and I added --use_frame_pointer to Keil uVision's ARMGCC compiler option. But after loading firmware into STM32F104 chip, then runs it, it crashed. Without --use_frame_pointer, everything is OK.
The hard fault handler shows that faultStackAddress is 0x40FFFFDC, which points to a reserved area. Does anyone has any idea of this error? Thanks a lot.
#if defined(__CC_ARM)
__asm void HardFault_Handler(void)
{
TST lr, #4
ITE EQ
MRSEQ r0, MSP
MRSNE r0, PSP
B __cpp(Hard_Fault_Handler)
}
#else
void HardFault_Handler(void)
{
__asm("TST lr, #4");
__asm("ITE EQ");
__asm("MRSEQ r0, MSP");
__asm("MRSNE r0, PSP");
__asm("B Hard_Fault_Handler");
}
#endif
void Hard_Fault_Handler(uint32_t *faultStackAddress)
{
}
I stepped into each line of code, and the crash happened in below function in FreeRTOS's port.c after I called vTaskDelete(NULL);
void vPortYieldFromISR( void )
{
/* Set a PendSV to request a context switch. */
portNVIC_INT_CTRL_REG = portNVIC_PENDSVSET_BIT;
}
But seems like this is not the root cause, because when I deleted vTaskDelete(NULL), crash still happened.
[update on Jan 8] sample code
#include "FreeRTOSConfig.h"
#include "FreeRTOS.h"
#include "task.h"
#include <stm32f10x.h>
void crashTask(void *param)
{
unsigned int i = 0;
/* halt the hardware. */
while(1)
{
i += 1;
}
vTaskDelete(NULL);
}
void testCrashTask()
{
xTaskCreate(crashTask, (const signed char *)"crashTask", configMINIMAL_STACK_SIZE, NULL, 1, NULL);
}
void Hard_Fault_Handler(unsigned int *faultStackAddress);
/* The fault handler implementation calls a function called Hard_Fault_Handler(). */
#if defined(__CC_ARM)
__asm void HardFault_Handler(void)
{
TST lr, #4
ITE EQ
MRSEQ r0, MSP
MRSNE r0, PSP
B __cpp(Hard_Fault_Handler)
}
#else
void HardFault_Handler(void)
{
__asm("TST lr, #4");
__asm("ITE EQ");
__asm("MRSEQ r0, MSP");
__asm("MRSNE r0, PSP");
__asm("B Hard_Fault_Handler");
}
#endif
void Hard_Fault_Handler(unsigned int *faultStackAddress)
{
int i = 0;
while(1)
{
i += 1;
}
}
void nvicInit(void)
{
NVIC_PriorityGroupConfig(NVIC_PriorityGroup_4);
#ifdef VECT_TAB_RAM
NVIC_SetVectorTable(NVIC_VectTab_RAM, 0x0);
#else
NVIC_SetVectorTable(NVIC_VectTab_FLASH, 0x0);
#endif
}
int main()
{
nvicInit();
testCrashTask();
vTaskStartScheduler();
}
/* For now, the stack depth of IDLE has 88 left. if want add func to here,
you should increase it. */
void vApplicationIdleHook(void)
{ /* ATTENTION: all funcs called within here, must not be blocked */
//workerProbe();
}
void debugSendTraceInfo(unsigned int taskNbr)
{
}
When crash happened, in HardFault_Handler, Keil MDK IDE reports below fault information. I looked the STKERR error, which mainly means that stack pointer is corrupted. But I really have no idea why it is corrupted. Without --use_frame_pointer, everything works OK.
[update on Jan 13]
I did further investigation. Seems like the crash is caused by FreeRTOS's default TimerTask. If I comment out the xTimerCreateTimerTask() in vTaskStartScheduler() function(tasks.c), the crash does not happen.
Another odd thing is that if I debug it and step into the TimerTask's portYIELD_WITHIN_API() function call, then resume the application. It does not crash. So my guess is that this might due to certain time sequence. But I could not find the root cause of it.
Any thoughts? Thanks.
I ran into a similar problem in my project. It looks that armcc --use_frame_pointer tends to generate broken function epilogues. An example of generated code:
; function prologue
stmdb sp!, {r3, r4, r5, r6, r7, r8, r9, r10, r11, lr}
add.w r11, sp, #36
; ... actual function code ...
; function epilogue
mov sp, r11
; <--- imagine an interrupt happening here
sub sp, #36
ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, r10, r11, pc}
This code actually seems to break the constraint from AAPCS section 5.2.1.1:
A process may only access (for reading or writing) the closed interval of the entire stack delimited by [SP, stack-base – 1] (where SP is the value of register r13).
Now, on Cortex-M3, when an exception/interrupt arrives, partial register set is automatically pushed onto the current process' stack before jumping into the exception handler. If an exception is raised between the mov and sub, that partial register set will overwrite the registers stored by the function prologue's stmdb instruction, thus corrupting the state of the caller function.
Unfortunately, there doesn't seem to be any easy solution. None of the optimization settings seems to fix this code that looks like it can be easily fixed (coerced into sub sp, r11, #36). It seems that --use_frame_pointer is too broken to work on Cortex-M3 with multi-threaded code. At least on ARMCC 5.05u1, I didn't have the chance to check other versions.
If using a different compiler is an option for you, arm-none-eabi-gcc -fno-omit-frame-pointer seems to emit saner function epilogues, though.
I'm currently trying to port FreeRTOS to the TI AM335x processor, best known for being used on the BeagleBones. I am able to boot, run GPIOs and setup a compare match timer for running the system ticks. If I disable interrupts, i can see how the interrupt get set after a correct amount of time after the timer was started. And if I enable interrupts, my application dies after that same given time. The application also dies if I try to yield a task, aka calling the SWI handler. This makes me belive that the vector table is unavailable or incorrectly setup. The ROM Exception Vectors for SWI and IRQ has the contenct 4030CE08h and 4030CE18h. Which again in RAM executes some branching, the TRM says:
User code can redirect any exception to a custom handler either by writing its address to the appropriate location from 4030CE24h to 4030CE3Ch or by overriding the branch (load into PC) instruction between addresses from 4030CE04h to 4030CE1Ch.
My vIRQHandler function address is therefore written to 4030CE38h. One would hope this was enough, but sadly no. I suspect that there is something wrong in my boot.s file, however my assembly has never been that great and i'm struggling to understand the code. The boot.s and the rest of the project was started from a OMAP3 port.
Boot.s:
.section .startup,"ax"
.code 32
.align 0
b _start /* reset - _start */
ldr pc, _undf /* undefined - _undf */
ldr pc, _swi /* SWI - _swi */
ldr pc, _pabt /* program abort - _pabt */
ldr pc, _dabt /* data abort - _dabt */
nop /* reserved */
ldr pc, _irq /* IRQ - read the VIC */
ldr pc, _fiq /* FIQ - _fiq */
_undf: .word 0x4030CE24 /* undefined */
_swi: .word 0x4030CE28 /* SWI */
_pabt: .word 0x4030CE2C /* program abort */
_dabt: .word 0x4030CE30 /* data abort */
_irq: .word 0x4030CE38
_fiq: .word 0x4030CE3C /* FIQ */
The branch to start sets up a stack for each mode and clears the bss, not sure if that is relevant. This is the code which seems relevant to me, and I have updated the words to fit the AM335 instead of the OMAP3.
The setting IRQ handler:
#define E_IRQ (*(REG32 (0x4030CE38)))
....
/* Setup interrupt handler */
E_IRQ = ( long ) vIRQHandler;
If anyone have any pointers to an assembly newbie it would be much appriciated, because i'm completely stuck :)
U-boot had moved the exception vector table. However, instead of recompiling u-boot I just reset the exception vector table in my own start script.
Added this right before branching to main:
/* Set V=0 in CP15 SCTRL register - for VBAR to point to vector */
mrc p15, 0, r0, c1, c0, 0 # Read CP15 SCTRL Register
bic r0, #(1 << 13) # V = 0
mcr p15, 0, r0, c1, c0, 0 # Write CP15 SCTRL Register
/* Set vector address in CP15 VBAR register */
ldr r0, =_vector_table
mcr p15, 0, r0, c12, c0, 0 #Set VBAR
bl main
And put in the _vector_table label at the start of my exception vector table:
.section .startup,"ax"
.code 32
.align 0
_vector_table: b _start /* reset - _start */
ldr pc, _undf /* undefined - _undf */
ldr pc, _swi /* SWI - _swi */
ldr pc, _pabt /* program abort - _pabt */
ldr pc, _dabt /* data abort - _dabt */
nop /* reserved */
ldr pc, _irq /* IRQ - read the VIC */
ldr pc, _fiq /* FIQ - _fiq */
Now all the exceptions gets redirected to my code. Hopefully this will help anyone in the same situation that I was in:)