Prevent nested calls - c

I have a function to disable interrupts, but the problem is that if I disable them and I call a function which also disables/enables them, they get re-enabled too early. Is the following logic enough to prevent this?
static volatile int IrqCounter = 0;
void EnableIRQ()
{
if(IrqCounter > 0)
{
IrqCounter--;
}
if(IrqCounter == 0)
{
__enable_irq();
}
}
void DisableIRQ()
{
if(IrqCounter == 0)
{
__disable_irq();
}
IrqCounter++;
}

The way every operating system I know of does it is to save IRQ state into a local variable, and then restore that.
Clearly, your code has TOCTOU issues - if two threads run at the same time, checking the IrqCounter > 0, if IrqCounter == 1, then the first thread will see it as 1, the second thread sees it as 1, and both decrement the counter.
I would definitely try to arrange something like this:
int irq_state = irq_save();
irq_disable();
... do stuff with IRQ's turned off ...
irq_restore(irq_state);
Now, you don't have to worry about counters that can get out of sync, etc.

Assuming that you've got a system where you can't change context when interrupts are disabled, then what you've got is fine, assuming you keep careful track of when call the enable().
In the usage you're describing in the comments below, you plan on using these sections from within an interrupt service routine. Your main use is blocking higher-priority interrupts from running for a certain portion of an ISR.
Be aware that you'll have to consider the stack depth of these nested ISRs, as when you enable interrupts before your return from interrupt, you'll have interrupts enabled in the ISR.
Regarding other answers: the lack of thread-safety of the enable() (due to the if(IrqCounter > 0)) doesn't matter, because anytime you're in the enable() context switches are already disabled due to interrupts being off. (Unless for some reason you have unmatched disable/enable pairs, and in that case you've got other issues.)
The only suggestion I'd have would be to add an ASSERT to the enable instead of the run-time check, as you should never be enabling interrupts that you didn't disable.
void EnableIRQ()
{
ASSERT(IrqCounter != 0) //should never be 0, or we'd have an unmatched enable/disable pair
IrqCounter--; //doesn't matter that this isn't thread safe, as the enable is always called with interrupts disabled.
if(IrqCounter == 0)
{
__enable_irq();
}
}
I prefer the technique you've listed over the save(); disable(); restore(); technique as I don't like having to keep track of a piece of the OS' data every time I work with the interrupts. But, you do have to be aware of when you (directly or indirectly) make a call to the enable() from an ISR.

That looks fine, except it's not thread-safe.
Another common option is to query the interrupt-enable/disable state and save it into a local variable, then disable interrupts, then do whatever you want to be done while interrupts are disabled, then restore the state from the local variable.

static volatile int IrqCounter = 0;
void EnableIRQ(void)
{
ASSERT(IrqCounter != 0) //should never be 0, or we'd have an unmatched enable/disable pair
if (IrqCounter > 0)
{
IrqCounter--;
}
if (IrqCounter == 0)
{
__enable_irq();
}
}
void DisableIRQ(void)
{
__disable_irq(); // Fix TOCTOU issues. In CMSIS there is no harm in extra disables, so always disable.
IrqCounter++;
}

Related

Memory ordering for a spin-lock "call once" implementation

Suppose I wanted to implement a mechanism for calling a piece of code exactly once (e.g. for initialization purposes), even when multiple threads hit the call site repeatedly. Basically, I'm trying to implement something like pthread_once, but with GCC atomics and spin-locking. I have a candidate implementation below, but I'd like to know if
a) it could be faster in the common case (i.e. already initialized), and,
b) is the selected memory ordering strong enough / too strong?
Architectures of interest are x86_64 (primarily) and aarch64.
The intended use API is something like this
void gets_called_many_times_from_many_threads(void)
{
static int my_once_flag = 0;
if (once_enter(&my_once_flag)) {
// do one-time initialization here
once_commit(&my_once_flag);
}
// do other things that assume the initialization has taken place
}
And here is the implementation:
int once_enter(int *b)
{
int zero = 0;
int got_lock = __atomic_compare_exchange_n(b, &zero, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
if (got_lock) return 1;
while (2 != __atomic_load_n(b, __ATOMIC_ACQUIRE)) {
// on x86, insert a pause instruction here
};
return 0;
}
void once_commit(int *b)
{
(void) __atomic_store_n(b, 2, __ATOMIC_RELEASE);
}
I think that the RELAXED ordering on the compare exchange is okay, because we don't skip the atomic load in the while condition even if the compare-exchange gives us 2 (in the "zero" variable), so the ACQUIRE on that load synchronizes with the RELEASE in once_commit (I think), but maybe on a successful compare-exchange we need to use RELEASE? I'm unclear here.
Also, I just learned that lock cmpxchg is a full memory barrier on x86, and since we are hitting the __atomic_compare_exchange_n in the common case (initialization has already been done), that barrier it is occurring on every function call. Is there an easy way to avoid this?
UPDATE
Based on the comments and accepted answer, I've come up with the following modified implementation. If anybody spots a bug please let me know, but I believe it's correct. Basically, the change amounts to implementing double-check locking. I also switched to using SEQ_CST because:
I mainly care that the common (already initialized) case is fast.
I observed that GCC doesn't emit a memory fence instruction on x86 for the first read (and it does do so on ARM even with ACQUIRE).
#ifdef __x86_64__
#define PAUSE() __asm __volatile("pause")
#else
#define PAUSE()
#endif
int once_enter(int *b)
{
if(2 == __atomic_load_n(b, __ATOMIC_SEQ_CST)) return 0;
int zero = 0;
int got_lock = __atomic_compare_exchange_n(b, &zero, 1, 0, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
if (got_lock) return 1;
while (2 != __atomic_load_n(b, __ATOMIC_SEQ_CST)) {
PAUSE();
};
return 0;
}
void once_commit(int *b)
{
(void) __atomic_store_n(b, 2, __ATOMIC_SEQ_CST);
}
a, What you need is a double-checked lock.
Basically, instead of entering the lock every time, you do an acquiring-load to see if the initialisation has been done yet, and only invoke once_enter if it has not.
void gets_called_many_times_from_many_threads(void)
{
static int my_once_flag = 0;
if (__atomic_load_n(&my_once_flag, __ATOMIC_ACQUIRE) != 2) {
if (once_enter(&my_once_flag)) {
// do one-time initialization here
once_commit(&my_once_flag);
}
}
// do other things that assume the initialization has taken place
}
b, I believe this is enough, your initialisation happens before the releasing store of 2 to my_once_flag, and every other thread has to observe the value of 2 with an acquiring load from the same variable.

Protected Hardware Interrupt Handler Stuck? (DJGPP)

I'm trying to set up a hardware interrupt handler in protected mode, using djgpp-2 for compiling in dosbox-0.74. Here's the smallest code possible (timer interrupt), I guess:
#include <dpmi.h>
#include <go32.h>
#include <stdio.h>
unsigned int counter = 0;
void handler(void) {
++counter;
}
void endHandler(void) {}
int main(void) {
_go32_dpmi_seginfo oldInfo, newInfo;
_go32_dpmi_lock_data(&counter, sizeof(counter));
_go32_dpmi_lock_code(handler, endHandler - handler);
_go32_dpmi_get_protected_mode_interrupt_vector(8, &oldInfo);
newInfo.pm_offset = (int) handler;
newInfo.pm_selector = _go32_my_cs();
_go32_dpmi_allocate_iret_wrapper(&newInfo);
_go32_dpmi_set_protected_mode_interrupt_vector(8, &newInfo);
while (counter < 3) {
printf("%u\n", counter);
}
_go32_dpmi_set_protected_mode_interrupt_vector(8, &oldInfo);
_go32_dpmi_free_iret_wrapper(&newInfo);
return 0;
}
Note that I'm not chaining my handler but replacing it. The counter won't increase beyond 1 (therefore never stopping the main loop) making me guess that the handler doesn't return correctly or is called only once. Chaining on the other hand works fine (remove the wrapper-lines and replace set_protected_mode with chain_protected_mode).
Am I missing a line?
You need to chain the old interrupt handler, like in the example Jonathon Reinhart linked to in the documentation, as the old handler will tell the interrupt controller to stop asserting the interrupt. It will also have the added benefit of keeping the BIOS clock ticking, so it doesn't lose a few seconds each time you run the program. Otherwise when your interrupt handler returns the CPU will immediately call the handler again and your program will get stuck in an infinite loop.
Also there's no guarantee that GCC will place endHandler after handler. I'd recommend just simply locking both the page handler starts on and the next page in case it straddles a page:
_go32_dpmi_lock_code((void *) handler, 4096);
Note the cast is required here, as there's no automatic conversion from pointer to a function types to pointer to void.

Variable value not updated by interrupt on STM32F4 Discovery

In the code below, I can see that the timer is working normally as the LED is always blinking. But the value of the count variable never changes inside the second while.
I don't know what could possibly go wrong?
// count variable used only in main and TIM2_IRQHandler.
uint8_t count=0;
int main(void)
{
count=0;
SystemInit();
GPIOInit();
NVIC_Configuration();
TIM_Configuration();
init_USART3(115200);
// All initialization is ok.
USART_puts(USART3, "\r\nConnection ok.\r\n");// Working normally
while (1)
{
if(asterixok==1)// No problem. This code if ok ->>process continue next step.
{
GPIO_SetBits(GPIOD , GPIO_Pin_12); // Led on (ok)
count=0;// count going to zero, timer working, must be change in there
while(1)
{
//Led blinking continue
//Timer query working normal led (13) blink.
//There is a problem
if(count>5) // Timer working, count never change in timer interrupt query (WHY)
{
GPIO_SetBits(GPIOD , GPIO_Pin_14); // LED OFFFFFFFFFFFFFFFF
USART_puts(USART3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\r\n");
goto nextstate;
}
}
nextstate:
GPIO_SetBits(GPIOD , GPIO_Pin_15); // Led never going on because code step in while loop.
}
}
}
void USART3_IRQHandler(void)
{
if( USART_GetITStatus(USART3, USART_IT_RXNE) )
{
unsigned char t = USART3->DR;
if(t=='*')
{
asterixok=1;
}
}
}
void TIM2_IRQHandler(void)
{
if ( TIM_GetITStatus(TIM2 , TIM_IT_Update) != RESET )
{
TIM_ClearITPendingBit(TIM2 , TIM_FLAG_Update);
count++;
if(count>100)
count=0;
if( display )
{
GPIO_ResetBits(GPIOD , GPIO_Pin_13);
}
else
{
GPIO_SetBits(GPIOD , GPIO_Pin_13);
}
display = ~display;
}
}
I have tried with another Discovery board but the problem continues.
Please help. I'm going crazy!
You should declare count as volatile, as such :
volatile uint8_t count;
While compiling main the compiler was able to prove that count was not modified in the loop body, and so it probably cached its value in a register and maybe even optimized out the if statement. You could verify that by looking at a disassembly. The compiler does not know about interrupts as per the standard and so is permitted to perform such optimizations. Qualifying count as volatile will forbid the compiler from making these optimizations, forcing it to reload the variable from memory each time it is used.
In this simple case volatile will be enough but please be aware that it doesn't guarantee atomicity of operations, and it doesn't prevent the compiler and CPU from reordering instructions around accesses to the variable. It only forces the compiler to generate memory access instructions each time the variable is used. For atomicity you need locks, and to prevent reordering you need memory barriers.

Can a SysTick exception in Cortex-M4 preempt itself?

I have a handler for SysTick exception which counts ticks and calls other functions (f1, f2, f3) whose execution time can be longer than SysTick period. These functions set and clear their active status (global variables) so if a SysTick exception occurs it can detect an overload and return to interrupted function.
I have assigned fixed priority to SysTick exception (let's say 16). I want to somehow make possible for SysTick to generate an exception regardless of it's prior active status, go to SysTickHandler, increase tick counter and return to interrupted function.
One solution which may be useful is to use BASEPRI. It can be set to priority lower than SysTick so it would enable that exception. Unfortunately, using BASEPRI got me nowhere because nothing happened (I set it to max value). BASEPRI value was 0 inside SysTickHandler before I changed it. Should that value be equal to SysTick priority when processor enters handler function? Is exception priority loaded automatically in BASEPRI?
I have also considered for NVIC to have an issue with preempting already active exception but found nothing regarding that in ARM documentation.
Also, return from handler when oveload is detected could set the processor state to thread mode. Let's ignore that for now.
void SysTickHandler(void) {
ticks++;
//set_BASEPRI(max_value);
if (f1_act || f2_act || f3_act) return;
else {
f1();
f2();
f3();
}
}
A simpler example for this problem (without return) would be to increase tick counter when having an infinite loop inside handler.
void SysTickHandler(void) {
ticks++;
set_BASEPRI(max_value);
while(1);
}
If the interrupt becomes pending while its handler is already running, the handler will run to completion and immediately re-enter. Your tick will be aperiodic, and if the functions consistently take longer that one tick period, you may never leave the interrupt context.
It may be possible I suppose to increase the priority of the interrupt in the handler so that it will preempt itself, but even if that were to work, I would hesitate to recommend it.
It sounds that what you actually need is an RTOS.
Sorry to disappoint you, but it seems a overall design problem to me...
Why won't you just set some flag in SysTick and read it somewhere else?
Like:
#include <stdbool.h>
volatile bool flag = false;
//Consider any form of atomicity here
//atomic_bool or LDREX/STREX instructions here. Bitbanding will also work
void sysTickHandler(void) {
ticks++;
if (f1_act || f2_act || f3_act) return;
else {
flag = true; //or increment some counter if you want to keep track of the amount of executions
}
And somewhere else:
int main() {
// some init code
//main loop
for(;;) {
foo();//do sth
bar(x); //do sth else
if (flag) {
f1();
f2();
f3();
flag = false;
}
}
}
Or if we assume that every interrupt wakes the microcontroller and power-down mode is needed, then sth. like this might work:
if (flag) {
f1();
f2();
f3();
flag = false;
}
goToSleep(powerDownModeX); //whatever;

Does this sound like a stack overflow?

I think I might be having a stack overflow problem or something similar in my embedded firmware code. I am a new programmer and have never dealt with a SO so I'm not sure if that is what's happening or not.
The firmware controls a device with a wheel that has magnets evenly spaced around it and the board has a hall effect sensor that senses when magnet is over it. My firmware operates the stepper and also count steps while monitoring the magnet sensor in order to detect if the wheel has stalled.
I am using a timer interrupt on my chip (8 bit, 8057 acrh.) to set output ports to control the motor and for the stall detection. The stall detection code looks like this...
// Enter ISR
// Change the ports to the appropriate value for the next step
// ...
StallDetector++; // Increment the stall detector
if(PosSensor != LastPosMagState)
{
StallDetector = 0;
LastPosMagState = PosSensor;
}
else
{
if (PosSensor == ON)
{
if (StallDetector > (MagnetSize + 10))
{
HandleStallEvent();
}
}
else if (PosSensor == OFF)
{
if (StallDetector > (GapSize + 10))
{
HandleStallEvent();
}
}
}
this code is called every time the ISR is triggered. PosSensor is the magnet sensor. MagnetSize is the number of stepper steps that it takes to get through the magnet field. GapSize is the number of steps between two magnets. So I want to detect if the wheel gets stuck either with the sensor over a magnet or not over a magnet.
This works great for a long time but then after a while the first stall event will occur because 'StallDetector > (MagnetSize + 10)' but when I look at the value of StallDetector it is always around 220! This doesn't make sense because MagnetSize is always around 35. So the stall event should have been triggered at like 46 but somehow it got all the way up to 220? And I don't set the value of stall detector anywhere else in my code.
Do you have any advice on how I can track down the root of this problem?
The ISR looks like this
void Timer3_ISR(void) interrupt 14
{
OperateStepper(); // This is the function shown above
TMR3CN &= ~0x80; // Clear Timer3 interrupt flag
}
HandleStallEvent just sets a few variable back to their default values so that it can attempt another move...
#pragma save
#pragma nooverlay
void HandleStallEvent()
{
///*
PulseMotor = 0; //Stop the wheel from moving
SetMotorPower(0); //Set motor power low
MotorSpeed = LOW_SPEED;
SetSpeedHz();
ERROR_STATE = 2;
DEVICE_IS_HOMED = FALSE;
DEVICE_IS_HOMING = FALSE;
DEVICE_IS_MOVING = FALSE;
HOMING_STATE = 0;
MOVING_STATE = 0;
CURRENT_POSITION = 0;
StallDetector = 0;
return;
//*/
}
#pragma restore
Is PosSensor volatile? That is, do you update PosSensor somewhere, or is it directly reading a GPIO?
I assume GapSize is rather large (> 220?) It sounds to me like you might have a race condition.
// PosSensor == OFF, LastPosMagState == OFF
if(PosSensor != LastPosMagState)
{
StallDetector = 0;
LastPosMagState = PosSensor;
}
else
{
// Race Condition: PosSensor turns ON here
// while LastPosMagState still == OFF
if (PosSensor == ON)
{
if (StallDetector > (MagnetSize + 10))
{
HandleStallEvent();
}
}
else if (PosSensor == OFF)
{
if (StallDetector > (GapSize + 10))
{
HandleStallEvent();
}
}
}
You should cache the value of PosSensor once, right after doing StallDetector++, so that in the event PosSensor changes during your code, you don't start testing the new value.
This is definitely not stack overflow. If you blew the stack (overflowed it) your application would simply crash. This sounds more like something we used to call memory stomping in my C++ days. You may not be accessing the memory location that the StallDetector value occupies via StallDetector variable alone. There may be another part of your code "stomping" this particular memory location erroneously.
Unfortunately, this kind of issue is very hard to track down. About the only thing you could do is systematically isolate (remove from execution) chunks of your code until you narrow down and find the bug.
Do you have nest ISRs on your system? Could be something along the lines of start your ISR and increment your count, then interrupt it and do it again. Do this enough times and your interrupt stack can overflow. It could also explain such a high counter variable as well.
Does HandleStallEvent() "look at" StallDetector within the ISR or does it trigger something on the main loop? If it's on the main loop, are you clearing the interrupt bit?
Or are you looking at StallDetector from a debugger outside the ISR? Then a retriggered interrupt would use the correct value each time, but execute too many times, and you would only see the final, inflated value.
On second thought, more likely you don't have to clear an interrupt-generating register, but rather the interrupt pin is remaining asserted by the sensor. You need to ignore the interrupt after it's first handled until the line deasserts, such as by having the original ISR disable itself and and reinstall it in a second ISR which handles the 1->0 transition.
You might then also need to add debouncing hardware or adjust it if you have it.
Check your parameter types. If you defined the parameters in a way different than the caller expects then calling your method could overwrite the space that variable is stored in. (For instance if you wrote the function expecting an int but it is pushing a long onto the stack.)
You could see what additional options your debugger supports. In Visual Studio, for example, it is possible to set a "data breakpoint", where you break when a memory location changes (or is set to a certain value, or above a threshold, ...).
If something like this is possible in your case, you could see where the data is changed and if there is someone else writing to the memory erroneously.

Resources