While creating FreeRTOS application project with STM32CubeMx, there are two ways you can use to introduce delay, namely osDelay and HAL_Delay.
What's the difference among them and which one should be preferred?
osDelay Code:
/*********************** Generic Wait Functions *******************************/
/**
* #brief Wait for Timeout (Time Delay)
* #param millisec time delay value
* #retval status code that indicates the execution status of the function.
*/
osStatus osDelay (uint32_t millisec)
{
#if INCLUDE_vTaskDelay
TickType_t ticks = millisec / portTICK_PERIOD_MS;
vTaskDelay(ticks ? ticks : 1); /* Minimum delay = 1 tick */
return osOK;
#else
(void) millisec;
return osErrorResource;
#endif
}
HAL_Delay Code:
/**
* #brief This function provides accurate delay (in milliseconds) based
* on variable incremented.
* #note In the default implementation , SysTick timer is the source of time base.
* It is used to generate interrupts at regular time intervals where uwTick
* is incremented.
* #note ThiS function is declared as __weak to be overwritten in case of other
* implementations in user file.
* #param Delay: specifies the delay time length, in milliseconds.
* #retval None
*/
__weak void HAL_Delay(__IO uint32_t Delay)
{
uint32_t tickstart = 0;
tickstart = HAL_GetTick();
while((HAL_GetTick() - tickstart) < Delay)
{
}
}
HAL_Delay is NOT a FreeRTOS function and _osDelay is a function built around FreeRTOS function. (acc #Clifford: ) They both are entirely different functions by different developers for different purposes.
osDelay is part of the CMSIS Library and uses vTaskDelay() internally to introduce delay with the difference that input argument of osDelay is delay time in milliseconds while the input argument of _vTaskDelay() is number of Ticks to be delayed. (acc. #Bence Kaulics:) Using this function, OS will be notified about the delay and OS will change the status of task to blocked for that particular time period.
HAL_Delay is part of the hardware abstraction layer for our processor. It basically uses polling to introduce delay. (acc. #Bence Kaulics:) Using this function, OS won't be notified about the delay. Also if you do not use OS, then HAL_Delay is the default and only blocking delay to use provided by the HAL library. (acc. #Clifford: ) This is part of HAL Library and can be used without FreeRTOS (or when FreeRTOS is not running)
To introduce Delay using FreeRTOS functions, you can use vTaskDelay() or vTaskDelayUntil() after the scheduler has started.
(acc. #Clifford: )
Always Favour FreeRTOS API function if you want your application to be deterministic.
CubeMX is a collection of parts from multiple sources.
It does not look like HAL_Delay() is intended for use with an RTOS because it is a NULL loop delay. If you call HAL_Delay() from an RTOS task then the task will continue to run until the delay has expired. Higher priority tasks will be able to run, but lower priority tasks will be starved of any processing time during the delay period. That is a waste of processing time, power, and can be detrimental to system responsiveness.
osDelay() on the other hand effects a delay using the RTOS. It tells the RTOS that it has nothing to do until the delay period has expired, so the RTOS does not assign any processing time to the task during that period. That saves processing time, potentially saves power, and allows lower priority tasks to get processing time during the delay period. http://www.freertos.org/FAQWhat.html#WhyUseRTOS
There is a task with the highest priority. If you are going to use the HAL_Delay to block the task then probably there won't be a context switch because the scheduler won't be notified that the task currently just polls a tick counter in a while loop and actually does not do any useful operation. Tasks with lower priority won't run.
The other function uses the vTaskDelay function of the OS, I did not peek into its source code, but probably this will notify the OS the current task wants to be blocked for a certain time, so the task's state will change to blocked and the scheduler can switch to a lower prio task in the meanwhile.
HAL_Delay is used across the stm32_HAL library, including in some case, the function being called in ISR. Beside the naming implication that it is hardware abstract layer, the timer that used the HAL_Delay (HAL_GetTick) needs to have highest NVIC priority. (because it may be called inside ISR and can't be blocked) Whether this is good or bad from the implementation point of view, there are some discussions in web. However, this is the way ST do, you choose if you want to use STM32_HAL.
osDelay is in CMSIS layer is implemented with vTaskDelay. Which use the systick function as timer. The FreeRTOS also use systick to do task context switch. According to FreeRTOS document. The NVIC priority of systick need to be lowest. (So it won't get into the middle of ISR).
Which function is preferred depend on what you are doing, one has highest priority and the one has lowest (according to ST and FreeRTOS recommendation). That's the reason that if you use STM32CubeMX, it will ask you to assign a hardware timer as the 'tick' in addition to systick if you choose to use FreeRTOS.
The answer is quite simple,
If your project is bare-metal (means without os), you should(or can) use HAL_Delay.
The "weak" symbol implementation uses a code like the below.
You can declare your own function if you want.
__weak void HAL_Delay(uint32_t Delay)
{
uint32_t tickstart = HAL_GetTick();
uint32_t wait = Delay;
/* Add a period to guaranty minimum wait */
if (wait < HAL_MAX_DELAY)
{
wait += (uint32_t)(uwTickFreq);
}
while((HAL_GetTick() - tickstart) < wait)
{
}
}
But if your project has an os (lets say FreeRTOS or Keil-RTX) or any other, then you should use a osDelay. This is because as #ARK4579 explained if you use hal_delay with the above function definition then the above function is a blocking call, which means this is just consuming cycles. With osDelay, the caller task will go into blocked state and when the ticks are completed, the task will be in Ready state once again. So here you don't consume any cycles. It is a non-blocking call.
Related
I'm working on an embedded project that's running on an ARM Cortex M3 based microcontroller. Some code provided by our vendor uses a delay function that sets up built-in hardware timer and then spins until the timer expires. Typically this is used to wait between 1 and a couple hundred microseconds. These delays are almost because they are waiting on some register, chip or bus to complete an action and need to wait at least the given number of microseconds. The hardware timer also appears to cost at least 6 microseconds in overhead to setup.
In a multithreaded environment this is a problem because there are N threads but only 1 hardware timer. I could disable interrupts while the timer is being used to prevent context switches and thus race conditions but it seems a bit ugly. I am thinking of replacing the function that uses the hardware timer with a function that uses the ARM CPU Cycle Counter (CCNT). Are there are pitfalls I am missing or other alternatives? Obviously the cycle counter function requires it be tuned to the proper CPU frequency which will never change for our system, but I suppose could be detected at boot programmatically using the hardware timer.
Setup the timer once at startup and let the counter run continuously. When you want to start a delay, read the counter value and remember this start value. Then in the delay loop read the counter value again and loop until the counter value minus the start value is greater than or equal to the requested delay ticks. (If you do the subtraction correctly then rollovers will wash out and you don't need special handling to check for them.)
You could multiplex your timer such that you have a table of when each thread wants to fire off and a function pointer / vector for execution. When the timer interrupt occurs, fire off that thread's interrupt and then set the timer to the next one in the list, minus elapsed time. This is what I see many *nix operating systems do in their kernel code, so there should be code to pull from as example.
A bigger concern is the fact that you are spin locking the thread waiting for the timer. Besides CPU usage, and depending on what OS you have (or if you have an OS) you could easily introduce thread inversion issues or even full on lock ups. It might be better to use thread primitives instead so that any OS can actually sleep your threads and wake them when needed.
I work with embedded stuff, namely PIC32 Microchip CPUs these days.
I'm familiar with several real-time kernels: AVIX, FreeRTOS, TNKernel, and in all of them we have 2 versions of nearly all functions: one for calling from task, and second one for calling from ISR.
Of course it makes sense for functions that could switch context and/or sleep: obviously, ISR can't sleep, and context switch should be done in different manner. But there are several functions that do not switch context nor sleep: say, it may return system tick count, or set up software timer, etc.
Now, I'm implementing my own kernel: TNeoKernel, which has well-formed code and is carefully tested, and I'm considering to invent "universal" functions sometimes: the ones that can be called from either task or ISR context. But since all three aforementioned kernels use separate functions, I'm afraid I'm going to do something wrong.
Say, in task and ISR context, TNKernel uses different routines for disabling/restoring interrupts, but as far as I see, the only possible difference is that ISR functions may be "compiled out" as an optimization if the target platform doesn't support nested interrupts. But if target platform supports nested interrupts, then disabling/restoring interrupts looks absolutely the same for task and ISR context.
So, my question is: are there platforms on which disabling/restoring interrupts from ISR should be done differently than from non-ISR context?
If there are no such platforms, I'd prefer to go with "universal" functions. If you have any comments on this approach, they are highly appreciated.
UPD: I don't like to have two set of functions because they lead to notable code duplication and complication. Say, I need to provide a function that should start software timer. Here is what it looks like:
enum TN_RCode _tn_timer_start(struct TN_Timer *timer, TN_Timeout timeout)
{
/* ... real job is done here ... */
}
/*
* Function to be called from task
*/
enum TN_RCode tn_timer_start(struct TN_Timer *timer, TN_Timeout timeout)
{
TN_INTSAVE_DATA; //-- define the variable to store interrupt status,
// it is used by TN_INT_DIS_SAVE()
// and TN_INT_RESTORE()
enum TN_RCode rc = TN_RC_OK;
//-- check that function is called from right context
if (!tn_is_task_context()){
rc = TN_RC_WCONTEXT;
goto out;
}
//-- disable interrupts
TN_INT_DIS_SAVE();
//-- perform real job, after all
rc = _tn_timer_start(timer, timeout);
//-- restore interrupts state
TN_INT_RESTORE();
out:
return rc;
}
/*
* Function to be called from ISR
*/
enum TN_RCode tn_timer_istart(struct TN_Timer *timer, TN_Timeout timeout)
{
TN_INTSAVE_DATA_INT; //-- define the variable to store interrupt status,
// it is used by TN_INT_DIS_SAVE()
// and TN_INT_RESTORE()
enum TN_RCode rc = TN_RC_OK;
//-- check that function is called from right context
if (!tn_is_isr_context()){
rc = TN_RC_WCONTEXT;
goto out;
}
//-- disable interrupts
TN_INT_IDIS_SAVE();
//-- perform real job, after all
rc = _tn_timer_start(timer, timeout);
//-- restore interrupts state
TN_INT_IRESTORE();
out:
return rc;
}
So, we need wrappers like the ones above for nearly all system function. This is a kind of inconvenience, for me as a kernel developer as well as for kernel users.
The only difference is that different macros are used: for task, these are TN_INTSAVE_DATA, TN_INT_DIS_SAVE(), TN_INT_RESTORE(); for interrupts these are TN_INTSAVE_DATA_INT, TN_INT_IDIS_SAVE(), TN_INT_IRESTORE().
For the platforms that support nested interrupts (ARM, PIC32), these macros are identical. For other platforms that don't support nested interrupts, TN_INTSAVE_DATA_INT, TN_INT_IDIS_SAVE() and TN_INT_IRESTORE() are expanded to nothing. So it is a bit of performance optimization, but the cost is too high in my opinion: it's harder to maintain, it's not so convenient to use, and the code size increases.
It's all a matter of design and CPU capabilities. I'm not familiar with any of the PICs but, for example, Freescale (Motorola) MCUs (among many others) have the ability to move the Condition Code Register (CCR) into the accumulator and back. This allows one to save the previous state of the Interrupt Enable/Disable Mask, and restore it at the end, without worrying about bluntly enabling interrupts where they should stay disabled (inside ISRs).
To answer, however, which platform(s) must do it differently inside and outside ISRs would require one to be familiar with all of them, or at least one that fails this test. If there is a CPU that does not allow saving and restoring the CCR (as mentioned above), one would have no option but to do it differently for each case.
Kernel functions that normally cause scheduling to occur have simpler ISR versions because the scheduler runs on return from interrupt (there is usually an interrupt epilogue required to do that), not from the scheduling function itself.
It is simple enough to create a function that will work in any context, but it adds a small overhead. However the safety afforded by not calling an inappropriate function is probably worth it.
For example:
OSStatus semGive( OSSem sem )
{
return isInterrupt() ? ISR_SemGive( sem ) : OS_SemGive( sem ) ;
}
The implementation of isInterrupt() is platform dependent, and is discussed at Safely detect, if function is called from an ISR?
I am having following code from big code base of an embedded application. I am trying to understand code and have following questions.
old_rate = sysAuxClkRateGet();
sysAuxClkRateSet(50);
sysAuxClkConnect ((FUNCPTR) scanDispatcher, 0);
/* Enable dispatcher */
sysAuxClkEnable ();
My questions are
Do scanDispatcher is called for each tick or after 50 ticks?
Is sysAuxClkRateSet(50); means we have 50 ticks per second? Is my understanding is right.
The auxiliary clock ISR will call scanDispatcher (with argument 0) every time it's invoked to handle the auxilary clock interrupt.
sysAuxClkRateSet(50) defines the frequency of the auxiliary clock interrupt. Since the auxiliary clock driver ISR doesn't perform other actions than managing the timer device and calling the scanDispatcher routine, you can change the frequency.
There are two kind of limits in the frequency values you can use:
The auxiliary clock driver (part of the BSP you're using) defines absolute minimum and maximum values that the driver is able to manage
The real maximum limit is defined by the system load introduced by scanDispatcher and it's execution time; remember, in any case, that scanDispatcher is executed at interrupt time, so its' execution time should always be very short.
A last caveat: auxiliary clock isn't a mandatory device in VxWorks: most of the BSPs support an auxiliary clock device, but (in principle) you could find a BSP that doesn't support it.
I want to write a task that does some polling on some IOs. Now, I need it to not block the cpu but to check the IOs every 1 microsecond or so.
I'm a relative VxWorks newbie and just realized that inserting a usleep(1); into my polling loop probably won't do what I need it to do. How do I best go about this?
I have figured out that sysClkRateGet() returns 60 which isn't good enough for me. I need to poll and react fast but can't block the other things that are going on in the CPU, so I guess taskDelay() won't do it for me... is there anything else that allows for a shorter downtime of my task (than 1/60 seconds)?
edit
I think I've figured out that it's much smarter to have a timer kicking in every 1us that executes my short polling function.
i triggered the timer like this:
timer_t polltimerID;
struct itimerspec poll_time;
poll_time.it_value.tv_sec = 0;
poll_time.it_value.tv_nsec= 1000;
poll_time.it_interval.tv_sec = 0;
poll_time.it_interval.tv_nsec= 1000; // execute it every 1us
if(timer_create (CLOCK_REALTIME, NULL, &polltimerID))
printf("problem in timer_create(): %s",strerror(errno));
if(timer_connect (polltimerID,MyPollFunction,0))
printf("problem in timer_connect(): %s",strerror(errno));
if(timer_settime (polltimerID, 0, &poll_time, NULL))
printf("problem in timer_settime(): %s",strerror(errno));
But I'm not exactly sure yet, what the priority of the timer is and if (and how) it is able to preempt a current task, anyone?
The posix timer won't do what you want as it's driven off the system clock (which as you pointed out is at 60Hz).
There is no "built-in" OS function that will give you a 100KHz timer.
You will have to find some unused hardware timer on your board (CPU reference manual is useful)
You will have to configure the timer registers for you 100KHz (again Ref. Manual is good)
You will have to hook up the timer interrupt line to your function: intConnect (vector, fn, arg)
The VxWorks Kernel programmers manual has information about writing Interrupt Service Routines.
I'm using FreeRTOS port for PIC32 microcontroller on the PIC32MX starter kit. Was just playing with tasks but the tasks aren't context switching. Here are my main config settings:
#define configMAX_PRIORITIES ( ( unsigned portBASE_TYPE ) 5 )
#define configKERNEL_INTERRUPT_PRIORITY 0x01
#define configMAX_SYSCALL_INTERRUPT_PRIORITY 0x03
#define configTICK_RATE_HZ ( ( portTickType ) 100 )
Now I have two tasks defined which blink two LEDs. Both have priority of 4(highest). Under normal operation the LEDs should alternatively blink for each 100 ticks. But this doesn't happen. The second LED blinks for 100 ticks and the control goes to the general exception handler. Why does this happen? Seems like there is no scheduling at all.
FreeRTOS is a priority based pre-emptive scheduler, tasks of equal priority that do not yield processor time will be round-robin scheduled. Relying on round-robin scheduling is seldom suitable for real-time tasks, and depending on the configured time slice, that may mess up your timing. Time-slicing may even be disabled.
Your tasks must enter the Blocked state waiting on some event (such as elapsed time) to allow each other to run as intended.
That said, entering the exception handler rather than simply one task starving another or not running with the intended timing is a different matter. For that you will need to post additional information, though your first approach should be to deploy your debugger.
The absolute first thing to check is your "tick" interrupt. Often interrupts are not enabled, timers aren't set up right, clocks are not configured properly in the #pragma's that set up the PIC32.. and all those issues manifest themselves first in a lack of a "tick".
This is the #1 cause of not task switching: if you're not getting the tick interrupt. That's where the normal pre-emptive task switching happens.
Assuming you're using the "off the shelf demo", in MPLAB, set a breakpoint in the void vPortIncrementTick( void ) function (around line 177 in FreeRTOS\Source\portable\MPLAB\PIC32MX\port.c) and run your code. If it breakpoints in there, your timer tick is working.
Are you sure both tasks are well registered and the scheduler has been launched?
Something like the following code would do the job:
xTaskCreate( yourFirstTask, "firstTask", STACK_SIZE, NULL, TASK_PRIORITY, NULL );
xTaskCreate( yourSecondTask, "secondTask", STACK_SIZE, NULL, TASK_PRIORITY, NULL );
vTaskStartScheduler();
You can also add an application tick hook to see if the tick interruption occurs correctly or if there is a problem with the tick timer.
There are standard demo tasks that just blink LEDs in the FreeRTOS/Demo/Common/Minimal/flash.c source file. The tasks created in that file are included in the standard PIC32 demo application (which targets the Microchip Explorer16 board).
In its very simplest form, a task that just toggles and LED every 500ms would look like this:
/* Standard task prototype, the parameter is not used in this case. */
void vADummyTask( void *pvParameters )
{
const portTickType xDelayTime = 500 / portTICK_RATE_MS;
for( ;; )
{
ToggleLED();
vTaskDelay( xDelayTime );
}
}
Do you have a round-robin scheduler? Are your tasks sleeping for any length of time, or just yielding (or busy waiting)?
A very common gotcha in embedded OSs is that the scheduler will frequently not attempt to schedule multiple processes of the same priority fairly. That is, once A yields, if A is runnable A may get scheduled again immediately even if B has not had any CPU for ages. This is very counterintuitive if you're used to desktop OSs which go to a lot of effort to do fair scheduling (or at least, it was to me).
If you're running into this, you'll want to make sure that your tasks look like this:
for (;;)
{
led(on); sleep(delay);
led(off); sleep(delay);
}
...to ensure that the task actually stops being runnable between blinks. It won't work if it looks like this:
for (;;)
{
led(on);
led(off);
}
(Also, as a general rule, you want to use normal priority rather than high priority unless you know you'll really need it --- if you starve a system task the system can behave oddly or crash.)