I am working on a system where I need to achieve (virtually) a real time behavior. I am using non-blocking bare-metal programming and a dsPIC33e microcontroller for this project. Tasks communicate with each other using queues.
I have a low, medium and high priority tasks. High priority task is for example emergency shut down using tactile switch. Low and medium priority task are for communication, sensor reading and there processing respectively. All tasks are checked under RMS (rate monotonic scheduling) and working fine utilizing 60% of processor time.
The question is that I want to call the low and high priority tasks (linked list of modules) inside hardware timer ISR because the dspic33e processor provides hardware based context switching. But its often said that the interrupt routine should be small as possible and often said that use flags and read in main. If I use these flags and then read these flags in main then i don't achieve a preemption behavior.
can anybody suggest/guide me if it's still good to call the linked lists inside the timer routines?
Related
I am trying to find a better way to organize sub-tasks for embedded applications. I am more interested in Power Electronics applications. I am not a software engineer, but a Power Electronics Engineer. However, in most cases I need to develop the code.
In those applications, the main will stay in a infinite loop, and the control algorithm will run in a ISR (Interrupt Service Routine). However, in some applications extra low-priority sub-tasks are necessary (e.g. communication, alarm handling). Those sub-tasks cannot run in the ISR routine due to time limitation (the control algorithm has the higher priority). I would like to know the best ways to handle task schedule for embedded applications.
One simple way, in code snippet below, is just put all the sub-tasks inside the infinite loop (if all have the same priority). The application will run the ISR routine periodically (each switching period, for example) and use the left time to run the subtasks in a Round Robin approach. However, in this method all the subtasks will run in a unknown period. Consequently I will not be able to add timer routines (increment and check) inside those tasks. Also, if the software stays trapped (due to some bad code) in a low-priority task, the other tasks will not be executed (or the watchdog timer will be activated).
void main(void)
{
Init();
for(;;) /* There is a ISR routine with the control Algorithm*/
{
SubTask1();
SubTask2();
SubTask3();
}
}
It is possible to use other ISR routines (controlled for timer modules, for example) and control the interrupt priority to run one specific task. However, this method will demand a more careful study of the device, in order to set all the interrupt priorities correctly.
Do you know a better method? What schedule tasking methods are the most efficient for embedded applications?
The question hits on some general principals in embedded software.
1) Limit what you do in ISRs to the bare minimum
2) Coordinate different activities by using an RTOS
3) Improve performance by designing the software as event driven
The way to efficiently implement the sub-tasks is to move them from a polled loop to being event driven. If they are an alarm condition you want to check for periodically, use your RTOS to call that code from a timer. For communications, have that code do a blocking wait for an event, like the arrival of a message. Event driven code is much more efficient because it doesn't have to spin through all the polling looking for the events to handle.
The tools of an event driven design (threads, timers, blocking, etc) are provided by an RTOS, point 3) leads to point 2). An RTOS also solves your issues with the sub-tasks running at unknown times and for unknown durations, if there are remaining tasks that are not event driven.
Finally, there are a variety of reasons to limit how much you do in an ISR. It's harder to debug ISR code. It's harder to synchronize what the ISR does with the rest to the tasks. The alternative is to doing the same thing as a high priority task that waits for an event from the ISR.
But the biggest reason is future flexibility. Running the control algorithm in the ISR makes it hard to add another high priority task. Or maybe there will be a new requirement for the control algorithm to report status or write to a disk. Moving the code out of the ISR give you more options.
I'm learning FreeRTOS on a Cortex M0. (Simultaneously, I'm learning the Cortex as well...). I've got plenty of experience with 8bit MCUs.
I'm going through the newbie tutorials on FreeRTOS and I understand setting up basics tasks and the idle daemon.
I realize I don't really understand what the FreeRTOS is doing to manage the underlying timing mechanicals of the kernel. Which leads to one big question...
What is the ideal way to shutdown an RTOS when you want to turn your device off? Not idle the device, but put your MCU into the deepest OFF there is (whatever you want to call it).
It seems trivial, to idle between tasks, but shutting the MCU off and making sure it stays off, and the RTOS kernel doesn't trigger an interrupt or somethign else to wake the MCU back up...?
this is deep sleep mode / power down mode, for an 8-bit MCU this is in the datasheet of ATmega128RFA1 on page 159 ff in http://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-8266-MCU_Wireless-ATmega128RFA1_Datasheet.pdf ( with the wake-up sources ) in this mode all internal timers are disabled
in freeRTOS this is named Tickless Idle Mode, cf https://www.freertos.org/low-power-tickless-rtos.html
Note: If eTaskConfirmSleepModeStatus() returns eNoTasksWaitingTimeout
when it is called from within portSUPPRESS_TICKS_AND_SLEEP() then the
microcontroller can remain in a deep sleep state indefinitely.
eTaskConfirmSleepModeStatus() will only return eNoTasksWaitingTimeout
when the following conditions are true:
Software timers are not being used, so the scheduler is not due to execute a timer callback function at any time in the future.
All the application tasks are either in the Suspended state, or in the Blocked state with an infinite timeout (a timeout value of
portMAX_DELAY), so the scheduler is not due to transition a task out
of the Blocked state at any fixed time in the future.
To avoid race conditions the RTOS scheduler is suspended before
portSUPPRESS_TICKS_AND_SLEEP() is called, and resumed when
portSUPPRESS_TICKS_AND_SLEEP() completes. This ensures application
tasks cannot execute between the microcontroller exiting its low power
state and portSUPPRESS_TICKS_AND_SLEEP() completing its execution.
Further, it is necessary for the portSUPPRESS_TICKS_AND_SLEEP()
function to create a small critical section between the tick source
being stopped and the microcontroller entering the sleep state.
eTaskConfirmSleepModeStatus() should be called from this critical
section.
All GCC, IAR and Keil ARM Cortex-M3 and ARM Cortex-M4 ports now
provide a default portSUPPRESS_TICKS_AND_SLEEP() implementation.
Important information on using the ARM Cortex-M implementation is
provided on the Low Power Features For ARM Cortex-M MCUs page.
so in freeRTOS invoking tickless idle mode is equivalent to deep sleep or power down. possibly you have to manually disable internal timers on the cortex ...
had some problems powering down the ATmega128RFA1 MCU in Contiki OS ...
I know it has to do with time and efficiency, and how ISRs take time away from other processes, but I am unclear why this is. I am always told to keep ISRs very short. I am a bit confused why this is.
Normally, ISRs come into scene when a hardware device needs to interact with the CPU. They send an interrupt signal that makes the CPU to leave whatever it was doing to service the interrupt. That it's what ISR must care about.
Now, this depends on many factors, being the hardware environment and the nature of the interrupt maybe the most relevant ones, but it usually happens that in order to properly service an interrupt, ISRs run with interrupts disabled so they cannot be interrupted. This means that the CPU cannot be shared among other processes while it is running ISR code because the system timer interrupt that is used to run the scheduler (which is the part of the kernel that takes care of making the illusion that the CPU can do several tasks at the same time) won't work.
So, if your ISR takes too much time to perform a certain operation with the device, your system will be affected as a whole, because the percentage of time the CPU is available for the rest of processes will be less than usual. This is much noted on old system with PIO hard disks, which interrupt the CPU for every disk sector they want to transfer to the CPU, and the ISR must do the actual transfer. If there's many disk traffic, you may notice things like your mouse moving jerky (because the interrupt that the mouse device sends to the CPU is not attended)
OSes like Linux allow ISRs to defer time consuming operations with hardware devices to tasklets: sort of kernel threads that can share CPU time with other processes, yet keeping the atomic nature of hardware device operations (the OS ensures that there won't be more than one tasklet function -for the specific tasklet associated to the ISR- running in the system at the same time). The PIO transfer from disk to kernel buffers is an example of such operation.
Some precisions w.r.t. the accepted answer.
Interrupts are not necessarily disabled when running an interrupt, and that is not necessarily the reason why the kernel processes all interrupts before returning to threads.
There is the concept of interrupt priorities. An interrupt of higher priority will preempt a running ISR: if the timer interrupt is of higher priority than the running ISR, it will run. However, a kernel will not handle context switches at this time, but rather defer them until all queued/pending ISRs have run.
Also, on some processors (eg. ARM Cortex-M3), the concept of handling an interrupt is a mode of operation in the processor itself. The processor cannot go back to running threads until it gets out of interrupt mode. Once that happens, all interrupts are fully serviced: you cannot go back to running an ISR.
But the main reason why all ISRs must finish before going back to threads is that kernels do not have the concept of a thread-like running context for ISRs. An ISR thus cannot pend: it must run to completion. An ISR is thus hogging the CPU, except from higher-priority interrupts, until it finishes its purpose.
Usually, the main thread has lower priority than the ISRs. Depending on the scheduler, often the main code will be executed after all pending ISRs have been run.
Having alot of computation intensive code in one or many ISR is generally not advisable, since it may cause delays or even CPU starvation of lower priority ISRs or threads, which may be detrimental if time-critical code needs to be executed.
However, when action needs to be taken immediately at an interrupt event, the fastest way is to execute code from the associated ISR (and possibly assign it a high priority).
If you plan on using several interrupt sources that execute time-consuming code, the way to go is by using an RTOS to allow safe and efficient interleaving of several threads to service each of the interrupts.
I would like to run a long term task on a dedicated core and would like that task to be minimally interrupted / preempted. I can see 2 solutions. Which one is better or any other solution?
1) Set affinity and isolate core using isolcpus
2) Make the thread real time using SCHED_FIFO and set the priority high
- if this is the better choice how high the priority should be? Can I set it to 99?
What I am concerned about is being preempted by kernel threads, IPIs ...
Regarding the first solution you mentioned, by adding parameter isolcpus = [CPU no.] during boot will instruct Linux scheduler to not run any task on that CPU unless requested by user using CPU Affinity. But this CPU may receive interrupts and that can also be avoided by setting IRQ Affinity, so that the isolated CPU doesn’t receive any interrupt. Finally in your code of the task you set the Affinity to the isolated CPU and you are good to go.
But Even if you follow these steps, kernel tasks are executed on the isolated CPU core if you are not using a real-time kernel from RP_PREEMPT, hence it might not be possible to completely isolate a CPU core unless you are using RT kernel.
Refer - http://elinux.org/CPU_Shielding_capability
The second solution about using SCHED_FIFO scheduling policy and using a high priority value will still not prevent the kernel threads, Timer tick interrupts, IPIs etc., from pre-empting your task. Because the scheduling policies and priority is for kernel to schedule all other User-space processes and threads and does not apply to kernel threads or processes.
So by setting high priority to your task does not mean you will get 100% CPU dedicated to your task. Also the alternative, manually setting the CPU mask of your task to a CPUSET in the system, can cause problems and suboptimal load balancer performance. Your task will still get interrupted from time to time by Linux code, including other tasks - such as the timer tick interrupt and the scheduler code, IPIs from other CPUs and stuff like work queue kernel threads, although the interruption should be quite minimal if you have don’t have much activity going on in your other cores.
But the cleanest way to achieve this should come from Kernel tweak which I found from this link http://www.linuxjournal.com/article/6799?page=0,2. Though I haven’t tried this personally, I think it’s worth giving a look at this article as well before you decide upon the method you will use.
I am working on a customized/proprietary RTOS provided by my client.
The RTOS uses round robin scheduling with priority preemption.
Scenario is -
The Renesas H8S controller is running at 20 MHz
I have configured interrupt for ethernet interrupt (A LAN9221 chip is interrupting)
An OS task which reads the data from LAN controller is running at highest priority in OS
Another OS task TCP which is second highest priority task in system
An OS task which referesh watchdog
I have generated network traffic to simulate bombarding condition on the network.
Problem is at high data rates (more than 500 packets/second) on ethernet ISR watchdog is getting fired which is configured for 1 second.
Watchdog is configured to be serviced by a lower priority task of OS to detect any problem in OS functionality.
I doubt the frequency of ISR and higher priority tasks are not letting the watchdog task to be scheduled. To confirm my doubt i have serviced the watchdog in ISR itself and found working till 2000 packets/second.
Could you please suggest how can handle the situation so the watchdog should not fire even on higher data/interrupt rate.
Watchdog is refreshed in OS task running at normal OS priority which helps in catching endless loop.
The task which is at highest OS priority is Ethernet packet reading task.
There is one hardware interrupt which is raised when Ethernet receives packet and in ISR we schedule waiting Ethernet packet reading task.
Also in my system the OS is not running using timer interrupt (Like other OS run).
The OS is round robin and relinquish the control voluntarily. So increasing the watchdog task priority above the normal is not possible otherwise OS will always find it at higher priority and ready (watchdog is refreshed in infinite loop no waiting for any event) and other tasks will not get time to execute.
Only tasks which are waiting on some event can have high priorities.
So the problem is watchdog task is not getting time to refresh because of frequent interrupts and continuous scheduling of high priority tasks (Ethernet packet reading).
Try to give you watchdog a higher priority.
This might seem wrong at first glance. A watchdog shouldn't get a high priority but that's only true for systems which aren't under heavy load. Under heavy load, the scheduling will push the watchdog back (it's low prio after all) which can cause spurious time outs.
Giving the watchdog a high priority should not have a big impact on performance (it's a small task, runs not very often, triggered by an interrupt) but makes sure it can't starve.
The disadvantage is that you can't catch endless loops anymore (since the loop can now be interrupted by the watchdog).
You should also consider badly designed hardware or a bad mapping of interrupts. Maybe you can give the watchdog IRQ a higher priority than the network card. That would allow the watchdog to process its interrupts in a timely fashion without you having to give the task a higher priority.
Or you can try to increment a counter when a network packet has been processed. A new, high priority watchdog thread could watch this counter and re-configure the low-prio watchdog task not to fire as long as the counter changes.
In any form of real-time application you need, by definition, to be 100% aware of what is going on. You must know how much time each task consumes. Measure the time needed for each task with an oscilloscope by toggling a pin. Then calculate these times for the whole system. If the higher priority tasks take too much time, well, then obviously the dog will starve.
If this is too complex to measure because of acyclic or non-deterministic behavior, the program needs to be fixed. If the watchdog sits in a high priority task, you have pretty much disabled it for any task with lower prio. You might as well shut the watchdog off entirely then.
Trial & error patches, giving the watchdog higher prio, or increasing the CPU clock until the bug goes away is simply not a professional approach.
But then of course, the hardware might not be sufficient to service such a high data load as you expect. Then you may have no other option but to either use dirty patches or re-design the product from scratch with a suitable MCU.
It is probably not a matter of telling how to do it, the architecture you described should work. What you need to do is discover why the watchdog is not serviced.
If your RTOS does not have instrumentation or tools for debugging and testing, you could add I/O toggling in the watchdog loop and watch it with a scope - all the periods where it stops toggling are where higher priority tasks or interrupts are running -if that happens for more than one second, the watchdog will trigger. You might then add similar instrumentation to your other tasks and ISRs to see what is taking the time.
Is it possible that you are dead-locking under high load so that the system is in fact failing? A situation where the watchdog firing would be entirely valid. You don't want to stop it firing if it is in fact detecting an system failure - you want to fix the system failure.
If the task that handles network packets consumes so much time that it prevents the task responsible for refreshing the watchdog from getting CPU time; then the system is unable to handle high networking load. The watchdog problem is only a symptom of this "unable to handle high network load" problem.
The solution is to use a faster CPU, slow down the network, reduce the overhead of handling packets, or some combination of these options; so that the system can handle high network load (and so that the task that refreshes the watchdog does get run). Note that "handling high network load" may include dropping packets, which is the normal/established approach for handling network congestion.