why there is many schedule() call in different place? - c

I am tracing Linux 0.11
https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/old-versions/
I see there are many schedule() call in different place, not just the one inside do_timer().
Few questions here:
do_timer() (#sched.c) will be called every time the timer timeout? This timer is based on an x86 interrupt call?
Since there are many schedule() calls outside of do_timer(), can I say that is kind of preempting? or what's the purpose?

Any operation that blocks calls schedule() to yield control.

Some tasks' state has changed, it needs to be updated in schedule().
Some tasks' are working and still a lot of work, schedule() for balance.

Since there are many schedule() calls outside of do_timer(), can I say that is kind of preempting? or what's the purpose?
For a real OS; most task switches occur because a task blocks waiting for something (user input, network packet, disk IO, ..) or a task unblocks because something it was waiting for happened (and the unblocked task has higher priority and preempts the currently running lower priority task).
The whole "task switch caused by timer IRQ" thing is mostly just a fallback to guard against malicious CPU hogs (denial of service attacks); and for normal software under normal conditions you could disable it (delete the schedule() from the timer IRQ handler) and nobody would notice or care. Note: Some people will say it's also for "non-malicious" CPU bound tasks, but CPU bound tasks are relatively rare, and (ignoring the fact that the Linux scheduler has never been good for task priorities) for CPU bound tasks it's better to rely on an effective system of task priorities (e.g. give the CPU bound tasks a low priority so that almost everything will preempt them).
Also note that various courses on OS theory start with "so simple it never actually happens in practice" concepts, which is almost always a pure round-robin scheduler with tasks that never block (often with "Hey, we can accurately predict the future and know exactly how long each task will run for" nonsense), which is mostly fine as a first step (in a "learn to walk before you run" way) but sucks big salty dog balls if it's not followed by more realistic and more complex concepts (better scheduling algorithms, task priorities, multiple simultaneous scheduling algorithms/"scheduler policies", multi-CPU, interactive/latency sensitive tasks, ..) because it leaves the student/victim with little more than misinformation (e.g. the ever re-occurring "all tasks switches are caused by timer IRQ" misconception).
do_timer() (#sched.c) will be called every time the timer timeout? This timer is based on an x86 interrupt call?
I'm guessing that the timer was the raw PIT chip's IRQ (given that Linux version 0.11 was "absolute beginner developer with no intention of making it portable" historical memorabilia from before thousands of volunteers fixed half of the worst parts).
Also don't forget that the scheduler uses time for two different things - the "current task has used too much CPU time" thing that almost never matters, and figuring out when tasks that are blocked/sleeping (e.g. because they called sleep()) should unblock/wake up. The do_timer() might be for either of these things and might be for both (I don't know without looking at it).

Related

STM32 RTOS timer interrupt and threads

I am working on a project where I need to execute 2 pieces of code off TIM interrupts. One of them has a slightly higher priority than the other, and both will be running on 2 different timers (of course not at the same time interval). Due to both timers being proportional to another (one is 1KHz, one is 8Khz) both will trigger at the same time.
Since I am already using the RTOS middle-ware for another purposes (threads of a much lower priority than these too), I was thinking of creating one thread of each these routines.
However, looking at how cubeMX is generating code, I am even wondering if this is possible.
I can start/stop these timers from any thread, but there is only one HAL_TIM_PeriodElapsedCallback which you usually fill with if statements like so:
if (htim->Instance == TIM2)
Am I correct to assume, regardless of which thread the timers are started from, the TIM callback will always occur "outside" of the RTOS environment?
if so, what would be a better strategy to achieve something close to what I need?
Cheers
Interrupts will triger. But remember:
Its priority (not the RTOS priority as they are unrelated) must be lower the SVC interrupt if you want to use any ...fromISR RTOS functions
They will not happen at the same time (as you have only one core)
I am working on a project where I need to execute 2 pieces of code off
TIM interrupts. One of them has a slightly higher priority than the
other, and both will be running on 2 different timers...
What exactly do you mean by "one of them has a [..] higher priority" - the HW timer events will occur just when the timer underflows occur. I think you mean, the handler code servicing the timeout events.
... (of course not at the same time interval). Due to both timers being proportional to another (one is 1KHz, one is 8Khz) both will trigger at the same time.
In embedded realtime programming, you should never build on the assumption that IRQ events are not occurring at the same time: Your ISR handlers may be suppressed at the moment when a trigger event occurs. This way, even if two concurrent events trigger closely after each other, it may look for your software code as if they had triggered at the same time. The solution is what your question points at: Context priorities (of tasks (= "threads") and ISRs (= "Interrupt handlers")) let you avoid the question which event came earlier and control which event to treat first.
Since I am already using the RTOS middle-ware for another purposes (threads of a much lower priority than these too), I was thinking of creating one thread of each these routines.
You are free to deploy code to an RTOS task or to an ISR, but keep in mind that any ISR will have a higher priority than any task. Your TIM event will trigger an ISR (= interrupt context), but you can (and often should) use the ISR to send a notification (or event, or semaphore, or queue message) to a task in order to have the main part of the timer event processed at the lower priority of a task.
However, looking at how cubeMX is generating code, I am even wondering if this is possible.
CubeMX is not limiting you to use or not use tasks. The question is rather how far CubeMX will generate the code you need, and how much you have to add manually. Please note that you don't have to use the CubeMX feature to generate tasks through its configuration, but this can be done by your own C code, too.
I can start/stop these timers from any thread, but there is only one HAL_TIM_PeriodElapsedCallback which you usually fill with if statements like so:
if (htim->Instance == TIM2)
Am I correct to assume, regardless of which thread the timers are started from, the TIM callback will always occur "outside" of the RTOS environment?
Yes, you are. The question who started the timer is not relevant to the context type/selection triggered by the timer. In any case, the TIM will trigger its ISR (at the interrupt priority configured for that interrupt).
If you use the CubeHAL library, it will implement the root of that ISR, check which of the TIMs related to that ISR have elapsed, and invoke the code you printed. Here, you can insert your user code to the different TIM instances (like TIM2 in your case).
if so, what would be a better strategy to achieve something close to what I need?
Re-check your favourite textbook on RTOS and microcontrollers. Any SO answer cannot include all the theory to solve the problem properly.
Decide whether there will be any more urgent reaction on your system than treating the timeout events. If no, you may implement the timeout reaction in the ISR handler. If yes (or in cases of doubt), implement the ISR with a task notification that goes to a task where you do what the timeout event requires. This may be the task from where you started the timer, or another one.

Why not put task context in interrupt

Here is the story.
Its a safety critical project and needs to run a time critical functional routine in 20KHz. Now the design is to put functional routine in a 20KHz FIQ interrupt, meanwhile safety interrupt also in FIQ. Thats the only two FIQ in system. (Surely there are couples of IRQ enabled in the MCU)
I know that its not good to put task context in interrupt ISR, the proper way of doing this to set mark and run in OS task. But seems current design harm nobody.
The routine takes about 10us (main clock 300MHz), so basically it will not blocks IRQ/FIQ for unacceptable time. It even save time for extra context switch compare with using OS task to run the functional routine. To me, currently it feels like the design is against every principle written on text book in university but can not find a reason to say no to it.
How could I convince myself to move functional routine from ISR to OS? Should I?
Let's recollect your situation:
you are coding a safety critical system
the software architecture isn't specified otherwise you wouldn't ask the question at hand
the system requirements weren't processed correctly otherwise 2) wouldn't be in question
someone told you to "use minimum interrupt if possible in safety critical system"
you want to use the highest priority & non-interruptible code for "just some math work"
Sorry for being a bit harsh but I wouldn't want to use/be in your safety critical system.
For your actual problem:
you have to make sure two things
the code in the FIQ must be deterministic and WCET tested
the registers of the timer must be protected and supervised. Why? An unwanted/erroneous manipulation of the timers registers by a lower safety level code can congest the CPU so much that effectively nothing else but the interrupt is processed.
All this under the assumption that your safe state depends entirely on an external hardware watchdog.
PS: Which are the hazards for users of your system? Annoyance? Injury? Lethal? Are you in a SIL or ASIL context?
The reason to move complex code away from ISR is precisely to avoid lengthy processing in the ISR and thus timing jitter and delayed interrupt servicing resulting from it.
You are stating the your processing is not lengthy so do it in the ISR! Otherwise you are just adding bloat.
20Khz = 50us between interrupts, with 10us of processing time it gives you roughly 20% of CPU time just for this "task", and a jitter of 10us in any other routine that runs in your CPU, it will also sum 10us of processing time for each 40us that any other task will consum, if it is ok for your project, and you keep your total CPU processing time below 70% (which is the common maximum acceptable for critical systems), IMHO it should work without any issue.

How to Disable/Delay the watchDog Timer for a certain Task in an embedded system

I'm working on a project for automotive system where we use the MPC5748 MCU. The application uses an RTOS based on AUTOSAR OS, and this MPC target support two type of watchdogs; software and hardware (they have used soft WDT).
My mission is to fit an algorithm within this application, the development of the algorithm has been done, the problem is that in the task where the algorithm is running is a 1ms task and the algorithm needs much more time than the time dedicated to this function.
I'm a newbie to the embedded world.By the way, in the algorithm main function the program will reset itself and this seems to be a timeOut generated by the expiration of the watchdog.
My questions are:
Can I disable the watchdog timer for this specified function (which must not be disabled but just for testing purpose)? It is possible to use more timeOut for the watchdog on that specified function?
Must I develop another task with a big delay in other to run the algorithm? But the problem is that the algorithm need to be synchronised with the 1ms task since we are receiving CAN commands.
Can i add a sleep(<1ms) on the desired function in order to wait a little bit witout affecting other tasks
What are other options to try?
NB: This is a general problem on the watchdog timer and any useful informations will be much helpful for me. Sorry because I can't share the code.
Can I disable the watchdog timer for this specified function (which must not be disabled but just for testing purpose)? It is possible to use more timeOut for the watchdog on that specified function?
Let's forget that one - it is a really bad idea. If it is possible to defeat the watchdog, then it is possible to do it by error, and then the whole point of the watchdog is defeated. Apart from that its an XY question - a question about your proposed solution to a different problem - you should ask about the problem directly.
Must I develop another task with a big delay in other to run the algorithm? But the problem is that the algorithm need to be synchronised with the 1ms task since we are receiving CAN commands.
Yes you need another task, but you should not add a "big delay" and it is probably unnecessary and certainly a bad design. If the 1ms task needs the result of the algorithm then, the algorithm should run in a service task triggered by the 1ms task and run asynchronously to the 1ms task, the service task then makes the results available to the 1ms task when available (by shared memory or message passing perhaps). Alternatively if the result is not specifically needed by the 1ms task, the service task could take the necessary action independently of the 1ms task.
There are many options, but essentially it seems that your task partitioning is inappropriate; your CAN Rx task should be responsible for receiving CAN messages only, and any action required in response to CAN messages deferred to one or more other tasks, perhaps fed from a message queue.
What are other options to try ?
Software design should not be a matter of trial and error - get the design right, implement the design. However you might consider whether 1ms is appropriate; is it possible that the period can be extended to encompass the worst case execution time without causing a failure to meet deadlines in general? If the answer is "no" then the algorithm does not belong in this task.
I don't think so you can disable/delay the WATCHDOG timer and even if you could that's not a good option to go for.
The problem what think is that the task you are calling is of 1ms, which is very less to read CAN messages and then operate on the same. The minimum task time i think should be of 5ms and the optimal time should be of 10ms.
Can I disable the watchdog timer for this specified function (which must not be disabled but just for testing purpose)? It is possible to use more timeOut for the watchdog on that specified function?
You should never disable the watchdog anywhere in your code.
It might not even be possible, on the MPC5x families you typically set up the watchdog once, and then for safety reasons all watchdog registers turn to read-only registers.
Must I develop another task with a big delay in other to run the algorithm? But the problem is that the algorithm need to be synchronised with the 1ms task since we are receiving CAN commands.
Ideally you should only service the watchdog from one single location in the program. Your CAN peripheral will be FlexCAN, which has a lot of available "mailboxes" for CAN messages. In most cases, you shouldn't need to poll it, but a flag will be set when the desired message arrive.
So it isn't obvious to me why you would need a delay to wait for them. Simply do:
void the_task (void)
{
wdog_refresh();
... // do other things
if(can_message_available)
{
// do something with the message
}
... // do other things
}
rather than
// BAD:
while(!can_message_available)
; // do nothing
Even if you need to use the CAN as FIFO and poll it repeatedly, you would still use the same approach. You'd just have to ensure that the task runs often enough that there will never be an overflow in the FIFO buffer.

What could be delaying my select() call?

I have a small program running on Linux (on an embedded PC, dual-core Intel Atom 1.6GHz with Debian 6 running Linux 2.6.32-5) which communicates with external hardware via an FTDI USB-to-serial converter (using the ftdi_sio kernel module and a /dev/ttyUSB* device). Essentially, in my main loop I run
clock_gettime() using CLOCK_MONOTONIC
select() with a timeout of 8 ms
clock_gettime() as before
Output the time difference of the two clock_gettime() calls
To have some level of "soft" real-time guarantees, this thread runs as SCHED_FIFO with maximum priority (showing up as "RT" in top). It is the only thread in the system running at this priority, no other process has such priorities. My process has one other SCHED_FIFO thread with a lower priority, while everything else is at SCHED_OTHER. The two "real-time" threads are not CPU bound and do very little apart from waiting for I/O and passing on data.
The kernel I am using has no RT_PREEMPT patches (I might switch to that patch in the future). I know that if I want "proper" realtime, I need to switch to RT_PREEMPT or, better, Xenomai or the like. But nevertheless I would like to know what is behind the following timing anomalies on a "vanilla" kernel:
Roughly 0.03% of all select() calls are timed at over 10 ms (remember, the timeout was 8 ms).
The three worst cases (out of over 12 million calls) were 31.7 ms, 46.8 ms and 64.4 ms.
All of the above happened within 20 seconds of each other, and I think some cron job may have been interfering (although the system logs are low on information apart from the fact that cron.daily was being executed at the time).
So, my question is: What factors can be involved in such extreme cases? Is this just something that can happen inside the Linux kernel itself, i.e. would I have to switch to RT_PREEMPT, or even a non-USB interface and Xenomai, to get more reliable guarantees? Could /proc/sys/kernel/sched_rt_runtime_us be biting me? Are there any other factors I may have missed?
Another way to put this question is, what else can I do to reduce these latency anomalies without switching to a "harder" realtime environment?
Update: I have observed a new, "worse worst case" of about 118.4 ms (once over a total of around 25 million select() calls). Even when I am not using a kernel with any sort of realtime extension, I am somewhat worried by the fact that a deadline can apparently be missed by over a tenth of a second.
Without more information it is difficult to point to something specific, so I am just guessing here:
Interrupts and code that is triggered by interrupts take so much time in the kernel that your real time thread is significantly delayed. This depends on the frequency of interrupts, which interrupt handlers are involved, etc.
A thread with lower priority will not be interrupted inside the kernel until it yields the cpu or leaves the kernel.
As pointed out in this SO answer, CPU System Management Interrupts and Thermal Management can also cause significant time delays (up to 300ms were observed by the poster).
118ms seems quite a lot for a 1.6GHz CPU. But one driver that accidently locks the cpu for some time would be enough. If you can, try to disable some drivers or use different driver/hardware combinations.
sched_rt_period_us and sched_rt_period_us should not be a problem if they are set to reasonable values and your code behaves as you expect. Still, I would remove the limit for RT threads and see what happens.
What else can you do? Write a device driver! It's not that difficult and interrupt handlers get a higher priority than realtime threads. It may be easier to switch to a real time kernel but YMMV.

what's wrong in using interrupt handlers as event listeners

My system is simple enough that it runs without an OS, I simply use interrupt handlers like I would use event listener in a desktop program. In everything I read online, people try to spend as little time as they can in interrupt handlers, and give the control back to the tasks. But I don't have an OS or real task system, and I can't really find design information on OS-less targets.
I have basically one interrupt handler that reads a chunk of data from the USB and write the data to memory, and one interrupt handler that reads the data, sends the data on GPIO and schedule itself on an hardware timer again.
What's wrong with using the interrupts the way I do, and using the NVIC (I use a cortex-M3) to manage the work hierarchy ?
First of all, in the context of this question, let's refer to the OS as a scheduler.
Now, unlike threads, interrupt service routines are "above" the scheduling scheme.
In other words, the scheduler has no "control" over them.
An ISR enters execution as a result of a HW interrupt, which sets the PC to a different address in the code-section (more precisely, to the interrupt-vector, where you "do a few things" before calling the ISR).
Hence, essentially, the priority of any ISR is higher than the priority of the thread with the highest priority.
So one obvious reason to spend as little time as possible in an ISR, is the "side effect" that ISRs have on the scheduling scheme that you design for your system.
Since your system is purely interrupt-driven (i.e., no scheduler and no threads), this is not an issue.
However, if nested ISRs are not allowed, then interrupts must be disabled from the moment an interrupt occurs and until the corresponding ISR has completed. In that case, if any interrupt occurs while an ISR is in execution, then your program will effectively ignore it.
So the longer you spend inside an ISR, the higher the chances are that you'll "miss out" on an interrupt.
In many desktop programs, events are send to queue and there is some "event loop" that handle this queue. This event loop handles event by event so it is not possible to interrupt one event by other one. It also is good practise in event driven programming to have all event handlers as short as possible because they are not interruptable.
In bare metal programming, interrupts are similar to events but they are not send to queue.
execution of interrupt handlers is not sequential, they can be interrupted by interrupt with higher priority (numerically lower number in Cortex-M3)
there is no queue of same interrupts - e.g. you can't detect multiple GPIO interrupts while you are in that interrupt - this is the reason you should have all routines as short as possible.
It is possible to implement queues by yourself, feed these queues by interrupts and consume these queues in your super loop (consume while disabling all interrupts). By this approach, you can get sequential processing of interrupts. If you keep your handlers short, this is mostly not needed and you can do the work in handlers directly.
It is also good practise in OS based systems that they are using queues, semaphores and "interrupt handler tasks" to handle interrupts.
With bare metal it is perfectly fine to design for application bound or interrupt/event bound so long as you do your analysis. So if you know what events/interrupts are coming at what rate and you can insure that you will handle all of them in the desired/designed amount of time, you can certainly take your time in the event/interrupt handler rather than be quick and send a flag to the foreground task.
The common approach of course is to get in and out fast, saving just enough info to handle the thing in the foreground task. The foreground task has to spin its wheels of course looking for event flags, prioritizing, etc.
You could of course make it more complicated and when the interrupt/event comes, save state, and return to the forground handler in the forground mode rather than interrupt mode.
Now that is all general but specific to the cortex-m3 I dont think there are really modes like big brother ARMs. So long as you take a real-time approach and make sure your handlers are deterministic, and you do your system engineering and insure that no situation happens where the events/interrupts stack up such that the response is not deterministic, not too late or too long or loses stuff it is okay
What you have to ask yourself is whether all events can be services in time in all circumstances:
For example;
If your interrupt system were run-to-completion, will the servicing of one interrupt cause unacceptable delay in the servicing of another?
On the other hand, if the interrupt system is priority-based and preemptive, will the servicing of a high priority interrupt unacceptably delay a lower one?
In the latter case, you could use Rate Monotonic Analysis to assign priorities to assure the greatest responsiveness (the shortest execution-time handlers get the highest priority). In the first case your system may lack a degree of determinism, and performance will be variable under both event load, and code changes.
One approach is to divide the handler into real-time critical and non-critical sections, the time-critical code can be done in the handler, then a flag set to prompt the non-critical action to be performed in the "background" non-interrupt context in a "big-loop" system that simply polls event flags or shared data for work to complete. Often all that might be necessary in the interrupt handler is to copy some data to timestamp some event - making data available for background processing without holding up processing of new events.
For more sophisticated scheduling, there are a number of simple, low-cost or free RTOS schedulers that provide multi-tasking, synchronisation, IPC and timing services with very small footprints and can run on very low-end hardware. If you have a hardware timer and 10K of code space (sometimes less), you can deploy an RTOS.
I am taking your described problem first
As I interpret it your goal is to create a device which by receiving commands from the USB, outputs some GPIO, such as LEDs, relays etc. For this simple task, your approach seems to be fine (if the USB layer can work with it adequately).
A prioritizing problem exists though, in this case it may be that if you overload the USB side (with data from the other end of the cable), and the interrupt handling it is higher priority than that triggered by the timer, handling the GPIO, the GPIO side may miss ticks (like others explained, interrupts can't queue).
In your case this is about what could be considered.
Some general guidance
For the "spend as little time in the interrupt handler as possible" the rationale is just what others told: an OS may realize a queue, etc., however hardware interrupts offer no such concepts. If the event causing the interrupt happens, the CPU enters your handler. Then until you handle it's source (such as reading a receive holding register in the case of a UART), you lose any further occurrences of that event. After this point, until exiting the handler, you may receive whether the event happened, but not how many times (if the event happened again while the CPU was still processing the handler, the associated interrupt line goes active again, so after you return from the handler, the CPU immediately re-enters it provided nothing higher priority is waiting).
Above I described the general concept observable on 8 bit processors and the AVR 32bit (I have experience with these).
When designing such low-level systems (no OS, one "background" task, and some interrupts) it is fundamental to understand what goes on on each priority level (if you utilize such). In general, you would make the most real-time critical tasks the highest priority, taking the most care of serving those fast, while being more relaxed with the lower priority levels.
From an other aspect usually at design phase it can be planned how the system should react to missed interrupts, since where there are interrupts, missing one will eventually happen anyway. Critical data going across communication lines should have adequate checksums, an especially critical timer should be sourced from a count register, not from event counting, and the likes.
An other nasty part of interrupts is their asynchronous nature. If you fail to design the related locks properly, they will eventually corrupt something giving nightmares to that poor soul who will have to debug it. The "spend as little time in the interrupt handler as possible" statement also encourages you to keep the interrupt code reasonably short which means less code to consider for this problem as well. If you also worked with multitasking assisted by an RTOS you should know this part (there are some differences though: a higher priority interrupt handler's code does not need protection against a lower priority handler's).
If you can properly design your architecture regarding the necessary asynchronous tasks, getting around without an OS (from the no multitasking aspect) may even prove to be a nicer solution. It needs way more thinking to design it properly, however later there are much less locking related problems. I got through some mid-sized safety critical projects designed over a single background "task" with very few and little interrupts, and the experience and maintenance demands regarding those (especially the tracing of bugs) were quite satisfactory compared to some others in the company built over multitasking concepts.

Resources