How a context switch works in a RTOS, need clarity

How a context switch works in a RTOS, need clarity - c

im wondering whether I understand the concept of a RTOS, and more specifically the scheduling process, correctly.
So, I think I understand the process of a timer interrupt (i omitted the interrupt enable/disable commands for better readability here)
1. program runs...
2. A timer tick occurs that triggers a Timer Interrupt
3. The Timer ISR is called
The timer ISR looks like this:
3.1. Kernel saves context (registers etc.)
3.2. Kernel checks if there is a higher priority task
3.3. If so, the Kernel performs the context switch
3.4. Return from Interrupt
4. Program runs with another task executing
But how does the process looks like, when an Interrupt occurs from lets say a I/O Pin?
1. program runs
2. an interrupt is triggered because data is available
3. a general ISR is called?
3.1. Kernel saves context
3.2. Kernel have to call the User defined ISR, because the Kernel doesn't know what to do now
3.1.1 User ISR runs and does whatever it should do (maybe change priority of a task, that should run now, because the data is now available)
3.1.2 return from User ISR
3.3. Kernel checks if there is a higher priority task available
3.4. If so the Kernel performs a context switch
3.5. Return from Interrupt
4. program runs with the different task
In this case the kernel must implement a general ISR, so that all interrupts are mapped to this ISR. For example (as far as i know) the ATmega168p microcontroller has 26 interrupt vectors. So there should be a processor specific file, that maps all the Interrupts to a general ISR. The Kernel-ISR determines what caused the interrupt and calls the specific User-ISR (that handles the actual interrupt).
Did I misunderstood something?
Thank you for your help

There is a clear distinction between the OS tick interrupt and the OS scheduler - you have however conflated the two. When the OS tick ISR occurs, the tick count is incremented, if that increment causes a timer or delay expiry, that is a scheduling event, and scheduling events causes the scheduler to run on exit from the interrupt context.
Different RTOS may have subtle differences, but in general in any ISR, if a scheduling event occurred, the scheduler runs immediately before exiting the interrupt context, setting up the threading context for whatever thread is due to run by the scheduling policy (normally highest priority ready thread).
Scheduling events include:
OS timer expiry
Task delay expiry
Timeslice expiry (for round-robin scheduling).
Semaphore give
Message queue post
Task event flag set
These last three can occur in any ISR (so long as they are "try semantics" non-blocking/zero timeout), the first three as a result of the tick ISR. So the scheduler will run on exit from the interrupt context when any interrupt has caused at least one scheduling event (there may have been nested or multiple simultaneous interrupts).
Scheduling events may occur in the task context also including on any potentially blocking action such as:
Semaphore give
Semaphore take
Message queue receive
Message queue post
Task event flag set
Task event flag wait
Task delay start
Timer wait
Explicit "yield"
The scheduler runs also when a thread triggers a scheduling event, so context switches do not only occur as the result of an interrupt.
To summarise and with respect to your question specifically; the tick or any other interrupt does not directly cause the scheduler to run. An interrupt, any interrupt can perform an action that makes the scheduler due to run. Unlike the thread context where such an action causes the scheduler to run immediately, in the interrupt context, the scheduler is deferred until all pending interrupts have been serviced and runs on exit from the interrupt context.
For details of a specific RTOS implementation of context switching see §§3.05, 3.06 and 3.10 of MicroC/OS-II: The Real Time Kernel (the kernel and the book were specifically developed to teach such principles, so it is a useful resource and the principles apply to other RTOS kernels). In particular Listings 3.18 to 3.20 and Figure 3.10 and the associated explanation.

Related

How does OS scheduler return?

I am developing a simple kernel for my upcoming OS. I have developed everything till the scheduler. I am wondering how the scheduler comes into its cycle.
For example,
The TIMER interrupt fires.
The handler calls the scheduler.
The scheduler jumps the next process in the queue.
The interrupt must return (IRETD)
But if the scheduler has to jump to the next process then when does the interrupt return. And if it does, wouldn't it go back to last process.
I want this clarification - How does the timer interrupt return to from scheduler and how does the scheduler communicate with timer interrupt (if with function call, then when does it return) ?
Assume - Monolithic Kernel

When a interrupt occurs, the processor switches its context. It does so by updating a flag in the EFLAGS register and pushing some information on the stack (can be seen in intel manuals). If the interrupt occurs in user-mode, then a stack-switch also occurs according to the TSS of the current task.
The scheduler process is done as -
Came from user-process with interrupt state pushed on stack
Pick next process
IRETD on interrupt state of new process

Is interrupt handler running like this, and for how long?

I have some confusion when looking at how interrupt handler(ISR) is run. In Wiki http://en.wikipedia.org/wiki/Context_switch, it describes interrupt handling with 2 steps:
1) context switching
When an interrupt occurs, the hardware automatically switches a part of the
context (at least enough to allow the handler to return to the interrupted code).
The handler may save additional context, depending on details of the particular
hardware and software designs.
2) running the handler
The kernel does not spawn or schedule a special process to handle interrupts,
but instead the handler executes in the (often partial) context established at
the beginning of interrupt handling. Once interrupt servicing is complete, the
context in effect before the interrupt occurred is restored so that the
interrupted process can resume execution in its proper state.
Let's say the interrupt handler is the upper half, is for a kernel space device driver (i assume user space device driver interrupt follow same logic).
when interrupt occurs:
1) current kernel process is suspended. But what is the context situation here? Based on Wiki's description, kernel does not spawn a new process to run ISR, and the context established at the beginning of interrupt handling, sounds so much like another function call within the interrupted process. so is interrupt handler using the interrupted process's stack(context) to run? Or kernel would allocate some other memory space/resource to run it?
2) since here ISR is not a 'process' type that can be put to sleep by scheduler. It has to be finished no matter what? Not even limited by any time-slice bound? What if ISR hang, how does the system deal with it?
Sorry if the question is fundamental. I have not delved into the subject long enough.
Thanks,

so is interrupt handler using the interrupted process's stack(context) to run? Or kernel would allocate some other memory space/resource to run it?
It depends on the CPU and on the kernel. Some CPUs execute ISRs using the current stack. Others automatically switch to a special ISR stack or to a kernel stack. The kernel may switch the stack as well, if needed.
since here ISR is not a 'process' type that can be put to sleep by scheduler. It has to be finished no matter what?
Yep, or you're risking to hang your computer. You see, interrupts interrupt processes and threads. In fact, most CPUs have no concept of a thread or a process and to them it doesn't matter what gets interrupted/preempted (it can even be another ISR!), it's just not going to execute again until the ISR finishes.
Not even limited by any time-slice bound? What if ISR hang, how does the system deal with it?
It hangs, especially if it's a single-CPU system. It may report an error and then hang/reboot. In fact, in Windows (since Vista?) hung or too slowly executing deferred procedures (DPCs), which aren't ISRs but are somewhat like them (they execute between ISRs and threads in terms of priority/preemption) can cause a "bugcheck". The OS monitors execution of DPCs and it can do that concurrently on multiple CPUs.
Anyway, it's not a normal situation and typically there's no way out of it other than a system reset. Look up watchdog timers. They help to discover such bad hangs and perform a reset. Many electronic devices have them.

Think about interrupt handler as a function running in its own thread with high priority. When interrupt is set by device, any other activity with lowest priority is suspended, and ISR is executed. This is like thread context switch.
When ISR hangs (for example, in endless loop), the whole computer hangs - assuming that we are talking about ISR in PC driver. Any activity with lower that ISR priority is not allowed, so computer looks dead. However, it still reacts on the hardware remote debugger commands, if one is attached.

Can del_timer return while its handler is running?

I'm looking at some Linux kernel module code that starts and stops timers using add_timer and del_timer.
Sometimes, the implementation goes on to delete the timer "object" (the struct timer_list) right after calling del_timer.
I'd like to find out is if this is safe. Note that this is a uniprocessor implementation, with SMP disabled (which would mandate the use of del_timer_sync instead).
The del_timer_sync implementation checks if the timer is being handled anywhere right now, but del_timer does not. On a UP system, is it possible to have the timer being handled without del_timer knowing, i.e. the timer has been removed from the pending timers list and is being handled?

UP makes things quite a bit simpler, but I think the answer is still "it depends."
If you are doing del_timer in process context, then on UP I think you are safe in assuming the timer is not running anywhere after that returns: the timers are removed from the pending lists and run from the timer interrupt, and if that interrupt starts, it will run to completion before allowing the process context code to continue.
However, if you are in interrupt context, then your interrupt might have interrupted the timer interrupt, and so the timer might be in the middle of being run.

Gracefully (i.e eventually cooperatively) suspend thread execution

I have to develop an application that tries to emulate the executing flow of an embedded target. This target has 2 levels of priority : the highest one being preemptive on the lowest one. The low priority level is managed with a round-robin scheduler which gives 1ms of execution to each thread in turn.
My goal is to write a library that provide the thread_create, thread_start, and all the system calls that are available on my target and use POSIX functions to reproduce the behavior natively on a standard PC.
Thus, when an high priority thread executes, low priority threads should be suspended whatever they are doing at that very moment. It is to the responsibility of the low priority thread's implementation to ensure that it won't be perturbed.
I now it is usually unsafe to suspend a thread, which explains why I didn't find any "suspend(pid)" function.
I basically imagine two solutions to the problem :
-find a way to suspend the low priority threads when a high priority thread starts (and resume them when there is no more high priority activity)
-periodically call a very small "suspend_if_necessary" function everywhere in my low-priority code, and whenever an high priority must start, wait for all low-priority process to call that function and be suspended, execute as single high priority thread, then resume them all.
Even if it is not-so-clean, I quite like the second solution, but still have one problem : how to call the function everywhere without changing all my code?
I wonder if there is an easy way to doing that, somewhat like debugging code does : add a hook call at every line executed that checks for a flag and run some specific code when that flag changes?
I'd be very happy if there is an easy solution to that problem, since I really need to be representative with the behavior of the target execution flow...
Thanks in advance,
Goulou.

Unfortunately, it's not really possible to implement what you want with true threads - even if the high prio thread is restarted, it can take arbitrarily long before the high prio thread is scheduled back in and goes to suspend all the low priority threads. Moreover, there is no reliable way to determine whether the high priority thread is blocked or not using only POSIX threads; you could try tracking things manually, but this runs the risk of both false positives (the thread's blocked on something, but the low prio threads think it's running and suspend itself) and false negatives (you miss a resumed annotation, or there's lag between when the thread's actually resumed and when it marks itself as running).
If you want to implement a thread priority system with pure POSIX, one option is to not use threads, but rather use setcontext for cooperative multitasking. This would allow you to swap between threads at a user level. However you must explicitly yield the CPU in this case. It also doesn't help with blocking syscalls, which would then block all threads in your app; but since you're writing an emulator this might not be an issue.
You may also be able to swap threads using setcontext within a signal handler; I've not tested this case myself, but it could be worth a try scheduling using setcontext in a SIGALRM handler.

To suspend a thread, you sleep it. If you want to be able to wake it on command, sleep it using sigwait, which puts the thread to sleep until it gets a signal. You can send a specific thread a signal with pthread_kill (crazy name, but it actually just sends signals to a thread). This is a very fast way to sleep and wake up threads. 40x Faster than condition variables and very easy.

Force Win32 thread scheduling to a defined sequence based on priority

I am an embedded programmer attempting to simulate a real time preemptive scheduler in a Win32 environment using Visual Studio 2010 and MingW (as two separate build environments). I am very green on the Win32 scheduling environment and have hit a brick wall with what I am trying to do. I am not trying to achieve real time behaviour - just to get the simulated tasks to run in the same order and sequence as they would on the real target hardware.
The real time scheduler being simulated has a simple objective - always execute the highest priority task (thread) that is able to run. As soon a task becomes able to run - it must preempt the currently running task if it has a priority higher than the currently running task. A task can become able to run due to an external event it was waiting for, or a time out/block time/sleep time expiring - with a tick interrupt generating the time base.
In addition to this preemptive behaviour, a task can yield or volunteer to give up its time slice because is is executing a sleep or wait type function.
I am simulating this by creating a low priority Win32 thread for each task that is created by the real time scheduler being simulated (the thread effectively does the context switching the scheduler would do on a real embedded target), a medium priority Win32 thread as a pseudo interrupt handler (handles simulated tick interrupts and yield requests that are signalled to it using a Win32 event object), and a higher priority Win32 thread to simulate the peripheral that generates the tick interrupts.
When the pseudo interrupt handler establishes that a task switch should occur it suspends the currently executing thread using SuspendThread() and resumes the thread that executes the newly selected task using ResumeThread(). Of the many tasks and their associated Win32 threads that may be created, only one thread that manages the task will ever be out of the suspended state at any one time.
It is important that a suspended thread suspends immediately that SuspendThread() is called, and that the pseudo interrupt handling thread executes as soon as the event telling it that an interrupt is pending is signalled - but this is not the behaviour I am seeing.
As an example problem that I already have a work around for: When a task/thread yields the yield event is latched in a variable and the interrupt handling thread is signalled as there is a pseudo interrupt (the yield) that needs processing. Now in a real time system as I am used to programming I would expect the interrupt handling thread to execute immediately that it is signalled because it has a higher priority than the thread that signals it. What I am seeing in the Win32 environment is that the thread that signals the higher priority thread continues for some time before being suspended - either because it takes some time before the signalled higher priority thread starts to execute or because it takes some time for the suspended task to actually stop running - I'm not sure which. In any case this can easily be correct by making the signally Win32 thread block on a semaphore after signalling the Win32 interrupt handling thread, and have the interrupt handling Win32 thread unblock the thread when it has finished its function (handshake). Effectively using thread synchronisation to force the scheduling pattern to what I need. I am using SignalObjectAndWait() for this purpose.
Using this technique the simulation works perfectly when the real time scheduler being simulated is functioning in co-operative mode - but not (as is needed) in preemptive mode.
The problem with preemptive task switching is I guess the same, the task continues to execute for some time after it has been told to suspend before it actually stops running so the system cannot be guaranteed to be left in a consistent state when the thread that runs the task suspends. In the preemptive case though, because the task does not know when it is going to happen, the same technique of using a semaphore to prevent the Win32 thead continuing until it is next resumed cannot be used.
Has anybody made it this far down this post - sorry for its length!
My questions then are:
How I can force Win32 (XP) scheduling to start and stop tasks immediately that the suspend and resume thread functions are called - or - how can I force a higher priority Win32 thread to start executing immediately that it is able to do so (the object it is blocked on is signalled). Effectively forcing Win32 to reschedule its running processes.
Is there some way of asynchronously stopping a task to wait for an event when its not in the task/threads sequential execution path.
The simulator works well in a Linux environment where POSIX signals are used to effectively interrupt threads - is there an equivalent in Win32?
Thanks to anybody who has taken the time to read this long post, and especially thanks in advance to anybody that can hold my 'real time engineers' hand through this Win32 maze.

If you need to do your own scheduling, then you might consider using fibers instead of threads. Fibers are like threads, in that they are separate blocks of executable code, however fibers can be scheduled in user code whereas threads are scheduled by the OS only. A single thread can host and manage scheduling of multiple fibers, and fibers can even schedule each other.

Firstly, what priority values are you using for your threads?
If you set the high priority thread to THREAD_PRIORITY_TIME_CRITICAL it should run pretty much immediately --- only those threads associated with a real-time process will have higher priority.
Secondly, how do you know that the suspend and resume aren't happening immediately? Are you sure this is the problem?
You cannot force a thread to wait on something from outside without suspending the thread to inject the wait code; if SuspendThread isn't working for you then this isn't going to help.
The closest to a signal is probably QueueUserAPC, which will schedule a callback to run the next time the thread enters an "alertable wait state", e.g. by calling SleepEx or WaitForSingleObjectEx or similar.

#Anthony W - thanks for the advice. I was running the Win32 threads that simulated the real time tasks at THREAD_PRIORITY_ABOVE_NORMAL, and the threads that ran the pseudo interrupt handler and the tick interrupt generator at THREAD_PRIORITY_HIGHEST. The threads that were suspended I was changing to THREAD_PRIORITY_IDLE in case that made any difference. I just tried your suggestion of using THREAD_PRIORITY_TIME_CRITICAL but unfortunately it didn't make any difference.
With regards to your question am I sure that the suspend and resume not happening immediately is the problem - well no I'm not. It is my best guess in an environment I am unfamiliar with. My thinking regarding the failure of suspend and resume to work immediately stems from my observation when a task yields. If I make the call to yield (signal [using a Win32 event] a higher priority Win32 thread to switch to the next real time task) I can place a break point after the yield and that gets hit before a break point in the higher priority thread. It is unclear whether a delay in signalling the event and the higher priority task running, or a delay in suspending the thread and the thread actually stopping running was causing this - but the behaviour was definitely observed. This was fixed using a semaphore handshake, but that cannot be done for preemptions caused by tick interrupts.
I know the simulation is not running as I expect because a set of tests that check the sequence of scheduling of real time tasks is failing. It is always possible the scheduler has a problem, or the test has a problem, but the test will run for weeks without failing on a real real time target so I'm inclined to think the test and the scheduler are ok. A big difference is on the real time target the tick frequency is 1 ms, whereas on the Win32 simulated target it is 15ms with quite a lot of variation even then.
#Remy - I have done quite a bit of reading about fibers today, and my conclusion is that for simulating the scheduler in cooperative mode they would be perfect. However, as far as I can see they can only be scheduled by the fibers themselves calling the SwitchToFiber() function. Can a thread be made to block on a timer or sleep so it runs periodically, effectively preempting the fiber that was running at the time? From what I have read the answer is no because blocking one fiber will block all fibers running in the thread. If it could be made to work, could the periodically executing fiber then call the SwitchToFiber() function to select the next fiber to run before again sleeping for a fixed period? Again I think the answer is no because once it switches to another fiber it will no longer be executing and so will not actually call the Sleep() function until the next time the executing fiber switches back to it. Please correct my logic here if I have got the wrong idea of how fibers work.
I think it could work if the periodic functionality could remain in its own thread, separate from the thread that executed the fibers - but (again from what I have read) I don't think a one thread can influence the execution of fibers running in a different thread. Again I would be grateful if you could correct my conclusions here if they are wrong.

[EDIT] - simpler than the hack below - it seems just ensuring all the threads run on the same CPU core also fixes the problem :o) After all that. The only problem then is the CPU runs at nearly 100% and I'm not sure if the heat is damaging to it.
[/EDIT]
Ahaa! I think I have a work around for this - but its ugly. The uglyness is kept in the port layer though.
What I do now is store the thread ID each time a thread is created to run a task (a Win32 thread is created for each real time task that is created). I then added the function below - which is called using trace macros. The trace macros can be defined to do whatever you want, and have proven very useful in this case. The comments in the code below explain. The simulation is not perfect, and all this does is correct the thread scheduling when it has already deviated from the real time scheduling whereas I would prefer it not to go wrong in the first place, but the positioning of the trace macros makes the code containing this solution pass all the tests:
void vPortCheckCorrectThreadIsRunning( void )
{
xThreadState *pxThreadState;
/* When switching threads, Windows does not always seem to run the selected
thread immediately. This function can be called to check if the thread
that is currently running is the thread that is responsible for executing
the task selected by the real time scheduler. The demo project for the Win32
port calls this function from the trace macros which are seeded throughout
the real time kernel code at points where something significant occurs.
Adding this functionality allows all the standard tests to pass, but users
should still be aware that extra calls to this function could be required
if their application requires absolute fixes and predictable sequencing (as
the port tests do). This is still a simulation - not the real thing! */
if( xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED )
{
/* Obtain the real time task to Win32 mapping state information. */
pxThreadState = ( xThreadState * ) *( ( unsigned long * ) pxCurrentTCB );
if( GetCurrentThreadId() != pxThreadState->ulThreadId )
{
SwitchToThread();
}
}
}