SetThreadPriority SetPriorityClass and SetProcessAffinityMask - c

I am having a small issue which I'm not understanding quite entirely.
So basically I have a thread which is waiting on an event and a timeSetEvent from WinMM which is pulsing the event every 1ms.
I put some query performance counter in my thread to find out the the time distance between each thread start. The thread is currently just waiting for the event and checking its own rate and doing nothing else.
I verified that he WinMM is correctly scheduled every 1ms, however, once the event is signaled, sometimes my thread is being preempted and runs ~6ms later than expected. At this point I started playing with priorities and affinity. So i cranked up my priority class to real time and my threads to time critical. And when on core 0 my thread still gets preempted every now and then (~1-2 times every 15 seconds). Instead if I set the affinity to core 2 it never gets preempted (like never ever, I ran the test software for a few hours, never got prempted once). Are there some driver/system threads running with priority above real time/time critical that are bound to core 0 only?
I am running on windows 7 pro on Intel i7-3470.

Related

Priority based multithreading?

I have written code for two threads where is one is assigned priority 20 (lower) and another on 10 (higher). Upon executing my code, 70% of the time I get expected results i.e high_prio (With priority 10) thread executes first and then low_prio (With priority 20).
Why is my code not able to get 100 % correct result in all the executions? Is there any conceptual mistake that I am doing?
void *low_prio(){
Something here;
}
void *high_prio(){
Something here;
}
int main(){
Thread with priority 10 calls high_prio;
Thread with priority 20 calls low_prio;
return 0;
}
Is there any conceptual mistake that I am doing?
Yes — you have an incorrect expectation regarding what thread priorities do. Thread priorities are not meant to force one thread to execute before another thread.
In fact, in a scenario where there is no CPU contention (i.e. where there are always at least as many CPU cores available as there are threads that currently want to execute), thread priorities will have no effect at all -- because there would be no benefit to forcing a low-priority thread not to run when there is a CPU core available for it to run on. In this no-contention scenario, all of the threads will get to run simultaneously and continuously for as long as they want to.
The only time thread priorities may make a difference is when there is CPU contention -- i.e. there are more threads that want to run than there are CPU cores available to run them. At that point, the OS's thread-scheduler has to make a decision about which thread will get to run and which thread will have to wait for a while. In this instance, thread priorities can be used to indicate to the scheduler which thread it should prefer allow to run.
Note that it's even more complicated than that, however -- for example, in your posted program, both of your threads are calling printf() rather a lot, and printf() invokes I/O, which means that the thread may be temporarily put to sleep while the I/O (e.g. to your Terminal window, or to a file if you have redirected stdout to file) completes. And while that thread is sleeping, the thread-scheduler can take advantage of the now-available CPU core to let another thread run, even if that other thread is of lower priority. Later, when the I/O operation completes, your high-priority thread will be re-awoken and re-assigned to a CPU core (possibly "bumping" a low-priority thread off of that core in order to get it).
Note that inconsistent results are normal for multithreaded programs -- threads are inherently non-deterministic, since their execution patterns are determined by the thread-scheduler's decisions, which in turn are determined by lots of factors (e.g. what other programs are running on the computer at the time, the system clock's granularity, etc).

Is a Windows Timer as accurate as Sleep()?

Sleep() is very accurate, so for example if I want to sleep for 10 hours:
Sleep(36000000); // sleep for 10 hours
My thread will wait for exactly 10 hours (plus the time that Windows needs to wake up my thread, which is negligible).
However, since Sleep() will block my UI thread, I which to use Windows Timers instead. So is a Windows Timer as accurate as Sleep()? that is, will it wait for exactly 10 hours (plus the time it needs for my Window Procedure to receive the WM_TIMER message)?
Yes, the basic plumbing underneath Sleep() and SetTimer() is the same. Something you can see by calling timeBeginPeriod(), it affects the accuracy of both. It is the clock tick interrupt handler that counts sleeps and timers down and gets the thread scheduled when it gets ready to run again. Their sleep/wait time is adjusted when the system clock gets re-calibrated by a time server.

The disadvantages of using sleep()

For c programming, if i want to coordinate two concurrently executing processes, I can use sleep(). However, i heard that sleep() is not a good idea to implement the orders of events between processes? Are there any reasons?
sleep() is not a coordination function. It never has been. sleep() makes your process do just that - go to sleep, not running at all for a certain period of time.
You have been misinformed. Perhaps your source was referring to what is known as a backoff after an acquisition of a lock fails, in which case a randomized sleep may be appropriate.
The way one generally establishes a relative event ordering between processes (ie, creates a happens-before edge) is to use a concurrency-control structure such as a condition variable which is only raised at a certain point, or a more-obtuse barrier which causes each thread hitting it to wait until all others have also reached that point in the program.
Using sleep() will impact the latency and CPU load. Let's say you sleep for 1ms and check some atomic shared variable. The average latency will be (at least) 0.5ms. You will be consuming CPU cycles in this non-active thread to poll the shared atomic variable. There are also often no guarantees about the sleep time.
The OS provides services to communicate/synchronize between threads/processes. Those have low latency, consume less CPU cycles, and often have other guarantees - those are the ones you should use... (E.g. condition variables, events, semaphores etc.). When you use those the thread/process does not need to "poll". The kernel wakes up the waiting threads/processes when needed (the thread/process "blocks").
There are some rare situations where polling is the best solution for thread/process synchronization, e.g. a spinlock, usually when the overhead of going through the kernel is larger than the time spent polling.
Sleep would not be a very robust way to handle event ordering between processes as there are so many things that can go wrong.
What if your sleep() is interrupted?
You need to be a bit more specific about what you mean by "implement the order of events between processes".
In my case, I was using this function in celery. I was doing time.sleep(10). And it was working fine if the celery_task was called once or twice per minute. But it created chaos in one case.
If the celery_task is called 1000 times
I had 4 celery workers, so the above 1000 celery calls were queued for execution.
The first 4 calls were executed by the 4 workers and the remaining 996 were still in the queue.
the workers were busy in the 4 tasks for 10 seconds and after 10 secs it took the next 4 tasks. Going this way it may take around 1000\4*10=2500 seconds.
Eventually, we had to remove time.sleep as it was blocking the worker for 10 seconds in my case.

What could produce this bizzare behavior with two threads sleeping at the same time?

There are two threads. One is an events thread, and another does rendering. The rendering thread uses variables from the events thread. There are mutex locks but they are irrelevant since I noticed the behavior is same even if I remove them completely (for testing).
If I do a sleep() in the rendering thread alone, for 10 milliseconds, the FPS is normally 100.
If I do no sleep at all in the rendering thread and a sleep in the events thread, the rendering thread does not slow down at all.
But, if I do a sleep of 10 milliseconds in the rendering thread and 10 in the events thread, the FPS is not 100, but lower, about 84! (notice it's the same even if mutex locks are removed completely)
(If none of them has sleeps it normally goes high.)
What could produce this behavior?
--
The sleep command used is Sleep() of windows or SDL_Delay() (which probably ends up to Sleep() on windows).
I believe I have found an answer (own answer).
Sleeping is not guaranteed to wait for a period, but it will wait at least a certain time, due to OS scheduling.
A better approach would be to calculate actual time passed explicitly (and allow execution via that, only if certain time has passed).
The threads run asynchronously unless you synchronise them, and will be scheduled according to the OS's scheduling policy. I would suggest that the behaviour will at best be non-deterministic (unless you were running on an RTOS perhaps).
You might do better to have one thread trigger another by some synchronisation mechanism such as a semaphore, then only have one thread Sleep, and the other wait on the semaphore.
I do not know what your "Events" thread does but given its name, perhaps it would be better to wait on the events themselves rather than simply sleep and then poll for events (if that is what it does). Making the rendering periodic probably makes sense, but waiting on events would be better doing exactly that.
The behavior will vary depending on many factors such as the OS version (e.g. Win7 vs. Win XP) and number of cores. If you have two cores and two threads with no synchronization objects they should run concurrently and Sleep() on one thread should not impact the other (for the most part).
It sounds like you have some other synchronization between the threads because otherwise when you have no sleep at all in your rendering thread you should be running at >100FPS, no?
In case that there is absolutely no synchronization then depending on how much processing happens in the two threads having them both Sleep() may increase the probability of contention for a single core system. That is if only one thread calls Sleep() it is generally likely to be given the next quanta once it wakes up and assuming it does very little processing, i.e. yields right away, that behavior will continue. If two threads are calling Sleep() there is some probability they will wake up in the same quanta and if at least one of them needs to do any amount of processing the other will be delayed and the observed frequency will be lower. This should only apply if there's a single core available to run the two threads on.
If you want to maintain a 100FPS update rate you should keep track of the next scheduled update time and only Sleep for the remaining time. This will ensure that even if your thread gets bumped by some other thread for a CPU quanta you will be able to keep the rate (assuming there is enough CPU time for all processing). Something like:
DWORD next_frame_time = GetTickCount(); // Milli-seconds. Note the resolution of GetTickCount()
while(1)
{
next_frame_time += 10; // Time of next frame update in ms
DWORD wait_for = next_frame_time - GetTickCount(); // How much time remains to next update
if( wait_for < 11 ) // A simplistic test for the case where we're already too late
{
Sleep(wait_for);
}
// Do periodic processing here
}
Depending on the target OS and your accuracy requirements you may want to use a higher resolution time function such as QueryPerformanceCounter(). The code above will not work well on Windows XP where the resolution of GetTickCount() is ~16ms but should work in Win7 - it's mostly to illustrate my point rather than meant to be copied literally in all situations.

Force Win32 thread scheduling to a defined sequence based on priority

I am an embedded programmer attempting to simulate a real time preemptive scheduler in a Win32 environment using Visual Studio 2010 and MingW (as two separate build environments). I am very green on the Win32 scheduling environment and have hit a brick wall with what I am trying to do. I am not trying to achieve real time behaviour - just to get the simulated tasks to run in the same order and sequence as they would on the real target hardware.
The real time scheduler being simulated has a simple objective - always execute the highest priority task (thread) that is able to run. As soon a task becomes able to run - it must preempt the currently running task if it has a priority higher than the currently running task. A task can become able to run due to an external event it was waiting for, or a time out/block time/sleep time expiring - with a tick interrupt generating the time base.
In addition to this preemptive behaviour, a task can yield or volunteer to give up its time slice because is is executing a sleep or wait type function.
I am simulating this by creating a low priority Win32 thread for each task that is created by the real time scheduler being simulated (the thread effectively does the context switching the scheduler would do on a real embedded target), a medium priority Win32 thread as a pseudo interrupt handler (handles simulated tick interrupts and yield requests that are signalled to it using a Win32 event object), and a higher priority Win32 thread to simulate the peripheral that generates the tick interrupts.
When the pseudo interrupt handler establishes that a task switch should occur it suspends the currently executing thread using SuspendThread() and resumes the thread that executes the newly selected task using ResumeThread(). Of the many tasks and their associated Win32 threads that may be created, only one thread that manages the task will ever be out of the suspended state at any one time.
It is important that a suspended thread suspends immediately that SuspendThread() is called, and that the pseudo interrupt handling thread executes as soon as the event telling it that an interrupt is pending is signalled - but this is not the behaviour I am seeing.
As an example problem that I already have a work around for: When a task/thread yields the yield event is latched in a variable and the interrupt handling thread is signalled as there is a pseudo interrupt (the yield) that needs processing. Now in a real time system as I am used to programming I would expect the interrupt handling thread to execute immediately that it is signalled because it has a higher priority than the thread that signals it. What I am seeing in the Win32 environment is that the thread that signals the higher priority thread continues for some time before being suspended - either because it takes some time before the signalled higher priority thread starts to execute or because it takes some time for the suspended task to actually stop running - I'm not sure which. In any case this can easily be correct by making the signally Win32 thread block on a semaphore after signalling the Win32 interrupt handling thread, and have the interrupt handling Win32 thread unblock the thread when it has finished its function (handshake). Effectively using thread synchronisation to force the scheduling pattern to what I need. I am using SignalObjectAndWait() for this purpose.
Using this technique the simulation works perfectly when the real time scheduler being simulated is functioning in co-operative mode - but not (as is needed) in preemptive mode.
The problem with preemptive task switching is I guess the same, the task continues to execute for some time after it has been told to suspend before it actually stops running so the system cannot be guaranteed to be left in a consistent state when the thread that runs the task suspends. In the preemptive case though, because the task does not know when it is going to happen, the same technique of using a semaphore to prevent the Win32 thead continuing until it is next resumed cannot be used.
Has anybody made it this far down this post - sorry for its length!
My questions then are:
How I can force Win32 (XP) scheduling to start and stop tasks immediately that the suspend and resume thread functions are called - or - how can I force a higher priority Win32 thread to start executing immediately that it is able to do so (the object it is blocked on is signalled). Effectively forcing Win32 to reschedule its running processes.
Is there some way of asynchronously stopping a task to wait for an event when its not in the task/threads sequential execution path.
The simulator works well in a Linux environment where POSIX signals are used to effectively interrupt threads - is there an equivalent in Win32?
Thanks to anybody who has taken the time to read this long post, and especially thanks in advance to anybody that can hold my 'real time engineers' hand through this Win32 maze.
If you need to do your own scheduling, then you might consider using fibers instead of threads. Fibers are like threads, in that they are separate blocks of executable code, however fibers can be scheduled in user code whereas threads are scheduled by the OS only. A single thread can host and manage scheduling of multiple fibers, and fibers can even schedule each other.
Firstly, what priority values are you using for your threads?
If you set the high priority thread to THREAD_PRIORITY_TIME_CRITICAL it should run pretty much immediately --- only those threads associated with a real-time process will have higher priority.
Secondly, how do you know that the suspend and resume aren't happening immediately? Are you sure this is the problem?
You cannot force a thread to wait on something from outside without suspending the thread to inject the wait code; if SuspendThread isn't working for you then this isn't going to help.
The closest to a signal is probably QueueUserAPC, which will schedule a callback to run the next time the thread enters an "alertable wait state", e.g. by calling SleepEx or WaitForSingleObjectEx or similar.
#Anthony W - thanks for the advice. I was running the Win32 threads that simulated the real time tasks at THREAD_PRIORITY_ABOVE_NORMAL, and the threads that ran the pseudo interrupt handler and the tick interrupt generator at THREAD_PRIORITY_HIGHEST. The threads that were suspended I was changing to THREAD_PRIORITY_IDLE in case that made any difference. I just tried your suggestion of using THREAD_PRIORITY_TIME_CRITICAL but unfortunately it didn't make any difference.
With regards to your question am I sure that the suspend and resume not happening immediately is the problem - well no I'm not. It is my best guess in an environment I am unfamiliar with. My thinking regarding the failure of suspend and resume to work immediately stems from my observation when a task yields. If I make the call to yield (signal [using a Win32 event] a higher priority Win32 thread to switch to the next real time task) I can place a break point after the yield and that gets hit before a break point in the higher priority thread. It is unclear whether a delay in signalling the event and the higher priority task running, or a delay in suspending the thread and the thread actually stopping running was causing this - but the behaviour was definitely observed. This was fixed using a semaphore handshake, but that cannot be done for preemptions caused by tick interrupts.
I know the simulation is not running as I expect because a set of tests that check the sequence of scheduling of real time tasks is failing. It is always possible the scheduler has a problem, or the test has a problem, but the test will run for weeks without failing on a real real time target so I'm inclined to think the test and the scheduler are ok. A big difference is on the real time target the tick frequency is 1 ms, whereas on the Win32 simulated target it is 15ms with quite a lot of variation even then.
#Remy - I have done quite a bit of reading about fibers today, and my conclusion is that for simulating the scheduler in cooperative mode they would be perfect. However, as far as I can see they can only be scheduled by the fibers themselves calling the SwitchToFiber() function. Can a thread be made to block on a timer or sleep so it runs periodically, effectively preempting the fiber that was running at the time? From what I have read the answer is no because blocking one fiber will block all fibers running in the thread. If it could be made to work, could the periodically executing fiber then call the SwitchToFiber() function to select the next fiber to run before again sleeping for a fixed period? Again I think the answer is no because once it switches to another fiber it will no longer be executing and so will not actually call the Sleep() function until the next time the executing fiber switches back to it. Please correct my logic here if I have got the wrong idea of how fibers work.
I think it could work if the periodic functionality could remain in its own thread, separate from the thread that executed the fibers - but (again from what I have read) I don't think a one thread can influence the execution of fibers running in a different thread. Again I would be grateful if you could correct my conclusions here if they are wrong.
[EDIT] - simpler than the hack below - it seems just ensuring all the threads run on the same CPU core also fixes the problem :o) After all that. The only problem then is the CPU runs at nearly 100% and I'm not sure if the heat is damaging to it.
[/EDIT]
Ahaa! I think I have a work around for this - but its ugly. The uglyness is kept in the port layer though.
What I do now is store the thread ID each time a thread is created to run a task (a Win32 thread is created for each real time task that is created). I then added the function below - which is called using trace macros. The trace macros can be defined to do whatever you want, and have proven very useful in this case. The comments in the code below explain. The simulation is not perfect, and all this does is correct the thread scheduling when it has already deviated from the real time scheduling whereas I would prefer it not to go wrong in the first place, but the positioning of the trace macros makes the code containing this solution pass all the tests:
void vPortCheckCorrectThreadIsRunning( void )
{
xThreadState *pxThreadState;
/* When switching threads, Windows does not always seem to run the selected
thread immediately. This function can be called to check if the thread
that is currently running is the thread that is responsible for executing
the task selected by the real time scheduler. The demo project for the Win32
port calls this function from the trace macros which are seeded throughout
the real time kernel code at points where something significant occurs.
Adding this functionality allows all the standard tests to pass, but users
should still be aware that extra calls to this function could be required
if their application requires absolute fixes and predictable sequencing (as
the port tests do). This is still a simulation - not the real thing! */
if( xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED )
{
/* Obtain the real time task to Win32 mapping state information. */
pxThreadState = ( xThreadState * ) *( ( unsigned long * ) pxCurrentTCB );
if( GetCurrentThreadId() != pxThreadState->ulThreadId )
{
SwitchToThread();
}
}
}

Resources