I wrote a CPU intensive program in C to run on Windows. In the main loop I check for a keyboard press to allow you to interrupt execution in order to pause the program. The idea is to release the thread to other processes if the program is slowing down the computer too much. After a keyboard press I wait for more keyboard input using fgets(), which allows you to restart the program later. This does reduce the CPU usage shown in task manager quite well. But I was wondering if there is perhaps a more explicit way to tell the operating system that this process doesn't need any attention for a while in order to reduce the overhead while idle to the absolute minimum.
My understanding is that the operating system periodically lets a process run and then stops running it after a certain amount of time. It then checks the rest of the processes in the same way until it comes back to this one again. If it has enough to do the process will run for the maximum allowed time. Otherwise, it will stop early and return control to the operating system. So a function like fgets must immediately return control if there is no keyboard input, which is why the process runs at near 0% CPU. So I guess another way of asking my question is how do I deliberately return control to the operating system in my own code.
my question is how do I deliberately return control to the operating system in my own code
You can use either Sleep(0) or SwitchToThread(). Both pass control back to the OS and might cause the calling thread to give up the remaining time slice but the devil is in the detail.
Sleep(0)
If no other thread with a matching priority is ready to run, the call returns immediately. Otherwise, the thread gives up its remaining time slice.
You can work around the priority issue by using SwitchToThread or Sleep(1). The disadvantage of the latter is that the thread gives up its time slice unconditionally, whether or not other threads are ready to run.
SwitchToThread()
If no other thread, irrespective if its priority, is ready to run on the thread's current processor, the call returns immediately. Otherwise, the thread gives up its remaining time slice for at most one time slice.
Alternatively, you could change the priority of the process (SetPriorityClass() with PROCESS_MODE_BACKGROUND_BEGIN) or thread (SetThreadPriority() with THREAD_MODE_BACKGROUND_BEGIN) so that the OS can take care of prioritizing more important processes/threads for you. In your scenario, doing so would be a better fit. The scheduler will respond to sudden CPU demand without any additional work on your end.
You can do it in pretty much two ways. Either read the input using a blocking function, like fgets, or read the input using a non-blocking function. In the second situation you would need to incorporate a timeout of some sort. Some functions do this for you, like select. Otherwise you need to regularly sleep your process or thread.
Effectively the system is using interrupts to determine which processes care about a specific event.
Related
pthread_yield is documented as "causes the calling thread to relinquish the CPU", but on a modern OS/scheduler, the relinquishing of the CPU happens automatically at the appropriate times (i.e. whenever the thread calls a blocking operation, and/or when the thread's quantum has expired). Is pthread_yield() therefore vestigial/useless except in the special case of running under a co-operative-only task scheduler? Or are there some use-cases where calling it would still be correct/useful even under a modern pre-emptive scheduler?
pthread_yield() gives you a chance to do a short sleep -- not a timed sleep. You relinquish the remainder of time slice to some other thread or process, but you don't put the thread in a wait queue.
Also a while ago I read about how schedulers prioritizing interactive processes. These are the processes that user interacts with directly and you feel their sluggishness most (you have less of a feeling of your system being slow if your UI is responsive). One of the properties of interactive processes is that they have little to do and mostly don't use entire time slice. So if a process keeps yielding before its time slice is up you assume it is interactive and you boost its priority. There were exploits that used this trick to effectively use 99% of CPU while showing the offending process as being at 0%.
I have written code for two threads where is one is assigned priority 20 (lower) and another on 10 (higher). Upon executing my code, 70% of the time I get expected results i.e high_prio (With priority 10) thread executes first and then low_prio (With priority 20).
Why is my code not able to get 100 % correct result in all the executions? Is there any conceptual mistake that I am doing?
void *low_prio(){
Something here;
}
void *high_prio(){
Something here;
}
int main(){
Thread with priority 10 calls high_prio;
Thread with priority 20 calls low_prio;
return 0;
}
Is there any conceptual mistake that I am doing?
Yes — you have an incorrect expectation regarding what thread priorities do. Thread priorities are not meant to force one thread to execute before another thread.
In fact, in a scenario where there is no CPU contention (i.e. where there are always at least as many CPU cores available as there are threads that currently want to execute), thread priorities will have no effect at all -- because there would be no benefit to forcing a low-priority thread not to run when there is a CPU core available for it to run on. In this no-contention scenario, all of the threads will get to run simultaneously and continuously for as long as they want to.
The only time thread priorities may make a difference is when there is CPU contention -- i.e. there are more threads that want to run than there are CPU cores available to run them. At that point, the OS's thread-scheduler has to make a decision about which thread will get to run and which thread will have to wait for a while. In this instance, thread priorities can be used to indicate to the scheduler which thread it should prefer allow to run.
Note that it's even more complicated than that, however -- for example, in your posted program, both of your threads are calling printf() rather a lot, and printf() invokes I/O, which means that the thread may be temporarily put to sleep while the I/O (e.g. to your Terminal window, or to a file if you have redirected stdout to file) completes. And while that thread is sleeping, the thread-scheduler can take advantage of the now-available CPU core to let another thread run, even if that other thread is of lower priority. Later, when the I/O operation completes, your high-priority thread will be re-awoken and re-assigned to a CPU core (possibly "bumping" a low-priority thread off of that core in order to get it).
Note that inconsistent results are normal for multithreaded programs -- threads are inherently non-deterministic, since their execution patterns are determined by the thread-scheduler's decisions, which in turn are determined by lots of factors (e.g. what other programs are running on the computer at the time, the system clock's granularity, etc).
When a program is doing I/O, my understanding is that the thread will briefly sleep and then resume (e.g. when writing to a file). My question is that when we do printing using printf(), does a C program thread sleep in any way ?
Since you've specifically asked for printf(), I'm going to assume that you mean in the most generic way where it will fill a reasonably sized buffer and invoke the system call write(2) to stdout and that the stdout happens to point to your terminal.
In most operating systems, when you invoke certain system calls the calling thread/process is removed from CPU runnable list and placed in a separate waiting list. This is true for all I/O calls like read/write/etc. Being temporarily removed from processing due to I/O is not the same as being put to sleep via a timer.
For example, in Linux there's uninterruptible sleep state of a thread/process specifically meant for I/O waiting, while interruptible sleep state for those thread/process that are waiting on timers and events. Though, from a dumb user's perspective they both seem to be same, their implementation behind the scenes are significantly different.
To answer your question, a call to printf() isn't exactly sleeping but waiting for the buffer to be flushed to device rather than actually being in sleep. Even then there are a few more quirks which you can read about it in signal(7) and even more about various process/thread states from Marek's blog.
Hope this helps.
Much of the point of stdio.h is that it buffers I/O: a call to printf will often simply put text into a memory buffer (owned by the library by default) and perform zero system calls, thus offering no opportunity to yield the CPU. Even when something like write(2) is called, the thread may continue running: the kernel can copy the data into kernel memory (from which it will be transferred to the disk later, e.g. by DMA) and return immediately.
Of course, even on a single-core system, most operating systems frequently interrupt the running thread in order to share it. So another thread can still run at any time, even if no blocking calls are made.
I have a project with some soft real-time requirements. I have two processes (programs that I've written) that do some data acquisition. In either case, I need to continuously read in data that's coming in and process it.
The first program is heavily threaded, and the second one uses a library which should be threaded, but I have no clue what's going on under the hood. Each program is executed by the user and (by default) I see each with a priority of 20 and a nice value of 0. Each program uses roughly 30% of the CPU.
As it stands, both processes have to contended with a few background processes, and I want to give my two programs the best shot at the CPU as possible. My main issue is that I have a device that I talk to that has a 64 byte hardware buffer, and if I don't read from it in time, I get an overflow. I have noted this condition occurring once every 2-3 hours of run time.
Based on my research (http://oreilly.com/catalog/linuxkernel/chapter/ch10.html) there appear to be three ways of playing around with the priority:
Set the nice value to a lower number, and therefore give each process more priority. I can do this without any modification to my code (or use the system call) using the nice command.
Use sched_setscheduler() for the entire process to a particular scheduling policy.
Use pthread_setschedparam() to individually set each pthread.
I have run into the following roadblocks:
Say I go with choice 3, how do I prevent lower priority threads from being starved? Is there also a way to ensure that shared locks cause lower priority threads to be promoted to a higher priority? Say I have a thread that's real-time, SCHED_RR and it shared a lock with a default, SCHED_OTHER thread. When the SCHED_OTHER thread gets the lock, I want it to execute # higher priority to free the lock. How do I ensure this?
If a thread of SCHED_RR creates another thread, is the new thread automatically SCHED_RR, or do I need to specify this? What if I have a process that I have set to SCHED_RR, do all its threads automatically follow this policy? What if a process of SCHED_RR spawns a child process, is it too automatically SCHED_RR?
Does any of this matter given that the code only uses up 60% of the CPU? Or are there still issues with the CPU being shared with background processes that I should be concerned with and could be caused my buffer overflows?
Sorry for the long winded question, but I felt it needed some background info. Thanks in advance for the help.
(1) pthread_mutex_setprioceiling
(2) A newly created thread inherits the schedule and priority of its creating thread unless it's thread attributes (e.g. pthread_attr_setschedparam / pthread_attr_setschedpolicy) are directed to do otherwise when you call pthread_create.
(3) Since you don't know what causes it now it is in fairness hard for anyone say with assurance.
There are two threads. One is an events thread, and another does rendering. The rendering thread uses variables from the events thread. There are mutex locks but they are irrelevant since I noticed the behavior is same even if I remove them completely (for testing).
If I do a sleep() in the rendering thread alone, for 10 milliseconds, the FPS is normally 100.
If I do no sleep at all in the rendering thread and a sleep in the events thread, the rendering thread does not slow down at all.
But, if I do a sleep of 10 milliseconds in the rendering thread and 10 in the events thread, the FPS is not 100, but lower, about 84! (notice it's the same even if mutex locks are removed completely)
(If none of them has sleeps it normally goes high.)
What could produce this behavior?
--
The sleep command used is Sleep() of windows or SDL_Delay() (which probably ends up to Sleep() on windows).
I believe I have found an answer (own answer).
Sleeping is not guaranteed to wait for a period, but it will wait at least a certain time, due to OS scheduling.
A better approach would be to calculate actual time passed explicitly (and allow execution via that, only if certain time has passed).
The threads run asynchronously unless you synchronise them, and will be scheduled according to the OS's scheduling policy. I would suggest that the behaviour will at best be non-deterministic (unless you were running on an RTOS perhaps).
You might do better to have one thread trigger another by some synchronisation mechanism such as a semaphore, then only have one thread Sleep, and the other wait on the semaphore.
I do not know what your "Events" thread does but given its name, perhaps it would be better to wait on the events themselves rather than simply sleep and then poll for events (if that is what it does). Making the rendering periodic probably makes sense, but waiting on events would be better doing exactly that.
The behavior will vary depending on many factors such as the OS version (e.g. Win7 vs. Win XP) and number of cores. If you have two cores and two threads with no synchronization objects they should run concurrently and Sleep() on one thread should not impact the other (for the most part).
It sounds like you have some other synchronization between the threads because otherwise when you have no sleep at all in your rendering thread you should be running at >100FPS, no?
In case that there is absolutely no synchronization then depending on how much processing happens in the two threads having them both Sleep() may increase the probability of contention for a single core system. That is if only one thread calls Sleep() it is generally likely to be given the next quanta once it wakes up and assuming it does very little processing, i.e. yields right away, that behavior will continue. If two threads are calling Sleep() there is some probability they will wake up in the same quanta and if at least one of them needs to do any amount of processing the other will be delayed and the observed frequency will be lower. This should only apply if there's a single core available to run the two threads on.
If you want to maintain a 100FPS update rate you should keep track of the next scheduled update time and only Sleep for the remaining time. This will ensure that even if your thread gets bumped by some other thread for a CPU quanta you will be able to keep the rate (assuming there is enough CPU time for all processing). Something like:
DWORD next_frame_time = GetTickCount(); // Milli-seconds. Note the resolution of GetTickCount()
while(1)
{
next_frame_time += 10; // Time of next frame update in ms
DWORD wait_for = next_frame_time - GetTickCount(); // How much time remains to next update
if( wait_for < 11 ) // A simplistic test for the case where we're already too late
{
Sleep(wait_for);
}
// Do periodic processing here
}
Depending on the target OS and your accuracy requirements you may want to use a higher resolution time function such as QueryPerformanceCounter(). The code above will not work well on Windows XP where the resolution of GetTickCount() is ~16ms but should work in Win7 - it's mostly to illustrate my point rather than meant to be copied literally in all situations.