I have a C program on linux. During execution of my program, I want to make some decisions if the process is facing scheduling delay above a threshold.
Any suggestion on how I find this statistic ?
P.S.: By scheduling delay I mean time spent by the process waiting to be scheduled i.e. time spent in the scheduler queue.
The time() function allows you to measure the "wall clock" time: http://linux.die.net/man/2/time
On the other side, the clock() function allows you to measure the CPU time used by your process: http://linux.die.net/man/3/clock
By subtracting the two, you can get an approximation of what you asked for.
PS: for more accurate measurements (time has a second resolution) you can use clock_gettime: http://linux.die.net/man/3/clock_gettime
You could set a timer to go off, say every minute, or whatever interval seems appropriate and then gather stats with getrusage() and based on those results (the difference between successive values), you could make your decision then
Related
I work for a company that produces automatic machines, and I help maintain their software that controls the machines. The software runs on a real-time operating system, and consists of multiple threads running concurrently. The code bases are legacy, and have substantial technical debts. Among all the issues that the code bases exhibit, one stands out as being rather bizarre to me; most of the timing algorithms that involve the computation of time elapsed to realize common timed features such as timeouts, delays, recording time spent in a particular state, and etc., basically take the following form:
unsigned int shouldContinue = 1;
unsigned int blockDuration = 1; // Let's say 1 millisecond.
unsigned int loopCount = 0;
unsigned int elapsedTime = 0;
while (shouldContinue)
{
.
. // a bunch of statements, selections and function calls
.
blockingSystemCall(blockDuration);
.
. // a bunch of statements, selections and function calls
.
loopCount++;
elapsedTime = loopCount * blockDuration;
}
The blockingSystemCall function can be any operating system's API that suspends the current thread for the specified blockDuration. The elapsedTime variable is subsequently computed by basically multiplying loopCount by blockDuration or by any equivalent algorithm.
To me, this kind of timing algorithm is wrong, and is not acceptable under most circumstances. All the instructions in the loop, including the condition of the loop, are executed sequentially, and each instruction requires measurable CPU time to execute. Therefore, the actual time elapsed is strictly greater than the value of elapsedTime in any given instance after the loop starts. Consequently, suppose the CPU time required to execute all the statements in the loop, denoted by d, is constant. Then, elapsedTime lags behind the actual time elapsed by loopCount • d for any loopCount > 0; that is, the deviation grows according to an arithmetic progression. This sets the lower bound of the deviation because, in reality, there will be additional delays caused by thread scheduling and time slicing, depending on other factors.
In fact, not too long ago, while testing a new data-driven predictive maintenance feature which relies on the operation time of a machine, we discovered that the operation time reported by the software lagged behind that of a standard reference clock by a whopping three hours after the machine was in continuous operation for just over two days. It was through this test that I discovered the algorithm outlined above, which I swiftly determined to be the root cause.
Coming from a background where I used to implement timing algorithms on bare-metal systems using timer interrupts, which allows the CPU to carry on with the execution of the business logic while the timer process runs in parallel, it was shocking for me to have discovered that the algorithm outlined in the introduction is used in the industry to compute elapsed time, even more so when a typical operating system already encapsulates the timer functions in the form of various easy-to-use public APIs, liberating the programmer from the hassle of configuring a timer via hardware registers, raising events via interrupt service routines, etc.
The kind of timing algorithm as illustrated in the skeleton code above is found in at least two code bases independently developed by two distinct software engineering teams from two subsidiary companies located in two different cities, albeit within the same state. This makes me wonder whether it is how things are normally done in the industry or it is just an isolated case and is not widespread.
So, the question is, is the algorithm shown above common or acceptable in calculating elapsed time, given that the underlying operating system already provides highly optimized time-management system calls that can be used right out of the box to accurately measure elapsed time or even used as basic building blocks for creating higher-level timing facilities that provide more intuitive methods similar to, e.g., the Timer class in C#?
You're right that calculating elapsed time that way is inaccurate -- since it assumes that the blocking call will take exactly the amount of time indicated, and that everything that happens outside of the blocking system call will take no time at all, which would only be true on an infinitely-fast machine. Since actual machines are not infinitely fast, the elapsed-time calculated this way will always be somewhat less than the actual elapsed time.
As to whether that's acceptable, it's going to depend on how much timing accuracy your program needs. If it's just doing a rough estimate to make sure a function doesn't run for "too long", this might be okay. OTOH if it is trying for accuracy (and in particular accuracy over a long period of time), then this approach won't provide that.
FWIW the more common (and more accurate) way to measure elapsed time would be something like this:
const unsigned int startTime = current_clock_time();
while (shouldContinue)
{
loopCount++;
elapsedTime = current_clock_time() - startTime;
}
This has the advantage of not "drifting away" from the accurate value over time, but it does assume that you have a current_clock_time() type of function available, and that it's acceptable to call it within the loop. (If current_clock_time() is very expensive, or doesn't provide some real-time performance guarantees that the calling routine requires, that might be a reason not to do it this way)
I don't think these loops do what you think they do.
In a RTOS, the purpose of a loop like this is usually to perform a task at regular intervals.
blockingSystemCall(N) probably does not just sleep for N milliseconds like you think it does. It probably sleeps until N milliseconds after the last time your thread woke up.
More accurately, all the sleeps your thread has performed since starting are added to the thread start time to get the time at which the OS will try to wake the thread up. If your thread woke up due to an I/O event, then the last one of those times could be used instead of the thread start time. The point is that the inaccuracies in all these start times are corrected, so your thread wakes up at regular intervals and the elapsed time measurement is perfectly accurate according to the RTOS master clock.
There could also be very good reasons for measuring elapsed time by the RTOS master clock instead of a more accurate wall clock time, in addition to simplicity. This is because all of the guarantees that an RTOS provides (which is the reason you are using a RTOS in the first place) are provided in that time scale. The amount of time taken by one task can affect the amount of time you are guaranteed to have available for other tasks, as measured by this clock.
It may or may not be a problem that your RTOS master clock runs slow by 3 hours every 2 days...
I read http://linux.die.net/man/3/clock_gettime and http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/comment-page-1/#comment-681578
It said to use this to
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &stop_time);
measure how long it take for a function to run.
I tried that in my program. When I run it, it returns saying it took 15 sec. But when I compare against using a stop watch to measure it, it is 30 sec.
Can you please tell me why clock_gettime return 1/2 of the actual time it took?
Thank you.
In a multi-process environment, processes are constantly migrating from CPU(s) to 'run queue(s)'.
When performance testing an application, it is often convenient to know the amount of time a process has been running on a CPU, while excluding time that the process was waiting for a CPU resource on a 'run queue'.
In the case of this question, where CPU-time is about half of REAL-time, it is likely that other processes were actively competing for CPU time while your process was also running. It appears that your process was fairly successful in acquiring roughly half the CPU resources during its run.
Instead of using CLOCK_PROCESS_CPUTIME_ID, you might consider using CLOCK_REALTIME?
For additional details, see: Understanding the different clocks of clock_gettime()
I have a function which calculates a BPM for a track from incoming data packets from a CDJ. Lets say the BPM was 124.45 beats per minute, how would I go about calling a function every 0.482 seconds (i.e. once per beat)? Would it be possible to set up another thread and set a timer?
Maybe have a look at high precision timers, here for which Apple claim 500 micrososecond accuracy which is 0.1% of your 500 (ish) millisecond requirement. You can minimise skew by reading the time at the start of your processing and calculating an offset to the next beat. Also, if you find you are often getting scheduled late, and missing beats, you can sleep for, say, 95% of the time to your next beat so the CPU can schedule something else, and then busy wait for the last few percent so you don't hog the CPU.
I have been looking for that, I am using Micro C/OS II Real Time Operating System. I couldn't find a way to create a delay apart from writing nested loops. Any way to do create a delay?
OSTimeDly() will delay/sleep a task for a specified number of ticks. OSTimeDlyHMSM() will delay a specified number of hours, minutes, seconds, milliseconds.
I'm trying simulate a key down and key up action.
For example: 2638 millseconds.
SendMessage(hWnd, WM_KEYDOWN, keyCode, 0);
Sleep(2638);
SendMessage(hWnd, WM_KEYUP, keyCode, 0);
How would you know if it really worked?
You wouldn't with this code, since accurately measuring the time that code takes to execute is a difficult task.
To get to the question posed by your question title (you should really ask one question at a time...) the accuracy of said functions is dictated by the operating system. On Linux, the system clock granularity is 10ms, so timed process suspension via nanosleep() is only guaranteed to be accurate to 10ms, and even then it's not guaranteed to sleep for exactly the time you specify. (See below.)
On Windows, the clock granularity can be changed to accommodate power management needs (e.g. decrease the granularity to conserve battery power). See MSDN's documentation on the Sleep function.
Note that with Sleep()/nanosleep(), the OS only guarantees that the process suspension will last for at least as long as you specify. The execution of other processes can always delay resumption of your process.
Therefore, the key-up event sent by your code above will be sent at least 2.638 seconds later than the key-down event, and not a millisecond sooner. But it would be possible for the event to be sent 2.7, 2.8, or even 3 seconds later. (Or much later if a realtime process grabbed hold of the CPU and didn't relinquish control for some time.)
Sleep works in terms of the standard Windows thread scheduling. It is accurate up to about 20-50 milliseconds.
So that it's ok for user experience-dependent things. However it's absolutely inappropriate for real-time things.
Beside of this, there're much better ways to simulate keyboard/mouse events. Please see SendInput.
The sleep() function will return before the desired delay when the requested delay is shorter than the time left until the next interrupt occurs. But this only points out that you want to sleep for a shorter period of time than currently is supported by your system. It is advisable to setup the multimedia timer resource to a higher interrupt frequency to obtain better matching of the observed sleep delay with respect to the desired delay.
The the comments in the following threads:
How to get an accurate 1ms Timer Tick under WinXP
Sleep Less Than One Millisecond
The command Sleep() will ensure that thread is suspended at least the amount of time which is given as argument. Operating system does not guarantee it. For detailed discussion you can refer the below post
how is sleep implemented at OS level?