I'm trying simulate a key down and key up action.
For example: 2638 millseconds.
SendMessage(hWnd, WM_KEYDOWN, keyCode, 0);
Sleep(2638);
SendMessage(hWnd, WM_KEYUP, keyCode, 0);
How would you know if it really worked?
You wouldn't with this code, since accurately measuring the time that code takes to execute is a difficult task.
To get to the question posed by your question title (you should really ask one question at a time...) the accuracy of said functions is dictated by the operating system. On Linux, the system clock granularity is 10ms, so timed process suspension via nanosleep() is only guaranteed to be accurate to 10ms, and even then it's not guaranteed to sleep for exactly the time you specify. (See below.)
On Windows, the clock granularity can be changed to accommodate power management needs (e.g. decrease the granularity to conserve battery power). See MSDN's documentation on the Sleep function.
Note that with Sleep()/nanosleep(), the OS only guarantees that the process suspension will last for at least as long as you specify. The execution of other processes can always delay resumption of your process.
Therefore, the key-up event sent by your code above will be sent at least 2.638 seconds later than the key-down event, and not a millisecond sooner. But it would be possible for the event to be sent 2.7, 2.8, or even 3 seconds later. (Or much later if a realtime process grabbed hold of the CPU and didn't relinquish control for some time.)
Sleep works in terms of the standard Windows thread scheduling. It is accurate up to about 20-50 milliseconds.
So that it's ok for user experience-dependent things. However it's absolutely inappropriate for real-time things.
Beside of this, there're much better ways to simulate keyboard/mouse events. Please see SendInput.
The sleep() function will return before the desired delay when the requested delay is shorter than the time left until the next interrupt occurs. But this only points out that you want to sleep for a shorter period of time than currently is supported by your system. It is advisable to setup the multimedia timer resource to a higher interrupt frequency to obtain better matching of the observed sleep delay with respect to the desired delay.
The the comments in the following threads:
How to get an accurate 1ms Timer Tick under WinXP
Sleep Less Than One Millisecond
The command Sleep() will ensure that thread is suspended at least the amount of time which is given as argument. Operating system does not guarantee it. For detailed discussion you can refer the below post
how is sleep implemented at OS level?
Related
I am doing C program in Linux . I have a main thread which continuously updates values of two variables and other thread write those variable values into a file every 20 milliseconds. I have used usleep to achieve this time interval. sample code is below.
main()
{
.
.
.
.
.
pthread_create(...write_file..); /* started another thread by passing a function write_file */
while(variable1)
{
updates value of variables
}
return 0;
}
void write_file()
{
.
.
.
.
fp = fopen("sample.txt" , "a");
while(variable2)
{
fprintf(fp," %d \n", somevariable);
usleep(20 * 1000);
}
fclose(fp);
}
Is it suitable to use usleep function achieve 20 milliseconds time interval or should I use some other methods like Timer.?
Is this usleep is accurate enough ? Does this sleep function any way affect the main thread ?
Using of sleep() family often results in non-precise timing, especially when process has many CPU-consuming threads and required intervals are relatively small, like 20ms. So you shouldn't assume that *sleep() call blocks execution exactly to specified time. For described above situation actual sleep duration may be even twice or more greater than specified (assuming that kernel is not real-time one). As result you should implement some kind of compensation logic, that adjusts sleep duration for subsequent calls.
More precise (but of course not ideal) approach is to use POSIX timers. See timer_create(). The most precise timers are the ones that use SIGEV_SIGNAL or SIGEV_THREAD_ID notifications (latter is only on Linux systems). As signal number you can use one of the real-time signals (SIGRTMIN to SIGRTMAX), but be aware that pthread implementations often use few of these signals internally, so you should choose actual number carefully. And also doing something in signal handler context requires extra attention, because not every library function may be used safely here. You can find safe list here.
P.S. Also note that select() called with empty sets is a fairly portable way to sleep with subsecond precision.
Sleeping: sleep() and usleep()
Now, let me start with the easier timing calls. For delays of multiple seconds, your best bet is probably to use sleep(). For delays of at least tens of milliseconds (about 10 ms seems to be the minimum delay), usleep() should work. These functions give the CPU to other processes (``sleep''), so CPU time isn't wasted. See the manual pages sleep(3) and usleep(3) for details.
For delays of under about 50 milliseconds (depending on the speed of your processor and machine, and the system load), giving up the CPU takes too much time, because the Linux scheduler (for the x86 architecture) usually takes at least about 10-30 milliseconds before it returns control to your process. Due to this, in small delays, usleep(3) usually delays somewhat more than the amount that you specify in the parameters, and at least about 10 ms.
nanosleep()
In the 2.0.x series of Linux kernels, there is a new system call, nanosleep() (see the nanosleep(2) manual page), that allows you to sleep or delay for short times (a few microseconds or more).
For delays <= 2 ms, if (and only if) your process is set to soft real time scheduling (using sched_setscheduler()), nanosleep() uses a busy loop; otherwise it sleeps, just like usleep().
The busy loop uses udelay() (an internal kernel function used by many kernel drivers), and the length of the loop is calculated using the BogoMips value (the speed of this kind of busy loop is one of the things that BogoMips measures accurately). See /usr/include/asm/delay.h) for details on how it works.
Source: http://tldp.org/HOWTO/IO-Port-Programming-4.html
Try use nanosleep() instead usleep(), it should be more accurately for 20ms interval.
I have a small program running on Linux (on an embedded PC, dual-core Intel Atom 1.6GHz with Debian 6 running Linux 2.6.32-5) which communicates with external hardware via an FTDI USB-to-serial converter (using the ftdi_sio kernel module and a /dev/ttyUSB* device). Essentially, in my main loop I run
clock_gettime() using CLOCK_MONOTONIC
select() with a timeout of 8 ms
clock_gettime() as before
Output the time difference of the two clock_gettime() calls
To have some level of "soft" real-time guarantees, this thread runs as SCHED_FIFO with maximum priority (showing up as "RT" in top). It is the only thread in the system running at this priority, no other process has such priorities. My process has one other SCHED_FIFO thread with a lower priority, while everything else is at SCHED_OTHER. The two "real-time" threads are not CPU bound and do very little apart from waiting for I/O and passing on data.
The kernel I am using has no RT_PREEMPT patches (I might switch to that patch in the future). I know that if I want "proper" realtime, I need to switch to RT_PREEMPT or, better, Xenomai or the like. But nevertheless I would like to know what is behind the following timing anomalies on a "vanilla" kernel:
Roughly 0.03% of all select() calls are timed at over 10 ms (remember, the timeout was 8 ms).
The three worst cases (out of over 12 million calls) were 31.7 ms, 46.8 ms and 64.4 ms.
All of the above happened within 20 seconds of each other, and I think some cron job may have been interfering (although the system logs are low on information apart from the fact that cron.daily was being executed at the time).
So, my question is: What factors can be involved in such extreme cases? Is this just something that can happen inside the Linux kernel itself, i.e. would I have to switch to RT_PREEMPT, or even a non-USB interface and Xenomai, to get more reliable guarantees? Could /proc/sys/kernel/sched_rt_runtime_us be biting me? Are there any other factors I may have missed?
Another way to put this question is, what else can I do to reduce these latency anomalies without switching to a "harder" realtime environment?
Update: I have observed a new, "worse worst case" of about 118.4 ms (once over a total of around 25 million select() calls). Even when I am not using a kernel with any sort of realtime extension, I am somewhat worried by the fact that a deadline can apparently be missed by over a tenth of a second.
Without more information it is difficult to point to something specific, so I am just guessing here:
Interrupts and code that is triggered by interrupts take so much time in the kernel that your real time thread is significantly delayed. This depends on the frequency of interrupts, which interrupt handlers are involved, etc.
A thread with lower priority will not be interrupted inside the kernel until it yields the cpu or leaves the kernel.
As pointed out in this SO answer, CPU System Management Interrupts and Thermal Management can also cause significant time delays (up to 300ms were observed by the poster).
118ms seems quite a lot for a 1.6GHz CPU. But one driver that accidently locks the cpu for some time would be enough. If you can, try to disable some drivers or use different driver/hardware combinations.
sched_rt_period_us and sched_rt_period_us should not be a problem if they are set to reasonable values and your code behaves as you expect. Still, I would remove the limit for RT threads and see what happens.
What else can you do? Write a device driver! It's not that difficult and interrupt handlers get a higher priority than realtime threads. It may be easier to switch to a real time kernel but YMMV.
I am writing a Gif animator in C.
I have two threads running in parallel, both . The first allows the user to alter the speed of the animation. The second draws the current frame, and then calls Sleep(Constant * 100 / CurrentSpeed), where CurrentSpeed is a percentage amount, ranging from 1 to 200.
The problem is that if you quickly change the speed from 100%, to 1%, and then back to the first, the second thread will execute the following:
Sleep(Constant * 100)
This will draw frame A, wait many seconds (although the speed was changed by the user), and only then draw B and the following frames in the default speed.
It seems to me that Sleep is a poor choice of mine in this case. What can I do to solve this problem?
EDIT:
The code I currently have (Simplified):
while (1) {
InvalidateRect(Handle, &ImageRect, FALSE);
if (shouldDispose) {
break;
}
if (DelayTime)
Sleep(DelayTime * 100 / CurrentSpeed);
SelectNextImage();
}
Instead of calling Sleep() with the desired frame rate, why don't you call it with a constant interval of 1 ms, for example, and use a variable as a counter?
For example, let C be a global variable (counter) which is loaded with a number of 'ticks' of 1ms. Then, write the loop:
while(1) { //Main loop of the player thread
if (C > 0) C--;
if (C == 0) nextframe(); //if counter reaches 0, load next frame.
Sleep(1);
}
The control thread would load C with a number of 1ms ticks (i.e. frame rate), and the player thread will never be stopped beyond 1 ms. The use of 1ms as the base rate is arbitrary. Use the minimum time that allows you the maximum frame rate, in order to load CPU the less as possible.
EDIT
After some hot comments (arguing is good after all), I'd like to point out that this solution is sub-optimal, i.e., it doesn't use any OS mechanism for signaling threads or any other API for preventing the thread from wasting CPU time. The solution shown here is generic: it may be used in any system (even in embedded systems without any running OS. But above all, it is based on the original code posted by the user that asked the question: using Sleep(), how can I achieve my purpose. I give him my humble answer. Anyway, I encourage other people to write sample code using the appropriate API for achieving the same goal. With no hard feelings, special thanks to Martin James.
Find a synchro API on your OS that allows a wait with a timeout, eg. WaitForSingleObject() on Windows. If you want to change the delay, change the timeout and signal the event upon which the WFSO is waiting to make it return 'early' and restart the wait with the new timeout.
Polling with Sleep(1) loops is rarely justifiable.
Create a waitable timer. When you set the timer, you can specify a callback function that will run in the setting thread's context. This means you can do it with two threads, but it actually works just fine with only a single thread as well.
The main advantage of a waitable timer is, however, that it is more accurate and more reliable than Sleep. A timer is conceptually much different from Sleep insofar as Sleep only gives up control and the scheduler marks the thread as ready to run when the time is up and when the scheduler runs anyway. It doesn't do anything beyond that. Which means that the thread will eventually be scheduled to run again, like any other thread that is ready.
A thread that is waiting on a timer (or other waitable object) causes the scheduler to run when the timer is up and has its priority temporarily boosted. It therefore runs not only more reliably and more closely to the desired time, but also earlier than all other threads with the same base priority. Which does not give a realtime guarantee but at least gives a sort of "soft guarantee".
If you still want to use Sleep, use SleepEx instead which you can alert, either by queueing an APC, or by calling the undocumented NtAlertThread function.
In any case, Sleep is troublesome not only because of being unreliable, but also because it bases on the granularity of the system-wide timer. Which you can, of course, set to as low as 1ms (or less on some systems), but that will cause a lot of unnecessary interrupts.
I'm trying to determine the granularity I can accurately schedule tasks to occur in C/C++. At the moment I can reliably schedule tasks to occur every 5 microseconds, but I'm trying to see if I can lower this further.
Any advice on how to achieve this / if it is possible would be greatly appreciated.
Since I know timer granularity can often be OS dependent: I am currently running on Linux, but would use Windows if the timing granularity is better (although I don't believe it is, based on what I've found for the QueryPerformanceCounter)
I execute all measurements on bare-metal (no VM). /proc/timer_info confirms nanosecond timer resolution for my CPU (but I know that doesn't translate to nanosecond alarm resolution)
Current
My current code can be found as a Gist here
At the moment, I'm able to execute a request every 5 microseconds (5000 nanoseconds) with less then 1% late arrivals. When late arrivals do occur, they are typically only one cycle (5000 nanoseconds) behind.
I'm doing 3 things at the moment
Setting the process to real-time priority (some pointed out by #Spudd86 here)
struct sched_param schedparm;
memset(&schedparm, 0, sizeof(schedparm));
schedparm.sched_priority = 99; // highest rt priority
sched_setscheduler(0, SCHED_FIFO, &schedparm);
Minimizing the timer slack
prctl(PR_SET_TIMERSLACK, 1);
Using timerfds (part of the 2.6 Linux kernel)
int timerfd = timerfd_create(CLOCK_MONOTONIC,0);
struct itimerspec timspec;
bzero(&timspec, sizeof(timspec));
timspec.it_interval.tv_sec = 0;
timspec.it_interval.tv_nsec = nanosecondInterval;
timspec.it_value.tv_sec = 0;
timspec.it_value.tv_nsec = 1;
timerfd_settime(timerfd, 0, &timspec, 0);
Possible improvements
Dedicate a processor to this process?
Use a nonblocking timerfd so that I can create a tight loop, instead of blocking (tight loop will waste more CPU, but may also be quicker to respond to an alarm)
Using an external embedded device for triggering (can't imagine why this would be better)
Why
I'm currently working on creating a workload generator for a benchmarking engine. The workload generator simulates an arrival rate (X requests / second, etc.) using a Poisson process. From the Poisson process, I can determine the relative times at which requests must be made from the benchmarking engine.
So for instance, at 10 requests a second, we may have requests made at:
t = 0.02, 0.04, 0.05, 0.056, 0.09 seconds
These requests need to be scheduled in advance and then executed. As the number of requests per second increases, the granularity required for scheduling these requests increases (thousands of requests per second requires sub-millisecond accuracy). As a result, I'm trying to figure out how to scale this system further.
You're very close to the limits of what vanilla Linux will offer you, and it's way past what it can guarantee. Adding the real-time patches to your kernel and tuning for full pre-emption will help give you better guarantees under load. I would also remove any dynamic memory allocation from your time critical code, malloc and friends can (and will) stall for a not-inconsequential (in a real-time sense) period of time if it has to reclaim the memory from the i/o cache. I would also be considering removing swap from that machine to help guarantee performance. Dedicating a processor to your task will help to prevent context switch times but, again, it's no guarantee.
I would also suggest that you be careful with that level of sched_priority, you're above various important bits of Linux there, which can lead to very strange effects.
What you gain from building a realtime kernel is more reliable guarantees (ie lower maximum latency) of the time between an IO/timer event handled by the kernel, and control being passed to your app in response. This comes at the price of lower throughput, and you might notice an increase in your best-case latency times.
However, the only reason for using OS timers to schedule events with high-precision is if you're afraid of burning CPU cycles in a loop while you wait for your next due event. OS timers (especially in MS Windows) are not reliable for high granularity timing events, and are very dependant on the sort of timing/HPET hardware available in your system.
When I require highly accurate event scheduling, I use a hybrid method. First, I measure the worst case latency - that is, the biggest difference between the time I requested to sleep, and the actual clock time after sleeping. Let's call this difference "D". (You can actually do this on-the-fly during normal running, by tracking "D" every time you sleep, with something like "D = (D*7 + lastD) / 8" to produce a temporal average).
Then never request to sleep beyond "N - D*2", where "N" is the time of the next event. When within "D*2" time of the next event, enter a spin loop and wait for "N" to occur.
This eats a lot more CPU cycles, but depending on the accuracy you require, you might be able to get away with a "sched_yield()" in your spin loop, which is more kind to your system.
I have a small program that needs to be run in a small Linux embedded system (ARM). It is written in C. It needs to poll some data (2x64-bit) from an API provided by the system manufacturer, and then do some calculations and send the data through the network. The data should be polled around 30 times every second (30Hz).
What would be the best way to do it in C? I've seen solutions using sleep(), but it does not seem to be the best option for the job.
I suggest consider using the poll(2) multiplexing syscall to do the polling.
notice that when poll is waiting and polling for input, it does not consume any CPU
If the processing of each event takes some significant time (e.g. a millisecond or more) you may want to recompute the delay.
You could use timerfd_create(2) (and give both your device file descriptor and your timer fd to poll). See also timer_create(2)...
Perhaps clock_gettime(2) could be useful.
And reading time(7) is definitely useful. Perhaps also the Advanced Linux Programming book.
sleep() suspends execution in seconds, if you are looking for a more accurate sleep()-like function, use usleep() which suspends execution in microseconds, or nanosleep() in nanoseconds.