<100uS accurate sleeps on Windows CE - timer

Is it possible to sleep for an amount of time that will be accurate to less than 100 microseconds on Windows CE? The less jitter the better - ideally we'd like single digit microsecond response times.
What we really want is a 5ms timer with very low jitter - although the Windows CE WaitFor[Single|Multiple]Objects and Sleep APIs work in units of milliseconds, we can't then correct for the sub-ms time our code may take to run, so the cycle would gradually drift.
If this is not possible, that information would be very helpful too.

This MSDN article has some code to set up a 500us timer interrupt in WinCE, so it's absolutely possible.
If you aren't locked into your version of WinCE, you might want to look into Tenasys, who claims to offer an RTOS running side-by-side with Windows on standard hardware.
I've also heard good things about QNX, but I haven't used their products either. I do not believe it is Windows compatible in any way, however.

It's not possible. It's not even possible on desktops. Typical operating systems simply don't function in this manner.
If what you need is something to fire precisely every 4 milliseconds or whatever you're out of luck. If what you really need is something to fire precisely 250 times every second that may be more doable. If you're in need of the latter I can suggest an approach.

If your need to sleep is not a battery/thread-yield issue and just a matter of accurate timing, you can use the "performancecounter" in Windows CE devices. On XScale and Qualcomm CPUs, this is the internal chip timer and has sub 1-ms granularity. On older OMAP and Samsung processors, the performancecounter API is passing through the 1ms system tick and has lots of jitter.
L.B.

To correct the jitter you need access to a high resolution timer.
The CPU you have may have one. If not, the interrupt controller may.
The Easiest way is to use a Linux with realtime and WINE your way into that library.
you want a Periodic Thread.
Take a look at this report from NIST.

Related

what is the reference for timing calculations in linux

I want to clarify about timers in linux, how they are behaving?
I know in micro-controllers the timers/counters we use the reference, timing of machine instruction to execute.so there we could make it loop for how much time we need sleep/timer/counter.
But in linux where & how it will take the reference that if i use sleep(5), exactly 5 seconds are elapsed.If any one know please clarify me kindly.
Every operating system kernel (that I know of) has a whole machine independent framework for timers. This is pretty much one of the most central things a kernel must have because we need timers for everything, process scheduling, dealing with hardware errors, select/poll timeouts, network protocols, etc. At any point in time your kernel has dozens, if not thousands of timers waiting to be executed at some point in the future. Most of them will be canceled and never executed.
The simplest framework that pretty much everyone uses sets up one of the many clocks in a machine to generate an interrupt at a set interval. 100Hz is the most common, Windows (at least in the past) sets it to 64Hz (but it can be changed by any application), some systems experimented with 1024Hz. The timer interrupt fires and the interrupt handler checks if there's anything queued up to do at that time and if there is, it is executed. There has been some work for Linux to improve this so that we can get shorter or longer intervals than 10ms depending on the next scheduled timer, both to improve the precision of the timers and to save power, but in general it works as described above.
If I understand your question correctly, you think that there is something that measures how much certain sequence of instructions takes and then loops until some amount of time passes. This is something that is almost never done because it wastes power and it blocks anything else from running at the same time and is also quite unreliable. It is still done in modern kernels, but very rarely and only when high precision is required when talking to really, really stupid hardware. Last time I had to do it was 17 years ago to talk to some ethernet controller where you had to manually implement MII by bit-banging in software, it was terrible and hung the system for quite a long time every time you (un-)plugged an ethernet cable. Nobody builds hardware that requires this anymore because it really ruins the performance of modern systems.
So in your question, sleep(5) will be implemented by registering a function in the timer framework to be called in 5 seconds from now and then putting the process to sleep. 5 seconds later the timer fires and the process gets awakened again.

Full software timer : derivate time?

I have been asked a question but I am not sure if I answered it correctly.
"Is it possible to rely only on software timer?"
My answer was "yes, in theory".
But then I added:
"Just relying on hardware timer at the kernel loading (rtc) and then
software only is a mess to manage since we must be able to know
how many cpu cycles each instruction took + eventual cache miss +
branching cost + memory speed and put a counter after each one or
group (good luck with out-of-order cpu).
And do the calculation to derivate the current cpu cycle. That is
insane.
Not talking about the overall performance drop.
The best we could have is a brittle approximation of the time which
become more wrong over time. Even possibly on short laps."
But even if it seems logical to me, did my thinking go wrong?
Thanks
On current processors and hardware (e.g. Intel or AMD or ARM in laptops or desktops or tablets) with common operating systems (Linux, Windows, FreeBSD, MacOSX, Android, iOS, ...) processes are scheduled at random times. So cache behavior is non deterministic. Hence, instruction timing is non reproducible. You need some hardware time measurement.
A typical desktop or laptop gets hundreds, or thousands, of interrupts every second, most of them time related. Try running cat /proc/interrupts on a Linux machine twice, with a few seconds between the runs.
I guess that even with a single-tasked MS-DOS like operating system, you'll still get random behavior (e.g. induced by ACPI, or SMM). On some laptops, the processor frequency can be throttled by its temperature, which depends upon the CPU load and the external temperature...
In practice you really want to use some timer provided by the operating system. For Linux, read time(7)
So you practically cannot rely on a purely software timer. However, the processor has internal timers.... Even in principle, you cannot avoid timers on current processors ....
You might be able, if you can put your hardware in a very controlled environment (thermostatically) to run a very limited software (an OS-like free standing thing) sitting entirely in the processor cache and perhaps then get some determinism, but in practice current laptop or desktop (or tablet) hardware is non-deterministic and you cannot predict the time needed for a given small machine routine.
Timers are extremely useful in interesting (non-trivial) software, see e.g. J.Pitrat CAIA, a sleeping beauty blog entry for an interesting point. Also look at the many uses of watchdog timers in software (e.g. in the Parma Polyhedra Library)
Read also about Worst Case Execution Time (WCET).
So I would say that even in theory it is not possible to rely upon a purely software timer (unless of course that software uses the processor timers, which are hardware circuits). In the previous century (up to 1980s or 1990s) hardware was much more deterministic, and the amount of clock cycles or microsecond needed for each machine instruction was documented (but some instructions, e.g. division, needed a variable amount of time, depending on the actual data!).

Ultra-low latency programming on Linux, where to begin?

I heard there are some ways to modify linux such that an particular application can obtain very low latency such that whenver it ask resources, the OS will try to give it the resource as soon as possible, kind of overriding the default preemptive multitasking mechanism, I dont have a CS background, but the application I am working-on is very latency-sensitive, can anyone tell me are there any docs/stuff on this specific matter? many thanks.
Guaranteed low-latency response is called the real time capability. It means that timing goals that are realistic are guaranteed to be met.
There is a project for it called RTLinux. See the Real-Time Linux Wiki: https://rt.wiki.kernel.org/index.php/Main_Page
There are two real time models :
soft real time system - you get it by applying RT preempt kernel patches. I think it guaranties context switch within 10 ms. The goal of this project is to conform to hard real time requirements
hard real time system - have stricter guaranties (response of 1 ms). There are some libraries (like xenomai) that claim they provide hard real time system.

Nanosecond timing across kernel?

I am planning to write some software direct to an FPGA network card, to catch incoming customised network packets.
Eventually I believe I will send the data obtained either to the kernel or to a user application. This is for a latency-critical trading research project.
What kind of nanosecond timing instruments could I use due to the accuracy required and also the fact that I am timing the duration between reception at the PCI-E network card and receivership in the kernel?
This will be on Linux, with "driver" code (I may put the user application at this level to cut latency) written in C.
On linux access to the CPU clock tick is through the tsc equivalent to the Windows QueryPerformanceCOunter
clock_gettime uses HPET if available, which is simple and as good and as reliable as you can get.
If HPET is not available, you have no reliable timer at that scale anyway, so unluckily the resolution of clock_gettime will be worse, but that's just what it is, and there's not much you can do about it.
Any other source, including tsc, is either lower resolution or unreliable or both.
In software every thing happens on multiples of system clock. I think you can use any time measurement function that returns the number of elapsed clock ticks, clock() for example should give you enough accuracy.

Microsecond (or one ms) time resolution on an embedded device (Linux Kernel)

I have a kernel module I've built that requires at least 1 ms time resolution. I currently use do_gettimeofday() but I'm concerned that this won't work once I move my module to an embedded device. The device has a 180 Mz processor (MIPS) and the default HZ value in the kernel is 100. Thus using jiffies will only give me at best 10 ms resolution. That won't cut it.
What I'd like to know is if do_gettimeofday() is based on the timer interrupt (HZ). Can it be guaranteed to provide at least 1 ms of resolution?
Thanks!
ms is not microsecond, it's millisecond. Without knowing more about your choice of device, no one can possibly answer such an implementation-dependent question as whether gettimeofday is based on the timer interrupt. If you have chosen a device, which knowing the instruction set and clock speed suggests, then why don't you look at the implementation of that particular kernel to find out?
On an embedded device, it can't be guaranteed. Seeing as it's MIPS based, it's probably OK, most MIPS machines have cycle counters. But, you're going to have to go read the source to that part of the kernel to see what it is doing on your platform.
Yes, you need to enable CONFIG_HIGH_RES_TIMERS in your kernel, and make sure that your platform registers a clock_event_device. This is the mechanism that allows to expose high-resolution timers to userspace. You can check the resolution of your timers by calling clock_getres() in userspace.

Resources