Is QueryPerformanceCounter guaranteed to give you time since boot? - c

Is it safe to assume that the count returned from QueryPerformanceCounter relates to the time since the last system boot? Or could it be reset while the system is running? The MSDN article itself doesn't guarantee this, however I've seen some 3rd party information (such as this) that says that this is the case.

It's meant to be used for relative times. But I don't think it can be used to measure time since boot.
From what I hear, it's implemented using the rdtsc instruction which measures "pseudo" CPU cycles since the CPU was powered on. In that case, yes, it probably does give the time since boot, but I don't think this is specified.

Related

How to get CPU instruction count for a thread?

I know that getrusage() can provide per-thread CPU utilization, but only the time spent on the CPU. Is there any way to get the number of executed CPU instructions? Or the number of cycles spent on the cpu?
Basically, I need to find a reproducible measure of how much the thread spends on the CPU. Any suggestions to do this in C?
UPDATE (to respond to comments):
Ideally I'd need this in a platform independent way, but Linux would be the most useful.
Reproducibility is the most important for me, even if that means the actual runtime may be slightly different.
I know vTune (and have used it), but I'd like to have this info programmatically while my code is running. So vTune is out, as well as the suggestions made in the post linked by Craig Estey.
I did look at the Intel Intrinsics Guide, but did not find anything useful...
Take a look at google's filament engine. They are doing exactly that.
Look at their profiler.
https://github.com/google/filament/blob/master/libs/utils/src/Profiler.cpp
Also you can get more info from this link:
https://www.youtube.com/watch?v=Lcq_fzet9Iw

Full software timer : derivate time?

I have been asked a question but I am not sure if I answered it correctly.
"Is it possible to rely only on software timer?"
My answer was "yes, in theory".
But then I added:
"Just relying on hardware timer at the kernel loading (rtc) and then
software only is a mess to manage since we must be able to know
how many cpu cycles each instruction took + eventual cache miss +
branching cost + memory speed and put a counter after each one or
group (good luck with out-of-order cpu).
And do the calculation to derivate the current cpu cycle. That is
insane.
Not talking about the overall performance drop.
The best we could have is a brittle approximation of the time which
become more wrong over time. Even possibly on short laps."
But even if it seems logical to me, did my thinking go wrong?
Thanks
On current processors and hardware (e.g. Intel or AMD or ARM in laptops or desktops or tablets) with common operating systems (Linux, Windows, FreeBSD, MacOSX, Android, iOS, ...) processes are scheduled at random times. So cache behavior is non deterministic. Hence, instruction timing is non reproducible. You need some hardware time measurement.
A typical desktop or laptop gets hundreds, or thousands, of interrupts every second, most of them time related. Try running cat /proc/interrupts on a Linux machine twice, with a few seconds between the runs.
I guess that even with a single-tasked MS-DOS like operating system, you'll still get random behavior (e.g. induced by ACPI, or SMM). On some laptops, the processor frequency can be throttled by its temperature, which depends upon the CPU load and the external temperature...
In practice you really want to use some timer provided by the operating system. For Linux, read time(7)
So you practically cannot rely on a purely software timer. However, the processor has internal timers.... Even in principle, you cannot avoid timers on current processors ....
You might be able, if you can put your hardware in a very controlled environment (thermostatically) to run a very limited software (an OS-like free standing thing) sitting entirely in the processor cache and perhaps then get some determinism, but in practice current laptop or desktop (or tablet) hardware is non-deterministic and you cannot predict the time needed for a given small machine routine.
Timers are extremely useful in interesting (non-trivial) software, see e.g. J.Pitrat CAIA, a sleeping beauty blog entry for an interesting point. Also look at the many uses of watchdog timers in software (e.g. in the Parma Polyhedra Library)
Read also about Worst Case Execution Time (WCET).
So I would say that even in theory it is not possible to rely upon a purely software timer (unless of course that software uses the processor timers, which are hardware circuits). In the previous century (up to 1980s or 1990s) hardware was much more deterministic, and the amount of clock cycles or microsecond needed for each machine instruction was documented (but some instructions, e.g. division, needed a variable amount of time, depending on the actual data!).

Ultra-low latency programming on Linux, where to begin?

I heard there are some ways to modify linux such that an particular application can obtain very low latency such that whenver it ask resources, the OS will try to give it the resource as soon as possible, kind of overriding the default preemptive multitasking mechanism, I dont have a CS background, but the application I am working-on is very latency-sensitive, can anyone tell me are there any docs/stuff on this specific matter? many thanks.
Guaranteed low-latency response is called the real time capability. It means that timing goals that are realistic are guaranteed to be met.
There is a project for it called RTLinux. See the Real-Time Linux Wiki: https://rt.wiki.kernel.org/index.php/Main_Page
There are two real time models :
soft real time system - you get it by applying RT preempt kernel patches. I think it guaranties context switch within 10 ms. The goal of this project is to conform to hard real time requirements
hard real time system - have stricter guaranties (response of 1 ms). There are some libraries (like xenomai) that claim they provide hard real time system.

Is there a difference between a real time system and one that is just deterministic?

At work we're discussing the design of a new platform and one of the upper management types said it needed to run our current code base (C on Linux) but be real time because it needed to respond in less than a second to various inputs. I pointed out that:
That point doesn't mean it needs to be "real time" just that it needs a faster clock and more streamlining in its interrupt handling
One of the key points to consider is the OS that's being used. They wanted to stick with embedded Linux, I pointed out we need an RTOS. Using Linux will prevent "real time" because of the kernel/user space memory split thus I/O is done via files and sockets which introduce a delay
What we really need to determine is if it needs to be deterministic (needs to respond to input in <200ms 90% of the time for example).
Really in my mind if point 3 is true, then it needs to be a real time system, and then point 2 is the biggest consideration.
I felt confident answering, but then I was thinking about it later... What do others think? Am I on the right track here or am I missing something?
Is there any difference that I'm missing between a "real time" system and one that is just "deterministic"? And besides a RTC and a RTOS, am I missing anything major that is required to execute a true real time system?
Look forward to some great responses!
EDIT:
Got some good responses so far, looks like there's a little curiosity about my system and requirements so I'll add a few notes for those who are interested:
My company sells units in the 10s of thousands, so I don't want to go over kill on the price
Typically we sell a main processor board and an independent display. There's also an attached network of other CAN devices.
The board (currently) runs the devices and also acts as a webserver sending basic XML docs to the display for end users
The requirements come in here where management wants the display to be updated "quickly" (<1s), however the true constraints IMO come from the devices that can be attached over CAN. These devices are frequently motor controlled devices with requirements including "must respond in less than 200ms".
You need to distinguish between:
Hard realtime: there is an absolute limit on response time that must not be breached (counts as a failure) - e.g. this is appropriate for example when you are controlling robotic motors or medical devices where failure to meet a deadline could be catastrophic
Soft realtime: there is a requirement to respond quickly most of the time (perhaps 99.99%+), but it is acceptable for the time limit to be occasionally breached providing the response on average is very fast. e.g. this is appropriate when performing realtime animation in a computer game - missing a deadline might cause a skipped frame but won't fundamentally ruin the gaming experience
Soft realtime is readily achievable in most systems as long as you have adequate hardware and pay sufficient attention to identifying and optimising the bottlenecks. With some tuning, it's even possible to achieve in systems that have non-deterministic pauses (e.g. the garbage collection in Java).
Hard realtime requires dedicated OS support (to guarantee scheduling) and deterministic algorithms (so that once scheduled, a task is guaranteed to complete within the deadline). Getting this right is hard and requires careful design over the entire hardware/software stack.
It is important to note that most business apps don't require either: in particular I think that targeting a <1sec response time is far away from what most people would consider a "realtime" requirement. Having said that, if a response time is explicitly specified in the requirements then you can regard it as soft realtime with a fairly loose deadline.
From the definition of the real-time tag:
A task is real-time when the timeliness of the activities' completion is a functional requirement and correctness condition, rather than merely a performance metric. A real-time system is one where some (though perhaps not all) of the tasks are real-time tasks.
In other words, if something bad will happen if your system responds too slowly to meet a deadline, the system needs to be real-time and you will need a RTOS.
A real-time system does not need to be deterministic: if the response time randomly varies between 50ms and 150ms but the response time never exceeds 150ms then the system is non-deterministic but it is still real-time.
Maybe you could try to use RTLinux or RTAI if you have sufficient time to experiment with. With this, you can keep the non realtime applications on the linux, but the realtime applications will be moved to the RTOS part. In that case, you will(might) achieve <1second response time.
The advantages are -
Large amount of code can be re-used
You can manually partition realtime and non-realtime tasks and try to achieve the response <1s as you desire.
I think migration time will not be very high, since most of the code will be in linux
Just on a sidenote be careful about the hardware drivers that you might need to run on the realtime part.
The following architecture of RTLinux might help you to understand how this can be possible.
It sounds like you're on the right track with the RTOS. Different RTOSs prioritize different things either robustness or speed or something. You will need to figure out if you need a hard or soft RTOS and based on what you need, how your scheduler is going to be driven. One thing is for sure, there is a serious difference betweeen using a regular OS and a RTOS.
Note: perhaps for the truest real time system you will need hard event based resolution so that you can guarantee that your processes will execute when you expect them too.
RTOS or real-time operating system is designed for embedded applications. In a multitasking system, which handles critical applications operating systems must be
1.deterministic in memory allocation,
2.should allow CPU time to different threads, task, process,
3.kernel must be non-preemptive which means context switch must happen only after the end of task execution. etc
SO normal windows or Linux cannot be used.
example of RTOS in an embedded system: satellites, formula 1 cars, CAR navigation system.
Embedded System: System which is designed to perform a single or few dedicated functions.
The system with RTOS: also can be an embedded system but naturally RTOS will be used in the real-time system which will need to perform many functions.
Real-time System: System which can provide the output in a definite/predicted amount of time. this does not mean the real-time systems are faster.
Difference between both :
1.normal Embedded systems are not Real-Time System
2. Systems with RTOS are real-time systems.

<100uS accurate sleeps on Windows CE

Is it possible to sleep for an amount of time that will be accurate to less than 100 microseconds on Windows CE? The less jitter the better - ideally we'd like single digit microsecond response times.
What we really want is a 5ms timer with very low jitter - although the Windows CE WaitFor[Single|Multiple]Objects and Sleep APIs work in units of milliseconds, we can't then correct for the sub-ms time our code may take to run, so the cycle would gradually drift.
If this is not possible, that information would be very helpful too.
This MSDN article has some code to set up a 500us timer interrupt in WinCE, so it's absolutely possible.
If you aren't locked into your version of WinCE, you might want to look into Tenasys, who claims to offer an RTOS running side-by-side with Windows on standard hardware.
I've also heard good things about QNX, but I haven't used their products either. I do not believe it is Windows compatible in any way, however.
It's not possible. It's not even possible on desktops. Typical operating systems simply don't function in this manner.
If what you need is something to fire precisely every 4 milliseconds or whatever you're out of luck. If what you really need is something to fire precisely 250 times every second that may be more doable. If you're in need of the latter I can suggest an approach.
If your need to sleep is not a battery/thread-yield issue and just a matter of accurate timing, you can use the "performancecounter" in Windows CE devices. On XScale and Qualcomm CPUs, this is the internal chip timer and has sub 1-ms granularity. On older OMAP and Samsung processors, the performancecounter API is passing through the 1ms system tick and has lots of jitter.
L.B.
To correct the jitter you need access to a high resolution timer.
The CPU you have may have one. If not, the interrupt controller may.
The Easiest way is to use a Linux with realtime and WINE your way into that library.
you want a Periodic Thread.
Take a look at this report from NIST.

Resources