Timing/Clocks in the Linux Kernel - c

I am writing a device driver and want to benchmark a few pieces of code to get a feel for where I could be experiencing some bottlenecks. As a result, I want to time a few segments of code.
In userspace, I'm used to using clock_gettime() with CLOCK_MONOTONIC. Looking at the kernel sources (note that I am running kernel 4.4, but will be upgrading eventually), it appears I have a few choices:
getnstimeofday()
getrawmonotonic()
get_monotonic_coarse()
getboottime()
For convenience, I have written a function (see below) to get me the current time. I am currently using getrawmonotonic() because I figured this is what I wanted. My function returns the current time as a ktime_t, so then I can use ktime_sub() to get the elapsed time between two times.
static ktime_t get_time_now(void) {
struct timespec time_now;
getrawmonotonic(&time_now);
return timespec_to_ktime(time_now);
}
Given the available high resolution clocking functions (jiffies won't work for me), what is the best function for my given application? More generally, I'm interested in any/all documentation about these functions and the underlying clocks. Primarily, I am curious if the clocks are affected by any timing adjustments and what their epochs are.

Are you comparing measurements you're making in the kernel directly with measurements you've made in userspace? I'm wondering about your choice to use CLOCK_MONOTONIC_RAW as the timebase in the kernel, since you chose to use CLOCK_MONOTONIC in userspace. If you're looking for an analogous and non-coarse function in the kernel which returns CLOCK_MONOTONIC (and not CLOCK_MONOTONIC_RAW) time, look at ktime_get_ts().
It's possible you could also be using raw kernel ticks to be measuring what you're trying to measure (rather than jiffies, which represent multiple kernel ticks), but I do not know how to do that off the top of my head.
In general if you're trying to find documentation about Linux timekeeping, you can take a look at Documentation/timers/timekeeping.txt. Usually when I try to figure out kernel timekeeping I also unfortunately just spend a lot of time reading through the kernel source in time/ (time/timekeeping.c is where most of the functions you're thinking of using right now live... it's not super well-commented, but you can probably wrap your head around it with a little bit of time). And if you're feeling altruistic after learning, remember that updating documentation is a good way to contribute to the kernel :)
To your question at the end about how clocks are affected by timing adjustments and what epochs are used:
CLOCK_REALTIME always starts at Jan 01, 1970 at midnight (colloquially known as the Unix Epoch) if there are no RTC's present or if it hasn't already been set by an application in userspace (or I guess a kernel module if you want to be weird). Usually the userspace application which sets this is the ntp daemon, ntpd or chrony or similar. Its value represents the number of seconds passed since 1970.
CLOCK_MONTONIC represents the number of seconds passed since the device was booted up, and if the device is suspended at a CLOCK_MONOTONIC value of x, when it's resumed, it resumes with CLOCK_MONOTONIC set to x as well. It's not supported on ancient kernels.
CLOCK_BOOTTIME is like CLOCK_MONOTONIC, but has time added to it across suspend/resume -- so if you suspend at a CLOCK_BOOTTIME value of x, for 5 seconds, you'll come back with a CLOCK_BOOTTIME value of x+5. It's not supported on old kernels (its support came about after CLOCK_MONOTONIC).
Fully-fleshed NTP daemons (not SNTP daemons -- that's a more lightweight and less accuracy-creating protocol) set the system clock, or CLOCK_REALTIME, using settimeofday() for large adjustments ("steps" or "jumps") -- these immediately affect the total value of CLOCK_REALTIME, and using adjtime() for smaller adjustments ("slewing" or "skewing") -- these affect the rate at which CLOCK_REALTIME moves forward per CPU clock cycle. I think for some architectures you can actually tune the CPU clock cycle through some means or other, and the kernel implements adjtime() this way if possible, but don't quote me on that. From both the bulk of the kernel's perspective and userspace's perspective, it doesn't actually matter.
CLOCK_MONOTONIC, CLOCK_BOOTTIME, and all other friends slew at the same rate as CLOCK_REALTIME, which is actually fairly convenient in most situations. They're not affected by steps in CLOCK_REALTIME, only by slews.
CLOCK_MONOTONIC_RAW, CLOCK_BOOTTIME_RAW, and friends do NOT slew at the same rate as CLOCK_REALTIME, CLOCK_MONOTONIC, and CLOCK_BOOTIME. I guess this is useful sometimes.
Linux provides some process/thread-specific clocks to userspace (CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID), which I know nothing about. I do not know if they're easily accessible in the kernel.

Related

Do CLOCK_MONOTONIC and CLOCK_MONOTONIC_COARSE have the same base?

The man page for clock_gettime() describes CLOCK_MONOTONIC_COARSE as:
A faster but less precise version of CLOCK_MONOTONIC. Use when you need very fast, but
not fine-grained timestamps.
What does it mean for one to be a "version of" the other?
Can I validly compare one to the other, assuming I truncate a CLOCK_MONOTONIC value to the same precision as the coarse one?
Here is the man page that lists the different "versions" of Posix/Linux clocks:
https://linux.die.net/man/2/clock_gettime
Sufficiently recent versions of glibc and the Linux kernel support the
following clocks:
CLOCK_REALTIME
System-wide clock that measures real (i.e., wall-clock) time.
Setting this clock requires appropriate privileges. This clock is
affected by discontinuous jumps in the system time (e.g., if the
system administrator manually changes the clock), and by the
incremental adjustments performed by adjtime(3) and NTP.
CLOCK_REALTIME_COARSE (since Linux 2.6.32; Linux-specific)
A faster but less precise version of CLOCK_REALTIME. Use when you
need very fast, but not fine-grained timestamps.
CLOCK_MONOTONIC
Clock that cannot be set and represents monotonic time since some
unspecified starting point. This clock is not affected by
discontinuous jumps in the system time (e.g., if the system
administrator manually changes the clock), but is affected by the
incremental adjustments performed by adjtime(3) and NTP.
CLOCK_MONOTONIC_COARSE (since Linux 2.6.32; Linux-specific)
A faster but less precise version of CLOCK_MONOTONIC. Use when you
need very fast, but not fine-grained timestamps.
CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific)
Similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based time that is not subject to NTP adjustments or the
incremental adjustments performed by adjtime(3).
CLOCK_BOOTTIME (since Linux 2.6.39; Linux-specific)
Identical to CLOCK_MONOTONIC, except it also includes any time that the system is suspended. This allows applications to get a
suspend-aware monotonic clock without having to deal with the
complications of CLOCK_REALTIME, which may have discontinuities if the
time is changed using settimeofday(2).
CLOCK_PROCESS_CPUTIME_ID
High-resolution per-process timer from the CPU.
CLOCK_THREAD_CPUTIME_ID
Thread-specific CPU-time clock.
As you can see above, CLOCK_MONOTONIC_COARSE was introduced in Linux 2.6.32. Here is the rationale (and the specific source patch):
https://lwn.net/Articles/347811/
fter talking with some application writers who want very fast, but not
fine-grained timestamps, I decided to try to implement a new clock_ids
to clock_gettime(): CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE
which returns the time at the last tick. This is very fast as we don't
have to access any hardware (which can be very painful if you're using
something like the acpi_pm clocksource), and we can even use the vdso
clock_gettime() method to avoid the syscall. The only trade off is you
only get low-res tick grained time resolution.
This isn't a new idea, I know Ingo has a patch in the -rt tree that
made the vsyscall gettimeofday() return coarse grained time when the
vsyscall64 sysctrl was set to 2. However this affects all applications
on a system.
With this method, applications can choose the proper speed/granularity
trade-off for themselves.
thanks
-john
ADDENDUM:
Q: What use cases might benefit from using CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE?
A: In Linux 2.6.32 time frame (2010-2011), "...application workloads (especially databases and financial service applications) perform extremely frequent gettimeofday or similar time function calls":
Redhat Enterprise: 2.6. gettimeofday speedup
Many application workloads (especially databases and financial service
applications) perform extremely frequent gettimeofday or similar time
function calls. Optimizing the efficiency of this calls can provide
major benefits.
CLOCK_MONOTONIC_COARSE uses the same timebase as CLOCK_MONOTONIC (but specifically does NOT as compared to CLOCK_MONOTONIC_RAW). Specifically, they both use wall_to_monotonic to convert a value derived from tk's xtime. RAW uses a completely different time source.
Remember, that CLOCK_MONOTONIC_COARSE is only updated once per tick (so usually about 1 ms, but ask clock_getres() to be sure). If that accuracy is good enough, then by all means subtract your clock values.
The short answer is YES (at least for Linux!), you can compare them, compute delays, etc...
The precision would be that of the less precise, most probably COARSE one.
See this short program:
#include <time.h>
#include <stdio.h>
int main()
{
int ret;
struct timespec res;
ret = clock_getres(CLOCK_MONOTONIC, &res);
if (0 != ret)
return ret;
printf("CLOCK_MONOTONIC resolution is: %ld sec, %ld nsec\n", (long)res.tv_sec, (long)res.tv_nsec);
ret = clock_getres(CLOCK_MONOTONIC_COARSE, &res);
if (0 != ret)
return ret;
printf("CLOCK_MONOTONIC_COARSE resolution is: %ld sec, %ld nsec\n", (long)res.tv_sec, (long)res.tv_nsec);
return 0;
}
It returns (Ubuntu 20.04 - 64bits - kernel 5.4)
CLOCK_MONOTONIC resolution is: 0 sec, 1 nsec
CLOCK_MONOTONIC_COARSE resolution is: 0 sec, 4000000 nsec
So MONOTONIC has nanosecond precision, and COARSE has 4 milliseconds precision.
Unlike above comment, I would on the contrary recommend to use the COARSE version when the timings you need allow.
Calls to the clock are so frequent in user programs that they have a place in vDSO
When you use COARSE versions, you have exactly zero system call, and it is as fast as your machine can run a few instructions. Thanks to vDSO your program fully stays in "userland" during the call with COARSE.
With other types of clocks, you will have some system calls, and potential access to hardware. So at least a switch to "kernel" and back to "userland".
This of course has zero importance if your program just needs a dozen of calls, but it can be a huge time saver if, on the contrary, the program relies heavily on the clock. That is why, vDSO is there in the first place: performance!
Define first what is the accuracy you need for your timings. Is
second enough, do you need milli second, micro, etc...
Have in mind, unless you are tinkering with RT systems, that time is a
relative value! Imagine you called clock_gettime, and immediately
after returning your thread gets interrupted for any kernel business:
what is the accuracy you get? That is exactly the famous question
that defeated HAL in 2001: A space Odyssey: "what time is it?".
From that you can derive what is the type of clock you need.
You can mix MONOTONIC and the COARSE version of it and still compute delays or compare (that was the original question). But of course the precision is that of the less precise.
The monotonics are best suited to do time delays and do comparisons since they don't depend on the real time (as your watch displays). They don't change when the user changes the actual time.
On the contrary, if you need to display at what time (meaningful for the user) an event occurred, don't use monotonic!

Measuring Elapsed Time In Linux (CLOCK_MONOTONIC vs. CLOCK_MONOTONIC_RAW)

I am currently trying to talk to a piece of hardware in userspace (underneath the hood, everything is using the spidev kernel driver, but that's a different story).
The hardware will tell me that a command has been completed by indicating so with a special value in a register, that I am reading from. The hardware also has a requirement to get back to me in a certain time, otherwise the command has failed. Different commands take different times.
As a result, I am implementing a way to set a timeout and then check for that timeout using clock_gettime(). In my "set" function, I take the current time and add the time interval I should wait for (usually this anywhere from a few ms to a couple of seconds). I then store this value for safe keeping later.
In my "check" function, I once again, get the current time and then compare it against the time I have saved. This seems to work as I had hoped.
Given my use case, should I be using CLOCK_MONOTONIC or CLOCK_MONOTONIC_RAW? I'm assuming CLOCK_MONOTONIC_RAW is better suited, since I have short intervals that I am checking. I am worried that such a short interval might represent a system-wide outlier, in which NTP was doing alot of adjusting. Note that my target system is only Linux kernels 4.4 and newer.
Thanks in advance for the help.
Edited to add: given my use case, I need "wall clock" time, not CPU time. That is, I am checking to see if the hardware has responded in some wall clock time interval.
References:
Rutgers Course Notes
What is the difference between CLOCK_MONOTONIC & CLOCK_MONOTONIC_RAW?
Elapsed Time in C Tutorial

Avoid use of gettimeofday() API

gettimeofday() is hardware dependent with RTC.
Can some one suggest how we can avoid the use of the same in Application Programming.
How we can approach the use of System ticks ?
thanks in advance !
To get time in ticks you might like to use times().
However is is not clear whether those ticks are measured from boot-time.
From man times:
RETURN VALUE
times() returns the number of clock ticks that have elapsed since an
arbitrary point in the past. [...]
[...]
NOTES
On Linux, the "arbitrary point in the past" from which the return
value of times() is measured has varied across kernel versions. On
Linux 2.4 and earlier this point is the moment the system was booted.
Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429
million) seconds before system boot time. This variability across
kernel versions (and across UNIX implementations), combined with the
fact that the returned value may overflow the range of clock_t, means
that a portable application would be wise to avoid using this value.
To measure changes in elapsed time, use clock_gettime(2) instead.
Reading this using clock_gettitme() with the CLOCK_BOOTTIME timer might be the more secure and more portable way to go. If this function and/or timer is available for system without RTC I'm not sure. Others are encouraged to clarfiy this.

how to calculate the received packets rate on a linux based pc?like pps or fps

I am writing a network program which can calculate accurate data packet rate (packet per second, frame per second, bps). Now i have a device called testcenter which can send accurate flow to a specific pc (protocol is UDP/IP) on Linux, i like to know the accurate pps(packets per second) with my program , i have considered the gettimeofday(&start,NULL)function before i call recvfrom() and update the counter for packets, after that call gettimeofday(&end,NULL) and get the pps rate. I hope there is better solution than this since the user/kernel barrier is traversed on system calls.
Best regards.
I think you should use clock_gettime() with CLOCK_MONOTONIC_COARSE. But it will only be accurate till the last tick .. So may be off by 10s of millisec. But its definitely faster that using it with CLOCK_MONOTONIC_RAW. You can also use gettimeofday but clock_gettime with CLOCK_MONOTONIC_RAW is slightly faster and higher resolution than gettimeofday.
Also gettimeofday() gives wall clock time, which might change even for daylight saving ... I don't think you should use it to measure traffic rate.
Your observation that gettimeofday switches to kernel mode is incorrect for Linux on a few popular architectures due to the use of vsyscalls. Clearly using gettimeofday here is not a bad option. You should however consider using a monotonic clock, see man 3 clock_gettime. Note that clock_gettime is not yet converted to vsyscall for as many architectures as gettimeofday.
Beyond this option you may be able to set the SO_TIMESTAMP socket option and obtain precise timestamps via recvmsg.

Performance differences between nanosleep and /dev/rtc

I've been converting the main loop of an embedded linux program to run on a server and one of the nice to haves will be to run non-root. The program is in charge of periodically requesting an IO scan from a network device - every 2ms. Today I replaced the use of /dev/rtc with a call to nanosleep. In this particular case we can get away with plenty of latency but I'm wondering if the nanosleep call can be left in for the case when we're running on the embedded device with more strict timing requirements (which is the case for bigger projects). Is there a big difference in performance?
It depends on the Linux kernel version. From the time(7) manpage:
High-Resolution Timers
Before Linux 2.6.21, the accuracy of timer and sleep system calls (see below) was also limited by the
size of the jiffy.
Since Linux 2.6.21, Linux supports high-resolution timers (HRTs), optionally configurable via CON‐
FIG_HIGH_RES_TIMERS. On a system that supports HRTs, the accuracy of sleep and timer system calls is
no longer constrained by the jiffy, but instead can be as accurate as the hardware allows (microsec‐
ond accuracy is typical of modern hardware).
Note: "jiffy" is the timer tick frequency mentioned in the answer by "Greg". Also note that the system calls that this refers to includes nanosleep().
That is, if you have a recent enough version of the kernel on your embedded target, nanosleep() should be good enough. If you have an older kernel version, you're indeed limited by the clock tick frequency, and in that case you might have problems as 2 ms is quite close to the 1 ms tick frequency with CONFIG_HZ=1000.
Man-page for nanosleep suggests that kernel-timers are used, making it sensitive to the value of "HZ" (the kernel "tick"). The tick is commonly 1000HZ (1ms), however for an embedded system this value may not necessarily be the same, you can configure it with kernel parameters depending on what timing sources you have available.
Since /dev/rtc uses interrupts it doesn't tie you to HZ so timing could be more sensitive. Of course that also depends on your RTC hardware.
If in doubt, make a thin abstraction which lets you set a time and pass a callback so you can switch between implementations, and as always with embedded systems - measure it on the real device.

Resources