how to perform arithmetic on fractions in a Linux module - c

I want to perform arithmetic on fractions, and I know that you can't perform floating-point arithmetic in kernel code, and I understand the reasons why the kernel doesn't allow that. What I am trying to do exactly is to load a module and calculate the elapsed time in seconds since the time the kernel module has been loaded and removed. I know how to do that using the value of jiffies and HZ like this:
#include <linux/jiffies.h>
unsigned long int first_jiff;
int start_init(void){
first_jiff = jiffies;
printk(KERN_INFO "loading kernel module\n");
return 0;
}
void exit_init(void){
float elapsed_seconds;
//calculate the difference between first value of jiffies and the current one
first_jiff = jiffies - first_jiff;
elapsed_seconds = (float)(first_jiff / HZ);
printk(KERN_INFO "elapsed_time:%f", elapsed_seconds);
}
module_init(start_init);
module_exit(exit_init);
but of course, I get this error
/include/linux/printk.h:464:44: error: SSE register return with SSE disabled
Is there any way to get around this?

You don't need floating point types here. Just do the division, which will be integer division, and store the result in an integer.
void exit_init(void){
unsigned long elapsed_seconds;
//calculate the difference between first value of jiffies and the current one
first_jiff = jiffies - first_jiff;
elapsed_seconds = first_jiff / HZ;
printk(KERN_INFO "elapsed_time:%lu", elapsed_seconds);
}

Related

How is time(time_t *arg) rounding milliseconds or second fractions?

I need to know what will
time(NULL)
return when current system time is :
1997-07-16T19:20:30.49+01:00
or
1997-07-16T19:20:30.50+01:00
If time() reports a count of seconds (the most common time_t), it will certainly truncate, else it could report that now is tomorrow.
C does not specify this, yet it is is the only reasonable implementation of time().
See extensions like gettimeofday that explicitly provide the fraction and time_t.
A relatively new standard C function int timespec_get(struct timespec *ts, int base) does provide guidance that likely applies to time() in that the fraction part of now is truncated.
If base is TIME_UTC, the tv_sec member is set to the number of seconds since an implementation defined epoch, truncated to a whole value and the tv_nsec member is set to the integral number of nanoseconds, rounded to the resolution of the system clock.
Sample usage:
#include <stdio.h>
#include <time.h>
void print_now(void) {
struct timespec ts = { 0 };
int base = timespec_get(&ts, TIME_UTC);
if (base) {
printf("%lld.%09ld\n", (long long) ts.tv_sec, ts.tv_nsec);
}
}
The standard doesn't specify the exact behavior, but the most likely behavior is to truncate any fractional seconds.
As an example, I ran the following on CentOS7:
time_t t;
struct timeval tv;
int i;
for (i=0;i<500000;i++) {
t = time(NULL);
gettimeofday(&tv, NULL);
printf("t=%ld, tv=%ld.%06ld\n", t, tv.tv_sec, tv.tv_usec);
}
Which outputted the following when the second incremented:
t=1515099481, tv=1515099481.990469
t=1515099481, tv=1515099481.990469
t=1515099481, tv=1515099481.990470
t=1515099481, tv=1515099481.990470
t=1515099481, tv=1515099481.990470
t=1515099482, tv=1515099482.003241
t=1515099482, tv=1515099482.003250
t=1515099482, tv=1515099482.003250
t=1515099482, tv=1515099482.003251
It will ignore NULL and return the time in seconds since the epoch start which is 1st January of 1970.
Also, time_t is not the same as time function. milliseconds will be truncated, and the granularity of the timer depends on the settings as well as the hardware granularity.
Some systems use interpolation to get soft-time beyond the seconds.
If talking about time in Linux, see this: http://man7.org/linux/man-pages/man7/time.7.html, and there you will find clear distinction between the hardware real time and soft one. Go and see about jiffies, and how the hardware is set.

Using Time stamp counter to get the time stamp

I have used the below code to get the clock cycle of the processor
unsigned long long rdtsc(void)
{
unsigned hi, lo;
__asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}
I get some value say 43, but what is the unit here? Is it in microseconds or nanoseconds.
I used below code to get the frequency of my board.
cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
1700000
I also used below code to find my processor speed
dmidecode -t processor | grep "Speed"
Max Speed: 3700 MHz
Current Speed: 3700 MHz
Now how do I use above frequency and convert it to microseconds or milliseconds?
A simple answer to the stated question, "how do I convert the TSC frequency to microseconds or milliseconds?" is: You do not. What the TSC (Time Stamp Counter) clock frequency actually is, varies depending on the hardware, and may vary during runtime on some. To measure real time, you use clock_gettime(CLOCK_REALTIME) or clock_gettime(CLOCK_MONOTONIC) in Linux.
As Peter Cordes mentioned in a comment (Aug 2018), on most current x86-64 architectures the Time Stamp Counter (accessed by the RDTSC instruction and __rdtsc() function declared in <x86intrin.h>) counts reference clock cycles, not CPU clock cycles. His answer to a similar question in C++ is valid for C also in Linux on x86-64, because the compiler provides the underlying built-in when compiling C or C++, and rest of the answer deals with the hardware details. I recommend reading that one, too.
The rest of this answer assumes the underlying issue is microbenchmarking code, to find out how two implementations of some function compare to each other.
On x86 (Intel 32-bit) and x86-64 (AMD64, Intel and AMD 64-bit) architectures, you can use __rdtsc() from <x86intrin.h> to find out the number of TSC clock cycles elapsed. This can be used to measure and compare the number of cycles used by different implementations of some function, typically a large number of times.
Do note that there are hardware differences as to how the TSC clock is related to CPU clock. The abovementioned more recent answer goes into some detail on that. For practical purposes in Linux, it is sufficient in Linux to use cpufreq-set to disable frequency scaling (to ensure the relationship between the CPU and TSC frequencies does not change during microbenchmarking), and optionally taskset to restrict the microbenchmark to specific CPU core(s). That ensures that the results gathered in that microbenchmark yield results that can be compared to each other.
(As Peter Cordes commented, we also want to add _mm_lfence() from <emmintrin.h> (included by <immintrin.h>). This ensures that the CPU does not internally reorder the RDTSC operation compared to the function to be benchmarked. You can use -DNO_LFENCE at compile time to omit those, if you want.)
Let's say you have functions void foo(void); and void bar(void); that you wish to compare:
#include <stdlib.h>
#include <x86intrin.h>
#include <stdio.h>
#ifdef NO_LFENCE
#define lfence()
#else
#include <emmintrin.h>
#define lfence() _mm_lfence()
#endif
static int cmp_ull(const void *aptr, const void *bptr)
{
const unsigned long long a = *(const unsigned long long *)aptr;
const unsigned long long b = *(const unsigned long long *)bptr;
return (a < b) ? -1 :
(a > b) ? +1 : 0;
}
unsigned long long *measure_cycles(size_t count, void (*func)())
{
unsigned long long *elapsed, started, finished;
size_t i;
elapsed = malloc((count + 2) * sizeof elapsed[0]);
if (!elapsed)
return NULL;
/* Call func() count times, measuring the TSC cycles for each call. */
for (i = 0; i < count; i++) {
/* First, let's ensure our CPU executes everything thus far. */
lfence();
/* Start timing. */
started = __rdtsc();
/* Ensure timing starts before we call the function. */
lfence();
/* Call the function. */
func();
/* Ensure everything has been executed thus far. */
lfence();
/* Stop timing. */
finished = __rdtsc();
/* Ensure we have the counter value before proceeding. */
lfence();
elapsed[i] = finished - started;
}
/* The very first call is likely the cold-cache case,
so in case that measurement might contain useful
information, we put it at the end of the array.
We also terminate the array with a zero. */
elapsed[count] = elapsed[0];
elapsed[count + 1] = 0;
/* Sort the cycle counts. */
qsort(elapsed, count, sizeof elapsed[0], cmp_ull);
/* This function returns all cycle counts, in sorted order,
although the median, elapsed[count/2], is the one
I personally use. */
return elapsed;
}
void benchmark(const size_t count)
{
unsigned long long *foo_cycles, *bar_cycles;
if (count < 1)
return;
printf("Measuring run time in Time Stamp Counter cycles:\n");
fflush(stdout);
foo_cycles = measure_cycles(count, foo);
bar_cycles = measure_cycles(count, bar);
printf("foo(): %llu cycles (median of %zu calls)\n", foo_cycles[count/2], count);
printf("bar(): %llu cycles (median of %zu calls)\n", bar_cycles[count/2], count);
free(bar_cycles);
free(foo_cycles);
}
Note that the above results are very specific to the compiler and compiler options used, and of course on the hardware it is run on. The median number of cycles can be interpreted as "the typical number of TSC cycles taken", because the measurement is not completely reliable (may be affected by events outside the process; for example, by context switches, or by migration to another core on some CPUs). For the same reason, I don't trust the minimum, maximum, or average values.
However, the two implementations' (foo() and bar()) cycle counts above can be compared to find out how their performance compares to each other, in a microbenchmark. Just remember that microbenchmark results may not extend to real work tasks, because of how complex tasks' resource use interactions are. One function might be superior in all microbenchmarks, but poorer than others in real world, because it is only efficient when it has lots of CPU cache to use, for example.
In Linux in general, you can use the CLOCK_REALTIME clock to measure real time (wall clock time) used, in the very same manner as above. CLOCK_MONOTONIC is even better, because it is not affected by direct changes to the realtime clock the administrator might make (say, if they noticed the system clock is ahead or behind); only drift adjustments due to NTP etc. are applied. Daylight savings time or changes thereof does not affect the measurements, using either clock. Again, the median of a number of measurements is the result I seek, because events outside the measured code itself can affect the result.
For example:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#ifdef NO_LFENCE
#define lfence()
#else
#include <emmintrin.h>
#define lfence() _mm_lfence()
#endif
static int cmp_double(const void *aptr, const void *bptr)
{
const double a = *(const double *)aptr;
const double b = *(const double *)bptr;
return (a < b) ? -1 :
(a > b) ? +1 : 0;
}
double median_seconds(const size_t count, void (*func)())
{
struct timespec started, stopped;
double *seconds, median;
size_t i;
seconds = malloc(count * sizeof seconds[0]);
if (!seconds)
return -1.0;
for (i = 0; i < count; i++) {
lfence();
clock_gettime(CLOCK_MONOTONIC, &started);
lfence();
func();
lfence();
clock_gettime(CLOCK_MONOTONIC, &stopped);
lfence();
seconds[i] = (double)(stopped.tv_sec - started.tv_sec)
+ (double)(stopped.tv_nsec - started.tv_nsec) / 1000000000.0;
}
qsort(seconds, count, sizeof seconds[0], cmp_double);
median = seconds[count / 2];
free(seconds);
return median;
}
static double realtime_precision(void)
{
struct timespec t;
if (clock_getres(CLOCK_REALTIME, &t) == 0)
return (double)t.tv_sec
+ (double)t.tv_nsec / 1000000000.0;
return 0.0;
}
void benchmark(const size_t count)
{
double median_foo, median_bar;
if (count < 1)
return;
printf("Median wall clock times over %zu calls:\n", count);
fflush(stdout);
median_foo = median_seconds(count, foo);
median_bar = median_seconds(count, bar);
printf("foo(): %.3f ns\n", median_foo * 1000000000.0);
printf("bar(): %.3f ns\n", median_bar * 1000000000.0);
printf("(Measurement unit is approximately %.3f ns)\n", 1000000000.0 * realtime_precision());
fflush(stdout);
}
In general, I personally prefer to compile the benchmarked function in a separate unit (to a separate object file), and also benchmark a do-nothing function to estimate the function call overhead (although it tends to give an overestimate for the overhead; i.e. yield too large an overhead estimate, because some of the function call overhead is latencies and not actual time taken, and some operations are possible during those latencies in the actual functions).
It is important to remember that the above measurements should only be used as indications, because in a real world application, things like cache locality (especially on current machines, with multi-level caching, and lots of memory) hugely affect the time used by different implementations.
For example, you might compare the speeds of a quicksort and a radix sort. Depending on the size of the keys, the radix sort requires rather large extra arrays (and uses a lot of cache). If the real application the sort routine is used in does not simultaneously use a lot of other memory (and thus the sorted data is basically what is cached), then a radix sort will be faster if there is enough data (and the implementation is sane). However, if the application is multithreaded, and the other threads shuffle (copy or transfer) a lot of memory around, then the radix sort using a lot of cache will evict other data also cached; even though the radix sort function itself does not show any serious slowdown, it may slow down the other threads and therefore the overall program, because the other threads have to wait for their data to be re-cached.
This means that the only "benchmarks" you should trust, are wall clock measurements used on the actual hardware, running actual work tasks with actual work data. Everything else is subject to many conditions, and are more or less suspect: indications, yes, but not very reliable.

Measuring processor ticks in C

I wanted to calculate the difference in execution time when executing the same code inside a function. To my surprise, however, sometimes the clock difference is 0 when I use clock()/clock_t for the start and stop timer. Does this mean that clock()/clock_t does not actually return the number of clicks the processor spent on the task?
After a bit of searching, it seemed to me that clock_gettime() would return more fine grained results. And indeed it does, but I instead end up with an abitrary number of nano(?)seconds. It gives a hint of the difference in execution time, but it's hardly accurate as to exactly how many clicks difference it amounts to. What would I have to do to find this out?
#include <math.h>
#include <stdio.h>
#include <time.h>
#define M_PI_DOUBLE (M_PI * 2)
void rotatetest(const float *x, const float *c, float *result) {
float rotationfraction = *x / *c;
*result = M_PI_DOUBLE * rotationfraction;
}
int main() {
int i;
long test_total = 0;
int test_count = 1000000;
struct timespec test_time_begin;
struct timespec test_time_end;
float r = 50.f;
float c = 2 * M_PI * r;
float x = 3.f;
float result_inline = 0.f;
float result_function = 0.f;
for (i = 0; i < test_count; i++) {
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &test_time_begin);
float rotationfraction = x / c;
result_inline = M_PI_DOUBLE * rotationfraction;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &test_time_end);
test_total += test_time_end.tv_nsec - test_time_begin.tv_nsec;
}
printf("Inline clocks %li, avg %f (result is %f)\n", test_total, test_total / (float)test_count,result_inline);
for (i = 0; i < test_count; i++) {
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &test_time_begin);
rotatetest(&x, &c, &result_function);
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &test_time_end);
test_total += test_time_end.tv_nsec - test_time_begin.tv_nsec;
}
printf("Function clocks %li, avg %f (result is %f)\n", test_total, test_total / (float)test_count, result_inline);
return 0;
}
I am using gcc version 4.8.4 on Linux 3.13.0-37-generic (Linux Mint 16)
First of all: As already mentioned in the comments, clocking a single run of execution one by the other will probably do you no good. If all goes down the hill, the call for getting the time might actually take longer than the actual execution of the operation.
Please clock multiple runs of the operation (including a warm up phase so everything is swapped in) and calculate the average running times.
clock() isn't guaranteed to be monotonic. It also isn't the number of processor clicks (whatever you define this to be) the program has run. The best way to describe the result from clock() is probably "a best effort estimation of the time any one of the CPUs has spent on calculation for the current process". For benchmarking purposes clock() is thus mostly useless.
As per specification:
The clock() function returns the implementation's best approximation to the processor time used by the process since the beginning of an implementation-dependent time related only to the process invocation.
And additionally
To determine the time in seconds, the value returned by clock() should be divided by the value of the macro CLOCKS_PER_SEC.
So, if you call clock() more often than the resolution, you are out of luck.
For profiling/benchmarking, you should --if possible-- use one of the performance clocks that are available on modern hardware. The prime candidates are probably
The HPET
The TSC
Edit: The question now references CLOCK_PROCESS_CPUTIME_ID, which is Linux' way of exposing the TSC.
If any (or both) are available depends on the hardware in is also operating system specific.
After googling a little bit I can see that clock() function can be used as a standard mechanism to find the tome taken for execution , but be aware that the time will be varying at different time depending upon the load of your processor,
You can just use the below code for calculation
clock_t begin, end;
double time_spent;
begin = clock();
/* here, do your time-consuming job */
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;

Measuring time in millisecond precision

My program is going to race different sorting algorithms against each other, both in time and space. I've got space covered, but measuring time is giving me some trouble. Here is the code that runs the sorts:
void test(short* n, short len) {
short i, j, a[1024];
for(i=0; i<2; i++) { // Loop over each sort algo
memused = 0; // Initialize memory marker
for(j=0; j<len; j++) // Copy scrambled list into fresh array
a[j] = n[j]; // (Sorting algos are in-place)
// ***Point A***
switch(i) { // Pick sorting algo
case 0:
selectionSort(a, len);
case 1:
quicksort(a, len);
}
// ***Point B***
spc[i][len] = memused; // Record how much mem was used
}
}
(I removed some of the sorting algos for simplicity)
Now, I need to measure how much time the sorting algo takes. The most obvious way to do this is to record the time at point (a) and then subtract that from the time at point (b). But none of the C time functions are good enough:
time() gives me time in seconds, but the algos are faster than that, so I need something more accurate.
clock() gives me CPU ticks since the program started, but seems to round to the nearest 10,000; still not small enough
The time shell command works well enough, except that I need to run over 1,000 tests per algorithm, and I need the individual time for each one.
I have no idea what getrusage() returns, but it's also too long.
What I need is time in units (significantly, if possible) smaller than the run time of the sorting functions: about 2ms. So my question is: Where can I get that?
gettimeofday() has microseconds resolution and is easy to use.
A pair of useful timer functions is:
static struct timeval tm1;
static inline void start()
{
gettimeofday(&tm1, NULL);
}
static inline void stop()
{
struct timeval tm2;
gettimeofday(&tm2, NULL);
unsigned long long t = 1000 * (tm2.tv_sec - tm1.tv_sec) + (tm2.tv_usec - tm1.tv_usec) / 1000;
printf("%llu ms\n", t);
}
For measuring time, use clock_gettime with CLOCK_MONOTONIC (or CLOCK_MONOTONIC_RAW if it is available). Where possible, avoid using gettimeofday. It is specifically deprecated in favor of clock_gettime, and the time returned from it is subject to adjustments from time servers, which can throw off your measurements.
You can get the total user + kernel time (or choose just one) using getrusage as follows:
#include <sys/time.h>
#include <sys/resource.h>
double get_process_time() {
struct rusage usage;
if( 0 == getrusage(RUSAGE_SELF, &usage) ) {
return (double)(usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
(double)(usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / 1.0e6;
}
return 0;
}
I elected to create a double containing fractional seconds...
double t_begin, t_end;
t_begin = get_process_time();
// Do some operation...
t_end = get_process_time();
printf( "Elapsed time: %.6f seconds\n", t_end - t_begin );
The Time Stamp Counter could be helpful here:
static unsigned long long rdtsctime() {
unsigned int eax, edx;
unsigned long long val;
__asm__ __volatile__("rdtsc":"=a"(eax), "=d"(edx));
val = edx;
val = val << 32;
val += eax;
return val;
}
Though there are some caveats to this. The timestamps for different processor cores may be different, and changing clock speeds (due to power saving features and the like) can cause erroneous results.

How to measure time in milliseconds using ANSI C?

Using only ANSI C, is there any way to measure time with milliseconds precision or more? I was browsing time.h but I only found second precision functions.
There is no ANSI C function that provides better than 1 second time resolution but the POSIX function gettimeofday provides microsecond resolution. The clock function only measures the amount of time that a process has spent executing and is not accurate on many systems.
You can use this function like this:
struct timeval tval_before, tval_after, tval_result;
gettimeofday(&tval_before, NULL);
// Some code you want to time, for example:
sleep(1);
gettimeofday(&tval_after, NULL);
timersub(&tval_after, &tval_before, &tval_result);
printf("Time elapsed: %ld.%06ld\n", (long int)tval_result.tv_sec, (long int)tval_result.tv_usec);
This returns Time elapsed: 1.000870 on my machine.
#include <time.h>
clock_t uptime = clock() / (CLOCKS_PER_SEC / 1000);
I always use the clock_gettime() function, returning time from the CLOCK_MONOTONIC clock. The time returned is the amount of time, in seconds and nanoseconds, since some unspecified point in the past, such as system startup of the epoch.
#include <stdio.h>
#include <stdint.h>
#include <time.h>
int64_t timespecDiff(struct timespec *timeA_p, struct timespec *timeB_p)
{
return ((timeA_p->tv_sec * 1000000000) + timeA_p->tv_nsec) -
((timeB_p->tv_sec * 1000000000) + timeB_p->tv_nsec);
}
int main(int argc, char **argv)
{
struct timespec start, end;
clock_gettime(CLOCK_MONOTONIC, &start);
// Some code I am interested in measuring
clock_gettime(CLOCK_MONOTONIC, &end);
uint64_t timeElapsed = timespecDiff(&end, &start);
}
Implementing a portable solution
As it was already mentioned here that there is no proper ANSI solution with sufficient precision for the time measurement problem, I want to write about the ways how to get a portable and, if possible, a high-resolution time measurement solution.
Monotonic clock vs. time stamps
Generally speaking there are two ways of time measurement:
monotonic clock;
current (date)time stamp.
The first one uses a monotonic clock counter (sometimes it is called a tick counter) which counts ticks with a predefined frequency, so if you have a ticks value and the frequency is known, you can easily convert ticks to elapsed time. It is actually not guaranteed that a monotonic clock reflects the current system time in any way, it may also count ticks since a system startup. But it guarantees that a clock is always run up in an increasing fashion regardless of the system state. Usually the frequency is bound to a hardware high-resolution source, that's why it provides a high accuracy (depends on hardware, but most of the modern hardware has no problems with high-resolution clock sources).
The second way provides a (date)time value based on the current system clock value. It may also have a high resolution, but it has one major drawback: this kind of time value can be affected by different system time adjustments, i.e. time zone change, daylight saving time (DST) change, NTP server update, system hibernation and so on. In some circumstances you can get a negative elapsed time value which can lead to an undefined behavior. Actually this kind of time source is less reliable than the first one.
So the first rule in time interval measuring is to use a monotonic clock if possible. It usually has a high precision, and it is reliable by design.
Fallback strategy
When implementing a portable solution it is worth to consider a fallback strategy: use a monotonic clock if available and fallback to time stamps approach if there is no monotonic clock in the system.
Windows
There is a great article called Acquiring high-resolution time stamps on MSDN about time measurement on Windows which describes all the details you may need to know about software and hardware support. To acquire a high precision time stamp on Windows you should:
query a timer frequency (ticks per second) with QueryPerformanceFrequency:
LARGE_INTEGER tcounter;
LARGE_INTEGER freq;
if (QueryPerformanceFrequency (&tcounter) != 0)
freq = tcounter.QuadPart;
The timer frequency is fixed on the system boot so you need to get it only once.
query the current ticks value with QueryPerformanceCounter:
LARGE_INTEGER tcounter;
LARGE_INTEGER tick_value;
if (QueryPerformanceCounter (&tcounter) != 0)
tick_value = tcounter.QuadPart;
scale the ticks to elapsed time, i.e. to microseconds:
LARGE_INTEGER usecs = (tick_value - prev_tick_value) / (freq / 1000000);
According to Microsoft you should not have any problems with this approach on Windows XP and later versions in most cases. But you can also use two fallback solutions on Windows:
GetTickCount provides the number of milliseconds that have elapsed since the system was started. It wraps every 49.7 days, so be careful in measuring longer intervals.
GetTickCount64 is a 64-bit version of GetTickCount, but it is available starting from Windows Vista and above.
OS X (macOS)
OS X (macOS) has its own Mach absolute time units which represent a monotonic clock. The best way to start is the Apple's article Technical Q&A QA1398: Mach Absolute Time Units which describes (with the code examples) how to use Mach-specific API to get monotonic ticks. There is also a local question about it called clock_gettime alternative in Mac OS X which at the end may leave you a bit confused what to do with the possible value overflow because the counter frequency is used in the form of numerator and denominator. So, a short example how to get elapsed time:
get the clock frequency numerator and denominator:
#include <mach/mach_time.h>
#include <stdint.h>
static uint64_t freq_num = 0;
static uint64_t freq_denom = 0;
void init_clock_frequency ()
{
mach_timebase_info_data_t tb;
if (mach_timebase_info (&tb) == KERN_SUCCESS && tb.denom != 0) {
freq_num = (uint64_t) tb.numer;
freq_denom = (uint64_t) tb.denom;
}
}
You need to do that only once.
query the current tick value with mach_absolute_time:
uint64_t tick_value = mach_absolute_time ();
scale the ticks to elapsed time, i.e. to microseconds, using previously queried numerator and denominator:
uint64_t value_diff = tick_value - prev_tick_value;
/* To prevent overflow */
value_diff /= 1000;
value_diff *= freq_num;
value_diff /= freq_denom;
The main idea to prevent an overflow is to scale down the ticks to desired accuracy before using the numerator and denominator. As the initial timer resolution is in nanoseconds, we divide it by 1000 to get microseconds. You can find the same approach used in Chromium's time_mac.c. If you really need a nanosecond accuracy consider reading the How can I use mach_absolute_time without overflowing?.
Linux and UNIX
The clock_gettime call is your best way on any POSIX-friendly system. It can query time from different clock sources, and the one we need is CLOCK_MONOTONIC. Not all systems which have clock_gettime support CLOCK_MONOTONIC, so the first thing you need to do is to check its availability:
if _POSIX_MONOTONIC_CLOCK is defined to a value >= 0 it means that CLOCK_MONOTONIC is avaiable;
if _POSIX_MONOTONIC_CLOCK is defined to 0 it means that you should additionally check if it works at runtime, I suggest to use sysconf:
#include <unistd.h>
#ifdef _SC_MONOTONIC_CLOCK
if (sysconf (_SC_MONOTONIC_CLOCK) > 0) {
/* A monotonic clock presents */
}
#endif
otherwise a monotonic clock is not supported and you should use a fallback strategy (see below).
Usage of clock_gettime is pretty straight forward:
get the time value:
#include <time.h>
#include <sys/time.h>
#include <stdint.h>
uint64_t get_posix_clock_time ()
{
struct timespec ts;
if (clock_gettime (CLOCK_MONOTONIC, &ts) == 0)
return (uint64_t) (ts.tv_sec * 1000000 + ts.tv_nsec / 1000);
else
return 0;
}
I've scaled down the time to microseconds here.
calculate the difference with the previous time value received the same way:
uint64_t prev_time_value, time_value;
uint64_t time_diff;
/* Initial time */
prev_time_value = get_posix_clock_time ();
/* Do some work here */
/* Final time */
time_value = get_posix_clock_time ();
/* Time difference */
time_diff = time_value - prev_time_value;
The best fallback strategy is to use the gettimeofday call: it is not a monotonic, but it provides quite a good resolution. The idea is the same as with clock_gettime, but to get a time value you should:
#include <time.h>
#include <sys/time.h>
#include <stdint.h>
uint64_t get_gtod_clock_time ()
{
struct timeval tv;
if (gettimeofday (&tv, NULL) == 0)
return (uint64_t) (tv.tv_sec * 1000000 + tv.tv_usec);
else
return 0;
}
Again, the time value is scaled down to microseconds.
SGI IRIX
IRIX has the clock_gettime call, but it lacks CLOCK_MONOTONIC. Instead it has its own monotonic clock source defined as CLOCK_SGI_CYCLE which you should use instead of CLOCK_MONOTONIC with clock_gettime.
Solaris and HP-UX
Solaris has its own high-resolution timer interface gethrtime which returns the current timer value in nanoseconds. Though the newer versions of Solaris may have clock_gettime, you can stick to gethrtime if you need to support old Solaris versions.
Usage is simple:
#include <sys/time.h>
void time_measure_example ()
{
hrtime_t prev_time_value, time_value;
hrtime_t time_diff;
/* Initial time */
prev_time_value = gethrtime ();
/* Do some work here */
/* Final time */
time_value = gethrtime ();
/* Time difference */
time_diff = time_value - prev_time_value;
}
HP-UX lacks clock_gettime, but it supports gethrtime which you should use in the same way as on Solaris.
BeOS
BeOS also has its own high-resolution timer interface system_time which returns the number of microseconds have elapsed since the computer was booted.
Example usage:
#include <kernel/OS.h>
void time_measure_example ()
{
bigtime_t prev_time_value, time_value;
bigtime_t time_diff;
/* Initial time */
prev_time_value = system_time ();
/* Do some work here */
/* Final time */
time_value = system_time ();
/* Time difference */
time_diff = time_value - prev_time_value;
}
OS/2
OS/2 has its own API to retrieve high-precision time stamps:
query a timer frequency (ticks per unit) with DosTmrQueryFreq (for GCC compiler):
#define INCL_DOSPROFILE
#define INCL_DOSERRORS
#include <os2.h>
#include <stdint.h>
ULONG freq;
DosTmrQueryFreq (&freq);
query the current ticks value with DosTmrQueryTime:
QWORD tcounter;
unit64_t time_low;
unit64_t time_high;
unit64_t timestamp;
if (DosTmrQueryTime (&tcounter) == NO_ERROR) {
time_low = (unit64_t) tcounter.ulLo;
time_high = (unit64_t) tcounter.ulHi;
timestamp = (time_high << 32) | time_low;
}
scale the ticks to elapsed time, i.e. to microseconds:
uint64_t usecs = (prev_timestamp - timestamp) / (freq / 1000000);
Example implementation
You can take a look at the plibsys library which implements all the described above strategies (see ptimeprofiler*.c for details).
timespec_get from C11
Returns up to nanoseconds, rounded to the resolution of the implementation.
Looks like an ANSI ripoff from POSIX' clock_gettime.
Example: a printf is done every 100ms on Ubuntu 15.10:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
static long get_nanos(void) {
struct timespec ts;
timespec_get(&ts, TIME_UTC);
return (long)ts.tv_sec * 1000000000L + ts.tv_nsec;
}
int main(void) {
long nanos;
long last_nanos;
long start;
nanos = get_nanos();
last_nanos = nanos;
start = nanos;
while (1) {
nanos = get_nanos();
if (nanos - last_nanos > 100000000L) {
printf("current nanos: %ld\n", nanos - start);
last_nanos = nanos;
}
}
return EXIT_SUCCESS;
}
The C11 N1570 standard draft 7.27.2.5 "The timespec_get function says":
If base is TIME_UTC, the tv_sec member is set to the number of seconds since an
implementation defined epoch, truncated to a whole value and the tv_nsec member is
set to the integral number of nanoseconds, rounded to the resolution of the system clock. (321)
321) Although a struct timespec object describes times with nanosecond resolution, the available
resolution is system dependent and may even be greater than 1 second.
C++11 also got std::chrono::high_resolution_clock: C++ Cross-Platform High-Resolution Timer
glibc 2.21 implementation
Can be found under sysdeps/posix/timespec_get.c as:
int
timespec_get (struct timespec *ts, int base)
{
switch (base)
{
case TIME_UTC:
if (__clock_gettime (CLOCK_REALTIME, ts) < 0)
return 0;
break;
default:
return 0;
}
return base;
}
so clearly:
only TIME_UTC is currently supported
it forwards to __clock_gettime (CLOCK_REALTIME, ts), which is a POSIX API: http://pubs.opengroup.org/onlinepubs/9699919799/functions/clock_getres.html
Linux x86-64 has a clock_gettime system call.
Note that this is not a fail-proof micro-benchmarking method because:
man clock_gettime says that this measure may have discontinuities if you change some system time setting while your program runs. This should be a rare event of course, and you might be able to ignore it.
this measures wall time, so if the scheduler decides to forget about your task, it will appear to run for longer.
For those reasons getrusage() might be a better better POSIX benchmarking tool, despite it's lower microsecond maximum precision.
More information at: Measure time in Linux - time vs clock vs getrusage vs clock_gettime vs gettimeofday vs timespec_get?
The best precision you can possibly get is through the use of the x86-only "rdtsc" instruction, which can provide clock-level resolution (ne must of course take into account the cost of the rdtsc call itself, which can be measured easily on application startup).
The main catch here is measuring the number of clocks per second, which shouldn't be too hard.
The accepted answer is good enough.But my solution is more simple.I just test in Linux, use gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0.
Alse use gettimeofday, the tv_sec is the part of second, and the tv_usec is microseconds, not milliseconds.
long currentTimeMillis() {
struct timeval time;
gettimeofday(&time, NULL);
return time.tv_sec * 1000 + time.tv_usec / 1000;
}
int main() {
printf("%ld\n", currentTimeMillis());
// wait 1 second
sleep(1);
printf("%ld\n", currentTimeMillis());
return 0;
}
It print:
1522139691342
1522139692342, exactly a second.
^
As of ANSI/ISO C11 or later, you can use timespec_get() to obtain millisecond, microsecond, or nanosecond timestamps, like this:
#include <time.h>
/// Convert seconds to milliseconds
#define SEC_TO_MS(sec) ((sec)*1000)
/// Convert seconds to microseconds
#define SEC_TO_US(sec) ((sec)*1000000)
/// Convert seconds to nanoseconds
#define SEC_TO_NS(sec) ((sec)*1000000000)
/// Convert nanoseconds to seconds
#define NS_TO_SEC(ns) ((ns)/1000000000)
/// Convert nanoseconds to milliseconds
#define NS_TO_MS(ns) ((ns)/1000000)
/// Convert nanoseconds to microseconds
#define NS_TO_US(ns) ((ns)/1000)
/// Get a time stamp in milliseconds.
uint64_t millis()
{
struct timespec ts;
timespec_get(&ts, TIME_UTC);
uint64_t ms = SEC_TO_MS((uint64_t)ts.tv_sec) + NS_TO_MS((uint64_t)ts.tv_nsec);
return ms;
}
/// Get a time stamp in microseconds.
uint64_t micros()
{
struct timespec ts;
timespec_get(&ts, TIME_UTC);
uint64_t us = SEC_TO_US((uint64_t)ts.tv_sec) + NS_TO_US((uint64_t)ts.tv_nsec);
return us;
}
/// Get a time stamp in nanoseconds.
uint64_t nanos()
{
struct timespec ts;
timespec_get(&ts, TIME_UTC);
uint64_t ns = SEC_TO_NS((uint64_t)ts.tv_sec) + (uint64_t)ts.tv_nsec;
return ns;
}
// NB: for all 3 timestamp functions above: gcc defines the type of the internal
// `tv_sec` seconds value inside the `struct timespec`, which is used
// internally in these functions, as a signed `long int`. For architectures
// where `long int` is 64 bits, that means it will have undefined
// (signed) overflow in 2^64 sec = 5.8455 x 10^11 years. For architectures
// where this type is 32 bits, it will occur in 2^32 sec = 136 years. If the
// implementation-defined epoch for the timespec is 1970, then your program
// could have undefined behavior signed time rollover in as little as
// 136 years - (year 2021 - year 1970) = 136 - 51 = 85 years. If the epoch
// was 1900 then it could be as short as 136 - (2021 - 1900) = 136 - 121 =
// 15 years. Hopefully your program won't need to run that long. :). To see,
// by inspection, what your system's epoch is, simply print out a timestamp and
// calculate how far back a timestamp of 0 would have occurred. Ex: convert
// the timestamp to years and subtract that number of years from the present
// year.
For a much-more-thorough answer of mine, including with an entire timing library I wrote, see here: How to get a simple timestamp in C.
#Ciro Santilli Путлер also presents a concise demo of C11's timespec_get() function here, which is how I first learned how to use that function.
In my more-thorough answer, I explain that on my system, the best resolution possible is ~20ns, but the resolution is hardware-dependent and can vary from system to system.
Under windows:
SYSTEMTIME t;
GetLocalTime(&t);
swprintf_s(buff, L"[%02d:%02d:%02d:%d]\t", t.wHour, t.wMinute, t.wSecond, t.wMilliseconds);

Resources