Which timer to use when comparing C code to CUDA code? - c

I'm currently doing two implementations of an algorithm, one in C and the other in CUDA, and am planning to do a comparison between the two in terms of runtime. My question is, what would be the best C timer to use considering I'm going to be comparing runtimes in C and CUDA. For CUDA, I shall be using Events, and I've read about wall clock timers in C such as clock() and gettimeofday() as well as high-resolution timers such as clock_gettime(), but am unsure which C one to use if I'm going to be comparing my C times against CUDA times?
Thanks :-)

For end-to-end measurements at application level, I would recommend using a high-precision host timer, as in the code below, which I have used for well over a decade. For detailed measurements of potentially extremely short GPU activity, I would suggest using CUDA events.
#if defined(_WIN32)
#if !defined(WIN32_LEAN_AND_MEAN)
#define WIN32_LEAN_AND_MEAN
#endif
#include <windows.h>
double second (void)
{
LARGE_INTEGER t;
static double oofreq;
static int checkedForHighResTimer;
static BOOL hasHighResTimer;
if (!checkedForHighResTimer) {
hasHighResTimer = QueryPerformanceFrequency (&t);
oofreq = 1.0 / (double)t.QuadPart;
checkedForHighResTimer = 1;
}
if (hasHighResTimer) {
QueryPerformanceCounter (&t);
return (double)t.QuadPart * oofreq;
} else {
return (double)GetTickCount() * 1.0e-3;
}
}
#elif defined(__linux__) || defined(__APPLE__)
#include <stddef.h>
#include <sys/time.h>
double second (void)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return (double)tv.tv_sec + (double)tv.tv_usec * 1.0e-6;
}
#else
#error unsupported platform
#endif

It's probably best just to stick to something relatively simple, I'd recommend gettimeofday, which will provide a timestamp with microsecond accuracy. Just record the time before and after doing your computation, then subtract the two. You can use the timersub macro to do this.
http://linux.die.net/man/2/gettimeofday
http://linux.die.net/man/3/timercmp

#include "time.h"
clock_t init, final;
init=clock();
...
//your sequential algoritm
...
final=clock()-init;
float seq_time ((double)final / ((double)CLOCKS_PER_SEC));
printf("\nThe sequential duration is %f seconds.", seq_time);
//Clock is initialized again
init=clock();
...
//your parallel algoritm
...
final=clock()-init;
float par_time ((double)final / ((double)CLOCKS_PER_SEC));
printf("\nThe parallel duration is %f seconds.", par_time);
printf("\n\nSpped up is %f seconds. (%dX Faster)", (seq_time - par_time), ((int)(seq_time / par_time)));

I've used the following code with great/accurate success:
#include <time.h>
long unsigned int get_tick()
{
struct timespec ts;
if (clock_gettime(CLOCK_MONOTONIC, &ts) != 0) return (0);
return ts.tv_sec*(long int)1000 + ts.tv_nsec / (long int) 1000000;
}
Then in the code you want to time put the get_tick method before and after it and subtract the two variables to get the result. Divide the answer by 1000 to get it in seconds

Related

clock_gettime on Raspberry Pi with C

I want to measure the time between the start to the end of the function in a loop. This difference will be used to set the amount of loops of the inner while-loops which does some here not important stuff.
I want to time the function like this :
#include <wiringPi.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#define BILLION 1E9
float hz = 1000;
long int nsPerTick = BILLION/hz;
double unprocessed = 1;
struct timespec now;
struct timespec last;
clock_gettime(CLOCK_REALTIME, &last);
[ ... ]
while (1)
{
clock_gettime(CLOCK_REALTIME, &now);
double diff = (last.tv_nsec - now.tv_nsec );
unprocessed = unprocessed + (diff/ nsPerTick);
clock_gettime(CLOCK_REALTIME, &last);
while (unprocessed >= 1) {
unprocessed --;
DO SOME RANDOM MAGIC;
}
}
The difference between the timer is always negative. I was told this was where the error was:
if ( (last.tv_nsec - now.tv_nsec)<0) {
double diff = 1000000000+ last.tv_nsec - now.tv_nsec;
}
else {
double diff = (last.tv_nsec - now.tv_nsec );
}
But still, my variable difference and is always negative like "-1095043244" (but the time spent during the function is a positive of course).
What's wrong?
Your first issue is that you have `last.tv_nsec - now.tv_nsec, which is the wrong way round.
last.tv_nsec is in the past (let's say it's set to 1), and now.tv_nsec will always be later (for example, 8ns later, so it's 9). In that case, last.tv_nsec - now.tv_nsec == 1 - 9 == -8.
The other issue is that tv_nsec isn't the time in nanoseconds: for that, you'd need to multiply the time in seconds by a billion and add that. So to get the difference in ns between now and last, you want:
((now.tv_sec - last.tv_sec) * ONE_BILLION) + (now.tv_nsec - last.tv_nsec)
(N.B. I'm still a little surprised that although now.tv_nsec and last.tv_nsec are both less than a billion, subtracting one from the other gives a value less than -1000000000, so there may yet be something I'm missing here.)
I was just investigating timing on Pi, with similar approach and similar problems. My thoughts are:
You don't have to use double. In fact you also don't need nano-seconds, as the clock on Pi has 1 microsecond accuracy anyway (it's the way the Broadcom did it). I suggest you to use gettimeofday() to get microsecs instead of nanosecs. Then computation is easy, it's just:
number of seconds + (1000 * 1000 * number of micros)
which you can simply calculate as unsigned int.
I've implemented the convenient API for this:
typedef struct
{
struct timeval startTimeVal;
} TIMER_usecCtx_t;
void TIMER_usecStart(TIMER_usecCtx_t* ctx)
{
gettimeofday(&ctx->startTimeVal, NULL);
}
unsigned int TIMER_usecElapsedUs(TIMER_usecCtx_t* ctx)
{
unsigned int rv;
/* get current time */
struct timeval nowTimeVal;
gettimeofday(&nowTimeVal, NULL);
/* compute diff */
rv = 1000000 * (nowTimeVal.tv_sec - ctx->startTimeVal.tv_sec) + nowTimeVal.tv_usec - ctx->startTimeVal.tv_usec;
return rv;
}
And the usage is:
TIMER_usecCtx_t timer;
TIMER_usecStart(&timer);
while (1)
{
if (TIMER_usecElapsedUs(timer) > yourDelayInMicroseconds)
{
doSomethingHere();
TIMER_usecStart(&timer);
}
}
Also notice the gettime() calls on Pi take almost 1 [us] to complete. So, if you need to call gettime() a lot and need more accuracy, go for some more advanced methods of getting time... I've explained more about it in this short article about Pi get-time calls
Well, I don't know C, but if it's a timing issue on a Raspberry Pi it might have something to do with the lack of an RTC (real time clock) on the chip.
You should not be storing last.tv_nsec - now.tv_nsec in a double.
If you look at the documentation of time.h, you can see that tv_nsec is stored as a long. So you will need something along the lines of:
long diff = end.tv_nsec - begin.tv_nsec
With that being said, only comparing the nanoseconds can go wrong. You also need to look at the number of seconds also. So to convert everything to seconds, you can use this:
long nanosec_diff = end.tv_nsec - begin.tv_nsec;
time_t sec_diff = end.tv_sec - begin.tv_sec; // need <sys/types.h> for time_t
double diff_in_seconds = sec_diff + nanosec_diff / 1000000000.0
Also, make sure you are always subtracting the end time from the start time (or else your time will still be negative).
And there you go!

Time in milliseconds in C

Using the following code:
#include<stdio.h>
#include<time.h>
int main()
{
clock_t start, stop;
int i;
start = clock();
for(i=0; i<2000;i++)
{
printf("%d", (i*1)+(1^4));
}
printf("\n\n");
stop = clock();
//(double)(stop - start) / CLOCKS_PER_SEC
printf("%6.3f", start);
printf("\n\n%6.3f", stop);
return 0;
}
I get the following output:
56789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004
2.169
2.169
Start and stop times are the same. Does it mean that the program hardly takes time to complete execution?
If 1. is false, then atleast the no.of digits beyond the (.) should differ, which does not happen here. Is my logic correct?
Note: I need to calculate the time taken for execution, and hence the above code.
Yes, this program has likely used less than a millsecond. Try using microsecond resolution with timeval.
e.g:
#include <sys/time.h>
struct timeval stop, start;
gettimeofday(&start, NULL);
//do stuff
gettimeofday(&stop, NULL);
printf("took %lu us\n", (stop.tv_sec - start.tv_sec) * 1000000 + stop.tv_usec - start.tv_usec);
You can then query the difference (in microseconds) between stop.tv_usec - start.tv_usec. Note that this will only work for subsecond times (as tv_usec will loop). For the general case use a combination of tv_sec and tv_usec.
Edit 2016-08-19
A more appropriate approach on system with clock_gettime support would be:
struct timespec start, end;
clock_gettime(CLOCK_MONOTONIC_RAW, &start);
//do stuff
clock_gettime(CLOCK_MONOTONIC_RAW, &end);
uint64_t delta_us = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_nsec - start.tv_nsec) / 1000;
Here is what I write to get the timestamp in millionseconds.
#include<sys/time.h>
long long timeInMilliseconds(void) {
struct timeval tv;
gettimeofday(&tv,NULL);
return (((long long)tv.tv_sec)*1000)+(tv.tv_usec/1000);
}
A couple of things might affect the results you're seeing:
You're treating clock_t as a floating-point type, I don't think it is.
You might be expecting (1^4) to do something else than compute the bitwise XOR of 1 and 4., i.e. it's 5.
Since the XOR is of constants, it's probably folded by the compiler, meaning it doesn't add a lot of work at runtime.
Since the output is buffered (it's just formatting the string and writing it to memory), it completes very quickly indeed.
You're not specifying how fast your machine is, but it's not unreasonable for this to run very quickly on modern hardware, no.
If you have it, try adding a call to sleep() between the start/stop snapshots. Note that sleep() is POSIX though, not standard C.
This code snippet can be used for displaying time in seconds,milliseconds and microseconds:
#include <sys/time.h>
struct timeval start, stop;
double secs = 0;
gettimeofday(&start, NULL);
// Do stuff here
gettimeofday(&stop, NULL);
secs = (double)(stop.tv_usec - start.tv_usec) / 1000000 + (double)(stop.tv_sec - start.tv_sec);
printf("time taken %f\n",secs);
You can use gettimeofday() together with the timedifference_msec() function below to calculate the number of milliseconds elapsed between two samples:
#include <sys/time.h>
#include <stdio.h>
float timedifference_msec(struct timeval t0, struct timeval t1)
{
return (t1.tv_sec - t0.tv_sec) * 1000.0f + (t1.tv_usec - t0.tv_usec) / 1000.0f;
}
int main(void)
{
struct timeval t0;
struct timeval t1;
float elapsed;
gettimeofday(&t0, 0);
/* ... YOUR CODE HERE ... */
gettimeofday(&t1, 0);
elapsed = timedifference_msec(t0, t1);
printf("Code executed in %f milliseconds.\n", elapsed);
return 0;
}
Note that, when using gettimeofday(), you need to take seconds into account even if you only care about microsecond differences because tv_usec will wrap back to zero every second and you have no way of knowing beforehand at which point within a second each sample is obtained.
From man clock:
The clock() function returns an approximation of processor time used by the program.
So there is no indication you should treat it as milliseconds. Some standards require precise value of CLOCKS_PER_SEC, so you could rely on it, but I don't think it is advisable.
Second thing is that, as #unwind stated, it is not float/double. Man times suggests that will be an int.
Also note that:
this function will return the same value approximately every 72 minutes
And if you are unlucky you might hit the moment it is just about to start counting from zero, thus getting negative or huge value (depending on whether you store the result as signed or unsigned value).
This:
printf("\n\n%6.3f", stop);
Will most probably print garbage as treating any int as float is really not defined behaviour (and I think this is where most of your problem comes). If you want to make sure you can always do:
printf("\n\n%6.3f", (double) stop);
Though I would rather go for printing it as long long int at first:
printf("\n\n%lldf", (long long int) stop);
The standard C library provides timespec_get. It can tell time up to nanosecond precision, if the system supports. Calling it, however, takes a bit more effort because it involves a struct. Here's a function that just converts the struct to a simple 64-bit integer so you can get time in milliseconds.
#include <stdio.h>
#include <inttypes.h>
#include <time.h>
int64_t millis()
{
struct timespec now;
timespec_get(&now, TIME_UTC);
return ((int64_t) now.tv_sec) * 1000 + ((int64_t) now.tv_nsec) / 1000000;
}
int main(void)
{
printf("Unix timestamp with millisecond precision: %" PRId64 "\n", millis());
}
Unlike clock, this function returns a Unix timestamp so it will correctly account for the time spent in blocking functions, such as sleep.
Modern processors are too fast to register the running time. Hence it may return zero. In this case, the time you started and ended is too small and therefore both the times are the same after round of.

clock_gettime alternative in Mac OS X

When compiling a program I wrote on Mac OS X after installing the necessary libraries through MacPorts, I get this error:
In function 'nanotime':
error: 'CLOCK_REALTIME' undeclared (first use in this function)
error: (Each undeclared identifier is reported only once
error: for each function it appears in.)
It appears that clock_gettime is not implemented in Mac OS X. Is there an alternative means of getting the epoch time in nanoseconds? Unfortunately gettimeofday is in microseconds.
After hours of perusing different answers, blogs, and headers, I found a portable way to get the current time:
#include <time.h>
#include <sys/time.h>
#ifdef __MACH__
#include <mach/clock.h>
#include <mach/mach.h>
#endif
struct timespec ts;
#ifdef __MACH__ // OS X does not have clock_gettime, use clock_get_time
clock_serv_t cclock;
mach_timespec_t mts;
host_get_clock_service(mach_host_self(), CALENDAR_CLOCK, &cclock);
clock_get_time(cclock, &mts);
mach_port_deallocate(mach_task_self(), cclock);
ts.tv_sec = mts.tv_sec;
ts.tv_nsec = mts.tv_nsec;
#else
clock_gettime(CLOCK_REALTIME, &ts);
#endif
or check out this gist: https://gist.github.com/1087739
Hope this saves someone time. Cheers!
None of the solutions above answers the question. Either they don't give you absolute Unix time, or their accuracy is 1 microsecond. The most popular solution by jbenet is slow (~6000ns) and does not count in nanoseconds even though its return suggests so. Below is a test for 2 solutions suggested by jbenet and Dmitri B, plus my take on this. You can run the code without changes.
The 3rd solution does count in nanoseconds and gives you absolute Unix time reasonably fast (~90ns). So if someone find it useful - please let us all know here :-). I will stick to the one from Dmitri B (solution #1 in the code) - it fits my needs better.
I needed commercial quality alternative to clock_gettime() to make pthread_…timed.. calls, and found this discussion very helpful. Thanks guys.
/*
Ratings of alternatives to clock_gettime() to use with pthread timed waits:
Solution 1 "gettimeofday":
Complexity : simple
Portability : POSIX 1
timespec : easy to convert from timeval to timespec
granularity : 1000 ns,
call : 120 ns,
Rating : the best.
Solution 2 "host_get_clock_service, clock_get_time":
Complexity : simple (error handling?)
Portability : Mac specific (is it always available?)
timespec : yes (struct timespec return)
granularity : 1000 ns (don't be fooled by timespec format)
call time : 6000 ns
Rating : the worst.
Solution 3 "mach_absolute_time + gettimeofday once":
Complexity : simple..average (requires initialisation)
Portability : Mac specific. Always available
timespec : system clock can be converted to timespec without float-math
granularity : 1 ns.
call time : 90 ns unoptimised.
Rating : not bad, but do we really need nanoseconds timeout?
References:
- OS X is UNIX System 3 [U03] certified
http://www.opengroup.org/homepage-items/c987.html
- UNIX System 3 <--> POSIX 1 <--> IEEE Std 1003.1-1988
http://en.wikipedia.org/wiki/POSIX
http://www.unix.org/version3/
- gettimeofday() is mandatory on U03,
clock_..() functions are optional on U03,
clock_..() are part of POSIX Realtime extensions
http://www.unix.org/version3/inttables.pdf
- clock_gettime() is not available on MacMini OS X
(Xcode > Preferences > Downloads > Command Line Tools = Installed)
- OS X recommends to use gettimeofday to calculate values for timespec
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/pthread_cond_timedwait.3.html
- timeval holds microseconds, timespec - nanoseconds
http://www.gnu.org/software/libc/manual/html_node/Elapsed-Time.html
- microtime() is used by kernel to implement gettimeofday()
http://ftp.tw.freebsd.org/pub/branches/7.0-stable/src/sys/kern/kern_time.c
- mach_absolute_time() is really fast
http://www.opensource.apple.com/source/Libc/Libc-320.1.3/i386/mach/mach_absolute_time.c
- Only 9 deciaml digits have meaning when int nanoseconds converted to double seconds
Tutorial: Performance and Time post uses .12 precision for nanoseconds
http://www.macresearch.org/tutorial_performance_and_time
Example:
Three ways to prepare absolute time 1500 milliseconds in the future to use with pthread timed functions.
Output, N = 3, stock MacMini, OSX 10.7.5, 2.3GHz i5, 2GB 1333MHz DDR3:
inittime.tv_sec = 1390659993
inittime.tv_nsec = 361539000
initclock = 76672695144136
get_abs_future_time_0() : 1390659994.861599000
get_abs_future_time_0() : 1390659994.861599000
get_abs_future_time_0() : 1390659994.861599000
get_abs_future_time_1() : 1390659994.861618000
get_abs_future_time_1() : 1390659994.861634000
get_abs_future_time_1() : 1390659994.861642000
get_abs_future_time_2() : 1390659994.861643671
get_abs_future_time_2() : 1390659994.861643877
get_abs_future_time_2() : 1390659994.861643972
*/
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h> /* gettimeofday */
#include <mach/mach_time.h> /* mach_absolute_time */
#include <mach/mach.h> /* host_get_clock_service, mach_... */
#include <mach/clock.h> /* clock_get_time */
#define BILLION 1000000000L
#define MILLION 1000000L
#define NORMALISE_TIMESPEC( ts, uint_milli ) \
do { \
ts.tv_sec += uint_milli / 1000u; \
ts.tv_nsec += (uint_milli % 1000u) * MILLION; \
ts.tv_sec += ts.tv_nsec / BILLION; \
ts.tv_nsec = ts.tv_nsec % BILLION; \
} while (0)
static mach_timebase_info_data_t timebase = { 0, 0 }; /* numer = 0, denom = 0 */
static struct timespec inittime = { 0, 0 }; /* nanoseconds since 1-Jan-1970 to init() */
static uint64_t initclock; /* ticks since boot to init() */
void init()
{
struct timeval micro; /* microseconds since 1 Jan 1970 */
if (mach_timebase_info(&timebase) != 0)
abort(); /* very unlikely error */
if (gettimeofday(&micro, NULL) != 0)
abort(); /* very unlikely error */
initclock = mach_absolute_time();
inittime.tv_sec = micro.tv_sec;
inittime.tv_nsec = micro.tv_usec * 1000;
printf("\tinittime.tv_sec = %ld\n", inittime.tv_sec);
printf("\tinittime.tv_nsec = %ld\n", inittime.tv_nsec);
printf("\tinitclock = %ld\n", (long)initclock);
}
/*
* Get absolute future time for pthread timed calls
* Solution 1: microseconds granularity
*/
struct timespec get_abs_future_time_coarse(unsigned milli)
{
struct timespec future; /* ns since 1 Jan 1970 to 1500 ms in the future */
struct timeval micro = {0, 0}; /* 1 Jan 1970 */
(void) gettimeofday(&micro, NULL);
future.tv_sec = micro.tv_sec;
future.tv_nsec = micro.tv_usec * 1000;
NORMALISE_TIMESPEC( future, milli );
return future;
}
/*
* Solution 2: via clock service
*/
struct timespec get_abs_future_time_served(unsigned milli)
{
struct timespec future;
clock_serv_t cclock;
mach_timespec_t mts;
host_get_clock_service(mach_host_self(), CALENDAR_CLOCK, &cclock);
clock_get_time(cclock, &mts);
mach_port_deallocate(mach_task_self(), cclock);
future.tv_sec = mts.tv_sec;
future.tv_nsec = mts.tv_nsec;
NORMALISE_TIMESPEC( future, milli );
return future;
}
/*
* Solution 3: nanosecond granularity
*/
struct timespec get_abs_future_time_fine(unsigned milli)
{
struct timespec future; /* ns since 1 Jan 1970 to 1500 ms in future */
uint64_t clock; /* ticks since init */
uint64_t nano; /* nanoseconds since init */
clock = mach_absolute_time() - initclock;
nano = clock * (uint64_t)timebase.numer / (uint64_t)timebase.denom;
future = inittime;
future.tv_sec += nano / BILLION;
future.tv_nsec += nano % BILLION;
NORMALISE_TIMESPEC( future, milli );
return future;
}
#define N 3
int main()
{
int i, j;
struct timespec time[3][N];
struct timespec (*get_abs_future_time[])(unsigned milli) =
{
&get_abs_future_time_coarse,
&get_abs_future_time_served,
&get_abs_future_time_fine
};
init();
for (j = 0; j < 3; j++)
for (i = 0; i < N; i++)
time[j][i] = get_abs_future_time[j](1500); /* now() + 1500 ms */
for (j = 0; j < 3; j++)
for (i = 0; i < N; i++)
printf("get_abs_future_time_%d() : %10ld.%09ld\n",
j, time[j][i].tv_sec, time[j][i].tv_nsec);
return 0;
}
In effect, it seems not to be implemented for macOS before Sierra 10.12. You may want to look at this blog entry. The main idea is in the following code snippet:
#include <mach/mach_time.h>
#define ORWL_NANO (+1.0E-9)
#define ORWL_GIGA UINT64_C(1000000000)
static double orwl_timebase = 0.0;
static uint64_t orwl_timestart = 0;
struct timespec orwl_gettime(void) {
// be more careful in a multithreaded environement
if (!orwl_timestart) {
mach_timebase_info_data_t tb = { 0 };
mach_timebase_info(&tb);
orwl_timebase = tb.numer;
orwl_timebase /= tb.denom;
orwl_timestart = mach_absolute_time();
}
struct timespec t;
double diff = (mach_absolute_time() - orwl_timestart) * orwl_timebase;
t.tv_sec = diff * ORWL_NANO;
t.tv_nsec = diff - (t.tv_sec * ORWL_GIGA);
return t;
}
#if defined(__MACH__) && !defined(CLOCK_REALTIME)
#include <sys/time.h>
#define CLOCK_REALTIME 0
// clock_gettime is not implemented on older versions of OS X (< 10.12).
// If implemented, CLOCK_REALTIME will have already been defined.
int clock_gettime(int /*clk_id*/, struct timespec* t) {
struct timeval now;
int rv = gettimeofday(&now, NULL);
if (rv) return rv;
t->tv_sec = now.tv_sec;
t->tv_nsec = now.tv_usec * 1000;
return 0;
}
#endif
Everything you need is described in Technical Q&A QA1398: Technical Q&A QA1398: Mach Absolute Time Units, basically the function you want is mach_absolute_time.
Here's a slightly earlier version of the sample code from that page that does everything using Mach calls (the current version uses AbsoluteToNanoseconds from CoreServices). In current OS X (i.e., on Snow Leopard on x86_64) the absolute time values are actually in nanoseconds and so don't actually require any conversion at all. So, if you're good and writing portable code, you'll convert, but if you're just doing something quick and dirty for yourself, you needn't bother.
FWIW, mach_absolute_time is really fast.
uint64_t GetPIDTimeInNanoseconds(void)
{
uint64_t start;
uint64_t end;
uint64_t elapsed;
uint64_t elapsedNano;
static mach_timebase_info_data_t sTimebaseInfo;
// Start the clock.
start = mach_absolute_time();
// Call getpid. This will produce inaccurate results because
// we're only making a single system call. For more accurate
// results you should call getpid multiple times and average
// the results.
(void) getpid();
// Stop the clock.
end = mach_absolute_time();
// Calculate the duration.
elapsed = end - start;
// Convert to nanoseconds.
// If this is the first time we've run, get the timebase.
// We can use denom == 0 to indicate that sTimebaseInfo is
// uninitialised because it makes no sense to have a zero
// denominator is a fraction.
if ( sTimebaseInfo.denom == 0 ) {
(void) mach_timebase_info(&sTimebaseInfo);
}
// Do the maths. We hope that the multiplication doesn't
// overflow; the price you pay for working in fixed point.
elapsedNano = elapsed * sTimebaseInfo.numer / sTimebaseInfo.denom;
printf("multiplier %u / %u\n", sTimebaseInfo.numer, sTimebaseInfo.denom);
return elapsedNano;
}
Note that macOS Sierra 10.12 now supports clock_gettime():
#include <stdio.h>
#include <time.h>
int main() {
struct timespec res;
struct timespec time;
clock_getres(CLOCK_REALTIME, &res);
clock_gettime(CLOCK_REALTIME, &time);
printf("CLOCK_REALTIME: res.tv_sec=%lu res.tv_nsec=%lu\n", res.tv_sec, res.tv_nsec);
printf("CLOCK_REALTIME: time.tv_sec=%lu time.tv_nsec=%lu\n", time.tv_sec, time.tv_nsec);
}
It does provide nanoseconds; however, the resolution is 1000, so it is (in)effectively limited to microseconds:
CLOCK_REALTIME: res.tv_sec=0 res.tv_nsec=1000
CLOCK_REALTIME: time.tv_sec=1475279260 time.tv_nsec=525627000
You will need XCode 8 or later to be able to use this feature. Code compiled to use this feature will not run on versions of Mac OS X (10.11 or earlier).
Thanks for your posts
I think you can add the following lines
#ifdef __MACH__
#include <mach/mach_time.h>
#define CLOCK_REALTIME 0
#define CLOCK_MONOTONIC 0
int clock_gettime(int clk_id, struct timespec *t){
mach_timebase_info_data_t timebase;
mach_timebase_info(&timebase);
uint64_t time;
time = mach_absolute_time();
double nseconds = ((double)time * (double)timebase.numer)/((double)timebase.denom);
double seconds = ((double)time * (double)timebase.numer)/((double)timebase.denom * 1e9);
t->tv_sec = seconds;
t->tv_nsec = nseconds;
return 0;
}
#else
#include <time.h>
#endif
Let me know what you get for latency and granularity
Maristic has the best answer here to date. Let me simplify and add a remark. #include and Init():
#include <mach/mach_time.h>
double conversion_factor;
void Init() {
mach_timebase_info_data_t timebase;
mach_timebase_info(&timebase);
conversion_factor = (double)timebase.numer / (double)timebase.denom;
}
Use as:
uint64_t t1, t2;
Init();
t1 = mach_absolute_time();
/* profiled code here */
t2 = mach_absolute_time();
double duration_ns = (double)(t2 - t1) * conversion_factor;
Such timer has latency of 65ns +/- 2ns (2GHz CPU). Use this if you need "time evolution" of single execution. Otherwise loop your code 10000 times and profile even with gettimeofday(), which is portable (POSIX), and has the latency of 100ns +/- 0.5ns (though only 1us granularity).
I tried the version with clock_get_time, and did cache the host_get_clock_service call. It's way slower than gettimeofday, it takes several microseconds per invocation. And, what's worse, the return value has steps of 1000, i.e. it's still microsecond granularity.
I'd advice to use gettimeofday, and multiply tv_usec by 1000.
Based on the open source mach_absolute_time.c we can see that the line extern mach_port_t clock_port; tells us there's a mach port already initialized for monotonic time. This clock port can be accessed directly without having to resort to calling mach_absolute_time then converting back to a struct timespec. Bypassing a call to mach_absolute_time should improve performance.
I created a small Github repo (PosixMachTiming) with the code based on the extern clock_port and a similar thread. PosixMachTiming emulates clock_gettime for CLOCK_REALTIME and CLOCK_MONOTONIC. It also emulates the function clock_nanosleep for absolute monotonic time. Please give it a try and see how the performance compares. Maybe you might want to create comparative tests or emulate other POSIX clocks/functions?
As of at least as far back as Mountain Lion, mach_absolute_time() returns nanoseconds and not absolute time (which was the number of bus cycles).
The following code on my MacBook Pro (2 GHz Core i7) showed that the time to call mach_absolute_time() averaged 39 ns over 10 runs (min 35, max 45), which is basically the time between the return of the two calls to mach_absolute_time(), about 1 invocation:
#include <stdint.h>
#include <mach/mach_time.h>
#include <iostream>
using namespace std;
int main()
{
uint64_t now, then;
uint64_t abs;
then = mach_absolute_time(); // return nanoseconds
now = mach_absolute_time();
abs = now - then;
cout << "nanoseconds = " << abs << endl;
}
void clock_get_uptime(uint64_t *result);
void clock_get_system_microtime( uint32_t *secs,
uint32_t *microsecs);
void clock_get_system_nanotime( uint32_t *secs,
uint32_t *nanosecs);
void clock_get_calendar_microtime( uint32_t *secs,
uint32_t *microsecs);
void clock_get_calendar_nanotime( uint32_t *secs,
uint32_t *nanosecs);
For MacOS you can find a good information on their developers page
https://developer.apple.com/library/content/documentation/Darwin/Conceptual/KernelProgramming/services/services.html
I found another portable solution.
Declare in some header file (or even in your source one):
/* If compiled on DARWIN/Apple platforms. */
#ifdef DARWIN
#define CLOCK_REALTIME 0x2d4e1588
#define CLOCK_MONOTONIC 0x0
#endif /* DARWIN */
And the add the function implementation:
#ifdef DARWIN
/*
* Bellow we provide an alternative for clock_gettime,
* which is not implemented in Mac OS X.
*/
static inline int clock_gettime(int clock_id, struct timespec *ts)
{
struct timeval tv;
if (clock_id != CLOCK_REALTIME)
{
errno = EINVAL;
return -1;
}
if (gettimeofday(&tv, NULL) < 0)
{
return -1;
}
ts->tv_sec = tv.tv_sec;
ts->tv_nsec = tv.tv_usec * 1000;
return 0;
}
#endif /* DARWIN */
Don't forget to include <time.h>.

How do I measure time in C?

I want to find out for how long (approximately) some block of code executes. Something like this:
startStopwatch();
// do some calculations
stopStopwatch();
printf("%lf", timeMesuredInSeconds);
How?
You can use the clock method in time.h
Example:
clock_t start = clock();
/*Do something*/
clock_t end = clock();
float seconds = (float)(end - start) / CLOCKS_PER_SEC;
You can use the time.h library, specifically the time and difftime functions:
/* difftime example */
#include <stdio.h>
#include <time.h>
int main ()
{
time_t start,end;
double dif;
time (&start);
// Do some calculation.
time (&end);
dif = difftime (end,start);
printf ("Your calculations took %.2lf seconds to run.\n", dif );
return 0;
}
(Example adapted from the difftime webpage linked above.)
Please note that this method can only give seconds worth of accuracy - time_t records the seconds since the UNIX epoch (Jan 1st, 1970).
Sometime it's needed to measure astronomical time rather than CPU time (especially this applicable on Linux):
#include <time.h>
double what_time_is_it()
{
struct timespec now;
clock_gettime(CLOCK_REALTIME, &now);
return now.tv_sec + now.tv_nsec*1e-9;
}
int main() {
double time = what_time_is_it();
printf("time taken %.6lf\n", what_time_is_it() - time);
return 0;
}
The standard C library provides the time function and it is useful if you only need to compare seconds. If you need millisecond precision, though, the most portable way is to call timespec_get. It can tell time up to nanosecond precision, if the system supports. Calling it, however, takes a bit more effort because it involves a struct. Here's a function that just converts the struct to a simple 64-bit integer.
#include <stdio.h>
#include <inttypes.h>
#include <time.h>
int64_t millis()
{
struct timespec now;
timespec_get(&now, TIME_UTC);
return ((int64_t) now.tv_sec) * 1000 + ((int64_t) now.tv_nsec) / 1000000;
}
int main(void)
{
printf("Unix timestamp with millisecond precision: %" PRId64 "\n", millis());
}
Unlike clock, this function returns a Unix timestamp so it will correctly account for the time spent in blocking functions, such as sleep. This is a useful property for benchmarking and implementing delays that take running time into account.
GetTickCount().
#include <windows.h>
void MeasureIt()
{
DWORD dwStartTime = GetTickCount();
DWORD dwElapsed;
DoSomethingThatYouWantToTime();
dwElapsed = GetTickCount() - dwStartTime;
printf("It took %d.%3d seconds to complete\n", dwElapsed/1000, dwElapsed - dwElapsed/1000);
}
I would use the QueryPerformanceCounter and QueryPerformanceFrequency functions of the Windows API. Call the former before and after the block and subtract (current − old) to get the number of "ticks" between the instances. Divide this by the value obtained by the latter function to get the duration in seconds.
For sake of completeness, there is more precise clock counter than GetTickCount() or clock() which gives you only 32-bit result that can overflow relatively quickly. It's QueryPerformanceCounter(). QueryPerformanceFrequency() gets clock frequency which is a divisor for two counters difference. Something like CLOCKS_PER_SEC in <time.h>.
#include <stdio.h>
#include <windows.h>
int main()
{
LARGE_INTEGER tu_freq, tu_start, tu_end;
__int64 t_ns;
QueryPerformanceFrequency(&tu_freq);
QueryPerformanceCounter(&tu_start);
/* do your stuff */
QueryPerformanceCounter(&tu_end);
t_ns = 1000000000ULL * (tu_end.QuadPart - tu_start.QuadPart) / tu_freq.QuadPart;
printf("dt = %g[s]; (%llu)[ns]\n", t_ns/(double)1e+9, t_ns);
return 0;
}
If you don't need fantastic resolution, you could use GetTickCount(): http://msdn.microsoft.com/en-us/library/ms724408(VS.85).aspx
(If it's for something other than your own simple diagnostics, then note that this number can wrap around, so you'll need to handle that with a little arithmetic).
QueryPerformanceCounter is another reasonable option. (It's also described on MSDN)

Calculating elapsed time in a C program in milliseconds

I want to calculate the time in milliseconds taken by the execution of some part of my program. I've been looking online, but there's not much info on this topic. Any of you know how to do this?
Best way to answer is with an example:
#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
/* Return 1 if the difference is negative, otherwise 0. */
int timeval_subtract(struct timeval *result, struct timeval *t2, struct timeval *t1)
{
long int diff = (t2->tv_usec + 1000000 * t2->tv_sec) - (t1->tv_usec + 1000000 * t1->tv_sec);
result->tv_sec = diff / 1000000;
result->tv_usec = diff % 1000000;
return (diff<0);
}
void timeval_print(struct timeval *tv)
{
char buffer[30];
time_t curtime;
printf("%ld.%06ld", tv->tv_sec, tv->tv_usec);
curtime = tv->tv_sec;
strftime(buffer, 30, "%m-%d-%Y %T", localtime(&curtime));
printf(" = %s.%06ld\n", buffer, tv->tv_usec);
}
int main()
{
struct timeval tvBegin, tvEnd, tvDiff;
// begin
gettimeofday(&tvBegin, NULL);
timeval_print(&tvBegin);
// lengthy operation
int i,j;
for(i=0;i<999999L;++i) {
j=sqrt(i);
}
//end
gettimeofday(&tvEnd, NULL);
timeval_print(&tvEnd);
// diff
timeval_subtract(&tvDiff, &tvEnd, &tvBegin);
printf("%ld.%06ld\n", tvDiff.tv_sec, tvDiff.tv_usec);
return 0;
}
Another option ( at least on some UNIX ) is clock_gettime and related functions. These allow access to various realtime clocks and you can select one of the higher resolution ones and throw away the resolution you don't need.
The gettimeofday function returns the time with microsecond precision (if the platform can support that, of course):
The gettimeofday() function shall
obtain the current time, expressed as
seconds and microseconds since the
Epoch, and store it in the timeval
structure pointed to by tp. The
resolution of the system clock is
unspecified.
C libraries have a function to let you get the system time. You can calculate elapsed time after you capture the start and stop times.
The function is called gettimeofday() and you can look at the man page to find out what to include and how to use it.
On Windows, you can just do this:
DWORD dwTickCount = GetTickCount();
// Perform some things.
printf("Code took: %dms\n", GetTickCount() - dwTickCount);
Not the most general/elegant solution, but nice and quick when you need it.

Resources