Related
Summary
I'm trying to write an embedded application for an MC9S12VR microcontroller. This is a 16-bit microcontroller but some of the values I deal with are 32 bits wide and while debugging I've captured some anomalous values that seem to be due to torn reads.
I'm writing the firmware for this micro in C89 and running it through the Freescale HC12 compiler, and I'm wondering if anyone has any suggestions on how to prevent them on this particular microcontroller assuming that this is the case.
Details
Part of my application involves driving a motor and estimating its position and speed based on pulses generated by an encoder (a pulse is generated on every full rotation of the motor).
For this to work, I need to configure one of the MCU timers so that I can track the time elapsed between pulses. However, the timer has a clock rate of 3 MHz (after prescaling) and the timer counter register is only 16-bit, so the counter overflows every ~22ms. To compensate, I set up an interrupt handler that fires on a timer counter overflow, and this increments an "overflow" variable by 1:
// TEMP
static volatile unsigned long _timerOverflowsNoReset;
// ...
#ifndef __INTELLISENSE__
__interrupt VectorNumber_Vtimovf
#endif
void timovf_isr(void)
{
// Clear the interrupt.
TFLG2_TOF = 1;
// TEMP
_timerOverflowsNoReset++;
// ...
}
I can then work out the current time from this:
// TEMP
unsigned long MOTOR_GetCurrentTime(void)
{
const unsigned long ticksPerCycle = 0xFFFF;
const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
const unsigned long ticks = _timerOverflowsNoReset * ticksPerCycle + TCNT;
const unsigned long microseconds = ticks / ticksPerMicrosecond;
return microseconds;
}
In main.c, I've temporarily written some debugging code that drives the motor in one direction and then takes "snapshots" of various data at regular intervals:
// Test
for (iter = 0; iter < 10; iter++)
{
nextWait += SECONDS(secondsPerIteration);
while ((_test2Snapshots[iter].elapsed = MOTOR_GetCurrentTime() - startTime) < nextWait);
_test2Snapshots[iter].position = MOTOR_GetCount();
_test2Snapshots[iter].phase = MOTOR_GetPhase();
_test2Snapshots[iter].time = MOTOR_GetCurrentTime() - startTime;
// ...
In this test I'm reading MOTOR_GetCurrentTime() in two places very close together in code and assign them to properties of a globally available struct.
In almost every case, I find that the first value read is a few microseconds beyond the point the while loop should terminate, and the second read is a few microseconds after that - this is expected. However, occasionally I find the first read is significantly higher than the point the while loop should terminate at, and then the second read is less than the first value (as well as the termination value).
The screenshot below gives an example of this. It took about 20 repeats of the test before I was able to reproduce it. In the code, <snapshot>.elapsed is written to before <snapshot>.time so I expect it to have a slightly smaller value:
For snapshot[8], my application first reads 20010014 (over 10ms beyond where it should have terminated the busy-loop) and then reads 19988209. As I mentioned above, an overflow occurs every 22ms - specifically, a difference in _timerOverflowsNoReset of one unit will produce a difference of 65535 / 3 in the calculated microsecond value. If we account for this:
A difference of 40 isn't that far off the discrepancy I see between my other pairs of reads (~23/24), so my guess is that there's some kind of tear going on involving an off-by-one read of _timerOverflowsNoReset. As in while busy-looping, it will perform one call to MOTOR_GetCurrentTime() that erroneously sees _timerOverflowsNoReset as one greater than it actually is, causing the loop to end early, and then on the next read after that it sees the correct value again.
I have other problems with my application that I'm having trouble pinning down, and I'm hoping that if I resolve this, it might resolve these other problems as well if they share a similar cause.
Edit: Among other changes, I've changed _timerOverflowsNoReset and some other globals from 32-bit unsigned to 16-bit unsigned in the implementation I now have.
You can read this value TWICE:
unsigned long GetTmrOverflowNo()
{
unsigned long ovfl1, ovfl2;
do {
ovfl1 = _timerOverflowsNoReset;
ovfl2 = _timerOverflowsNoReset;
} while (ovfl1 != ovfl2);
return ovfl1;
}
unsigned long MOTOR_GetCurrentTime(void)
{
const unsigned long ticksPerCycle = 0xFFFF;
const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
const unsigned long ticks = GetTmrOverflowNo() * ticksPerCycle + TCNT;
const unsigned long microseconds = ticks / ticksPerMicrosecond;
return microseconds;
}
If _timerOverflowsNoReset increments much slower then execution of GetTmrOverflowNo(), in worst case inner loop runs only two times. In most cases ovfl1 and ovfl2 will be equal after first run of while() loop.
Calculate the tick count, then check if while doing that the overflow changed, and if so repeat;
#define TCNT_BITS 16 ; // TCNT register width
uint32_t MOTOR_GetCurrentTicks(void)
{
uint32_t ticks = 0 ;
uint32_t overflow_count = 0;
do
{
overflow_count = _timerOverflowsNoReset ;
ticks = (overflow_count << TCNT_BITS) | TCNT;
}
while( overflow_count != _timerOverflowsNoReset ) ;
return ticks ;
}
the while loop will iterate either once or twice no more.
Based on the answers #AlexeyEsaulenko and #jeb provided, I gained understanding into the cause of this problem and how I could tackle it. As both their answers were helpful and the solution I currently have is sort of a mixture of the two, I can't decide which of the two answers to accept, so instead I'll upvote both answers and keep this question open.
This is how I now implement MOTOR_GetCurrentTime:
unsigned long MOTOR_GetCurrentTime(void)
{
const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
unsigned int countA;
unsigned int countB;
unsigned int timerOverflowsA;
unsigned int timerOverflowsB;
unsigned long ticks;
unsigned long microseconds;
// Loops until TCNT and the timer overflow count can be reliably determined.
do
{
timerOverflowsA = _timerOverflowsNoReset;
countA = TCNT;
timerOverflowsB = _timerOverflowsNoReset;
countB = TCNT;
} while (timerOverflowsA != timerOverflowsB || countA >= countB);
ticks = ((unsigned long)timerOverflowsA << 16) + countA;
microseconds = ticks / ticksPerMicrosecond;
return microseconds;
}
This function might not be as efficient as other proposed answers, but it gives me confidence that it will avoid some of the pitfalls that have been brought to light. It works by repeatedly reading both the timer overflow count and TCNT register twice, and only exiting the loop when the following two conditions are satisfied:
the timer overflow count hasn't changed while reading TCNT for the first time in the loop
the second count is greater than the first count
This basically means that if MOTOR_GetCurrentTime is called around the time that a timer overflow occurs, we wait until we've safely moved on to the next cycle, indicated by the second TCNT read being greater than the first (e.g. 0x0001 > 0x0000).
This does mean that the function blocks until TCNT increments at least once, but since that occurs every 333 nanoseconds I don't see it being problematic.
I've tried running my test 20 times in a row and haven't noticed any tearing, so I believe this works. I'll continue to test and update this answer if I'm wrong and the issue persists.
Edit: As Vroomfondel points out in the comments below, the check I do involving countA and countB also incidentally works for me and can potentially cause the loop to repeat indefinitely if _timerOverflowsNoReset is read fast enough. I'll update this answer when I've come up with something to address this.
The atomic reads are not the main problem here.
It's the problem that the overflow-ISR and TCNT are highly related.
And you get problems when you read first TCNT and then the overflow counter.
Three sample situations:
TCNT=0x0000, Overflow=0 --- okay
TCNT=0xFFFF, Overflow=1 --- fails
TCNT=0x0001, Overflow=1 --- okay again
You got the same problems, when you change the order to: First read overflow, then TCNT.
You could solve it with reading twice the totalOverflow counter.
disable_ints();
uint16_t overflowsA=totalOverflows;
uint16_t cnt = TCNT;
uint16_t overflowsB=totalOverflows;
enable_ints();
uint32_t totalCnt = cnt;
if ( overflowsA != overflowsB )
{
if (cnt < 0x4000)
totalCnt += 0x10000;
}
totalCnt += (uint32_t)overflowsA << 16;
If the totalOverflowCounter changed while reading the TCNT, then it's necessary to check if the value in tcnt is already greater 0 (but below ex. 0x4000) or if tcnt is just before the overflow.
One technique that can be helpful is to maintain two or three values that, collectively, hold overlapping portions of a larger value.
If one knows that a value will be monotonically increasing, and one will never go more than 65,280 counts between calls to "update timer" function, one could use something like:
// Note: Assuming a platform where 16-bit loads and stores are atomic
uint16_t volatile timerHi, timerMed, timerLow;
void updateTimer(void) // Must be only thing that writes timers!
{
timerLow = HARDWARE_TIMER;
timerMed += (uint8_t)((timerLow >> 8) - timerMed);
timerHi += (uint8_t)((timerMed >> 8) - timerHi);
}
uint32_t readTimer(void)
{
uint16_t tempTimerHi = timerHi;
uint16_t tempTimerMed = timerMed;
uint16_t tempTimerLow = timerLow;
tempTimerMed += (uint8_t)((tempTimerLow >> 8) - tempTimerMed);
tempTimerHi += (uint8_t)((tempTimerMed >> 8) - tempTimerHi);
return ((uint32_t)tempTimerHi) << 16) | tempTimerLow;
}
Note that readTimer reads timerHi before it reads timerLow. It's possible that updateTimer might update timerLow or timerMed between the time readTimer reads
timerHi and the time it reads those other values, but if that occurs, it will
notice that the lower part of timerHi needs to be incremented to match the upper
part of the value that got updated later.
This approach can be cascaded to arbitrary length, and need not use a full 8 bits
of overlap. Using 8 bits of overlap, however, makes it possible to form a 32-bit
value by using the upper and lower values while simply ignoring the middle one.
If less overlap were used, all three values would need to take part in the
final computation.
The problem is that the writes to _timerOverflowsNoReset isn't atomic and you don't protect them. This is a bug. Writing atomic from the ISR isn't very important, as the HCS12 blocks the background program during interrupt. But reading atomic in the background program is absolutely necessary.
Also, have in mind that Codewarrior/HCS12 generates somewhat ineffective code for 32 bit arithmetic.
Here is how you can fix it:
Drop unsigned long for the shared variable. In fact you don't need a counter at all, given that your background program can service the variable within 22ms real-time - should be very easy requirement. Keep your 32 bit counter local and away from the ISR.
Ensure that reads of the shared variable are atomic. Disassemble! It must be a single MOV instruction or similar; otherwise you must implement semaphores.
Don't read any volatile variable inside complex expressions. Not only the shared variable but also the TCNT. Your program as it stands has a tight coupling between the slow 32 bit arithmetic algorithm's speed and the timer, which is very bad. You won't be able to reliably read TCNT with any accuracy, and to make things worse you call this function from other complex code.
Your code should be changed to something like this:
static volatile bool overflow;
void timovf_isr(void)
{
// Clear the interrupt.
TFLG2_TOF = 1;
// TEMP
overflow = true;
// ...
}
unsigned long MOTOR_GetCurrentTime(void)
{
bool of = overflow; // read this on a line of its own, ensure this is atomic!
uint16_t tcnt = TCNT; // read this on a line of its own
overflow = false; // ensure this is atomic too
if(of)
{
_timerOverflowsNoReset++;
}
/* calculations here */
return microseconds;
}
If you don't end up with atomic reads, you will have to implement semaphores, block the timer interrupt or write the reading code in inline assembler (my recommendation).
Overall I would say that your design relying on TOF is somewhat questionable. I think it would be better to set up a dedicated timer channel and let it count up a known time unit (10ms?). Any reason why you can't use one of the 8 timer channels for this?
It all boils down to the question of how often you do read the timer and how long the maximum interrupt sequence will be in your system (i.e. the maximum time the timer code can be stopped without making "substantial" progress).
Iff you test for time stamps more often than the cycle time of your hardware timer AND those tests have the guarantee that the end of one test is no further apart from the start of its predecessor than one interval (in your case 22ms), all is well. In the case your code is held up for so long that these preconditions don't hold, the following solution will not work - the question then however is whether the time information coming from such a system has any value at all.
The good thing is that you don't need an interrupt at all - any try to compensate for the inability of the system to satisfy two equally hard RT problems - updating your overflow timer and delivering the hardware time is either futile or ugly plus not meeting the basic system properties.
unsigned long MOTOR_GetCurrentTime(void)
{
static uint16_t last;
static uint16_t hi;
volatile uint16_t now = TCNT;
if (now < last)
{
hi++;
}
last = now;
return now + (hi * 65536UL);
}
BTW: I return ticks, not microseconds. Don't mix concerns.
PS: the caveat is that such a function is not reentrant and in a sense a true singleton.
My Linux device driver has some obstinate logic which twiddles with some hardware and then waits for a signal to appear. The seemingly proper way is:
ulong timeout, waitcnt = 0;
...
/* 2. Establish programming mode */
gpio_bit_write (MPP_CFG_PROGRAM, 0); /* assert */
udelay (3); /* one microsecond should be long enough */
gpio_bit_write (MPP_CFG_PROGRAM, 1); /* de-assert */
/* 3. Wait for the FPGA to initialize. */
/* 100 ms timeout should be nearly 100 times too long */
timeout = jiffies + msecs_to_jiffies(100);
while (gpio_bit_read (MPP_CFG_INIT) == 0 &&
time_is_before_jiffies (timeout))
++waitcnt; /* do nothing */
if (!time_is_before_jiffies (timeout)) /* timed out? */
{
/* timeout error */
}
This always exercises the "timeout error" path and doesn't increment waitcnt at all. Perhaps I don't understand the meaning of time_is_before_jiffies(), or it is broken. When I replace it with the much more understandable direct comparison of jiffies:
while (gpio_bit_read (MPP_CFG_INIT) == 0 &&
jiffies <= timeout)
++waitcnt; /* do nothing */
It works just fine: it loops for awhile (1600 µS), sees the INIT bit come on, and then proceeds without triggering a timeout error.
The comment for time_is_before_jiffies() is:
/* time_is_before_jiffies(a) return true if a is before jiffies */
#define time_is_before_jiffies(a) time_after(jiffies, a)
As the sense of the comparison seemed nonsensically backward, I replaced both with time_is_after_jiffies(), but that doesn't work either.
What am I doing wrong? Maybe I should replace use of this confusing macro with the straightforward jiffies <= timeout logic, though that seems less portable.
The jiffies <= timeout comparison does not work when the jiffies are wrapping around, so you must use it.
The condition you want to use can be described as "has not yet timed out".
This means that the current time (jiffies) has not yet reached the timeout time (timeout), i.e., jiffies is before the variable you are comparing it to, which means that your variable is after jiffies.
(All the time_is_ functions have jiffies on the right side of the comparison.)
So you have to use timer_is_after_jiffies() in the while loop.
(And the <= implies that you actually want to use time_is_after_eq_jiffies().)
The timeout check should be better done by reading the GPIO bit, because it would be a shame if your code times out although it got the signal right at the end.
Furthermore, busy-looping for a hundred milliseconds is extremly evil; you should release the CPU if you don't need it:
unsigned long timeout = jiffies + msecs_to_jiffies(100);
bool ok = false;
for (;;) {
ok = gpio_bit_read(MPP_CFG_INIT) != 0;
if (ok || time_is_before_eq_jiffies(timeout))
break;
/* you should do msleep(...) or cond_resched() here, if possible */
}
if (!ok) /* timed out? */
...
(This loop uses time_is_before_eq_jiffies() because the condition is reversed.)
Suppose we have a following code:
if (timeout > jiffies)
{
/* we did not time out, good ... */
}
else
{
/* we timed out, error ...*
}
This code works fine when jiffies value do not overflow.
However, when jiffies overflow and wrap around to zero, this code doesn't work properly.
Linux apparently provides macros for dealing with this overflow problem
#define time_before(unknown, known) ((long)(unkown) - (long)(known) < 0)
and code above is supposed to be safe against overflow when replaced with this macro:
// SAFE AGAINST OVERFLOW
if (time_before(jiffies, timeout)
{
/* we did not time out, good ... */
}
else
{
/* we timed out, error ...*
}
But, what is the rationale behind time_before (and other time_ macros?
time_before(jiffies, timeout) will be expanded to
((long)(jiffies) - (long)(timeout) < 0)
How does this code prevent overflow problems?
Let's actually give it a try:
#define time_before(unknown, known) ((long)(unkown) - (long)(known) < 0)
I'll simplify things down a lot by saying that a long is only two bytes, so in hex it can have a value in the range [0, 0xFFFF].
Now, it's signed, so the range [0, 0xFFFF] can be broken into two separate ranges [0, 0x7FFF], [0x8000, 0xFFFF]. Those correspond to the values [0, 32767], [ -32768, -1]. Here's a diagram:
[0x0 - - - 0xFFFF]
[0x0 0x7FFF][0x8000 0xFFFF]
[0 32,767][-32,768 -1]
Say timeout is 32,000. We want to check if we're inside our timeout, but in truth we overflowed, so jiffies is -31,000. So if we naively tried to evaluate jiffies < timeout we'd get True. But, plugging in the values:
time_before(jiffies, offset)
== ((long)(jiffies) - (long)(offset) < 0)
== (-31000 - 32000 < 0) // WTF is this. Clearly NOT -63000
== (-31000 - 1768 - 1 - 30231 < 0) // simply expanded 32000
== (-32768 - 1 - 30232 < 0) // this -1 causes an underflow
== (32767 - 30232 < 0)
== (2535 < 0)
== False
jiffies are 4 bytes, not 2, but the same principle applies. Does that help at all?
See for example here: http://fixunix.com/kernel/266713-%5Bpatch-1-4%5D-fs-autofs-use-time_before-time_before_eq-etc.html
Code with checking overflow against some fixed small constant was converted to use time_before. Why?
I'm just summarizing the comment that goes with the definition of the
time_after etc functions:
include/linux/jiffies.h:93
93 /*
94 * These inlines deal with timer wrapping correctly. You are
95 * strongly encouraged to use them
96 * 1. Because people otherwise forget
97 * 2. Because if the timer wrap changes in future you won't have to
98 * alter your driver code.
99 *
100 * time_after(a,b) returns true if the time a is after time b.
101 *
So, time_before and time_after is the better effort of handling overflow.
Your testcase is more likely to be timeout < jiffles (w/o overflow) than timeout > jiffles (with overflow):
unsigned long jiffies = 2147483658;
unsigned long timeout = 10;
And if you will change timeout to
unsigned long timeout = -2146483000;
what will be an answer?
Or you can change the check from
printf("%d",time_before(jiffies,timeout));
to
printf("%d",time_before(jiffies,old_jiffles+timeout));
where old_jiffles is saved value of jiffles at the timer's start.
So, I think the usage of time_before can be like:
old_jiffles=jiffles;
timeout=10; // or even 10*HZ for ten-seconds
do_a_long_work_or_wait();
//check is the timeout reached or not
if(time_before(jiffies,old_jiffles+timeout) ) {
do_another_long_work_or_wait();
} else {
printk("ERRROR: the timeout is reached; here is a problem");
panic();
}
Given that jiffies is an unsigned value, a simple comparison is safe across one wraparound point (where signed values would jump from positive to negative) but not safe across the other point (where signed values would jump from negative to positive, and where unsigned values jump from high to low). It's protection against this second point that the macro is intended to solve.
There is a fundamental assumption that timeout was initially calculated as jiffies + some_offset at some prior recent point in time -- specifically, less than half the range of the variables. If you're trying to measure times longer than this then things break down and you'll get the wrong answer.
If we pretend that jiffies is 16-bit wide for convenience in the explanation (similar to the other answers):
timeout > jiffies
This is an unsigned comparison that is intended to return true if we have not yet reached the timeout. Some examples:
timeout == 0x0300, jiffies == 0x0100: result is true, as expected.
timeout == 0x8100, jiffies == 0x7F00: result is true, as expected.
timeout == 0x0100, jiffies == 0xFF00: oops, result is false, but we haven't really reached the timeout, it just wrapped the counter.
timeout == 0x0100, jiffies == 0x0300: result is false, as expected.
timeout == 0x7F00, jiffies == 0x8100: result is false, as expected.
timeout == 0xFF00, jiffies == 0x0100: oops, result is true, but we did pass the timeout.
time_before(jiffies, timeout)
This does a signed comparison on the difference of the values rather than the values themselves, and again is expected to return true if the timeout has not yet been reached. Provided that the assumption above is upheld, the same examples:
timeout == 0x0300, jiffies == 0x0100: result is true, as expected.
timeout == 0x8100, jiffies == 0x7F00: result is true, as expected.
timeout == 0x0100, jiffies == 0xFF00: result is true, as expected.
timeout == 0x0100, jiffies == 0x0300: result is false, as expected.
timeout == 0x7F00, jiffies == 0x8100: result is false, as expected.
timeout == 0xFF00, jiffies == 0x0100: result is false, as expected.
If the offset you used when calculating timeout is too large or you allow too much time to pass after calculating timeout, then the result can still be wrong. eg. if you calculate timeout once but then just keep testing it repeatedly, then time_before will initially be true, then change to false after the offset time has passed -- and then change back to true again after 0x8000 time has passed (however long that is; it depends on the tick rate). This is why when you reach the timeout, you're supposed to remember this and stop checking the time (or recalculate a new timeout).
In the real kernel, jiffies is longer than 16 bits so it will take longer for it to wrap, but it's still possible if the machine is run for long enough. (And typically it's set to wrap shortly after boot, to catch these bugs more quickly.)
I couldn't easily understand the above answers, so hoping to help with my own:
#define time_after(a,b) (long) ( (b) - (a) )
Here brackets around 'b' and 'a' make them signed.
Example overflow:
For convenience, imagine 8-bit integers, jiffy1 is changing and timeout is fixed and greater than jiffy1
like : jiffy1 = 252, timeout = 254 and jiffy2 becomes 0 or 1 after overflow
When we use unsigned:
jiffy1 < timeout and
jiffy2 < timeout (mistakenly due to overflow which we need to fix via MACRO)
When we use signed:
jiffy1 < timeout (more-negative < less-negative)
and
jiffy2 > timeout (positive > negative)
(because it will consider MSB as the sign bit, hence timeout will appear negative while our jiffy2 has become positive due to the overflow)
Do correct me if there is something wrong
How do I get a microseconds timestamp in C?
I'm trying to do:
struct timeval tv;
gettimeofday(&tv,NULL);
return tv.tv_usec;
But this returns some nonsense value that if I get two timestamps, the second one can be smaller or bigger than the first (second one should always be bigger). Would it be possible to convert the magic integer returned by gettimeofday to a normal number which can actually be worked with?
You need to add in the seconds, too:
unsigned long time_in_micros = 1000000 * tv.tv_sec + tv.tv_usec;
Note that this will only last for about 232/106 =~ 4295 seconds, or roughly 71 minutes though (on a typical 32-bit system).
You have two choices for getting a microsecond timestamp. The first (and best) choice, is to use the timeval type directly:
struct timeval GetTimeStamp() {
struct timeval tv;
gettimeofday(&tv,NULL);
return tv;
}
The second, and for me less desirable, choice is to build a uint64_t out of a timeval:
uint64_t GetTimeStamp() {
struct timeval tv;
gettimeofday(&tv,NULL);
return tv.tv_sec*(uint64_t)1000000+tv.tv_usec;
}
Get a timestamp in C in microseconds?
Here is a generic answer pertaining to the title of this question:
How to get a simple timestamp in C
in milliseconds (ms) with function millis(),
microseconds (us) with micros(), and
nanoseconds (ns) with nanos()
Quick summary: if you're in a hurry and using a Linux or POSIX system, jump straight down to the section titled "millis(), micros(), and nanos()", below, and just use those functions. If you're using C11 not on a Linux or POSIX system, you'll need to replace clock_gettime() in those functions with timespec_get().
2 main timestamp functions in C:
C11: timespec_get() is part of the C11 or later standard, but doesn't allow choosing the type of clock to use. It also works in C++17. See documentation for std::timespec_get() here. However, for C++11 and later, I prefer to use a different approach where I can specify the resolution and type of the clock instead, as I demonstrate in my answer here: Getting an accurate execution time in C++ (micro seconds).
The C11 timespec_get() solution is a bit more limited than the C++ solution in that you cannot specify the clock resolution nor the monotonicity (a "monotonic" clock is defined as a clock that only counts forwards and can never go or jump backwards--ex: for time corrections). When measuring time differences, monotonic clocks are desired to ensure you never count a clock correction jump as part of your "measured" time.
The resolution of the timestamp values returned by timespec_get(), therefore, since we can't specify the clock to use, may be dependent on your hardware architecture, operating system, and compiler. An approximation of the resolution of this function can be obtained by rapidly taking 1000 or so measurements in quick-succession, then finding the smallest difference between any two subsequent measurements. Your clock's actual resolution is guaranteed to be equal to or smaller than that smallest difference.
I demonstrate this in the get_estimated_resolution() function of my timinglib.c timing library intended for Linux.
Linux and POSIX: Even better than timespec_get() in C is the Linux and POSIX function clock_gettime() function, which also works fine in C++ on Linux or POSIX systems. clock_gettime() does allow you to choose the desired clock. You can read the specified clock resolution with clock_getres(), although that doesn't give you your hardware's true clock resolution either. Rather, it gives you the units of the tv_nsec member of the struct timespec. Use my get_estimated_resolution() function described just above and in my timinglib.c/.h files to obtain an estimate of the resolution.
So, if you are using C on a Linux or POSIX system, I highly recommend you use clock_gettime() over timespec_get().
C11's timespec_get() (ok) and Linux/POSIX's clock_gettime() (better):
Here is how to use both functions:
C11's timespec_get()
https://en.cppreference.com/w/c/chrono/timespec_get
Works in C, but doesn't allow you to choose the clock to use.
Full example, with error checking:
#include <stdint.h> // `UINT64_MAX`
#include <stdio.h> // `printf()`
#include <time.h> // `timespec_get()`
/// Convert seconds to nanoseconds
#define SEC_TO_NS(sec) ((sec)*1000000000)
uint64_t nanoseconds;
struct timespec ts;
int return_code = timespec_get(&ts, TIME_UTC);
if (return_code == 0)
{
printf("Failed to obtain timestamp.\n");
nanoseconds = UINT64_MAX; // use this to indicate error
}
else
{
// `ts` now contains your timestamp in seconds and nanoseconds! To
// convert the whole struct to nanoseconds, do this:
nanoseconds = SEC_TO_NS((uint64_t)ts.tv_sec) + (uint64_t)ts.tv_nsec;
}
Linux/POSIX's clock_gettime() -- USE THIS ONE WHENEVER POSSIBLE!
https://man7.org/linux/man-pages/man3/clock_gettime.3.html (best reference for this function) and:
https://linux.die.net/man/3/clock_gettime
Works in C on Linux or POSIX systems, and allows you to choose the clock to use!
I choose the CLOCK_MONOTONIC_RAW clock, which is best for obtaining timestamps used to time things on your system.
See definitions for all of the clock types here, too, such as CLOCK_REALTIME, CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, etc: https://man7.org/linux/man-pages/man3/clock_gettime.3.html
Another popular clock to use is CLOCK_REALTIME. Do NOT be confused, however! "Realtime" does NOT mean that it is a good clock to use for "realtime" operating systems, or precise timing. Rather, it means it is a clock which will be adjusted to the "real time", or actual "world time", periodically, if the clock drifts. Again, do NOT use this clock for precise timing usages, as it can be adjusted forwards or backwards at any time by the system, outside of your control.
Full example, with error checking:
// This line **must** come **before** including <time.h> in order to
// bring in the POSIX functions such as `clock_gettime() from <time.h>`!
#define _POSIX_C_SOURCE 199309L
#include <errno.h> // `errno`
#include <stdint.h> // `UINT64_MAX`
#include <stdio.h> // `printf()`
#include <string.h> // `strerror(errno)`
#include <time.h> // `clock_gettime()` and `timespec_get()`
/// Convert seconds to nanoseconds
#define SEC_TO_NS(sec) ((sec)*1000000000)
uint64_t nanoseconds;
struct timespec ts;
int return_code = clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
if (return_code == -1)
{
printf("Failed to obtain timestamp. errno = %i: %s\n", errno,
strerror(errno));
nanoseconds = UINT64_MAX; // use this to indicate error
}
else
{
// `ts` now contains your timestamp in seconds and nanoseconds! To
// convert the whole struct to nanoseconds, do this:
nanoseconds = SEC_TO_NS((uint64_t)ts.tv_sec) + (uint64_t)ts.tv_nsec;
}
millis(), micros(), and nanos():
Anyway, here are my millis(), micros(), and nanos() functions I use in C for simple timestamps and code speed profiling.
I am using the Linux/POSIX clock_gettime() function below. If you are using C11 or later on a system which does not have clock_gettime() available, simply replace all usages of clock_gettime(CLOCK_MONOTONIC_RAW, &ts) below with timespec_get(&ts, TIME_UTC) instead.
Get the latest version of my code here from my eRCaGuy_hello_world repo here:
timinglib.h
timinglib.c
// This line **must** come **before** including <time.h> in order to
// bring in the POSIX functions such as `clock_gettime() from <time.h>`!
#define _POSIX_C_SOURCE 199309L
#include <time.h>
/// Convert seconds to milliseconds
#define SEC_TO_MS(sec) ((sec)*1000)
/// Convert seconds to microseconds
#define SEC_TO_US(sec) ((sec)*1000000)
/// Convert seconds to nanoseconds
#define SEC_TO_NS(sec) ((sec)*1000000000)
/// Convert nanoseconds to seconds
#define NS_TO_SEC(ns) ((ns)/1000000000)
/// Convert nanoseconds to milliseconds
#define NS_TO_MS(ns) ((ns)/1000000)
/// Convert nanoseconds to microseconds
#define NS_TO_US(ns) ((ns)/1000)
/// Get a time stamp in milliseconds.
uint64_t millis()
{
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
uint64_t ms = SEC_TO_MS((uint64_t)ts.tv_sec) + NS_TO_MS((uint64_t)ts.tv_nsec);
return ms;
}
/// Get a time stamp in microseconds.
uint64_t micros()
{
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
uint64_t us = SEC_TO_US((uint64_t)ts.tv_sec) + NS_TO_US((uint64_t)ts.tv_nsec);
return us;
}
/// Get a time stamp in nanoseconds.
uint64_t nanos()
{
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
uint64_t ns = SEC_TO_NS((uint64_t)ts.tv_sec) + (uint64_t)ts.tv_nsec;
return ns;
}
// NB: for all 3 timestamp functions above: gcc defines the type of the internal
// `tv_sec` seconds value inside the `struct timespec`, which is used
// internally in these functions, as a signed `long int`. For architectures
// where `long int` is 64 bits, that means it will have undefined
// (signed) overflow in 2^64 sec = 5.8455 x 10^11 years. For architectures
// where this type is 32 bits, it will occur in 2^32 sec = 136 years. If the
// implementation-defined epoch for the timespec is 1970, then your program
// could have undefined behavior signed time rollover in as little as
// 136 years - (year 2021 - year 1970) = 136 - 51 = 85 years. If the epoch
// was 1900 then it could be as short as 136 - (2021 - 1900) = 136 - 121 =
// 15 years. Hopefully your program won't need to run that long. :). To see,
// by inspection, what your system's epoch is, simply print out a timestamp and
// calculate how far back a timestamp of 0 would have occurred. Ex: convert
// the timestamp to years and subtract that number of years from the present
// year.
Timestamp Resolution:
On my x86-64 Linux Ubuntu 18.04 system with the gcc compiler, clock_getres() returns a resolution of 1 ns.
For both clock_gettime() and timespec_get(), I have also done empirical testing where I take 1000 timestamps rapidly, as fast as possible (see the get_estimated_resolution() function of my timinglib.c timing library), and look to see what the minimum gap is between timestamp samples. This reveals a range of ~14~26 ns on my system when using timespec_get(&ts, TIME_UTC) and clock_gettime(CLOCK_MONOTONIC, &ts), and ~75~130 ns for clock_gettime(CLOCK_MONOTONIC_RAW, &ts). This can be considered the rough "practical resolution" of these functions. See that test code in timinglib_get_resolution.c, and see the definition for my get_estimated_resolution() and get_specified_resolution() functions (which are used by that test code) in timinglib.c.
These results are hardware-specific, and your results on your hardware may vary.
References:
The cppreference.com documentation sources I link to above.
This answer here by #Ciro Santilli新疆棉花
[my answer] my answer about usleep() and nanosleep() - it reminded me I needed to do #define _POSIX_C_SOURCE 199309L in order to bring in the clock_gettime() POSIX function from <time.h>!
https://linux.die.net/man/3/clock_gettime
https://man7.org/linux/man-pages/man3/clock_gettime.3.html
Mentions the requirement for:
_POSIX_C_SOURCE >= 199309L
See definitions for all of the clock types here, too, such as CLOCK_REALTIME, CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, etc.
See also:
My shorter and less-through answer here, which applies only to ANSI/ISO C11 or later: How to measure time in milliseconds using ANSI C?
My 3 sets of timestamp functions (cross-linked to each other):
For C timestamps, see my answer here: Get a timestamp in C in microseconds?
For C++ high-resolution timestamps, see my answer here: Here is how to get simple C-like millisecond, microsecond, and nanosecond timestamps in C++
For Python high-resolution timestamps, see my answer here: How can I get millisecond and microsecond-resolution timestamps in Python?
https://en.cppreference.com/w/c/chrono/clock
POSIX clock_gettime(): https://pubs.opengroup.org/onlinepubs/9699919799/functions/clock_getres.html
clock_gettime() on Linux: https://linux.die.net/man/3/clock_gettime
Note: for C11 and later, you can use timespec_get(), as I have done above, instead of POSIX clock_gettime(). https://en.cppreference.com/w/c/chrono/clock says:
use timespec_get in C11
But, using clock_gettime() instead allows you to choose a desired clock ID for the type of clock you want! See also here: ***** https://people.cs.rutgers.edu/~pxk/416/notes/c-tutorials/gettime.html
Todo:
✓ DONE AS OF 3 Apr. 2022: Since timespec_getres() isn't supported until C23, update my examples to include one which uses the POSIX clock_gettime() and clock_getres() functions on Linux. I'd like to know precisely how good the clock resolution is that I can expect on a given system. Is it ms-resolution, us-resolution, ns-resolution, something else? For reference, see:
https://linux.die.net/man/3/clock_gettime
https://people.cs.rutgers.edu/~pxk/416/notes/c-tutorials/gettime.html
https://pubs.opengroup.org/onlinepubs/9699919799/functions/clock_getres.html
Answer: clock_getres() returns 1 ns, but the actual resolution is about 14~27 ns, according to my get_estimated_resolution() function here: https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/c/timinglib.c. See the results here:
https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/c/timinglib_get_resolution.c#L46-L77
Activate the Linux SCHED_RR soft real-time round-robin scheduler for the best and most-consistent timing possible. See my answer here regarding clock_nanosleep(): How to configure the Linux SCHED_RR soft real-time round-robin scheduler so that clock_nanosleep() can have improved resolution of ~4 us down from ~ 55 us.
struct timeval contains two components, the second and the microsecond. A timestamp with microsecond precision is represented as seconds since the epoch stored in the tv_sec field and the fractional microseconds in tv_usec. Thus you cannot just ignore tv_sec and expect sensible results.
If you use Linux or *BSD, you can use timersub() to subtract two struct timeval values, which might be what you want.
timespec_get from C11
Returns with precision of up to nanoseconds, rounded to the resolution of the implementation.
#include <time.h>
struct timespec ts;
timespec_get(&ts, TIME_UTC);
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
See more details in my other answer here: How to measure time in milliseconds using ANSI C?
But this returns some nonsense value
that if I get two timestamps, the
second one can be smaller or bigger
than the first (second one should
always be bigger).
What makes you think that? The value is probably OK. It’s the same situation as with seconds and minutes – when you measure time in minutes and seconds, the number of seconds rolls over to zero when it gets to sixty.
To convert the returned value into a “linear” number you could multiply the number of seconds and add the microseconds. But if I count correctly, one year is about 1e6*60*60*24*360 μsec and that means you’ll need more than 32 bits to store the result:
$ perl -E '$_=1e6*60*60*24*360; say int log($_)/log(2)'
44
That’s probably one of the reasons to split the original returned value into two pieces.
use an unsigned long long (i.e. a 64 bit unit) to represent the system time:
typedef unsigned long long u64;
u64 u64useconds;
struct timeval tv;
gettimeofday(&tv,NULL);
u64useconds = (1000000*tv.tv_sec) + tv.tv_usec;
Better late than never! This little programme can be used as the quickest way to get time stamp in microseconds and calculate the time of a process in microseconds:
#include <sys/time.h>
#include <stdio.h>
#include <time.h>
struct timeval GetTimeStamp()
{
struct timeval tv;
gettimeofday(&tv,NULL);
return tv;
}
int main()
{
struct timeval tv= GetTimeStamp(); // Calculate time
signed long time_in_micros = 1000000 * tv.tv_sec + tv.tv_usec; // Store time in microseconds
getchar(); // Replace this line with the process that you need to time
printf("Elapsed time: %ld microsecons\n", (1000000 * GetTimeStamp().tv_sec + GetTimeStamp().tv_usec) - time_in_micros);
}
You can replace getchar() with a function/process. Finally, instead of printing the difference you can store it in a signed long. The programme works fine in Windows 10.
First we need to know on the range of microseconds i.e. 000_000 to 999_999 (1000000 microseconds is equal to 1second). tv.tv_usec will return value from 0 to 999999 not 000000 to 999999 so when using it with seconds we might get 2.1seconds instead of 2.000001 seconds because when only talking about tv_usec 000001 is essentially 1.
Its better if you insert
if(tv.tv_usec<10)
{
printf("00000");
}
else if(tv.tv_usec<100&&tv.tv_usec>9)// i.e. 2digits
{
printf("0000");
}
and so on...
I want to know the time that has passed between the occurrence of two events.
Now, the simple way would be to use something like:
time_t x, y;
x = time(NULL);
/* Some other stuff happens */
y = time(NULL);
printf("Time passed: %i", y-x);
However, it is possible that the system time is changed between these two events.
Is there an alternative way to know the time that has passed between the two events? Or is there a way to detect changes to the system time?
Since you're apparently on Linux, you can use the POSIX CLOCK_MONOTONIC clock to get a timer that is unaffected by system time changes:
struct timespec ts1, ts2;
clock_gettime(CLOCK_MONOTONIC, &ts1);
/* Things happen */
clock_gettime(CLOCK_MONOTONIC, &ts2);
ts2.tv_sec -= ts1.tv_sec;
ts2.tv_nsec -= ts1.tv_nsec;
if (ts2.tv_nsec < 0)
{
ts2.tv_nsec += 1000000000L;
ts2.tv_sec -= 1;
}
printf("Elapsed time: %lld.%09lds\n", (long long)ts2.tv_sec, ts2.tv_nsec);
To check if the system supports CLOCK_MONOTONIC, check for sysconf(_SC_MONOTONIC_CLOCK) > 0.
You can use clock_gettime() on Linux or gethrtime() on some other Unix systems. Although the difference between two such values gives you a time interval, those call are not giving you regular time values. As HP-UX says regarding gethrtime():
The gethrtime() function returns the current high-resolution real time. Time is expressed as nanoseconds since a certain time in the past.