I am using the below code to convert nano to micro sec
This code runs fines mostly but some times I see the usTick gives a value far beyond the current time.
For ex. if the current time in usTick is 63290061063 then sometimes this value is coming as 126580061060. If you see it is double.
Similarly one more instance I got is current time is 45960787154, but the usTick is showing as 91920787152
typedef unsigned long long TUINT64
unsigned long long GetMonoUSTick()
{
static unsigned long long usTick;
struct timespec t;
clock_gettime(CLOCK_MONOTONIC, &t);
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
return usTick;
}
If multiple threads of the same process in parallel access variables concurrently for reading/writing or writing/writing those variables need to protected. This can be achieved by using a mutex.
In this case the local variable usTick needs to be protected as it is defined static.
Using POSIX-threads the code could look like this:
pthread_mutex_lock(&ustick_mutex);
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
pthread_mutex_unlock(&ustick_mutex);
(error checking left out for clarity)
Take care to initialise ustick_mutex properly before using it.
Sounds like you are using threads and that once in a while two threads are calling this function at the same time. Because usTick is static, they are working with the same variable (not two different copies). They both convert the nanoseconds to microseconds and assign to usTick, one after the other, and then they both convert the seconds to microseconds and both add them to usTick (so the seconds get added twice).
EDIT - The following solution I proposed was based on two assumption:
If two threats called the function at the same time, the difference between the current time returned by clock_gettime() would be too small to matter (the difference between the results calculated by the threads would at most be 1).
On most modern CPU's, reading/writing an integer is an atomic operation.
I think you should be able to fix this by changing:
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
to:
usTick = ((TUINT64)t.tv_nsec) / 1000 + ((TUINT64)t.tv_sec) * 1000000;
The problem with my solution was that even if the later assumption would be correct, it might not hold for long long. Therefore, for example this could happen:
Thread A and B call the function at (almost) the same time.
Thread A calculates current time in microsecond as 0x02B3 1F02 FFFF FFFF.
Thread B calculates current time in microsecond as 0x02B3 1F03 0000 0000.
Thread A writes the least significant 32 bits (0xFFFF FFFF) to usTick.
Thread B writes the least significant 32 bits (0x0000 0000) to usTick.
Thread B writes the most significant 32 bits (0x02B3 1F03)to usTick.
Thread A writes the most significant 32 bits (0x02B3 1F02)to usTick.
and then the function would in both threads return 0x02B3 1F02 0000 0000 which is off by 4294967295.
So do as #alk said, use mutex to protect the read/write of usTick
Related
I'm working on an implementation of an obscure network protocol, and one of the requirements is that each packet header should contain a 56-bit timestamp, with the first 4 bytes containing an integer number of seconds since the epoch, and the remaining 3 bytes used to contain a binary fraction of the current second. In other words, the last 3 bytes should represent the number of 2^-24 seconds since the previous second. The first part of the timestamp is trivial, but I'm struggling to implement the C code that would store the fractional part of the timestamp. Can anyone shed some light on how to do this?
For completeness' sake, here's the timestamp code I have so far. primaryHeader is a char* that I'm using to store the header data for the packet. You can assume that the first 6 bytes in primaryHeader contain valid, unrelated data, and that primaryHeader is large enough to contain everything that needs to be stored in it.
int secs = (int)time(NULL);
memcpy(&primaryHeader[7], &secs, sizeof(int));
// TODO: Compute fractional portion of the timestamp and memcpy to primaryHeader[11]
The time() function will only give you seconds. You a higher resolution timer.
struct timespec t;
timespec_get(&ts, TIME_UTC);
int secs = t.tv_sec; // Whole seconds
int frac = ((int64_t)t.tv_nsec << 24) / 1000000000; // Fractional part.
If this is not available you can use clock_gettime, but clock_gettime is not found on Windows.
I'm writing a Linux Kernel Module to calculate the function's running time in jiffies.
static int thread_fn(void *unused) {
unsigned long j_0, j_1, j_seconds;
int a, b, sum;
printk("Inside thread creation's function\n");
j_0 = jiffies;
a = 100;
b = 200;
sum = a+b;
printk("Result of addition is %ld\n",sum);
j_1 = jiffies;
j_seconds = j_1 - j_0;
printk("Time elapsed in jiffies: %lu\n", j_seconds);
while(!kthread_should_stop())
schedule();
return 0;
}
I'm assigning j_0 an initial jiffy counter value, and j_1 the extended jiffy value. When I subtract them both, they give me 0, even though j_0 and j_1 are distinct values.
Edit 1: I'm sorry. I printed both j_0 and j_1 during the same execution and they're not distinct. I printed j_0 and j_1 separately during different module insertions, hence the reason to why I thought they were distinct.
So, my question now is, why isn't jiffies incrementing? Shouldn't they since operations occur between the two calls?
As noted in the comments, j_1 == j_0. This is understandable as jiffies is incremented every timer interrupt. The frequency with which this happens can be defined by CONFIG_HZ, e.g. on my VM:
grep 'CONFIG_HZ=' /boot/config-$(uname -r)
CONFIG_HZ=250
250Hz = one timer interrupt every 4 ms. This granularity is way too coarse to measure the impact of a printk (and a single addition).
For sub-jiffy time measurements, you can use ftrace, do_gettimeofday or perf. This has been asked before. See e.g. this question's answers.
I've written a code to ensure each loop of while(1) loop to take specific amount of time (in this example 10000µS which equals to 0.01 seconds). The problem is this code works pretty well at the start but somehow stops after less than a minute. It's like there is a limit of accessing linux time. For now, I am initializing a boolean variable to make this time calculation run once instead infinite. Since performance varies over time, it'd be good to calculate the computation time for each loop. Is there any other way to accomplish this?
void some_function(){
struct timeval tstart,tend;
while (1){
gettimeofday (&tstart, NULL);
...
Some computation
...
gettimeofday (&tend, NULL);
diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
usleep(10000-diff);
}
}
from man-page of usleep
#include <unistd.h>
int usleep(useconds_t usec);
usec is unsigned int, now guess what happens when diff is > 10000 in below line
usleep(10000-diff);
Well, the computation you make to get the difference is wrong:
diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
You are mixing different integer types, missing that tv_usec can be an unsigned quantity, which your are substracting from another unsigned and can overflow.... after that, you get as result a full second plus a quantity that is around 4.0E09usec. This is some 4000sec. or more than an hour.... aproximately. It is better to check if there's some carry, and in that case, to increment tv_sec, and then substract 10000000 from tv_usec to get a proper positive value.
I don't know the implementation you are using for struct timeval but the most probable is that tv_sec is a time_t (this can be even 64bit) while tv_usec normally is just a unsigned 32 bit value, as it it not going to go further from 1000000.
Let me illustrate... suppose you have elapsed 100ms doing calculations.... and this happens to occur in the middle of a second.... you have
tstart.tv_sec = 123456789; tstart.tv_usec = 123456;
tend.tv_sec = 123456789; tend.tv_usec = 223456;
when you substract, it leads to:
tv_sec = 0; tv_usec = 100000;
but let's suppose you have done your computation while the second changes
tstart.tv_sec = 123456789; tstart.tv_usec = 923456;
tend.tv_sec = 123456790; tend.tv_usec = 23456;
the time difference is again 100msec, but now, when you calculate your expression you get, for the first part, 1000000 (one full second) but, after substracting the second part you get 23456 - 923456 =*=> 4294067296 (*) with the overflow.
so you get to usleep(4295067296) or 4295s. or 1h 11m more.
I think you have not had enough patience to wait for it to finish... but this is something that can be happening to your program, depending on how struct timeval is defined.
A proper way to make carry to work is to reorder the summation to do all the additions first and then the substractions. This forces casts to signed integers when dealing with signed and unsigned together, and prevents a negative overflow in unsigneds.
diff = (tend.tv_sec - tstart.tv_sec) * 1000000 + tstart.tv_usec - tend.tv_usec;
which is parsed as
diff = (((tend.tv_sec - tstart.tv_sec) * 1000000) + tstart.tv_usec) - tend.tv_usec;
I want to ask anyone of you here is familiar with this function as below in the Interbench. I want to port this to windows platform but keep failing. I can only get microsecond accuracy by using timeval instead of timespec. And in the end , there will be error : divide by zero and access violation exceptions
unsigned long get_usecs(struct timeval *myts)
{
if (clock_gettime(myts))
terminal_error("clock_gettime");
return (myts->tv_sec * 1000000 + myts->tv_usec);
}
void burn_loops(unsigned long loops)
{
unsigned long i;
/*
* We need some magic here to prevent the compiler from optimising
* this loop away. Otherwise trying to emulate a fixed cpu load
* with this loop will not work.
*/
for (i = 0; i < loops; i++)
_ReadWriteBarrier();
}
void calibrate_loop()
{
unsigned long long start_time, loops_per_msec, run_time = 0;
unsigned long loops;
struct timeval myts;
loops_per_msec = 100000;
redo:
/* Calibrate to within 1% accuracy */
while (run_time > 1010000 || run_time < 990000) {
loops = loops_per_msec;
start_time = get_usecs(&myts);
burn_loops(loops);
run_time = get_usecs(&myts) - start_time;
loops_per_msec = (1000000 * loops_per_msec / run_time ? run_time : loops_per_msec );
}
/* Rechecking after a pause increases reproducibility */
Sleep(1 * 1000);
loops = loops_per_msec;
start_time = get_usecs(&myts);
burn_loops(loops);
run_time = get_usecs(&myts) - start_time;
/* Tolerate 5% difference on checking */
if (run_time > 1050000 || run_time < 950000)
goto redo;
loops_per_ms = loops_per_msec;
}
The only clock_gettime() function I know is the one specified by POSIX, and that function has a different signature than the one you are using. It does provide nanosecond resolution (though it is unlikely to provide single-nanosecond precision). To the best of my knowledge, however, it is not available on Windows. Microsoft's answer to obtaining nanosecond-scale time differences is to use its proprietary "Query Performance Counter" (QPC) API. Do put that aside for the moment, however, because I suspect clock resolution isn't your real problem.
Supposing that your get_usecs() function successfully retrieves a clock time with microsecond resolution and at least at least (about) millisecond precision, as seems to be the expectation, your code looks a bit peculiar. In particular, this assignment ...
loops_per_msec = (1000000 * loops_per_msec / run_time
? run_time
: loops_per_msec );
... looks quite wrong, as is more apparent when the formatting emphasizes operator precedence, as above (* and / have higher precedence than ?:). It will give you your divide-by-zero if you don't get a measurable positive run time, or otherwise it will always give you either the same loops_per_msec value you started with or else run_time, the latter of which doesn't even have the right units.
I suspect the intent was something more like this ...
loops_per_msec = ((1000000 * loops_per_msec)
/ (run_time ? run_time : loops_per_msec));
..., but that still has a problem: if 1000000 loops is not sufficient to consume at least one microsecond (as measured) then you will fall into an infinite loop, with loops_per_msec repeatedly set to 1000000.
This would be less susceptible to that particular problem ...
loops_per_msec = ((1000000 * loops_per_msec) / (run_time ? run_time : 1));
... and it makes more sense to me, too, because if the measured run time is 0 microseconds, then 1 microsecond is a better non-zero approximation to that than any other possible value. Do note that this will scale up your loops_per_msec quite rapidly (one million-fold) when the measured run time is zero microseconds. You can't do that many times without overflowing, even if unsigned long long turns out to have 128 bits, and if you get an overflow then you will go into an infinite loop. On the other hand, if that overflow happens then it indicates an absurdly large correct value for the loops_per_msec you are trying to estimate.
And that leads me to my conclusion: I suspect your real problem is that your timing calculations are wrong or invalid, either because get_usecs() isn't working correctly or because the body of burn_loops() is being optimized away (despite your effort to avoid that). You don't need sub-microsecond precision for your time measurements. In fact, you don't even really need better than millisecond precision, as long as your burn_loop() actually does work proportional to the value of its argument.
Im trying to use the function nanosleep to make my process sleep for a random amount of time between 1/10th of a second?
im using srand() to seed my random number generator, with the process id, that is, im calling:
srand(getpid());
then using
struct timespec delay;
delay.tv_sec = 0;
delay.tv_nsec = rand();
nanosleep(&delay, NULL);
How can i make sure im sleeping for 0..1/10th of a second?
I'd say you just need 100000000ULL * rand() / RAND_MAX nanoseconds, this is at most 0.1s and at least 0s. Alternatively, try usleep() with argument 100000ULL * rand() / RAND_MAX. (I think usleep requires fewer CPU resources.)
(Edit: Added "unsigned long long" literal specifier to ensure that the number fits. See comments below, and thanks to caf for pointing this out!)
you need to "cap" your rand(), like:
delay.tv_nsec = rand() % 1e8;
Some experts say that this is not the optimal way to do it, because you use the LSB of the number and they are not as "random" as the higher bits, but this is fast and reliable.