I've written a code to ensure each loop of while(1) loop to take specific amount of time (in this example 10000µS which equals to 0.01 seconds). The problem is this code works pretty well at the start but somehow stops after less than a minute. It's like there is a limit of accessing linux time. For now, I am initializing a boolean variable to make this time calculation run once instead infinite. Since performance varies over time, it'd be good to calculate the computation time for each loop. Is there any other way to accomplish this?
void some_function(){
struct timeval tstart,tend;
while (1){
gettimeofday (&tstart, NULL);
...
Some computation
...
gettimeofday (&tend, NULL);
diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
usleep(10000-diff);
}
}
from man-page of usleep
#include <unistd.h>
int usleep(useconds_t usec);
usec is unsigned int, now guess what happens when diff is > 10000 in below line
usleep(10000-diff);
Well, the computation you make to get the difference is wrong:
diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
You are mixing different integer types, missing that tv_usec can be an unsigned quantity, which your are substracting from another unsigned and can overflow.... after that, you get as result a full second plus a quantity that is around 4.0E09usec. This is some 4000sec. or more than an hour.... aproximately. It is better to check if there's some carry, and in that case, to increment tv_sec, and then substract 10000000 from tv_usec to get a proper positive value.
I don't know the implementation you are using for struct timeval but the most probable is that tv_sec is a time_t (this can be even 64bit) while tv_usec normally is just a unsigned 32 bit value, as it it not going to go further from 1000000.
Let me illustrate... suppose you have elapsed 100ms doing calculations.... and this happens to occur in the middle of a second.... you have
tstart.tv_sec = 123456789; tstart.tv_usec = 123456;
tend.tv_sec = 123456789; tend.tv_usec = 223456;
when you substract, it leads to:
tv_sec = 0; tv_usec = 100000;
but let's suppose you have done your computation while the second changes
tstart.tv_sec = 123456789; tstart.tv_usec = 923456;
tend.tv_sec = 123456790; tend.tv_usec = 23456;
the time difference is again 100msec, but now, when you calculate your expression you get, for the first part, 1000000 (one full second) but, after substracting the second part you get 23456 - 923456 =*=> 4294067296 (*) with the overflow.
so you get to usleep(4295067296) or 4295s. or 1h 11m more.
I think you have not had enough patience to wait for it to finish... but this is something that can be happening to your program, depending on how struct timeval is defined.
A proper way to make carry to work is to reorder the summation to do all the additions first and then the substractions. This forces casts to signed integers when dealing with signed and unsigned together, and prevents a negative overflow in unsigneds.
diff = (tend.tv_sec - tstart.tv_sec) * 1000000 + tstart.tv_usec - tend.tv_usec;
which is parsed as
diff = (((tend.tv_sec - tstart.tv_sec) * 1000000) + tstart.tv_usec) - tend.tv_usec;
Related
I'm working on an implementation of an obscure network protocol, and one of the requirements is that each packet header should contain a 56-bit timestamp, with the first 4 bytes containing an integer number of seconds since the epoch, and the remaining 3 bytes used to contain a binary fraction of the current second. In other words, the last 3 bytes should represent the number of 2^-24 seconds since the previous second. The first part of the timestamp is trivial, but I'm struggling to implement the C code that would store the fractional part of the timestamp. Can anyone shed some light on how to do this?
For completeness' sake, here's the timestamp code I have so far. primaryHeader is a char* that I'm using to store the header data for the packet. You can assume that the first 6 bytes in primaryHeader contain valid, unrelated data, and that primaryHeader is large enough to contain everything that needs to be stored in it.
int secs = (int)time(NULL);
memcpy(&primaryHeader[7], &secs, sizeof(int));
// TODO: Compute fractional portion of the timestamp and memcpy to primaryHeader[11]
The time() function will only give you seconds. You a higher resolution timer.
struct timespec t;
timespec_get(&ts, TIME_UTC);
int secs = t.tv_sec; // Whole seconds
int frac = ((int64_t)t.tv_nsec << 24) / 1000000000; // Fractional part.
If this is not available you can use clock_gettime, but clock_gettime is not found on Windows.
I am using the below code to convert nano to micro sec
This code runs fines mostly but some times I see the usTick gives a value far beyond the current time.
For ex. if the current time in usTick is 63290061063 then sometimes this value is coming as 126580061060. If you see it is double.
Similarly one more instance I got is current time is 45960787154, but the usTick is showing as 91920787152
typedef unsigned long long TUINT64
unsigned long long GetMonoUSTick()
{
static unsigned long long usTick;
struct timespec t;
clock_gettime(CLOCK_MONOTONIC, &t);
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
return usTick;
}
If multiple threads of the same process in parallel access variables concurrently for reading/writing or writing/writing those variables need to protected. This can be achieved by using a mutex.
In this case the local variable usTick needs to be protected as it is defined static.
Using POSIX-threads the code could look like this:
pthread_mutex_lock(&ustick_mutex);
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
pthread_mutex_unlock(&ustick_mutex);
(error checking left out for clarity)
Take care to initialise ustick_mutex properly before using it.
Sounds like you are using threads and that once in a while two threads are calling this function at the same time. Because usTick is static, they are working with the same variable (not two different copies). They both convert the nanoseconds to microseconds and assign to usTick, one after the other, and then they both convert the seconds to microseconds and both add them to usTick (so the seconds get added twice).
EDIT - The following solution I proposed was based on two assumption:
If two threats called the function at the same time, the difference between the current time returned by clock_gettime() would be too small to matter (the difference between the results calculated by the threads would at most be 1).
On most modern CPU's, reading/writing an integer is an atomic operation.
I think you should be able to fix this by changing:
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
to:
usTick = ((TUINT64)t.tv_nsec) / 1000 + ((TUINT64)t.tv_sec) * 1000000;
The problem with my solution was that even if the later assumption would be correct, it might not hold for long long. Therefore, for example this could happen:
Thread A and B call the function at (almost) the same time.
Thread A calculates current time in microsecond as 0x02B3 1F02 FFFF FFFF.
Thread B calculates current time in microsecond as 0x02B3 1F03 0000 0000.
Thread A writes the least significant 32 bits (0xFFFF FFFF) to usTick.
Thread B writes the least significant 32 bits (0x0000 0000) to usTick.
Thread B writes the most significant 32 bits (0x02B3 1F03)to usTick.
Thread A writes the most significant 32 bits (0x02B3 1F02)to usTick.
and then the function would in both threads return 0x02B3 1F02 0000 0000 which is off by 4294967295.
So do as #alk said, use mutex to protect the read/write of usTick
I want to ask anyone of you here is familiar with this function as below in the Interbench. I want to port this to windows platform but keep failing. I can only get microsecond accuracy by using timeval instead of timespec. And in the end , there will be error : divide by zero and access violation exceptions
unsigned long get_usecs(struct timeval *myts)
{
if (clock_gettime(myts))
terminal_error("clock_gettime");
return (myts->tv_sec * 1000000 + myts->tv_usec);
}
void burn_loops(unsigned long loops)
{
unsigned long i;
/*
* We need some magic here to prevent the compiler from optimising
* this loop away. Otherwise trying to emulate a fixed cpu load
* with this loop will not work.
*/
for (i = 0; i < loops; i++)
_ReadWriteBarrier();
}
void calibrate_loop()
{
unsigned long long start_time, loops_per_msec, run_time = 0;
unsigned long loops;
struct timeval myts;
loops_per_msec = 100000;
redo:
/* Calibrate to within 1% accuracy */
while (run_time > 1010000 || run_time < 990000) {
loops = loops_per_msec;
start_time = get_usecs(&myts);
burn_loops(loops);
run_time = get_usecs(&myts) - start_time;
loops_per_msec = (1000000 * loops_per_msec / run_time ? run_time : loops_per_msec );
}
/* Rechecking after a pause increases reproducibility */
Sleep(1 * 1000);
loops = loops_per_msec;
start_time = get_usecs(&myts);
burn_loops(loops);
run_time = get_usecs(&myts) - start_time;
/* Tolerate 5% difference on checking */
if (run_time > 1050000 || run_time < 950000)
goto redo;
loops_per_ms = loops_per_msec;
}
The only clock_gettime() function I know is the one specified by POSIX, and that function has a different signature than the one you are using. It does provide nanosecond resolution (though it is unlikely to provide single-nanosecond precision). To the best of my knowledge, however, it is not available on Windows. Microsoft's answer to obtaining nanosecond-scale time differences is to use its proprietary "Query Performance Counter" (QPC) API. Do put that aside for the moment, however, because I suspect clock resolution isn't your real problem.
Supposing that your get_usecs() function successfully retrieves a clock time with microsecond resolution and at least at least (about) millisecond precision, as seems to be the expectation, your code looks a bit peculiar. In particular, this assignment ...
loops_per_msec = (1000000 * loops_per_msec / run_time
? run_time
: loops_per_msec );
... looks quite wrong, as is more apparent when the formatting emphasizes operator precedence, as above (* and / have higher precedence than ?:). It will give you your divide-by-zero if you don't get a measurable positive run time, or otherwise it will always give you either the same loops_per_msec value you started with or else run_time, the latter of which doesn't even have the right units.
I suspect the intent was something more like this ...
loops_per_msec = ((1000000 * loops_per_msec)
/ (run_time ? run_time : loops_per_msec));
..., but that still has a problem: if 1000000 loops is not sufficient to consume at least one microsecond (as measured) then you will fall into an infinite loop, with loops_per_msec repeatedly set to 1000000.
This would be less susceptible to that particular problem ...
loops_per_msec = ((1000000 * loops_per_msec) / (run_time ? run_time : 1));
... and it makes more sense to me, too, because if the measured run time is 0 microseconds, then 1 microsecond is a better non-zero approximation to that than any other possible value. Do note that this will scale up your loops_per_msec quite rapidly (one million-fold) when the measured run time is zero microseconds. You can't do that many times without overflowing, even if unsigned long long turns out to have 128 bits, and if you get an overflow then you will go into an infinite loop. On the other hand, if that overflow happens then it indicates an absurdly large correct value for the loops_per_msec you are trying to estimate.
And that leads me to my conclusion: I suspect your real problem is that your timing calculations are wrong or invalid, either because get_usecs() isn't working correctly or because the body of burn_loops() is being optimized away (despite your effort to avoid that). You don't need sub-microsecond precision for your time measurements. In fact, you don't even really need better than millisecond precision, as long as your burn_loop() actually does work proportional to the value of its argument.
(can skip this part just an explanation of the code below. my problems are under the code block.)
hi. i'm trying to algro for throttling loop cycles based on how much bandwidth the linux computer is using. i'm reading /proc/net/dev once a second and keeping track of the bytes transmitted in 2 variables. one is the last time it was checked the other is the recent time. from there subtracts the recent one from the last one to calculate how many bytes has been sent in 1 second.
from there i have the variables max_throttle, throttle, max_speed, and sleepp.
the idea is to increase or decrease sleepp depending on bandwidth being used. the less bandwidth the lower the delay and the higher the longer.
i am currently having to problems dealing with floats and ints. if i set all my variables to ints max_throttle becomes 0 always no matter what i set the others to and even if i initialize them.
also even though my if statement says "if sleepp is less then 0 return it to 0" it keeps going deeper and deeper into the negatives then levels out at aroung -540 with 0 bandwidth being used.
and the if(ii & 0x40) is for speed and usage control. in my application there will be no 1 second sleep so this code allows me to limit the sleepp from changing about once every 20-30 iterations. although im also having a problem with it where after the 2X iterations when it does trigger it continues to trigger every iteration after instead of only being true once and then being true again after 20-30 more iterations.
edit:: simpler test cast for my variable problem.
#include <stdio.h>
int main()
{
int max_t, max_s, throttle;
max_s = 400;
throttle = 90;
max_t = max_s * (throttle / 100);
printf("max throttle:%d\n", max_t);
return 0;
}
In C, operator / is an integer division when used with integers only. Therefore, 90/100 = 0. In order to do floating-point division with integers, first convert them to floats (or double or other fp types).
max_t = max_s * (int)(((float)throttle / 100.0)+0.5);
The +0.5 is rounding before converting to int. You might want to consider some standard flooring functions, I don't know your use case.
Also note that the 100.0 is a float literal, whereas 100 would be an intger literal. So, although they seem identical, they are not.
As kralyk pointed out, C’s integer division of 90/100 is 0. But rather than using floats you can work with ints… Just do the division after the multiplication (note the omission of parentheses):
max_t = max_s * throttle / 100;
This gives you the general idea. For example if you want the kind of rounding kralyk mentions, add 50 before doing the division:
max_t = (max_s * throttle + 50) / 100;
I am trying to measure how long a function takes.
I have a little issue: although I am trying to be precise, and use floating points, every time I print my code using %lf I get one of two answers: 1.000... or 0.000... This leads me to wonder if my code is correct:
#define BILLION 1000000000L;
// Calculate time taken by a request
struct timespec requestStart, requestEnd;
clock_gettime(CLOCK_REALTIME, &requestStart);
function_call();
clock_gettime(CLOCK_REALTIME, &requestEnd);
// Calculate time it took
double accum = ( requestEnd.tv_sec - requestStart.tv_sec )
+ ( requestEnd.tv_nsec - requestStart.tv_nsec )
/ BILLION;
printf( "%lf\n", accum );
Most of this code has not been made by me. This example page had code illustrating the use of clock_gettime:
Could anyone please let me know what is incorrect, or why I am only getting int values please?
Dividing an integer by an integer yields an integer. Try this:
#define BILLION 1E9
And don't use a semicolon at the end of the line. #define is a preprocessor directive, not a statement, and including the semicolon resulted in BILLION being defined as 1000000000L;, which would break if you tried to use it in most contexts. You got lucky because you used it at the very end of an expression and outside any parentheses.
( requestEnd.tv_nsec - requestStart.tv_nsec ) is of integer type, and is always less than BILLION, so the result of dividing one by the other in integer arithmetic will always be 0. You need to cast the result of the subtraction to e.g. double before doing the divide.
Note that (requestEnd.tv_nsec - requestStart.tv_nsec) can be negative, in which case you need to subtract 1 second from the tv_sec difference and add one BILLION to the tv_nsec difference.
I know the question was posted long ago, but I still don't see the answer which would suggest you to "convert" elapsed time into nanoseconds (or milliseconds) and not into seconds as in your code sample.
The sample code fragment to illustrate the idea:
long long accum = ( requestEnd.tv_nsec - requestStart.tv_nsec )
+ ( requestEnd.tv_sec - requestStart.tv_sec ) * BILLION;
This way you can avoid floating point arithmetic, which may be heavy for some platforms...