I'm trying to port u8glib (graphics library) to MIPS processor, OpenWrt router.
Here's an example in arm environment.
As such, one of the routines I must implement is:
delay_micro_seconds(uint32_t us)
Since this is a high resolution unit of time, how can I do this reliably in my environment? I've tried the following, but I'm not even sure how to validate it:
nanosleep((struct timespec[]){{0, 1000}}, NULL);
How can I validate this approach? If its a bad approach, how could I reliably delay by 1 microsecond in C?
EDIT: I've tried this, but I'm getting strange output, I expect the difference between the two print s to be 1000*10 iterations = 10,000 , but it is actually closer to 670,000 nanoseconds:
int main(int argc, char **argv)
{
long res, resb;
struct timespec ts, tsb;
int i;
res = clock_gettime(CLOCK_REALTIME, &ts);
for(i=0;i<10;i++){
nanosleep((struct timespec[]){{0,1000}}, NULL);
}
resb = clock_gettime(CLOCK_REALTIME, &tsb);
if (0 == res) printf("%ld %ld\n", ts.tv_sec, ts.tv_nsec);
else perror("clock_gettime");
if (0 == resb) printf("%ld %ld\n", tsb.tv_sec, tsb.tv_nsec);
else perror("clock_gettime"); //getting 670k delta instead of 10k for tv_nsec
return 0;
}
I assume that your codes would run under Linux.
First, using clock_getres(2) to find a clock's resolution.
and then, using clock_nanosleep(2) might make more accurate sleep.
To validate the sleep, I suggest you check elapsed time with clock_gettime(2)
res = clock_gettime(CLOCK_REALTIME, &ts);
if (0 == res) printf("%ld %ld\n", ts.tv_sec, ts.tv_nsec);
else perror("clock_gettime");
clock_nanosleep(CLOCK_REALTIME, 0, &delay, NULL);
res = clock_gettime(CLOCK_REALTIME, &ts);
if (0 == res) printf("%ld %ld\n", ts.tv_sec, ts.tv_nsec);
else perror("clock_gettime");
Also, if necessary, you can recompile your kernel with higher HZ configuration.
It would be helpful to read time(7) man page. Especially, The software clock, HZ, and jiffies and High-resolution timers sections.
Though my man pages says that High-resolution timer is not supported under mips architecture but I just googled it and mips-linux support HRT apparently.
Is this sleep synchronous? (I'm not familiar with your platform)
Validation
If you have development tools available, are there any profiling tools included? They can at least help you measure execution time.
You can also loop around your sleep call 1000+ times in a test program. Before you enter the loop, get a timestamp from the system clock. After the loop is done cycling, take another timestamp and compare with the first. Remember the loop itself will have some time overhead, but otherwise this will let you know how accurate (overshoot or undershoot) your sleep is. (If you cycle 1,000,000 time around a 1 microsecond sleep function, you would expect that it finishes quite near to 1 second.
Alternative
Sleep functions are not always perfect to their resolution, but they promise to be simple to use and always get you in the neighborhood of what they say. There are many statements that run much quicker than a microsecond, a++;.
Using a similar method as above, you could easily make a homemade synchronous timer with awesome accuracy using a FOR loop with some pointless statement inside of it. Once you find out how many iterations lands you nearest 1 microsecond, it should never change and you could hardcode a function out of it.
If you intend your delay to be asynchronous with a multi-tasking process in mind, this obviously would not cooperate with the other tasks well.
Related
I'm trying to 'roughly' calculate the time of a thread context switch in a Linux system. I've written a program that uses pipes and multi-threading to achieve this. When running the program the calculated time is clearly wrong(see output below). I am unsure if this is due to me using the wrong clock_id for this procedure or perhaps my implementation
I have implemented sched_setaffinity() so as to only have the program run on core 0. I've tried to leave as much fluff out of code so to only measure the time of a context switch, so the tread process only writes a single character to the pipe and the parent does a 0 byte read.
I have a parent tread that creates one child thread with a one-way pipe between them to pass data, the child thread runs a simple function to write to a pipe.
void* thread_1_function()
{
write(fd2[1],"",sizeof("");
}
while the parent thread creates the child thread, starts the time counter and then calls a read on the pipe that the child thread writes to.
int main(int argc, char argv[])
{
//time struct declaration
struct timespec start,end;
//sets program to only use core 0
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(0,&cpu_set);
if((sched_setaffinity(0, sizeof(cpu_set_t), &cpu_set) < 1))
{
int nproc = sysconf(_SC_NPROCESSORS_ONLN);
int k;
printf("Processor used: ");
for(k = 0; k < nproc; ++k)
{
printf("%d ", CPU_ISSET(k, &cpu_set));
}
printf("\n");
if(pipe(fd1) == -1)
{
printf("fd1 pipe error");
return 1;
}
//fail on file descriptor 2 fail
if(pipe(fd2) == -1)
{
printf("fd2 pipe error");
return 1;
}
pthread_t thread_1;
pthread_create(&thread_1, NULL, &thread_1_function, NULL);
pthread_join(thread_1,NULL);
int i;
uint64_t sum = 0;
for(i = 0; i < iterations; ++i)
{
//initalize clock start
clock_gettime(CLOCK_MONOTONIC, &start);
//wait for child thread to write to pipe
read(fd2[0],input,0);
//record clock end
clock_gettime(CLOCK_MONOTONIC, &end);
write(fd1[1],"",sizeof(""));
uint64_t diff;
diff = billion * (end.tv_sec - start.tv_sec) + end.tv_nsec - start.tv_nsec;
diff = diff;
sum += diff;
}
The results i get while running this are typically in this manner:
3000
3000
4000
2000
12000
3000
5000
and so forth, when I inspect the time returned to the start and end timespec structs i see that tv_nsec seems to be a 'rounded' number as well:
start.tv_nsec: 714885000, end.tv_nsec: 714888000
Would this be caused by a clock_monotonic not being precise enough for what im attempting to measure, or some other problem that i'm overlooking?
i see that tv_nsec seems to be a 'rounded' number as well:
2626, 714885000, 2626, 714888000
Would this be caused by a clock_monotonic not being precise enough for
what im attempting to measure, or some other problem that i'm
overlooking?
Yes, that's a possibility. Every clock supported by the system has a fixed resolution. struct timespec is capable of supporting clocks with nanosecond resolution, but that does not mean that you can expect every clock to actually have such resolution. It looks like your CLOCK_MONOTONIC might have a resolution of 1 microsecond (1000 nanoseconds), but you can check that via the clock_getres() function.
If it is available to you, then you might try CLOCK_PROCESS_CPUTIME_ID. It is possible that that would have higher resolution than CLOCK_MONOTONIC for you, but do note that single-microsecond resolution is pretty precise -- that's on the order of one tick per 3000 CPU cycles on a modern machine.
Even so, I see several possible problems with your approach:
Although you set your process to have affinity for a single CPU, that does not prevent the system from scheduling other processes on that CPU, too. Thus, unless you've taken additional measures, you can't be certain -- it's not even likely -- that every context switch away from one of your program's threads is to the other thread.
You start your second thread and then immediately join it. There is no more context switching between your threads after that, because your second thread no longer exists after being successfully joined.
read() with a count of 0 may or may not check for errors, and it certainly does not transfer any data. It is totally unclear to me why you identify the time for that call with the time for a context switch.
If a context switch does occur in the space you're timing, then at least two need to occur there -- away from your program and back to it. Also, you're measuring the time consumed by whatever else runs in the other context as well, not just the switch time. The 1000-nanosecond steps may thus reflect time slices, rather than switching time.
Your main thread is writing null characters to the write end of a pipe, but there does not appear to be anything reading them. If indeed there isn't then this will eventually fill up the pipe's buffer and block. The purpose is lost on me.
I have read several forums with the same title, although what I am after is NOT a way to find out how long it takes my computer to execute the program.
I am interested in finding out how long the program is in use by the user. I have looked at several functions inside #include <time.h>, however, it seems as though these functions (like clock_t) give the time it takes for my computer to execute my code which is not what I am after.
Thank you for your time.
EDIT:
I have done:
clock_t start, stop;
long int x;
double duration;
start = clock();
//code
stop = clock(); // get number of ticks after loop
// calculate time taken for loop
duration = ( double ) ( stop - start ) / CLOCKS_PER_SEC;
printf( "\nThe number of seconds for loop to run was %.2lf\n", duration );
I receive 0.16 from the program, however, when i timed it I got 1 minute and 14 seconds. how does this possibly add up?
The clock function counts CPU time used, not elapsed time. For a program that's 100% CPU-intensive, the time reported by clock will be close to wall time, but otherwise it's likely to be less -- often much less.
One easy way of measuring elapsed or "wall clock" time is with the time function:
time_t start, stop;
start = time(NULL);
// code
stop = time(NULL);
printf("The number of seconds for loop to run was %ld\n", stop - start);
This is a POSIX function, though -- it's not part of the core C standards, and it may not exist on, say, embedded versions of C.
I think you're looking for getrusage() function. It doesn't give you time consumed like how you use a stop-watch to time a function.
The function gives overall resource usage of the current program and splits it per-user. This is what the time command gives.
I've always used clock() to measure how much time my application took from start to finish, as;
int main(int argc, char *argv[]) {
const clock_t START = clock();
// ...
const double T_ELAPSED = (double)(clock() - START) / CLOCKS_PER_SEC;
}
Since I've started using POSIX threads this seem to fail. It looks like clock() increases N times faster with N threads. As I don't know how many threads are going to be running simultaneously, this approach fails. So how can I measure how much time has passed ?
clock() measure the CPU time used by your process, not the wall-clock time. When you have multiple threads running simultaneously, you can obviously burn through CPU time much faster.
If you want to know the wall-clock execution time, you need to use an appropriate function. The only one in ANSI C is time(), which typically only has 1 second resolution.
However, as you've said you're using POSIX, that means you can use clock_gettime(), defined in time.h. The CLOCK_MONOTONIC clock in particular is the best to use for this:
struct timespec start, finish;
double elapsed;
clock_gettime(CLOCK_MONOTONIC, &start);
/* ... */
clock_gettime(CLOCK_MONOTONIC, &finish);
elapsed = (finish.tv_sec - start.tv_sec);
elapsed += (finish.tv_nsec - start.tv_nsec) / 1000000000.0;
(Note that I have done the calculation of elapsed carefully to ensure that precision is not lost when timing very short intervals).
If your OS doesn't provide CLOCK_MONOTONIC (which you can check at runtime with sysconf(_SC_MONOTONIC_CLOCK)), then you can use CLOCK_REALTIME as a fallback - but note that the latter has the disadvantage that it will generate incorrect results if the system time is changed while your process is running.
What timing resolution do you need? You could use time() from time.h for second resolution. If you need higher resolution, then you could use something more system specific. See Timer function to provide time in nano seconds using C++
I code inside a virtual machine( Linux Ubuntu) that's installed on a windows.
According to this page, the CLOCKS_PER_SEC value in the library on Linux should always be 1 000 000.
When I run this code:
int main()
{
printf("%d\n", CLOCKS_PER_SEC);
while (1)
printf("%f, %f \n\n", (double)clock(), (double)clock()/CLOCKS_PER_SEC);
return 0;
}
The first value is 1 000 000 as it should be, however, the value that should show the number of seconds does NOT increase at the normal pace (it takes between 4 and 5 seconds to increase by 1)
Is this due to my working on a virtual machine? How can I solve this?
This is expected.
The clock() function does not return the wall time (the time that real clocks on the wall display). It returns the amount of CPU time used by your program. If your program is not consuming every possible scheduler slice, then it will increase slower than wall time, if your program consumes slices on multiple cores at the same time it can increase faster.
So if you call clock(), and then sleep(5), and then call clock() again, you'll find that clock() has barely increased at all. Even though sleep(5) waits for 5 real seconds, it doesn't consume any CPU, and CPU usage is what clock() measures.
If you want to measure wall clock time you will want clock_gettime() (or the older version gettimeofday()). You can use CLOCK_REALTIME if you want to know the civil time (e.g. "it's 3:36 PM") or CLOCK_MONOTONIC if you want to measure time intervals. In this case, you probably want CLOCK_MONOTONIC.
#include <stdio.h>
#include <time.h>
int main() {
struct timespec start, now;
clock_gettime(CLOCK_MONOTONIC, &start);
while (1) {
clock_gettime(CLOCK_MONOTONIC, &now);
printf("Elapsed: %f\n",
(now.tv_sec - start.tv_sec) +
1e-9 * (now.tv_nsec - start.tv_nsec));
}
}
The usual proscriptions against using busy-loops apply here.
I have a function and that is called at specific intervals. I need to check the time previously its called, and the current time. If the difference between the function call is 10 milliseconds then execute some piece of code. Sleep should not be used since some other things are executing in parallel. I have written the following code and the function is called at every 10 milliseconds but the difference i am calcuting is giving 1 or 2 milliseconds less sometimes. what is best way to calculate the difference?
fxn()
{
int logCurTime;
static int logPrevTime = 0, logDiffTime = 0;
getCurrentTimeInMilliSec(&logCurTime);
if (logPrevTime > 0)
logDiffTime += logCurTime - logPrevTime;
if (logCurTime <= logPrevTime)
return;
if (logDiffTime >= 10)
{
...
...
logDiffTime = 0;
}
logPrevTime = logCurTime;
}
For eg:
fxn is called 10 times with the interval of 10 milliseconds. some instance logDiffTime is just 8 or 9 and next instance it accounts the remaining time. i.e., 11 or 12.
Using sleep() to get code executed in specific time intervals is indeed a bad idea. Register your function as the handler for a timer interrupt. Then it will be called very precisely on time.
If you're doing heavy lifting stuff in your function, than you should do it in another thread, because you will run into trouble when you're function is taking too long. (it will just be called from the beginning again).
In posix (linux) you could do it like this
#include <sys/time.h>
#include <stdio.h>
#include <signal.h>
if (signal (SIGALRM, fxn) == SIG_ERR)
perror ("Setting your function as timer handler failed");
unsigned seconds = 42;//your time
struct itimerval old, new_time;
new_time.it_interval.tv_usec = 0;
new_time.it_interval.tv_sec = 0;
new_time.it_value.tv_usec = 0;
new_time.it_value.tv_sec = (long int) seconds;
if (setitimer (ITIMER_REAL, &new_time, &old) != 0)
perror ("Setting the timer failed");
or in windows:
#include <Windows.h>
void Fxn_Timer_Proc_Wrapper(HWND,UINT,UINT_PTR,DWORD){
fxn();
}
unsigned seconds = 42;//your time
UINT_PTR timer_id;
if ( (timer_id = SetTimer(NULL,NULL,seconds *1000,(TIMERPROC) Fxn_Timer_Proc_Wrapper) == NULL){
//failed to create a timer
}
It may not be exactly what you are looking for, however I feel it should be clarified:
The sleep call only suspends the calling thread, not all threads of the process. Thus, you can still run parallel threads while one of them sleeps.
See this question for more:
Do sleep functions sleep all threads or just the one who call it?
For a solution to your problem you should register your function with a timer interrupt. See the other answer on how to do that.
10ms is at the edge of what is achievable see stack overflow : 1ms timer . However, several suggestions on how to get 10ms did come out.
timerfd_create allows your program to wait using select.
timer_settime allows your program to request the 10ms interval.
The caveats on linux are :-
May not be scheduled - the OS could be busy doing something else.
May not be accurate - as 10ms appears to be the shortest interval that works, it may be +/- 1 or 2 ms.