I have the following basic three lines of code to see if the clock is timing things correctly:
#include <unistd.h>
#define wait(seconds) sleep( seconds )
printf("%lu\n", clock());
wait(2);
printf("%lu\n", clock());
wait() seems to be working fine -- as when I run it, it 'feels' like it is pausing for 2s.
However, here is what the print command gives:
253778
253796
And so when I do something like:
(double) (clock() - t0) / CLOCKS_PER_SEC
It gives me useless results.
What is clock() doing here that I'm not understanding, and how can I fix this to get an accurate timer?
ISO C says that the "clock function returns the implementation's best approximation to the processor time used by the program since the beginning of an implementation-defined era related only to the program invocation."
In other words, it is not a real-time clock.
On platforms where C code runs as a process that can be put to sleep by an underlying operating system, a proper implementation of clock stops counting when that happens.
ISO C provides a function time that is intended to provide calendar time, encoded as a time_t arithmetic value. The difftime function computes the difference between two time_t values as a floating-point value of type double measuring seconds. On many systems, the precision of time_t is no better than one second, though.
There are POSIX functions for finer resolution time such as gettimeofday, or clock_gettime (with a suitable clock type argument).
If you have a read of the man page for clock(), you will see the following listed in the notes:
On several other implementations, the value returned by clock() also
includes the times of any children whose status has been collected via
wait(2) (or another wait-type call). Linux does not include the times
of waited-for children in the value returned by clock(). The times(2)
function, which explicitly returns (separate) information about the
caller and its children, may be preferable.
So clock() doesn't always include the time taken by something like wait() on all implementations. If you are on Linux, you may want to take a look at the times() function, defined in sys/times.h, as it returns a structure including the time (in clock ticks) taken by children (thus functions like wait() are included).
Related
looking for assistance on getting the real, user, sys times of functions within my C program. For instance how long it took to read in a file. I have been looking at the #include and using the time() function with the -p flag but I am striking out on the execution of this. I guess my question is if I have:
time_t total_time;
time_t start, end;
start = time(-p, &start);
<some code>
end = time(-p, &end);
printf("real = %e, user = %S, sys = %S\n", ???????????);
I understand the differences between the three times, just don't know the proper execution of getting the results.
Check time()'s return value:
The value returned generally represents the number of seconds since 00:00 hours, Jan 1, 1970 UTC (i.e., the current unix timestamp). Although libraries may use a different representation of time: Portable programs should not use the value returned by this function directly, but always rely on calls to other elements of the standard library to translate them to portable types (such as localtime, gmtime or difftime).
To display the real time, read this answer.
For user and system, read: How to measure user/system cpu time for a piece of program?
The real elapsed time can be determined by calling time() before and after, and subtracting the results. If you want to be portable, you should use the difftime function to perform the subtraction, because the return value from time() is not guaranteed to be a number.
If you are using a POSIX compliant system, however, the return value from time() will be the number of seconds since the epoch, so you could just subtract them to get the elapsed seconds.
If you wanted a higher resolution result, you could use gettimeofday(), which returns a timeval struct containing millisecond resolution time.
The most portable way to access the user & system times would be to call the clock() C library function, however you have noted in your comments to #gsamaras that your system does not have this function.
Depending on the system you are working with, you may be able to call the system functions times() or getrusage(). getrusage() would be easier to use, because it returns the values as timeval structs, which contain seconds and milliseconds. times() returns clock ticks which you would have to convert if you wanted actual time units.
If your program is threaded, and you are using Linux, getrusage() offers an additional advantage: you can get the resource consumption for the current thread.
Whichever function you choose the process would be to obtain the initial reading, the final reading, and subtract the two results to see the time consumed by your function.
I am doing C program in Linux . I have a main thread which continuously updates values of two variables and other thread write those variable values into a file every 20 milliseconds. I have used usleep to achieve this time interval. sample code is below.
main()
{
.
.
.
.
.
pthread_create(...write_file..); /* started another thread by passing a function write_file */
while(variable1)
{
updates value of variables
}
return 0;
}
void write_file()
{
.
.
.
.
fp = fopen("sample.txt" , "a");
while(variable2)
{
fprintf(fp," %d \n", somevariable);
usleep(20 * 1000);
}
fclose(fp);
}
Is it suitable to use usleep function achieve 20 milliseconds time interval or should I use some other methods like Timer.?
Is this usleep is accurate enough ? Does this sleep function any way affect the main thread ?
Using of sleep() family often results in non-precise timing, especially when process has many CPU-consuming threads and required intervals are relatively small, like 20ms. So you shouldn't assume that *sleep() call blocks execution exactly to specified time. For described above situation actual sleep duration may be even twice or more greater than specified (assuming that kernel is not real-time one). As result you should implement some kind of compensation logic, that adjusts sleep duration for subsequent calls.
More precise (but of course not ideal) approach is to use POSIX timers. See timer_create(). The most precise timers are the ones that use SIGEV_SIGNAL or SIGEV_THREAD_ID notifications (latter is only on Linux systems). As signal number you can use one of the real-time signals (SIGRTMIN to SIGRTMAX), but be aware that pthread implementations often use few of these signals internally, so you should choose actual number carefully. And also doing something in signal handler context requires extra attention, because not every library function may be used safely here. You can find safe list here.
P.S. Also note that select() called with empty sets is a fairly portable way to sleep with subsecond precision.
Sleeping: sleep() and usleep()
Now, let me start with the easier timing calls. For delays of multiple seconds, your best bet is probably to use sleep(). For delays of at least tens of milliseconds (about 10 ms seems to be the minimum delay), usleep() should work. These functions give the CPU to other processes (``sleep''), so CPU time isn't wasted. See the manual pages sleep(3) and usleep(3) for details.
For delays of under about 50 milliseconds (depending on the speed of your processor and machine, and the system load), giving up the CPU takes too much time, because the Linux scheduler (for the x86 architecture) usually takes at least about 10-30 milliseconds before it returns control to your process. Due to this, in small delays, usleep(3) usually delays somewhat more than the amount that you specify in the parameters, and at least about 10 ms.
nanosleep()
In the 2.0.x series of Linux kernels, there is a new system call, nanosleep() (see the nanosleep(2) manual page), that allows you to sleep or delay for short times (a few microseconds or more).
For delays <= 2 ms, if (and only if) your process is set to soft real time scheduling (using sched_setscheduler()), nanosleep() uses a busy loop; otherwise it sleeps, just like usleep().
The busy loop uses udelay() (an internal kernel function used by many kernel drivers), and the length of the loop is calculated using the BogoMips value (the speed of this kind of busy loop is one of the things that BogoMips measures accurately). See /usr/include/asm/delay.h) for details on how it works.
Source: http://tldp.org/HOWTO/IO-Port-Programming-4.html
Try use nanosleep() instead usleep(), it should be more accurately for 20ms interval.
I got the following problem: I have to measure the time a program needs to be executed. A scalar version of the program works fine with the code below, but when using OpenMP, it works on my PC, but not on the resource I am supposed to use.
In fact:
scalar program rt 34s
openmp program rt 9s
thats my pc (everything working) -compiled with visual studio
the ressource I have to use (I think Linux, compiled with gcc):
scalar program rt 9s
openmp program rt 9s (but the text pops immediately afterwards up, so it should be 0-1s)
my gues is, that it adds all ticks, which is about the same amount and devides them by the tick rate of a single core. My question is how to solve this, and if there is a better way to watch the time in the console on c.
clock_t start, stop;
double t = 0.0;
assert((start = clock()) != -1);
... code running
t = (double)(stop - start) / CLOCKS_PER_SEC;
printf("Run time: %f\n", t);
To augment Mark's answer: DO NOT USE clock()
clock() is an awful misunderstanding from the old computer era, who's actual implementation differs greatly from platform to platform. Behold:
on Linux, *BSD, Darwin (OS X) -- and possibly other 4.3BSD descendants -- clock() returns the processor time (not the wall-clock time!) used by the calling process, i.e. the sum of each thread's processor time;
on IRIX, AIX, Solaris -- and possibly other SysV descendants -- clock() returns the processor time (again not the wall-clock time) used by the calling process AND all its terminated child processes for which wait, system or pclose was executed;
HP-UX doesn't even seem to implement clock();
on Windows clock() returns the wall-clock time (not the processor time).
In the descriptions above processor time usually means the sum of user and system time. This could be less than the wall-clock (real) time, e.g. if the process sleeps or waits for file IO or network transfers, or it could be more than the wall-clock time, e.g. when the process has more than one thread, actively using the CPU.
Never use clock(). Use omp_get_wtime() - it exists on all platforms, supported by OpenMP, and always returns the wall-clock time.
Converting my earlier comment to an answer in the spirit of doing anything for reputation ...
Use two calls to omp_get_wtime to get the wallclock time (in seconds) between two points in your code. Note that time is measured individually on each thread, there is no synchronisation of clocks across threads.
Your problem is clock. By the C standard it measures the time passed on the CPU for your process, not wall clock time. So this is what linux does (usually they stick to the standards) and then the total CPU time for the sequential program or the parallel program are the same, as they should be.
Windows OS deviate from that, in that there clock is the wall clock time.
So use other time measurement functions. For standard C this would be time or if you need more precision with the new C11 standard you could use timespec_get, for OpenMP there are other possibilities as have already be mentioned.
Anybody knows, how to get real value of clock per sec? clock() from time.h returns clocks from start of my process, so it need to be divided by CLOCKS_PER_SEC, but this constant has always value 1000000.
Is there some POSIX standard for this?
That's how it specified in the C specification.
If you want to measure elapsed time, there are other (and better) functions, like gettimeofday for example.
I'm trying to find a way to get the execution time of a section of code in C. I've already tried both time() and clock() from time.h, but it seems that time() returns seconds and clock() seems to give me milliseconds (or centiseconds?) I would like something more precise though. Is there a way I can grab the time with at least microsecond precision?
This only needs to be able to compile on Linux.
You referred to clock() and time() - were you looking for gettimeofday()?
That will fill in a struct timeval, which contains seconds and microseconds.
Of course the actual resolution is up to the hardware.
For what it's worth, here's one that's just a few macros:
#include <time.h>
clock_t startm, stopm;
#define START if ( (startm = clock()) == -1) {printf("Error calling clock");exit(1);}
#define STOP if ( (stopm = clock()) == -1) {printf("Error calling clock");exit(1);}
#define PRINTTIME printf( "%6.3f seconds used by the processor.", ((double)stopm-startm)/CLOCKS_PER_SEC);
Then just use it with:
main() {
START;
// Do stuff you want to time
STOP;
PRINTTIME;
}
From http://ctips.pbwiki.com/Timer
You want a profiler application.
Search keywords at SO and search engines: linux profiling
Have a look at gettimeofday,
clock_*, or get/setitimer.
Try "bench.h"; it lets you put a START_TIMER; and STOP_TIMER("name"); into your code, allowing you to arbitrarily benchmark any section of code (note: only recommended for short sections, not things taking dozens of milliseconds or more). Its accurate to the clock cycle, though in some rare cases it can change how the code in between is compiled, in which case you're better off with a profiler (though profilers are generally more effort to use for specific sections of code).
It only works on x86.
You might want to google for an instrumentation tool.
You won't find a library call which lets you get past the clock resolution of your platform. Either use a profiler (man gprof) as another poster suggested, or - quick & dirty - put a loop around the offending section of code to execute it many times, and use clock().
gettimeofday() provides you with a resolution of microseconds, whereas clock_gettime() provides you with a resolution of nanoseconds.
int clock_gettime(clockid_t clk_id, struct timespec *tp);
The clk_id identifies the clock to be used. Use CLOCK_REALTIME if you want a system-wide clock visible to all processes. Use CLOCK_PROCESS_CPUTIME_ID for per-process timer and CLOCK_THREAD_CPUTIME_ID for a thread-specific timer.
It depends on the conditions.. Profilers are nice for general global views however if you really need an accurate view my recommendation is KISS. Simply run the code in a loop such that it takes a minute or so to complete. Then compute a simple average based on the total run time and iterations executed.
This approach allows you to:
Obtain accurate results with low resolution timers.
Not run into issues where instrumentation interferes with high speed caches (l2,l1,branch..etc) close to the processor. However running the same code in a tight loop can also provide optimistic results that may not reflect real world conditions.
Don't know which enviroment/OS you are working on, but your timing may be inaccurate if another thread, task, or process preempts your timed code in the middle. I suggest exploring mechanisms such as mutexes or semaphores to prevent other threads from preemting your process.
If you are developing on x86 or x64 why not use the Time Stamp Counter: RDTSC.
It will be more reliable then Ansi C functions like time() or clock() as RDTSC is an atomic function. Using C functions for this purpose can introduce problems as you have no guarantee that the thread they are executing in will not be switched out and as a result the value they return will not be an accurate description of the actual execution time you are trying to measure.
With RDTSC you can better measure this. You will need to convert the tick count back into a human readable time H:M:S format which will depend on the processors clock frequency but google around and I am sure you will find examples.
However even with RDTSC you will be including the time your code was switched out of execution, while a better solution than using time()/clock() if you need an exact measurement you will have to turn to a profiler that will instrument your code and take into account when your code is not actually executing due to context switches or whatever.