I have written some C code which I call form MATLAB after I compile it using MEX. Inside the C code, I measure the time of a part of the computation using the following code:
clock_t begin, end;
double time_elapsed;
begin = clock();
/* do stuff... */
end = clock();
time_elapsed = (double) ((double) (end - begin) / (double) CLOCKS_PER_SEC);
Elapsed time should be the execution time in seconds.
I then output the value time_elapsed to MATLAB (it is properly exported; I checked). Then MATLAB-side I call this C function (after I compile it using MEX) and I measure its execution time using tic and toc. What turns out to be a complete absurdity is that the time I compute using tic and toc is 0.0011s (average on 500 runs, st. dev. 1.4e-4) while the time that is returned by the C code is 0.037s (average on 500 runs, st. dev. 0.0016).
Here one may notice two very strange facts:
The execution time for the whole function is lower than the execution time for a part of the code. Hence, either MATLAB's or C's measurements are strongly inaccurate.
The execution times measured in the C code are very scattered and exhibit very high st. deviation (coeff. of variation 44%, compared to just 13% for tic-toc).
What is going on with these timers?
You're comparing apples to oranges.
Look at Matlab's documentation:
tic - http://www.mathworks.com/help/matlab/ref/tic.html
toc - http://www.mathworks.com/help/matlab/ref/toc.html
tic and toc let you measure real elapsed time.
Now look at the clock function http://linux.die.net/man/3/clock.
In particular,
The clock() function returns an approximation of processor time used by the program.
The value returned is the CPU time used so far as a clock_t; to
get the number of seconds used, divide by CLOCKS_PER_SEC. If the
processor time used is not available or its value cannot be
represented, the function returns the value (clock_t) -1.
So what can account for your difference:
CPU time (measured by clock()) and real elapsed time (measured by tic and toc) are NOT the same. So you would expect that cpu time to be less than elapsed time? Well, maybe. What if within 0.0011s you're driving 10 cores at 100%? That would mean that clock() measurement is 10x that measured with tic and toc. Possible, unlikely.
clock(.) is grossly inaccurate, and consistent with the documentation, it is an approximate cpu time measurement! I suspect that it is pegged to the scheduler quantum size, but I didn't dig through the Linux kernel code to check. I also didn't check on other OSes, but this dude's blog is consistent with that theory.
So what to do... for starters, compare apples to apples! Next, make sure you take into account timer resolution.
Related
I have trying to write c code where some numerical calculations containing time derivatives need to be performed in a real-time dynamic system setting. For this purpose, I need the most accurate time assessment possible from one cycle to the next in a variable called "dt":
static clock_t prev_time;
prev_time = clock();
while(1){
clock_t current_time = clock();
//do something
double dt = (double)(current_time - prev_time)/CLOCKS_PER_SEC;
prev_time = current_time;
}
However, when I test this code by integrating (continuously adding) dt, i.e
static double elapsed_time = 0;
//the above-mentioned declaration and initialization in done here
while(1){
//the above-mentioned code is executed here
elapsed_time += dt;
printf("elapsed_time : %lf\n", elapsed_time);
}
I obtain results that are significantly lower than reality, by a factor that is constant at runtime, but seems to vary when I edit unrelated parts of the code (it ranges between 1/10 and about half the actual time).
My current guess is that clock() doesn't account for time required for memory access (at several points throughout the code, I open an external textfile and save data in it for diagnosis purposes).
But I am not sure if this is the case.
Also, I couldn't find any other way to accurately measure time.
Does anyone why this is happening?
EDIT: the code is compiled and executed on a raspberry pi 3, and is used to implement a feedback controler
UPDATE: As it turns out, clock() is not suitable for real-time numerical calculation, as it only accounts for the time used on the processor. I solved the problem by using the clock_gettime() function defined in <time.h>. This returns the global time with a nanosecond resolution (of course, this comes with a certain error which is dependent on your clock resolution, but is is about the best thing out there).
The answer was surprisingly hard to find, so thanks to Ian Abbott for the very useful help
Does anyone why this is happening?
clock measures CPU time, which is the amount of time slices/clock ticks that the CPU spent on your process. This is useful when you do microbenchmarking as you often don't want to measure time that the CPU actually spent on other processes in a microbenchmark, but it's not useful when you want to measure how much time actually passes between events.
Instead, you should use time or gettimeofday to measure wall clock time.
I am working on encryption of realtime data. I have developed encryption and decryption algorithm. Now i want to measure the execution time of the same on Linux platform in C. How can i correctly measure it ?. I have tried it as below
gettimeofday(&tv1, NULL);
/* Algorithm Implementation Code*/
gettimeofday(&tv2, NULL);
Total_Runtime=(tv2.tv_usec - tv1.tv_usec) +
(tv2.tv_sec - tv1.tv_sec)*1000000);
which gives me time in microseconds. Is it correct way of time measurement or i should use some other function? Any hint will be appreciated.
clock(): The value returned is the CPU time used so far as a clock_t;
Logic
Get CPU time at program beginning and at end. Difference is what you want.
Code
clock_t begin = clock();
/**** code ****/
clock_t end = clock();
double time_spent = (double)(end - begin) //in microseconds
To get the number of seconds used,we divided the difference by CLOCKS_PER_SEC.
More accurate
In C11 timespec_get() provides time measurement in the range of nanoseconds. But the accuracy is implementation defined and can vary.
Read time(7). You probably want to use clock_gettime(2) with CLOCK_PROCESS_CPUTIME_ID or
CLOCK_MONOTONIC. Or you could just use clock(3) (for the CPU time in microseconds, since CLOCK_PER_SEC is always a million).
If you want to benchmark an entire program (executable), use time(1) command.
Measuring the execution time of a proper encryption code is simple although a bit tedious. The runtime of a good encryption code is independent of the quality of the input--no matter what you throw at it, it always needs the same amount of operations per chunk of input. If it doesn't you have a problem called a timing-attack.
So the only thing you need to do is to unroll all loops, count the opcodes and multiply the individual opcodes with their amount of clock-ticks to get the exact runtime. There is one problem: some CPUs have a variable amount of clock-ticks for some of their operations and you might have to change those to operations that have a fixed amount of clock-ticks. A pain in the behind, admitted.
If the single thing you want is to know if the code runs fast enough to fit into a slot of your real-time OS you can simply take to maximum and fill cases below with NOOPs (Your RTOS might have a routine for that).
I've been wanting a pause function for a while and I found this. Being a beginner in C, I can't be for sure but it looks like functions from <clock.h>.
I would like to implement this into my code, but not without understanding it.
void wait(int seconds){
clock_t start, end;
start = clock();
end = clock();
while (((end-start) / CLOCKS_PER_SEC) = !seconds)
end = clock();
}
It's just a busy-wait loop, which is a very nasty way of implementing a delay, because it pegs the CPU at 100% while doing nothing. Use sleep() instead:
#include <unistd.h>
void wait(int seconds)
{
sleep(seconds);
}
Also note that the code given in the question is buggy:
while (((end-start) / CLOCKS_PER_SEC) = !seconds)
should be:
while (((end-start) / CLOCKS_PER_SEC) != seconds)
or better still:
while (((end-start) / CLOCKS_PER_SEC) < seconds)
(but as mentioned above, you shouldn't even be using this code anyway).
Generally, clock_t is a structure in library. It returns the number of clock ticks elapsed from the program initiation till end or till you want to count the time.
If you want to know more you can read details here: http://www.tutorialspoint.com/c_standard_library/c_function_clock.htm
The clock() function returns the number of system clock cycles that have occurred since the program began execution.
Here is the prototype: clock_t clock(void);
Note that, the return value is not in seconds. To convert the result into seconds, you have to divide the return value by CLOCKS_PER_SEC macro.
You program just does this. Functionally it stops the program execution for seconds seconds (you pass this value as an argument to the wait function).
By the way, it uses time.h, not clock.h. And it is not the right point to start learning C.
To learn more: http://www.cplusplus.com/reference/ctime/?kw=time.h
clock function in <ctime>:
Returns the processor time consumed by the program.
The value returned is expressed in clock ticks, which are units of
time of a constant but system-specific length (with a relation of
CLOCKS_PER_SEC clock ticks per second).
reference
So, basically it returns number of processor ticks passed since the start of the program. While the processor tick is the number of processor instructions executed by a process, it does not account for IO time and any such that does not use CPU.
CLOCKS_PER_SEC Is the average number of CPU ticks executed by machine and varies from machine to machine, even it (probably) changes time to time, because doing too much IO will cause overall decrease for each process CLOCKS_PER_SEC because more time will be spent not using CPU.
Also this statement: (end-start) / CLOCKS_PER_SEC) = !seconds
is not correct, because the right implementation is
while (((end-start) / CLOCKS_PER_SEC) != seconds)
end = clock();
Does the trick of busy waiting, program will be trapped inside this while loop until seconds seconds will be passed using CPU clocks and CLOCKS_PER_SEC to determine time passed.
Although I would suggest changing it to:
while (((end-start) / CLOCKS_PER_SEC) < seconds)
end = clock();
Because if process has low priority, or computer is too busy handling many processes chance is one CPU tick can take more than one second (probably when system is crashed, for some buggy program who take up a lot of resources and has high enough priority to cause CPU starvation).
Finally, I do not recommend using it, because you are still using CPU while waiting which can be avoided by using sleep tools discussed here
I know this question may have been commonly asked before, but it seems most of those questions are regarding the elapsed time (based on wall clock) of a piece of code. The elapsed time of a piece of code is unlikely equal to the actual execution time, as other processes may be executing during the elapsed time of the code of interest.
I used getrusage() to get the user time and system time of a process, and then calculate the actual execution time by (user time + system time). I am running my program on Ubuntu. Here are my questions:
How do I know the precision of getrusage()?
Are there other approaches that can provide higher precision than getrusage()?
You can check the real CPU time of a process on linux by utilizing the CPU Time functionality of the kernel:
#include <time.h>
clock_t start, end;
double cpu_time_used;
start = clock();
... /* Do the work. */
end = clock();
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
Source: http://www.gnu.org/s/hello/manual/libc/CPU-Time.html#CPU-Time
That way, you count the CPU ticks or the real amount of instructions worked upon by the CPU on the process, thus getting the real amount of work time.
The getrusage() function is the only standard/portable way that I know of to get "consumed CPU time".
There isn't a simple way to determine the precision of returned values. I'd be tempted to call the getrusage() once to get an initial value, and the call it repeatedly until the value/s returned are different from the initial value, and then assume the effective precision is the difference between the initial and final values. This is a hack (it would be possible for precision to be higher than this method determines, and the result should probably be considered a worst case estimate) but it's better than nothing.
I'd also be concerned about the accuracy of the values returned. Under some kernels I'd expect that a counter is incremented for whatever code happens to be running when a timer IRQ occurs; and therefore it's possible for a process to be very lucky (and continually block just before the timer IRQ occurs) or very unlucky (and unblock just before the timer IRQ occurs). In this case "lucky" could mean a CPU hog looks like it uses no CPU time, and "unlucky" could means a process that uses very little CPU time looks like a CPU hog.
For specific versions of specific kernels on specific architecture/s (potentially depending on if/when the kernel is compiled with specific configuration options in some cases), there may be higher precision alternatives that aren't portable and aren't standard...
You can use this piece of code :
#include <sys/time.h>
struct timeval start, end;
gettimeofday(&start, NULL);
.
.
.
gettimeofday(&end, NULL);
delta = ((end.tv_sec - start.tv_sec) * 1000000u +
end.tv_usec - start.tv_usec) / 1.e6;
printf("Time is : %f\n",delta);
It will show you the execution time for piece of your code
I have some sectors on my drive with poor reading. I could measure the reading time required by each sector and then compare the time of the good sectors and the bad sectors.
I could use a timer of the processor to make the measurements.
How do I write a program in C/Assembly that measures the exact time it takes for each sector to be read?
So the procedure would be something like this:
Start the timer
Read the disk sector
Stop the timer
Read the time measured by the timer
The most useful functionality is the "rdtsc" instruction (ReaD Time Stamp Counter) which is incremented every time the processor's internal clock increments. For a 3 Ghz processor it increments 3 billion times per second. It returns a 64 bit unsigned integer containing the number of clock cycles since the processor was powered on.
Obviously the difference between two read-outs is the number of elapsed clock cycles consumed for executing the code sequence in-between. For a 3 Ghz machine you could use any of the following algorithms to convert to parts of seconds:
(time_difference+150)/300 gives a rounded off elapsed time in 0.1 us (tenths of microseconds)
(time_difference+1500)/3000 gives a rounded off elapsed time in us (microseconds)
(time_difference+1500000/3000000 gives a rounded off elapsed time in ms (milliseconds)
The 0.1 us algorithm is the most precise value you can use without having to adjust for read-out overhead.
In C, the function that would be most useful is clock() in time.h.
To time something, put calls to clock() around it, like so:
clock_t start, end;
float elapsed_time;
start = clock();
read_disk_sector();
end = clock();
elapsed_time = (float)(end - start) / (float)CLOCKS_PER_SEC;
printf("Elapsed time: %f seconds\n", elapsed_time);
This code prints out the number of seconds the read_disk_sector() function call took.
You can read more about the clock function here:
http://www.cplusplus.com/reference/clibrary/ctime/clock/