How to measure execution time in C on Linux

How to measure execution time in C on Linux - c

I am working on encryption of realtime data. I have developed encryption and decryption algorithm. Now i want to measure the execution time of the same on Linux platform in C. How can i correctly measure it ?. I have tried it as below
gettimeofday(&tv1, NULL);
/* Algorithm Implementation Code*/
gettimeofday(&tv2, NULL);
Total_Runtime=(tv2.tv_usec - tv1.tv_usec) +
(tv2.tv_sec - tv1.tv_sec)*1000000);
which gives me time in microseconds. Is it correct way of time measurement or i should use some other function? Any hint will be appreciated.

clock(): The value returned is the CPU time used so far as a clock_t;
Logic
Get CPU time at program beginning and at end. Difference is what you want.
Code
clock_t begin = clock();
/**** code ****/
clock_t end = clock();
double time_spent = (double)(end - begin) //in microseconds
To get the number of seconds used,we divided the difference by CLOCKS_PER_SEC.
More accurate
In C11 timespec_get() provides time measurement in the range of nanoseconds. But the accuracy is implementation defined and can vary.

Read time(7). You probably want to use clock_gettime(2) with CLOCK_PROCESS_CPUTIME_ID or
CLOCK_MONOTONIC. Or you could just use clock(3) (for the CPU time in microseconds, since CLOCK_PER_SEC is always a million).
If you want to benchmark an entire program (executable), use time(1) command.

Measuring the execution time of a proper encryption code is simple although a bit tedious. The runtime of a good encryption code is independent of the quality of the input--no matter what you throw at it, it always needs the same amount of operations per chunk of input. If it doesn't you have a problem called a timing-attack.
So the only thing you need to do is to unroll all loops, count the opcodes and multiply the individual opcodes with their amount of clock-ticks to get the exact runtime. There is one problem: some CPUs have a variable amount of clock-ticks for some of their operations and you might have to change those to operations that have a fixed amount of clock-ticks. A pain in the behind, admitted.
If the single thing you want is to know if the code runs fast enough to fit into a slot of your real-time OS you can simply take to maximum and fill cases below with NOOPs (Your RTOS might have a routine for that).

Related

How to measure time in c for the purposes of real-time numerical integration?

I have trying to write c code where some numerical calculations containing time derivatives need to be performed in a real-time dynamic system setting. For this purpose, I need the most accurate time assessment possible from one cycle to the next in a variable called "dt":
static clock_t prev_time;
prev_time = clock();
while(1){
clock_t current_time = clock();
//do something
double dt = (double)(current_time - prev_time)/CLOCKS_PER_SEC;
prev_time = current_time;
}
However, when I test this code by integrating (continuously adding) dt, i.e
static double elapsed_time = 0;
//the above-mentioned declaration and initialization in done here
while(1){
//the above-mentioned code is executed here
elapsed_time += dt;
printf("elapsed_time : %lf\n", elapsed_time);
}
I obtain results that are significantly lower than reality, by a factor that is constant at runtime, but seems to vary when I edit unrelated parts of the code (it ranges between 1/10 and about half the actual time).
My current guess is that clock() doesn't account for time required for memory access (at several points throughout the code, I open an external textfile and save data in it for diagnosis purposes).
But I am not sure if this is the case.
Also, I couldn't find any other way to accurately measure time.
Does anyone why this is happening?
EDIT: the code is compiled and executed on a raspberry pi 3, and is used to implement a feedback controler

UPDATE: As it turns out, clock() is not suitable for real-time numerical calculation, as it only accounts for the time used on the processor. I solved the problem by using the clock_gettime() function defined in <time.h>. This returns the global time with a nanosecond resolution (of course, this comes with a certain error which is dependent on your clock resolution, but is is about the best thing out there).
The answer was surprisingly hard to find, so thanks to Ian Abbott for the very useful help

Does anyone why this is happening?
clock measures CPU time, which is the amount of time slices/clock ticks that the CPU spent on your process. This is useful when you do microbenchmarking as you often don't want to measure time that the CPU actually spent on other processes in a microbenchmark, but it's not useful when you want to measure how much time actually passes between events.
Instead, you should use time or gettimeofday to measure wall clock time.

measuring time of multi-threading program

I measured time with function clock() but it gave bad results. I mean it gives the same results for program with one thread and for the same program running with OpenMP with many threads. But in fact, I notice with my watch that with many threads program counts faster.
So I need some wall-clock timer...
My question is: What is better function for this issue?
clock_gettime() or mb gettimeofday() ? or mb something else?
if clock_gettime(),then with which clock? CLOCK_REALTIME or CLOCK_MONOTONIC?
using mac os x (snow leopard)

If you want wall-clock time, and clock_gettime() is available, it's a good choice. Use it with CLOCK_MONOTONIC if you're measuring intervals of time, and CLOCK_REALTIME to get the actual time of day.
CLOCK_REALTIME gives you the actual time of day, but is affected by adjustments to the system time -- so if the system time is adjusted while your program runs that will mess up measurements of intervals using it.
CLOCK_MONOTONIC doesn't give you the correct time of day, but it does count at the same rate and is immune to changes to the system time -- so it's ideal for measuring intervals, but useless when correct time of day is needed for display or for timestamps.

I think clock() counts the total CPU usage among all threads, I had this problem too...
The choice of wall-clock timing method is personal preference. I use an inline wrapper function to take time-stamps (take the difference of 2 time-stamps to time your processing). I've used floating point for convenience (units are in seconds, don't have to worry about integer overflow). With multi-threading, there are so many asynchronous events that in my opinion it doesn't make sense to time below 1 microsecond. This has worked very well for me so far :)
Whatever you choose, a wrapper is the easiest way to experiment
inline double my_clock(void) {
struct timeval t;
gettimeofday(&t, NULL);
return (1.0e-6*t.tv_usec + t.tv_sec);
}
usage:
double start_time, end_time;
start_time = my_clock();
//some multi-threaded processing
end_time = my_clock();
printf("time is %lf\n", end_time-start_time);

Use clock() to count program execution time

I'm using something like this to count how long does it takes my program from start to finish:
int main(){
clock_t startClock = clock();
.... // many codes
clock_t endClock = clock();
printf("%ld", (endClock - startClock) / CLOCKS_PER_SEC);
}
And my question is, since there are multiple process running at the same time, say if for x amount of time my process is in idle, durning that time will clock tick within my program?
So basically my concern is, say there's 1000 clock cycle passed by, but my process only uses 500 of them, will I get 500 or 1000 from (endClock - startClock)?
Thanks.

This depends on the OS. On Windows, clock() measures wall-time. On Linux/Posix, it measures the combined CPU time of all the threads.
If you want wall-time on Linux, you should use gettimeofday().
If you want CPU-time on Windows, you should use GetProcessTimes().
EDIT:
So if you're on Windows, clock() will measure idle time.
On Linux, clock() will not measure idle time.

clock on POSIX measures cpu time, but it usually has extremely poor resolution. Instead, modern programs should use clock_gettime with the CLOCK_PROCESS_CPUTIME_ID clock-id. This will give up to nanosecond-resolution results, and usually it's really just about that good.

As per the definition on the man page (in Linux),
The clock() function returns an approximation of processor time used
by the program.
it will try to be as accurate a possible, but as you say, some time (process switching, for example) is difficult to account to a process, so the numbers will be as accurate as possible, but not perfect.

How to measure the ACTUAL execution time of a C program under Linux?

I know this question may have been commonly asked before, but it seems most of those questions are regarding the elapsed time (based on wall clock) of a piece of code. The elapsed time of a piece of code is unlikely equal to the actual execution time, as other processes may be executing during the elapsed time of the code of interest.
I used getrusage() to get the user time and system time of a process, and then calculate the actual execution time by (user time + system time). I am running my program on Ubuntu. Here are my questions:
How do I know the precision of getrusage()?
Are there other approaches that can provide higher precision than getrusage()?

You can check the real CPU time of a process on linux by utilizing the CPU Time functionality of the kernel:
#include <time.h>
clock_t start, end;
double cpu_time_used;
start = clock();
... /* Do the work. */
end = clock();
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
Source: http://www.gnu.org/s/hello/manual/libc/CPU-Time.html#CPU-Time
That way, you count the CPU ticks or the real amount of instructions worked upon by the CPU on the process, thus getting the real amount of work time.

The getrusage() function is the only standard/portable way that I know of to get "consumed CPU time".
There isn't a simple way to determine the precision of returned values. I'd be tempted to call the getrusage() once to get an initial value, and the call it repeatedly until the value/s returned are different from the initial value, and then assume the effective precision is the difference between the initial and final values. This is a hack (it would be possible for precision to be higher than this method determines, and the result should probably be considered a worst case estimate) but it's better than nothing.
I'd also be concerned about the accuracy of the values returned. Under some kernels I'd expect that a counter is incremented for whatever code happens to be running when a timer IRQ occurs; and therefore it's possible for a process to be very lucky (and continually block just before the timer IRQ occurs) or very unlucky (and unblock just before the timer IRQ occurs). In this case "lucky" could mean a CPU hog looks like it uses no CPU time, and "unlucky" could means a process that uses very little CPU time looks like a CPU hog.
For specific versions of specific kernels on specific architecture/s (potentially depending on if/when the kernel is compiled with specific configuration options in some cases), there may be higher precision alternatives that aren't portable and aren't standard...

You can use this piece of code :
#include <sys/time.h>
struct timeval start, end;
gettimeofday(&start, NULL);
.
.
.
gettimeofday(&end, NULL);
delta = ((end.tv_sec - start.tv_sec) * 1000000u +
end.tv_usec - start.tv_usec) / 1.e6;
printf("Time is : %f\n",delta);
It will show you the execution time for piece of your code

How can I find the execution time of a section of my program in C?

I'm trying to find a way to get the execution time of a section of code in C. I've already tried both time() and clock() from time.h, but it seems that time() returns seconds and clock() seems to give me milliseconds (or centiseconds?) I would like something more precise though. Is there a way I can grab the time with at least microsecond precision?
This only needs to be able to compile on Linux.

You referred to clock() and time() - were you looking for gettimeofday()?
That will fill in a struct timeval, which contains seconds and microseconds.
Of course the actual resolution is up to the hardware.

For what it's worth, here's one that's just a few macros:
#include <time.h>
clock_t startm, stopm;
#define START if ( (startm = clock()) == -1) {printf("Error calling clock");exit(1);}
#define STOP if ( (stopm = clock()) == -1) {printf("Error calling clock");exit(1);}
#define PRINTTIME printf( "%6.3f seconds used by the processor.", ((double)stopm-startm)/CLOCKS_PER_SEC);
Then just use it with:
main() {
START;
// Do stuff you want to time
STOP;
PRINTTIME;
}
From http://ctips.pbwiki.com/Timer

You want a profiler application.
Search keywords at SO and search engines: linux profiling

Have a look at gettimeofday,
clock_*, or get/setitimer.

Try "bench.h"; it lets you put a START_TIMER; and STOP_TIMER("name"); into your code, allowing you to arbitrarily benchmark any section of code (note: only recommended for short sections, not things taking dozens of milliseconds or more). Its accurate to the clock cycle, though in some rare cases it can change how the code in between is compiled, in which case you're better off with a profiler (though profilers are generally more effort to use for specific sections of code).
It only works on x86.

You might want to google for an instrumentation tool.

You won't find a library call which lets you get past the clock resolution of your platform. Either use a profiler (man gprof) as another poster suggested, or - quick & dirty - put a loop around the offending section of code to execute it many times, and use clock().

gettimeofday() provides you with a resolution of microseconds, whereas clock_gettime() provides you with a resolution of nanoseconds.
int clock_gettime(clockid_t clk_id, struct timespec *tp);
The clk_id identifies the clock to be used. Use CLOCK_REALTIME if you want a system-wide clock visible to all processes. Use CLOCK_PROCESS_CPUTIME_ID for per-process timer and CLOCK_THREAD_CPUTIME_ID for a thread-specific timer.

It depends on the conditions.. Profilers are nice for general global views however if you really need an accurate view my recommendation is KISS. Simply run the code in a loop such that it takes a minute or so to complete. Then compute a simple average based on the total run time and iterations executed.
This approach allows you to:
Obtain accurate results with low resolution timers.
Not run into issues where instrumentation interferes with high speed caches (l2,l1,branch..etc) close to the processor. However running the same code in a tight loop can also provide optimistic results that may not reflect real world conditions.

Don't know which enviroment/OS you are working on, but your timing may be inaccurate if another thread, task, or process preempts your timed code in the middle. I suggest exploring mechanisms such as mutexes or semaphores to prevent other threads from preemting your process.

If you are developing on x86 or x64 why not use the Time Stamp Counter: RDTSC.
It will be more reliable then Ansi C functions like time() or clock() as RDTSC is an atomic function. Using C functions for this purpose can introduce problems as you have no guarantee that the thread they are executing in will not be switched out and as a result the value they return will not be an accurate description of the actual execution time you are trying to measure.
With RDTSC you can better measure this. You will need to convert the tick count back into a human readable time H:M:S format which will depend on the processors clock frequency but google around and I am sure you will find examples.
However even with RDTSC you will be including the time your code was switched out of execution, while a better solution than using time()/clock() if you need an exact measurement you will have to turn to a profiler that will instrument your code and take into account when your code is not actually executing due to context switches or whatever.