I'm assigning the value of unistd.h's clock() to two int types, as follows:
int start_time = clock();
for (i = 0; i < 1000000; i++) {
printf("%d\n", i+1);
}
int end_time = clock();
However, when I print their values, the actual time elapsed differs from the time displayed. The POSIX standard declares that CLOCKS_PER_SEC must equal one million, assuming that a clock cycle is a microsecond. Is the clock just not going the speed the standard expects, or is my loop causing some weirdness in the calculation?
I'm trying to measure the speed of different operations in a similar fashion, and an inaccurate clock ruins my experiments.
It takes several seconds to print from 1 to 1,000,000. The value assigned to end_time is usually around 900,000, which in microseconds
Your processor is fast. Your I/O is not so much.
When your code is executed, your processor will get the time using hardware and assign it to start_time. Then, it will go through the loop and put 1 million lines in the output buffer. Putting things in output buffer does not mean processor has done displaying them.
That is why you get less than a second to finish the process, but see the output for several seconds.
Edit: Just to clarify, it seems that the phrase "Putting things in output buffer does not mean processor has done displaying them" has introduced confusion. This is written in the perspective of the executing process, and does not mean that processor puts all the output in the output buffer in one go.
What actually happens (as n.m. has pointed out) is this: the clock() actually returns the process time, not the time of the day. And because the output buffer is small and processor has to wait long waits after flushing the buffer, the process time will be significantly smaller than the actual execution time. Hence, by the perspective of the process it looks like execution happens fast but display of output is very slow.
Related
I am working on encryption of realtime data. I have developed encryption and decryption algorithm. Now i want to measure the execution time of the same on Linux platform in C. How can i correctly measure it ?. I have tried it as below
gettimeofday(&tv1, NULL);
/* Algorithm Implementation Code*/
gettimeofday(&tv2, NULL);
Total_Runtime=(tv2.tv_usec - tv1.tv_usec) +
(tv2.tv_sec - tv1.tv_sec)*1000000);
which gives me time in microseconds. Is it correct way of time measurement or i should use some other function? Any hint will be appreciated.
clock(): The value returned is the CPU time used so far as a clock_t;
Logic
Get CPU time at program beginning and at end. Difference is what you want.
Code
clock_t begin = clock();
/**** code ****/
clock_t end = clock();
double time_spent = (double)(end - begin) //in microseconds
To get the number of seconds used,we divided the difference by CLOCKS_PER_SEC.
More accurate
In C11 timespec_get() provides time measurement in the range of nanoseconds. But the accuracy is implementation defined and can vary.
Read time(7). You probably want to use clock_gettime(2) with CLOCK_PROCESS_CPUTIME_ID or
CLOCK_MONOTONIC. Or you could just use clock(3) (for the CPU time in microseconds, since CLOCK_PER_SEC is always a million).
If you want to benchmark an entire program (executable), use time(1) command.
Measuring the execution time of a proper encryption code is simple although a bit tedious. The runtime of a good encryption code is independent of the quality of the input--no matter what you throw at it, it always needs the same amount of operations per chunk of input. If it doesn't you have a problem called a timing-attack.
So the only thing you need to do is to unroll all loops, count the opcodes and multiply the individual opcodes with their amount of clock-ticks to get the exact runtime. There is one problem: some CPUs have a variable amount of clock-ticks for some of their operations and you might have to change those to operations that have a fixed amount of clock-ticks. A pain in the behind, admitted.
If the single thing you want is to know if the code runs fast enough to fit into a slot of your real-time OS you can simply take to maximum and fill cases below with NOOPs (Your RTOS might have a routine for that).
I've been wanting a pause function for a while and I found this. Being a beginner in C, I can't be for sure but it looks like functions from <clock.h>.
I would like to implement this into my code, but not without understanding it.
void wait(int seconds){
clock_t start, end;
start = clock();
end = clock();
while (((end-start) / CLOCKS_PER_SEC) = !seconds)
end = clock();
}
It's just a busy-wait loop, which is a very nasty way of implementing a delay, because it pegs the CPU at 100% while doing nothing. Use sleep() instead:
#include <unistd.h>
void wait(int seconds)
{
sleep(seconds);
}
Also note that the code given in the question is buggy:
while (((end-start) / CLOCKS_PER_SEC) = !seconds)
should be:
while (((end-start) / CLOCKS_PER_SEC) != seconds)
or better still:
while (((end-start) / CLOCKS_PER_SEC) < seconds)
(but as mentioned above, you shouldn't even be using this code anyway).
Generally, clock_t is a structure in library. It returns the number of clock ticks elapsed from the program initiation till end or till you want to count the time.
If you want to know more you can read details here: http://www.tutorialspoint.com/c_standard_library/c_function_clock.htm
The clock() function returns the number of system clock cycles that have occurred since the program began execution.
Here is the prototype: clock_t clock(void);
Note that, the return value is not in seconds. To convert the result into seconds, you have to divide the return value by CLOCKS_PER_SEC macro.
You program just does this. Functionally it stops the program execution for seconds seconds (you pass this value as an argument to the wait function).
By the way, it uses time.h, not clock.h. And it is not the right point to start learning C.
To learn more: http://www.cplusplus.com/reference/ctime/?kw=time.h
clock function in <ctime>:
Returns the processor time consumed by the program.
The value returned is expressed in clock ticks, which are units of
time of a constant but system-specific length (with a relation of
CLOCKS_PER_SEC clock ticks per second).
reference
So, basically it returns number of processor ticks passed since the start of the program. While the processor tick is the number of processor instructions executed by a process, it does not account for IO time and any such that does not use CPU.
CLOCKS_PER_SEC Is the average number of CPU ticks executed by machine and varies from machine to machine, even it (probably) changes time to time, because doing too much IO will cause overall decrease for each process CLOCKS_PER_SEC because more time will be spent not using CPU.
Also this statement: (end-start) / CLOCKS_PER_SEC) = !seconds
is not correct, because the right implementation is
while (((end-start) / CLOCKS_PER_SEC) != seconds)
end = clock();
Does the trick of busy waiting, program will be trapped inside this while loop until seconds seconds will be passed using CPU clocks and CLOCKS_PER_SEC to determine time passed.
Although I would suggest changing it to:
while (((end-start) / CLOCKS_PER_SEC) < seconds)
end = clock();
Because if process has low priority, or computer is too busy handling many processes chance is one CPU tick can take more than one second (probably when system is crashed, for some buggy program who take up a lot of resources and has high enough priority to cause CPU starvation).
Finally, I do not recommend using it, because you are still using CPU while waiting which can be avoided by using sleep tools discussed here
Come someone please tell me how this function works? I'm using it in code and have an idea how it works, but I'm not 100% sure exactly. I understand the concept of an input variable N incrementing down, but how the heck does it work? Also, if I am using it repeatedly in my main() for different delays (different iputs for N), then do I have to "zero" the function if I used it somewhere else?
Reference: MILLISEC is a constant defined by Fcy/10000, or system clock/10000.
Thanks in advance.
// DelayNmSec() gives a 1mS to 65.5 Seconds delay
/* Note that FCY is used in the computation. Please make the necessary
Changes(PLLx4 or PLLx8 etc) to compute the right FCY as in the define
statement above. */
void DelayNmSec(unsigned int N)
{
unsigned int j;
while(N--)
for(j=0;j < MILLISEC;j++);
}
This is referred to as busy waiting, a concept that just burns some CPU cycles thus "waiting" by keeping the CPU "busy" doing empty loops. You don't need to reset the function, it will do the same if called repeatedly.
If you call it with N=3, it will repeat the while loop 3 times, every time counting with j from 0 to MILLISEC, which is supposedly a constant that depends on the CPU clock.
The original author of the code have timed and looked at the assembler generated to get the exact number of instructions executed per Millisecond, and have configured a constant MILLISEC to match that for the for loop as a busy-wait.
The input parameter N is then simply the number of milliseconds the caller want to wait and the number of times the for-loop is executed.
The code will break if
used on a different or faster micro controller (depending on how Fcy is maintained), or
the optimization level on the C compiler is changed, or
c-compiler version is changed (as it may generate different code)
so, if the guy who wrote it is clever, there may be a calibration program which defines and configures the MILLISEC constant.
This is what is known as a busy wait in which the time taken for a particular computation is used as a counter to cause a delay.
This approach does have problems in that on different processors with different speeds, the computation needs to be adjusted. Old games used this approach and I remember a simulation using this busy wait approach that targeted an old 8086 type of processor to cause an animation to move smoothly. When the game was used on a Pentium processor PC, instead of the rocket majestically rising up the screen over several seconds, the entire animation flashed before your eyes so fast that it was difficult to see what the animation was.
This sort of busy wait means that in the thread running, the thread is sitting in a computation loop counting down for the number of milliseconds. The result is that the thread does not do anything else other than counting down.
If the operating system is not a preemptive multi-tasking OS, then nothing else will run until the count down completes which may cause problems in other threads and tasks.
If the operating system is preemptive multi-tasking the resulting delays will have a variability as control is switched to some other thread for some period of time before switching back.
This approach is normally used for small pieces of software on dedicated processors where a computation has a known amount of time and where having the processor dedicated to the countdown does not impact other parts of the software. An example might be a small sensor that performs a reading to collect a data sample then does this kind of busy loop before doing the next read to collect the next data sample.
#include <windows.h>
#include <stdio.h>
#include <stdint.h>
// assuming we return times with microsecond resolution
#define STOPWATCH_TICKS_PER_US 1
uint64_t GetStopWatch()
{
LARGE_INTEGER t, freq;
uint64_t val;
QueryPerformanceCounter(&t);
QueryPerformanceFrequency(&freq);
return (uint64_t) (t.QuadPart / (double) freq.QuadPart * 1000000);
}
void task()
{
printf("hi\n");
}
int main()
{
uint64_t start = GetStopWatch();
task();
uint64_t stop = GetStopWatch();
printf("Elapsed time (microseconds): %lld\n", stop - start);
}
The above contains a query performance counter function Retrieves the current value of the high-resolution performance counter and query performance frequency function Retrieves the frequency of the high-resolution performance counter. If I am calling the task(); function multiple times then the difference between the start and stop time varies but I should get the same time difference for calling the task function multiple times. could anyone help me to identify the mistake in the above code ??
The thing is, Windows is a pre-emptive multi-tasking operating system. What the hell does that mean, you ask?
'Simple' - windows allocates time-slices to each of the running processes in the system. This gives the illusion of dozens or hundreds of processes running in parallel. In reality, you are limited to 2, 4, 8 or perhaps 16 parallel processes in a typical desktop/laptop. An Intel i3 has 2 physical cores, each of which can give the impression of doing two things at once. (But in reality, there's hardware tricks going on that switch the execution between each of the two threads that each core can handle at once) This is in addition to the software context switching that Windows/Linux/MacOSX do.
These time-slices are not guaranteed to be of the same duration each time. You may find the pc does a sync with windows.time to update your clock, you may find that the virus-scanner decides to begin working, or any one of a number of other things. All of these events may occur after your task() function has begun, yet before it ends.
In the DOS days, you'd get very nearly the same result each and every time you timed a single iteration of task(). Though, thanks to TSR programs, you could still find an interrupt was fired and some machine-time stolen during execution.
It is for just these reasons that a more accurate determination of the time a task takes to execute may be calculated by running the task N times, dividing the elapsed time by N to get the time per iteration.
For some functions in the past, I have used values for N as large as 100 million.
EDIT: A short snippet.
LARGE_INTEGER tStart, tEnd;
LARGE_INTEGER tFreq;
double tSecsElapsed;
QueryPerformanceFrequency(&tFreq);
QueryPerformanceCounter(&tStart);
int i, n = 100;
for (i=0; i<n; i++)
{
// Do Something
}
QueryPerformanceCounter(&tEnd);
tSecsElapsed = (tEnd.QuadPart - tStart.QuadPart) / (double)tFreq.QuadPart;
double tMsElapsed = tSecElapsed * 1000;
double tMsPerIteration = tMsElapsed / (double)n;
Code execution time on modern operating systems and processors is very unpredictable. There is no scenario where you can be sure that the elapsed time actually measured the time taken by your code, your program may well have lost the processor to another process while it was executing. The caches used by the processor play a big role, code is always a lot slower when it is executed the first time when the caches do not yet contain the code and data used by the program. The memory bus is very slow compared to the processor.
It gets especially meaningless when you measure a printf() statement. The console window is owned by another process so there's a significant chunk of process interop overhead whose execution time critically depends on the state of that process. You'll suddenly see a huge difference when the console window needs to be scrolled for example. And most of all, there isn't actually anything you can do about making it faster so measuring it is only interesting for curiosity.
Profile only code that you can improve. Take many samples so you can get rid of the outliers. Never pick the lowest measurement, that just creates unrealistic expectations. Don't pick the average either, that is affected to much by the long delays that other processes can incur on your test. The median value is a good choice.
I know this question may have been commonly asked before, but it seems most of those questions are regarding the elapsed time (based on wall clock) of a piece of code. The elapsed time of a piece of code is unlikely equal to the actual execution time, as other processes may be executing during the elapsed time of the code of interest.
I used getrusage() to get the user time and system time of a process, and then calculate the actual execution time by (user time + system time). I am running my program on Ubuntu. Here are my questions:
How do I know the precision of getrusage()?
Are there other approaches that can provide higher precision than getrusage()?
You can check the real CPU time of a process on linux by utilizing the CPU Time functionality of the kernel:
#include <time.h>
clock_t start, end;
double cpu_time_used;
start = clock();
... /* Do the work. */
end = clock();
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
Source: http://www.gnu.org/s/hello/manual/libc/CPU-Time.html#CPU-Time
That way, you count the CPU ticks or the real amount of instructions worked upon by the CPU on the process, thus getting the real amount of work time.
The getrusage() function is the only standard/portable way that I know of to get "consumed CPU time".
There isn't a simple way to determine the precision of returned values. I'd be tempted to call the getrusage() once to get an initial value, and the call it repeatedly until the value/s returned are different from the initial value, and then assume the effective precision is the difference between the initial and final values. This is a hack (it would be possible for precision to be higher than this method determines, and the result should probably be considered a worst case estimate) but it's better than nothing.
I'd also be concerned about the accuracy of the values returned. Under some kernels I'd expect that a counter is incremented for whatever code happens to be running when a timer IRQ occurs; and therefore it's possible for a process to be very lucky (and continually block just before the timer IRQ occurs) or very unlucky (and unblock just before the timer IRQ occurs). In this case "lucky" could mean a CPU hog looks like it uses no CPU time, and "unlucky" could means a process that uses very little CPU time looks like a CPU hog.
For specific versions of specific kernels on specific architecture/s (potentially depending on if/when the kernel is compiled with specific configuration options in some cases), there may be higher precision alternatives that aren't portable and aren't standard...
You can use this piece of code :
#include <sys/time.h>
struct timeval start, end;
gettimeofday(&start, NULL);
.
.
.
gettimeofday(&end, NULL);
delta = ((end.tv_sec - start.tv_sec) * 1000000u +
end.tv_usec - start.tv_usec) / 1.e6;
printf("Time is : %f\n",delta);
It will show you the execution time for piece of your code