Measure the exact time (number of cycles) of a set of instructions - c

I have some sectors on my drive with poor reading. I could measure the reading time required by each sector and then compare the time of the good sectors and the bad sectors.
I could use a timer of the processor to make the measurements.
How do I write a program in C/Assembly that measures the exact time it takes for each sector to be read?
So the procedure would be something like this:
Start the timer
Read the disk sector
Stop the timer
Read the time measured by the timer

The most useful functionality is the "rdtsc" instruction (ReaD Time Stamp Counter) which is incremented every time the processor's internal clock increments. For a 3 Ghz processor it increments 3 billion times per second. It returns a 64 bit unsigned integer containing the number of clock cycles since the processor was powered on.
Obviously the difference between two read-outs is the number of elapsed clock cycles consumed for executing the code sequence in-between. For a 3 Ghz machine you could use any of the following algorithms to convert to parts of seconds:
(time_difference+150)/300 gives a rounded off elapsed time in 0.1 us (tenths of microseconds)
(time_difference+1500)/3000 gives a rounded off elapsed time in us (microseconds)
(time_difference+1500000/3000000 gives a rounded off elapsed time in ms (milliseconds)
The 0.1 us algorithm is the most precise value you can use without having to adjust for read-out overhead.

In C, the function that would be most useful is clock() in time.h.
To time something, put calls to clock() around it, like so:
clock_t start, end;
float elapsed_time;
start = clock();
read_disk_sector();
end = clock();
elapsed_time = (float)(end - start) / (float)CLOCKS_PER_SEC;
printf("Elapsed time: %f seconds\n", elapsed_time);
This code prints out the number of seconds the read_disk_sector() function call took.
You can read more about the clock function here:
http://www.cplusplus.com/reference/clibrary/ctime/clock/

Related

What's the relationship between the real CPU frequency and the clock_t in C?

What's the relationship between the real CPU frequency and the clock_t (the unit is clock tick) in C?
Let's say I have the below piece of C code which measures the time that CPU consumed for running a for loop.
But since the CLOCKS_PER_SEC is a constant value (basically 1000,000) in the C standard library, I wonder how the clock function does measure the real CPU cycles that are consumed by the program while it runs on different computers with different CPU frequencies (for my laptop, it is 2.6GHz).
And if they are not relevant, how does the CPU timer work in the mentioned scenario?
#include <time.h>
#include <stdio.h>
int main(void) {
clock_t start_time = clock();
for(int i = 0; i < 10000; i++) {}
clock_t end_time = clock();
printf("%fs\n", (double)(end_time - start_time) / CLOCKS_PER_SEC);
return 0;
}
Effectively, clock_t values are unrelated to the CPU frequency.
See longer explanation here.
While clock_t-type values could have, in theory, represented actual physical CPU clock ticks - in practice, they do not: POSIX mandates that CLOCKS_PER_SEC be equal to 1,000,000 - one million. Thus the clock_t function returns a value in microseconds.
There is no such thing as "the real CPU frequency". Not in everyone's laptop, at any rate.
On many systems, the OS can lower and raise the CPU clock speed as it sees fit. On some systems there is more than one kind of central processor or core, each with a different speed. Some CPUs are clockless (asynchronous).
Because of all this and for other reasons, most computers measure time with a separate clock device, independent from the CPU clock (if any).
For providing the information used in the shown code, measuring/knowing/using the CPU cycles is not relevant.
For providing the elapsed time, it is only necessary to measure the time.
Reading a hardware timer would be one way to do so.
Most computers (even non-embedded ones) do contain timers which are especially counting ticks of a clock with known constant frequency. (They are specifically not "CPU timers".)
Such a timer can be read and yields a value which increases once per tick (of constant period). Where "known period" means a period know to some appropriate driver for that timer, simplified "known to the clock() function, not necessarily known to you".
Note that even if the number of used CPU cycles were known, calculating the elapsed time from that info is near impossible nowadays, in the presence of:
pipelines
parallelisms
interrupts
branch prediction
More things influencing/preventing the calculation, from comment contribution:
frequency-scaling, temperature throttling and power settings
(David C. Rankin)

Can someone give me an explanation of this function?

I've been wanting a pause function for a while and I found this. Being a beginner in C, I can't be for sure but it looks like functions from <clock.h>.
I would like to implement this into my code, but not without understanding it.
void wait(int seconds){
clock_t start, end;
start = clock();
end = clock();
while (((end-start) / CLOCKS_PER_SEC) = !seconds)
end = clock();
}
It's just a busy-wait loop, which is a very nasty way of implementing a delay, because it pegs the CPU at 100% while doing nothing. Use sleep() instead:
#include <unistd.h>
void wait(int seconds)
{
sleep(seconds);
}
Also note that the code given in the question is buggy:
while (((end-start) / CLOCKS_PER_SEC) = !seconds)
should be:
while (((end-start) / CLOCKS_PER_SEC) != seconds)
or better still:
while (((end-start) / CLOCKS_PER_SEC) < seconds)
(but as mentioned above, you shouldn't even be using this code anyway).
Generally, clock_t is a structure in library. It returns the number of clock ticks elapsed from the program initiation till end or till you want to count the time.
If you want to know more you can read details here: http://www.tutorialspoint.com/c_standard_library/c_function_clock.htm
The clock() function returns the number of system clock cycles that have occurred since the program began execution.
Here is the prototype: clock_t clock(void);
Note that, the return value is not in seconds. To convert the result into seconds, you have to divide the return value by CLOCKS_PER_SEC macro.
You program just does this. Functionally it stops the program execution for seconds seconds (you pass this value as an argument to the wait function).
By the way, it uses time.h, not clock.h. And it is not the right point to start learning C.
To learn more: http://www.cplusplus.com/reference/ctime/?kw=time.h
clock function in <ctime>:
Returns the processor time consumed by the program.
The value returned is expressed in clock ticks, which are units of
time of a constant but system-specific length (with a relation of
CLOCKS_PER_SEC clock ticks per second).
reference
So, basically it returns number of processor ticks passed since the start of the program. While the processor tick is the number of processor instructions executed by a process, it does not account for IO time and any such that does not use CPU.
CLOCKS_PER_SEC Is the average number of CPU ticks executed by machine and varies from machine to machine, even it (probably) changes time to time, because doing too much IO will cause overall decrease for each process CLOCKS_PER_SEC because more time will be spent not using CPU.
Also this statement: (end-start) / CLOCKS_PER_SEC) = !seconds
is not correct, because the right implementation is
while (((end-start) / CLOCKS_PER_SEC) != seconds)
end = clock();
Does the trick of busy waiting, program will be trapped inside this while loop until seconds seconds will be passed using CPU clocks and CLOCKS_PER_SEC to determine time passed.
Although I would suggest changing it to:
while (((end-start) / CLOCKS_PER_SEC) < seconds)
end = clock();
Because if process has low priority, or computer is too busy handling many processes chance is one CPU tick can take more than one second (probably when system is crashed, for some buggy program who take up a lot of resources and has high enough priority to cause CPU starvation).
Finally, I do not recommend using it, because you are still using CPU while waiting which can be avoided by using sleep tools discussed here

Concept of clock tick and clock cycles

I have written a very small code to measure the time taken by my multiplication algorithm :
clock_t begin, end;
float time_spent;
begin = clock();
a = b*c;
end = clock();
time_spent = (float)(end - begin)/CLOCKS_PER_SEC;
I am working with mingw under Windows.
I am guessing that end = clock() will give me the clock ticks at that particular moment. Subtracting it from begin will give me clock ticks consumed by multiplication. When I divide with CLOCKS_PER_SEC, I will get the total amount of time.
My first question is: Is there a difference between clock ticks and clock cycle?
My algorithm here is so small that the difference end-begin is 0. Does this mean that my code execution time was less than 1 tick and that's why I am getting zero?
My first question is: Is there a difference between clock ticks and clock cycle?
Yes. A clock tick could be 1 millisecond or microsecond while the clock cycle could be 0.3 nanoseconds. On POSIX systems CLOCKS_PER_SEC must be defined as 1000000 (1 million). Note that if the CPU measurement cannot be obtained with microsecond resolution then the smallest jump in the return value from clock() will be larger than one.
My algorithm here is so small that the difference end-begin is 0. Does this mean that my code execution time was less than 1 tick and that's why I am getting zero?
Yes. To get a better reading I suggest that you loop enough iterations so that you measure over several seconds.
Answering the difference between clock tick and clock cycle from a systems perspective
Every processor is accompanied by a physical clock (usually quartz crystal clock), which oscillates at certain frequency (vibrations/sec). The processor keeps track of time by the help of interrupts generated from the physical clock, which interrupts the processor at every time period T. This interrupt is called a 'clock tick'. CPU counts the number of interrupts it has seen since the system has started, and returns that value when you call clock(). By taking a difference between two clock ticks values (obtained from clock()), you would get how many interrupts that were seen between those two time points.
Most of the modern operating systems program the T value to be 1 microsecond i.e. the physical clock interrupts at every 1 microsecond, this is the lowest clock granularity which is widely supported by most of the physical clocks. With 1 microsecond as T, the clock cycle can be calculated as 1000000 per second. So, with this information, you can calculate the time elapsed from the difference of two clock ticks values i.e. diff between two ticks * tick period
NOTE: clock cycle defined by the OS has to be <= vibrations/sec on the physical clock, otherwise there will be a loss of precision
four your first question: clock ticks refer to the main system clock. It is the smallest unit of time recognized by the device. clock cycle is the time taken for a full processor pulse to complete. this u can recognize by your cpu cpeed given in Hz. a 2GHz processor performs 2,000,000,000 clock cycles per second.
for your second question: probably yes.
A clock cycle is a clock tick.
A clock cycle is the speed of a computer processor, or CPU, and is determined by the amount of time between two pulses of an oscillator. Generally speaking, the higher number of pulses per second, the faster the computer processor will be able to process information.

MATLAB's tic-toc & C's clock discrepancy

I have written some C code which I call form MATLAB after I compile it using MEX. Inside the C code, I measure the time of a part of the computation using the following code:
clock_t begin, end;
double time_elapsed;
begin = clock();
/* do stuff... */
end = clock();
time_elapsed = (double) ((double) (end - begin) / (double) CLOCKS_PER_SEC);
Elapsed time should be the execution time in seconds.
I then output the value time_elapsed to MATLAB (it is properly exported; I checked). Then MATLAB-side I call this C function (after I compile it using MEX) and I measure its execution time using tic and toc. What turns out to be a complete absurdity is that the time I compute using tic and toc is 0.0011s (average on 500 runs, st. dev. 1.4e-4) while the time that is returned by the C code is 0.037s (average on 500 runs, st. dev. 0.0016).
Here one may notice two very strange facts:
The execution time for the whole function is lower than the execution time for a part of the code. Hence, either MATLAB's or C's measurements are strongly inaccurate.
The execution times measured in the C code are very scattered and exhibit very high st. deviation (coeff. of variation 44%, compared to just 13% for tic-toc).
What is going on with these timers?
You're comparing apples to oranges.
Look at Matlab's documentation:
tic - http://www.mathworks.com/help/matlab/ref/tic.html
toc - http://www.mathworks.com/help/matlab/ref/toc.html
tic and toc let you measure real elapsed time.
Now look at the clock function http://linux.die.net/man/3/clock.
In particular,
The clock() function returns an approximation of processor time used by the program.
The value returned is the CPU time used so far as a clock_t; to
get the number of seconds used, divide by CLOCKS_PER_SEC. If the
processor time used is not available or its value cannot be
represented, the function returns the value (clock_t) -1.
So what can account for your difference:
CPU time (measured by clock()) and real elapsed time (measured by tic and toc) are NOT the same. So you would expect that cpu time to be less than elapsed time? Well, maybe. What if within 0.0011s you're driving 10 cores at 100%? That would mean that clock() measurement is 10x that measured with tic and toc. Possible, unlikely.
clock(.) is grossly inaccurate, and consistent with the documentation, it is an approximate cpu time measurement! I suspect that it is pegged to the scheduler quantum size, but I didn't dig through the Linux kernel code to check. I also didn't check on other OSes, but this dude's blog is consistent with that theory.
So what to do... for starters, compare apples to apples! Next, make sure you take into account timer resolution.

Is clock-based timing reliable when CPU frequency is variable?

A common way to measure elapsed time is:
const clock_t START = clock();
// ...
const clock_t END = clock();
double T_ELAPSED = (double)(END - START) / CLOCKS_PER_SEC;
I know this is not the best way to measure real time, but I wonder if it works on a system with a variable frequency CPU. Is it just wrong?
There are system architectures that change the frequency of the CPU but have a separate and constant frequency to drive a system clock. One would think that a clock() function would return a time independent of the CPU frequency but this would have to be verified on each system the code is intended to run on.
It is not good to use on a variable clock speed CPU.
http://support.ntp.org/bin/view/Support/KnownHardwareIssues
NTP (network time prototcol) daemon on linux had issues with it.
Most OS's have some API calls for more accurate values, an example being on windows, QueryPerformanceCounter
http://msdn.microsoft.com/en-us/library/ms644904%28VS.85%29.aspx

Resources