is there any simple way how to measure computing time in C? I tried time utility when executed, but I need to measure specific part of a program.
Thanks
You can use the clock function in <time.h> along with the macro CLOCKS_PER_SEC:
clock_t start = clock() ;
do_some_work() ;
clock_t end = clock() ;
double elapsed_time = (end-start)/(double)CLOCKS_PER_SEC ;
Now elapsed_time holds the time it took to call do_some_work, in fractional seconds.
You can try the profiler "gprof". More information here: http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html
You can generally use the clock() function to get the start and end times of a single call to your function being tested. If, however, do_some_work() is particularly fast, it needs to be put in a loop and have the cost of the loop itself factored out, something like:
#define COUNT 10000
// Get cost of naked loop.
clock_t start_base = clock();
for (int i = count; i > 0; i--)
;
clock_t end_base = clock();
// Get cost of loop plus work.
clock_t start = clock();
for (int i = count; i > 0; i--)
do_some_work() ;
clock_t end = clock();
// Calculate cost of single call.
double elapsed_time = end - start - (end_base - start_base);
elapsed_time = elapsed_time / CLOCKS_PER_SEC / COUNT;
This has at least two advantages:
you'll get an average time which is more representative of the actual time it should take; and
you'll get a more accurate answer in the case where the clock() function has a limited resolution.
#codebolt - Thank you! very nice. On Mac OS X, I added an include of time.h, and pasted in your four lines. Then I printed the values of start, stop (integers) and elapsed time. 1mS resolution.
output:
3 X: strcpy .name, .numDocks: start 0x5dc end 0x5e1 elapsed: 0.000005
calloc: start 0x622 end 0x630 elapsed: 0.000014
in my foo.c program I have
#include <libc.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
but it works without explicitly including time.h. One of the others must bring it in.
Actual code:
clock_t start = clock() ;
strcpy( yard2.name, temp ); /* temp is only persistant in main... */
strcpy( yard1.name, "Yard 1");
strcpy( yard3.name, "3 y 3 a 3 r 3 d 3");
yard1.numDocks = MAX_DOCKS; /* or so I guess.. */
yard2.numDocks = MAX_DOCKS; /* or so I guess.. */
yard3.numDocks = MAX_DOCKS; /* or so I guess.. */
clock_t end = clock() ;
double elapsed_time = (end-start)/(double)CLOCKS_PER_SEC ;
printf("3 X: strcpy .name, .numDocks: start 0x%x end 0x%x elapsed: %-12:8f \n", start, end, elapsed_time );
start = clock() ;
arrayD = calloc( yard2.numDocks, sizeof( struct dock ) ); /* get some memory, init it to 0 */
end = clock() ;
elapsed_time = (end-start)/(double)CLOCKS_PER_SEC ;
printf("calloc: start 0x%x end 0x%x elapsed: %-12:8f \n", start, end, elapsed_time );
Related
I'd like to know if there is a way to reset the return value of the clock() function to 0.
I have a code something like this:
#include <stdio.h>
#include <time.h>
#include <stdbool.h>
int main()
{
/* clock_t t1; */
unsigned int sec = 0;
while(true) {
if(clock() >= 1000) {
printf("%u seconds has passed\r", sec);
/* reset clock()'s return value to 0 */
sec++;
}
}
return 0;
}
what code should I put to the comment's place to reset the timer? Is there a way, or am I aproaching the problem in the incorrect manner?
clock() is always increasing.
The unit of clock is in CLOCKS_PER_SEC. One second has CLOCKS_PER_SEC clocks.
Note that clock() does not measure real time. clock() measures the processor time spend in your process. If you want to measure real time, use time() from time.h (or check your OS, on linux you can use clock_gettime(CLOCK_MONOTONIC, ...) or with CLOCK_REALTIME).
Save the current clock in a variable. Then compare the variable with current clock.
Usually stdout is line buffered. So until you write a newline character, nothing will show up. Make sure to flush stdout if you depend on that behavior.
#include <stdio.h>
#include <time.h>
int main() {
unsigned int sec = 0;
// we will stop the clock one second from now
clock_t stopclock = clock() + 1 * CLOCKS_PER_SEC;
while(1) {
// current time is greater then the stopping time
if (clock() > stopclock) {
// increment stopping time by one second
stopclock += 1 * CLOCKS_PER_SEC;
printf("\r%u seconds has passed", sec);
fflush(stdout);
sec++;
}
}
return 0;
}
Note: calculations on clock_t type like clock() + 1 * CLOCKS_PER_SEC can potentially overflow - great code would handle such corner cases.
My best guess is that you are trying to time something
Take a look at this
#include <time.h>
#include <stdio.h>
int main () {
clock_t start_t, end_t, total_t;
int i;
start_t = clock();
printf("Starting of the program, start_t = %ld\n", start_t);
printf("Going to scan a big loop, start_t = %ld\n", start_t);
for(i=0; i< 1000000000; i++) {
}
end_t = clock();
printf("End of the big loop, end_t = %ld\n", end_t);
total_t = (double)(end_t - start_t) / CLOCKS_PER_SEC;
printf("Total time taken by CPU: %f\n", total_t );
printf("Exiting of the program...\n");
return(0);
}
Suppose i have a nested for-loop and if-checks shown below, if i wanted to see how many clock cycles (ultimately how many secs) a particular for-loop or if-check is taking to finish executing.
Should the sum of number of clock cycles (secs) taken by the inner for-loop and if-check be equal (or approximately equal) to the number of clock cycles(secs) taken by the outer most for-loop.?
Or am i doing it wrong? how do i time the loops if there's any other way of doing it.?
Note: I have 3 different functions doing pretty much the same thing, i have declared 3 different functions to measure each for-loop or if-check separately 'cause if i try to get the execution time of all the sub components in the same piece of code, then the number of clock cycles(secs) taken by the outer for-loop will include some extra execution of instructions which are calculating the clock cycles count of inner for-loop and if-check i guess.
void fun1(){
int i=0,j=0,k=0;
clock_t t=0,t_start=0,t_end=0;
//time the outermost forloop
t_start = clock();
for(i=0;i<100000;i++){
for(j=0;j<1000;j++){
//some code
}
if(k==0){
//some code
}
}
t_end = clock();
t=t_end-t_start;
double time_taken = ((double)t)/CLOCKS_PER_SEC;
printf("outer for-loop took %f seconds to execute \n", time_taken);
}
void fun2(){
int i=0,j=0,k=0;
clock_t t2=0,t2_start=0,t2_end=0;
for(i=0;i<100000;i++){
//time the inner for loop
t2_start=clock();
for(j=0;j<1000;j++){
//some code
}
t2_end=clock();
t2+=(t2_end-t2_start);
if(k==0){
//some code
}
}
double time_taken = ((double)t2)/CLOCKS_PER_SEC;
printf("inner for-loop took %f seconds to execute \n", time_taken);
}
void fun3(){
int i=0,j=0,k=0;
clock_t t3=0,t3_start=0,t3_end=0;
for(i=0;i<100000;i++){
for(j=0;j<1000;j++){
//some code
}
//time the if check
t3_start=clock();
if(k==0){
//some code
}
t3_end=clock();
t3+=(t3_end-t3_start);
}
double time_taken = ((double)t3)/CLOCKS_PER_SEC;
printf("if-check took %f seconds to execute \n", time_taken);
}
The expected answer is t in fun1 will likely be slightly more than t2+t3 from fun2 and fun3 respectively, representing the additional time to evaluate the outer loop itself.
Less obvious, however, is the time added by the measurement itself, which will be the time to invoke clock() itself once for each measurement. When measuring the inside loops, it's effectively multiplied by 100,000 because of the iteration of the outer loop.
Here's a program to measure the measurement itself, and for good measure, also measures the time to evaluate an empty outer loop.
#include <time.h>
#include <stdio.h>
int main () {
clock_t t = 0;
clock_t t_start, t_end;
for (int i = 0; i < 100000; i++) {
t_start = clock();
t_end = clock();
t += (t_end - t_start);
}
double time_taken = ((double) t) / CLOCKS_PER_SEC;
printf ("Time imposed by measurement itself: %fsec\n", time_taken);
t_start = clock();
for (int i = 0; i < 100000; i++) {
}
t_end = clock();
t = (t_end - t_start);
time_taken = ((double) t) / CLOCKS_PER_SEC;
printf ("Time to evaluate the loop: %fsec\n", time_taken);
}
Which, at least on my system, suggests the measurement may skew the results some:
Time imposed by measurement itself: 0.056949sec
Time to evaluate the loop: 0.000200sec
To get the amount of time your inner loops "really" take, you'll need to subtract out that added by the act of measuring it.
I wrote a program based on the idea of Riemann's sum to find out the integral value. It uses several threads, but the performance of it (the algorithm), compared to sequential program i wrote later, is subpar. Algorithm-wise they are identical except the threads stuff, so the question is what's wrong with it? pthread_join is not the case, i assume, because if one thread will finish sooner than the other thread, that join wait on, it will simply skip it in the future. Is that correct? The free call is probably wrong and there is no error check upon creation of threads, i'm aware of it, i deleted it along the way of testing various stuff. Sorry for bad english and thanks in advance.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <sys/types.h>
#include <time.h>
int counter = 0;
float sum = 0;
pthread_mutex_t mutx;
float function_res(float);
struct range {
float left_border;
int steps;
float step_range;
};
void *calcRespectiveRange(void *ranges) {
struct range *rangs = ranges;
float left_border = rangs->left_border;
int steps = rangs->steps;
float step_range = rangs->step_range;
free(rangs);
//printf("left: %f steps: %d step range: %f\n", left_border, steps, step_range);
int i;
float temp_sum = 0;
for(i = 0; i < steps; i++) {
temp_sum += step_range * function_res(left_border);
left_border += step_range;
}
sum += temp_sum;
pthread_exit(NULL);
}
int main() {
clock_t begin, end;
if(pthread_mutex_init(&mutx, NULL) != 0) {
printf("mutex error\n");
}
printf("enter range, amount of steps and threads: \n");
float left_border, right_border;
int steps_count;
int threads_amnt;
scanf("%f %f %d %d", &left_border, &right_border, &steps_count, &threads_amnt);
float step_range = (right_border - left_border) / steps_count;
int i;
pthread_t tid[threads_amnt];
float chunk = (right_border - left_border) / threads_amnt;
int steps_per_thread = steps_count / threads_amnt;
begin = clock();
for(i = 0; i < threads_amnt; i++) {
struct range *ranges;
ranges = malloc(sizeof(ranges));
ranges->left_border = i * chunk + left_border;
ranges->steps = steps_per_thread;
ranges->step_range = step_range;
pthread_create(&tid[i], NULL, calcRespectiveRange, (void*) ranges);
}
for(i = 0; i < threads_amnt; i++) {
pthread_join(tid[i], NULL);
}
end = clock();
pthread_mutex_destroy(&mutx);
printf("\n%f\n", sum);
double time_spent = (double) (end - begin) / CLOCKS_PER_SEC;
printf("Time spent: %lf\n", time_spent);
return(0);
}
float function_res(float lb) {
return(lb * lb + 4 * lb + 3);
}
Edit: in short - can it be improved to reduce execution time (with mutexes, for example)?
The execution time will be shortened, provided you you have multiple hardware threads available.
The problem is in how you measure time: clock returns the processor time used by the program. That means, it sums the time taken by all the threads. If your program uses 2 threads, and it's linear execution time is 1 second, that means that each thread has used 1 second of CPU time, and clock will return the equivalent of 2 seconds.
To get the actual time used (on Linux), use gettimeofday. I modified your code by adding
#include <sys/time.h>
and capturing the start time before the loop:
struct timeval tv_start;
gettimeofday( &tv_start, NULL );
and after:
struct timeval tv_end;
gettimeofday( &tv_end, NULL );
and calculating the difference in seconds:
printf("CPU Time: %lf\nTime passed: %lf\n",
time_spent,
((tv_end.tv_sec * 1000*1000.0 + tv_end.tv_usec) -
(tv_start.tv_sec * 1000*1000.0 + tv_start.tv_usec)) / 1000/1000
);
(I also fixed the malloc from malloc(sizeof(ranges)) which allocates the size of a pointer (4 or 8 bytes for 32/64 bit CPU) to malloc(sizeof(struct range)) (12 bytes)).
When running with the input parameters 0 1000000000 1000000000 1, that is, 1 billion iterations in 1 thread, the output on my machine is:
CPU Time: 4.352000
Time passed: 4.400006
When running with 0 1000000000 1000000000 2, that is, 1 billion iterations spread over 2 threads (500 million iterations each), the output is:
CPU Time: 4.976000
Time passed: 2.500003
For completeness sake, I tested it with the input 0 1000000000 1000000000 4:
CPU Time: 8.236000
Time passed: 2.180114
It is a little faster, but not twice as fast as with 2 threads, and it uses double the CPU time. This is because my CPU is a Core i3, a dual-core with hyperthreading, which aren't true hardware threads.
I'm trying to see how much time cost execute some code in a thread. But clock() is returning 0.
This is the code:
int main(int argc, char *argv[])
{
int begin, end;
float time_spent;
clock_t i,j;
struct timeval tv1;
struct timeval tv2;
for(i = 0; i<6; i++)
{
begin = clock();
// Send Audio Data
....
gettimeofday(&tv1,NULL);
usleep(200000); // Wait 200 ms
gettimeofday(&tv2,NULL);
printf("GETTIMEOFDAY %d\n", tv2.tv_usec-tv1.tv_usec); // Time using date WORKING
end = clock() - begin;
// Store time
...
printf ("It took me %d clicks (%f seconds).\n",begin,((float)begin)/CLOCKS_PER_SEC);
printf ("It took me %d clicks (%f seconds).\n",end,((float)end)/CLOCKS_PER_SEC);
time_spent = (((float)end) * 1000.0 / ((float)CLOCKS_PER_SEC)); // Time using clock BAD
printf("\n TIME %dms|%dms|%fms|%d\n",begin,end, time_spent,CLOCKS_PER_SEC);
}
return 0;
}
But I get 0 clicks all time. I think usleep is not waiting 200 ms exactly, so I need to calculate how much time cost the function to encode audio using ffmpeg with synchronization.
I think the problem is that you're using the clock() function.
The clock function determines the amount of processor time used since the invocation of the calling process, measured in CLOCKS_PER_SEC of a second.
So for example:
clock_t start = clock();
sleep(8);
clock_t finish = clock();
printf("It took %d seconds to execute the for loop.\n",
(finish - start) / CLOCKS_PER_SEC);
This code will give you a value of 0. Because the code was not using the processor, it was sleeping.
This code however:
long i;
clock_t start = clock();
for (i = 0; i < 100000000; ++i)
exp(log((double)i));
clock_t finish = clock();
printf("It took %d seconds to execute the for loop.\n",
(finish - start) / CLOCKS_PER_SEC);
Will give you a count of 8seconds, because the code was using the processor the whole time.
This is the "algorithm", but when I want to measure the execution time it gives me zero. Why?
#define ARRAY_SIZE 10000
...
clock_t start, end;
start = clock();
for( i = 0; i < ARRAY_SIZE; i++)
{
non_parallel[i] = vec[i] * vec[i];
}
end = clock();
printf( "Number of seconds: %f\n", (end-start)/(double)CLOCKS_PER_SEC );
So What should i do to measure the time?
Two things:
10000 is not a lot on a modern computer. Therefore that loop will run in probably less than a millisecond - less than the precision of clock(). Therefore it will return zero.
If you aren't using the result of non_parallel its possible that the entire loop will be optimized out by the compiler.
Most likely, you just need a more expensive loop. Try increasing ARRAY_SIZE to something much larger.
Here's a test on my machine with a larger array size:
#define ARRAY_SIZE 100000000
int main(){
clock_t start, end;
double *non_parallel = (double*)malloc(ARRAY_SIZE * sizeof(double));
double *vec = (double*)malloc(ARRAY_SIZE * sizeof(double));
start = clock();
for(int i = 0; i < ARRAY_SIZE; i++)
{
non_parallel[i] = vec[i] * vec[i];
}
end = clock();
printf( "Number of seconds: %f\n", (end-start)/(double)CLOCKS_PER_SEC );
free(non_parallel);
free(vec);
return 0;
}
Output:
Number of seconds: 0.446000
This is an unreliable way to actually time number of seconds, since the clock() function is pretty low precision, and your loop isn't doing a lot of work. You can either make your loop do more so that it runs longer, or use a better timing method.
The higher precision methods are platform specific. For Windows, see How to use QueryPerformanceCounter? and for linux see High resolution timer with C++ and Linux?