I am searching for a standard way to identify running time complexity of a program.
As described here, I am not searching for a solution for analyzing the same by looking at code, rather than through some other parameters at program runtime.
Consider a program which requires the user to convert a binary string to its decimal equivalent. The time complexity for such a program should be O(n) at worst, when each binary digit is processed at a time. With some intelligence, the running time can be reduced to O(n/4) (process 4 digits from the binary string at a time, assume that the binary string has 4k digits for all k=1,2,3...)
I wrote this program in C and used the time command and a function that uses gettimeoftheday (both) to calculate running time on a linux box having a 64 bit quad core processor (each core at 800 MHZ) under two categories:
When system is under normal load (core usage 5-10%)
When system is under heavy load (core usage 80-90%)
Following are the readings for O(n) algorithm, length of binary string is 100000, under normal load:
Time spent in User (ms) - 216
Time Spent in Kernel (ms) - 8
Timed using gettimeofday (ms) - 97
Following are the readings for O(n) algorithm, length of binary string is 200000, under high load:
Time spent in User (ms) - 400
Time Spent in Kernel (ms) - 48
Timed using gettimeofday (ms) - 190
What I am looking for:
If I am using time command, which output should I consider? real, user or sys?
Is there a standard method to calculate the running time of a program?
Every time I execute these commands, I get a different reading. How many times should I sample so that the average will always be the same, given the code does not change.
What if I want to use multiple threads and measure time in each thread by calling execve on such programs.
From the research I have done, I have not come across any standard approach. Also, whatever command / method I seem to use gives a me different output each time (I understand this is because of the context switches and cpu cycles). We can assume here that I can even do with a solution that is machine dependant.
To answer your questions:
Depends on what your code is doing each component of the output of time may be significant. This question deals with what those components mean. If the code you're timing doesn't utilize system calls, calculating the "user" time is probably sufficient. I'd probably just use the "real" time.
What's wrong with time? If you need better granularity (i.e. you just want to time a section of code instead of the entire program) you can always get the start time before the block of code you are profiling, run the code and then get the end time then calculate the difference to give you the runtime. NEVER use gettimeofday as the time does not monotonically increase. The system time can be changed by the administrator or an NTP process. You should use clock_gettime instead.
To minimise the runtime differences from run to run, I would check that cpu frequency scaling is OFF especially if you're getting very wildly differing results. This has caught me out before.
Once you start getting into multiple threads, you might want to start looking at a profiler. gprof is a good place to start.
Related
I am interested to calculate a duration of 125 μs for implementing a TDM (Time Division Multiplexing scheme) based scheme. However, I am not able to get this duration with an accuracy of +-5us using the Linux operating system. I am using DPDK which runs on ubuntu and intel hardware. If I take time from the computer using function clock_gettime(CLOCK_REALTIME), it adds the time to make a call to the kernel to get the time. This gives an inaccurate duration to me.
Therefore, I dedicated a cpu core for calculating time without asking the time from the kernel. For this, I run a for loop for a maximum instructions (8000000) and find the number instructions that need to be executed for the 125 μs duration (i.e. (125*8000000)/timespent).
However, the problem is that it is also giving inaccurate results (there is always different results i.e., a difference 1000 instructions).
Does anybody know why I am getting inaccurate results even if I am dedicating a CPU for this?
Do you know a method to calculate a duration (very short, may be equal to 125 us) without making a call to the kernel? thanks!
You are getting inaccurate result because you are on a multitasking operating system. You cannot do this on modern computers. You can only do this on embedded microcontroller where you control 100% of the cpu time. The operating system need to manage your process, even if you have a dedicated cpu. The mouse and keyboard takes time also. Your have to run the process on 'Bare Metal'.
I have made a application program in c language. And now i am trying to find Execution time and memory usage of my program. I have tried using time.h header and had done following
dif_sec = (double) difftime (time2,time1);
But every time i run the program, it give different execution time.
For eg. : 1st time i got 19 millisecond, and if i again i run the same program it will give me different execution time, greater then 19 millisecond, around 28 millisecond. And some time it give around 150 millisecond. So i am trying to get perfect execution time.
And also need a help to find memory usage of a program.
I am running my program on CodeBlocks in windows.
For Windows programs you can use QueryPerformanceCounter and QueryPerformanceFrequency to measure times to a decent accuracy. There's an article here.
In a program that calls getrusage() twice in order to obtain the time of a task by subtraction, I have once seen an assertion, saying that the time of the task should be nonnegative, fail. This, of course, cannot easily be reproduced, although I could write a specialized program that might reproduce it more easily.
I have tried to find a guarantee that getrusage() increased along execution, but neither the man page on my system(Linux on x86-64) nor this system-independant description say so explicitly.
The behavior was observed on a physical computer, with several cores, and NTP running.
Should I report a bug against the OS I am using? Am I asking too much when I expect getrusage() to increase with time?
On many systems rusage (I presume you mean ru_utime and ru_stime) is not calculated accurately, it's just sampled once per clock tick which is usually as slow as 100Hz and sometimes even slower.
Primary reason for that is that many machines have clocks that are incredibly expensive to read and you don't want to do this accounting (you'd have to read the clock twice for every system call). You could easily end up spending more time reading clocks than doing anything else in programs that do many system calls.
The counters should never go backwards though. I've seen that many years ago where the total running time of the process was tracked on context switches (which was relatively cheap and getrusge could calculate utime by using samples for stime, and subtracting that from the total running time). The clock used in that case was the wall clock instead of a monotonic clock and when you changed the time on the machine, the running time of processes could go back. But that was of course a bug.
How can we know that how much load does our program is on CPU?
I tried to find it using htop. But htop wont give the cpu load. It actually gives the cpu utilization percentage of my program(using pid).
I am using C programming, Linux environment.
The function you are probably looking for is getrusage. It fills struct rusage. There are two members of the struct you are interested in:
ru_utime - user CPU time used
ru_stime - system CPU time used
You can call the function at regular intervals of time and based on the results you can estimate the cpu load (e.g. in percentage) of your own process.
If you want to get it at the system level, then you need to read (and parse) /proc/stat
file (also at regular intervals), see here.
I'm using a built-in benchmarking module for some quick and dirty tests. It gives me:
CPU time
system CPU time (actually I never get any result for this with the code I'm running)
the sum of the user and system CPU times (always the same as the CPU time in my case)
the elapsed real time
I didn't even know I needed all that information.
I just want to compare two pieces of code and see which one takes longer. I know that one piece of code probably does more garbage collection than the other but I'm not sure how much of an impact it's going to have.
Any ideas which metric I should be looking at?
And, most importantly, could someone explain why the "elapsed real time" is always longer than the CPU time - what causes the lag between the two?
There are many things going on in your system other than running your Ruby code. Elapsed time is the total real time taken and should not be used for benchmarking. You want the system and user CPU times since those are the times that your process actually had the CPU.
An example, if your process:
used the CPU for one second running your code; then
used the CPU for one second running OS kernel code; then
was swapped out for seven seconds while another process ran; then
used the CPU for one more second running your code,
you would have seen:
ten seconds elapsed time,
two seconds user time,
one second system time,
three seconds total CPU time.
The three seconds is what you need to worry about, since the ten depends entirely upon the vagaries of the process scheduling.
Multitasking operating system, stalls while waiting for I/O, and other moments when you code is not actively working.
You don't want to totally discount clock-on-the-wall time. Time used to wait w/o another thread ready to utilize CPU cycles may make one piece of code less desirable than another. One set of code may take some more CPU time, but, employ multi-threading to dominate over the other code in the real world. Depends on requirements and specifics. My point is ... use all metrics available to you to make your decision.
Also, as a good practice, if you want to compare two pieces of code you should be running as few extraneous processes as possible.
It may also be the case that the CPU time when your code is executing is not counted.
The extreme example is a real-time system where the timer triggers some activity which is always shorter than a timer tick. Then the CPU time for that activity may never be counted (depending on how the OS does the accounting).