Quoting man 3 ftime:
This function is obsolete. Don't use it. If the time in seconds suffices, time(2) can be used; gettimeofday(2) gives microseconds;
clock_gettime(2) gives nanoseconds but is not as widely available.
Why should it not be used? What are the perils?
I understand that time(2), gettimeofday(2) and clock_gettime(2) can be used instead of ftime(3), but ftime(3) gives out exactly milliseconds, and this I find convenient, since milliseconds is the exact precision I need.
Such advice is intended to help you make your program portable and avoid various pitfalls. While the ftime function likely won't be removed from systems that have it, new systems your software gets ported to might not have it, and you may run into problems, e.g. if the system model of time zone evolves to something not conveniently expressible in the format of ftime's structure.
Related
The time function in the header time.h is defined by POSIX to return a time_t which can, evidently, be a signed int or some kind of floating point number.
http://en.cppreference.com/w/c/chrono/time
The function, however, returns (time_t)(-1) on error.
Under what circumstances can time fail?
Based on the signature, time_t time( time_t *arg ) it seems like the function shouldn't allocate, so that strikes one potential cause of failure.
The time() function is actually defined by ISO, to which POSIX mostly defers except it may place further restrictions on behaviour and/or properties (like an eight-bit byte, for example).
And, since the ISO C standard doesn't specify how time() may fail(a), the list of possibilities is not limited in any way:
One way in which it may fail is in the embedded arena. It's quite possible that your C program may be running on a device with no real-time clock or other clock hardware (even a counter), in which case no time would be available.
Or maybe the function detects bad clock hardware that's constantly jumping all over the place and is therefore unreliable.
Or maybe you're running in a real-time environment where accesses to the clock hardware are time-expensive so, if it detects you're doing it too often, it decides to start failing so your code can do what it's meant to be doing :-)
The possibilities are literally infinite and, of course, I mean 'literally' in a figurative sense rather than a literal one :-)
POSIX itself calls out explicitly that it will fail if it detects the value won't fit into a time_t variable:
The time() function may fail if: [EOVERFLOW] The number of seconds since the Epoch will not fit in an object of type time_t.
And, just on your comment:
Based on the signature, time_t time( time_t *arg ), it seems like the function shouldn't allocate.
You need to be circumspect about this. Anything not mandated by the standards is totally open to interpretation. For example, I can envisage a bizarre implementation that allocates space for an NTP request packet to go out to time.nist.somewhere.org so as to ensure all times are up to date even without an NTP client :-)
(a) In fact, it doesn't even specify what the definition of time_t is so it's unwise to limit it to an integer or floating point value, it could be the string representation of the number of fortnights since the big bang :-) All it requires is that it's usable by the other time.h functions and that it can be cast to -1 in the event of failure.
POSIX does state that it represents number of seconds (which ISO doesn't) but places no other restrictions on it.
I can imagine several causes:
the hardware timer isn't available, because the hardware doesn't support it.
the hardware timer just failed (hardware error, timer registers cannot be accessed for some reason)
arg is not null, but points to some illegal location. Instead of crashing, some implementations could detect an illegal pointer (or catch the resulting SEGV) and return an error instead.
in the provided link "Implementations in which time_t is a 32-bit signed integer (many historical implementations) fail in the year 2038.". So after 1<<31 seconds since the epoch (1/1/1970), time return value overflows (well, that is, if the hardware doesn't mask the problem by silently overflowing as well).
I've got an auxiliary function that does some operations that are pretty costly.
I'm trying to profile the main section of the algorithm, but this auxiliary function gets called a lot within. Consequently, the measured time takes into account the auxillary function's time.
To solve this, I decided to set and restore the time so that the auxillary function appears to be instantaneous. I defined the following macros:
#define TIME_SAVE struct timeval _time_tv; gettimeofday(&_time_tv,NULL);
#define TIME_RESTORE settimeofday(&_time_tv,NULL);
. . . and used them as the first and last lines of the auxiliary function. For some reason, though, the auxiliary function's overhead is still included!
So, I know this is kind of a messy solution, and so I have since moved on, but I'm still curious as to why this idea didn't work.
Can someone please explain why?
If you insist on profiling this way, do not set the system clock. This will break all sorts of things, if you have permission to do it. Basically you should forget you ever heard of settimeofday. What you want to do is call gettimeofday both before and after the function you want to exclude from measurement, and compute the difference. You can then exclude the time spent in this function from the overall time.
With that said, this whole method of "profiling" is highly flawed, because gettimeofday probably (1) takes a significant amount of time compared to what you're trying to measure, and (2) probably involves a transition into kernelspace, which will do some serious damage to your program's cache coherency. This second problem, whereby in attempting to observe your program's performance characteristics you actually change them, is the most problematic.
What you really should do is forget about this kind of profiling (gettimeofday or even gcc's -pg/gmon profiling) and instead use oprofile or perf or something similar. These modern profiling techniques work based on statistically sampling the instruction pointer and stack information periodically; your program's own code is not modified at all, so it behaves as closely as possible to how it would behave with no profiler running.
There are a couple possibilities that may be occurring. One is that Linux tries to keep the clock accurate and adjustments to the clock may be 'smoothed' or otherwise 'fixed up' to try to keep a smooth sense of time within the system. If you are running NTP, it will also try to maintain a reasonable sense of time.
My approach would have been to not modify the clock but instead track time consumed by each portion of the process. The calls to the expensive part would be accumulated (by getting the difference between gettimeofday on entry and exit, and accumulating) and subtracting that from overall time. There are other possibilities for fancier approaches, I'm sure.
As the title implies basically: Say we have a complex program and we want to make it faster however we can. Can we somehow detect which loops or other parts of its structure take most of the time for targeting them for optimizations?
edit: Notice, of importance is that the software is assumed to be very complex and we can't check each loop or other structure one by one, putting timers in them etc..
You're looking for a profiler. There are several around; since you mention gcc you might want to check gprof (part of binutils). There's also Google Perf Tools although I have never used them.
You can use GDB for that, by this method.
Here's a blow-by-blow example of using it to optimize a realistically complex program.
You may find "hotspots" that you can optimize, but more
generally the things that give you the greatest opportunity for saving time are mid-level function calls that you can avoid.
One example is, say, calling a function to extract information from a database, where the function is being called multiple times, when with some extra coding the result from a prior call could be used.
Often such calls are small and innocent-looking, and you're totally surprised to learn how much they're costing, as an overall percent of time.
Another example is doing some low-level I/O that escapes attention, but actually costs a hefty percent of clock time.
Another example is tidal waves of notifications that propagate from seemingly trivial changes to data.
Another good tool for finding these problems is Zoom.
Here's a discussion of the technical issues, but basically what to look for is:
It should tell you inclusive percent of time, at line-level resolution, not just functions.
a) Only knowing that a function is costly still leaves you wondering where the lines are in it that you should look at.
b) Inclusive percent tells the true cost of the line - how much bottom-line time it is responsible for and would not be spent if it were not there.
It should include both I/O (i.e. blocked) time and CPU time, not just CPU time. A tool that only considers CPU time will not see the first two problems mentioned above.
If your program is interactive, the tool should operate only during the time you care about, and not while waiting for user input. You don't want to include head-scratching time in your program's performance statistics.
gprof breaks it down by function. If you have many different loops in one function, it might not tell you which loop is taking the time. This is a clue to refactor ;-)
Preferable for x86-32 gcc implementation
Considering modern C compiler optimize like crazy, I think you'll find timings to be very situationally dependent. What would be a slow operation in one situation might be either optimized away to a faster operation, or the compiler might be able to use a faster 8 or 16 bit version of the same instruction, etc.
It depends on the particular case, but this is likely to vary substantially based on the platform, hardware, operating system, function, and function inputs. A general answer is "no." It also depends on what you mean by "time;" there is execution time and clock time, among other things.
The best way to determine how long something will take is to run it as best you can. If performance is an issue, profiling and perfecting will be your best bet.
Certain real-time systems place constraints on how long operations will take, but this is not specific to C.
I don't think such a thing is really possible. When you consider the difference in time for the same program given different arguments. For example, assuming the function costOf did what you wanted, which costs more, memcpy or printf. Both?
costOf(printf("Hello World")) > costOf(memcpy(a, b, 4))
costOf(printf("Hello World")) < costOf(memcpy(a, b, 4 * 1024 * 1024 * 1024))
IMHO, this is a micro optimization, which should be disregarded until all profiling has been performed. In general, library routines are not the consumers of execution time, but rather resources or programmer created functions.
I also suggest spending more time on a program's quality, and robustness rather than worrying about micro optimizations. With computing power increasing and memory sizes increasing, size and execution times are less of a problem to customers than quality and robustness. A customer is willing to wait for a program that produces correct output (or performs all requirements correctly) and doesn't crash rather than demanding a fast program that has errors or crashes the system.
To answer your question, as others have stated, the execution times of library functions depend upon the library developer, the platform (hardware) and the operating system. Some platforms can execute floating point instructions faster or in equal time to integral operations. Some libraries will delegate function to the operating system, while others will package their own. Some functions are slower because they are written to work on a variety of platforms, while the same functions in other libraries can be faster because they are tailored to the specific platform.
Use the library functions that you need and don't worry about their speed. Use 3rd party tested libraries rather than rewriting your own code. If the program is executing very slowly, review the design and profile. Perhaps you can gain more speed by using Data Oriented Design rather than Object Oriented Design or procedural programming. Again, concentrate your efforts on developing quality and robust code while learning how to produce software more efficiently.
I'm doing some prototyping work in C, and I want to compare how long a program takes to complete with various small modifications.
I've been using clock; from K&R:
clock returns the processor time used by the program since the beginning of execution, or -1 if unavailable.
This seems sensible to me, and has been giving results which broadly match my expectations. But is there something better to use to see what modifications improve/worsen the efficiency of my code?
Update: I'm interested in both Windows and Linux here; something that works on both would be ideal.
Update 2: I'm less interested in profiling a complex problem than total run time/clock cycles used for a simple program from start to finish—I already know which parts of my program are slow. clock appears to fit this bill, but I don't know how vulnerable it is to, for example, other processes running in the background and chewing up processor time.
Forget time() functions, what you need is:
Valgrind!
And KCachegrind is the best gui for examining callgrind profiling stats. In the past I have ported applications to linux just so I could use these tools for profiling.
For a rough measurement of overall running time, there's time ./myprog.
But for performance measurement, you should be using a profiler. For GCC, there is gprof.
This is both assuming a Unix-ish environment. I'm sure there are similar tools for Windows, but I'm not familiar with them.
Edit: For clarification: I do advise against using any gettime() style functions in your code. Profilers have been developed over decades to do the job you are trying to do with five lines of code, and provide a much more powerful, versatile, valuable, and fool-proof way to find out where your code spends its cycles.
I've found that timing programs, and finding things to optimize, are two different problems, and for both of them I personally prefer low-tech.
For timing, the trick is to make it take long enough by wrapping a loop around it. For example, if you iterate an operation 1000 times and time it with a stopwatch, then seconds become milliseconds when you remove the loop.
For finding things to optimize, there are pieces of code (terminal instructions and function calls) that are responsible for various fractions of the time. During that time, they are exposed on the stack. So you can wrap a loop around the program to make it take long enough, and then take stackshots. The code to optimize will jump out at you.
In POSIX (e.g. on Linux), you can use gettimeofday() to get higher-precision timing values (microseconds).
In Win32, QueryPerformanceCounter() is popular.
Beware of CPU clock-changing effects, if your CPU decides to clock down during the test, results may be skewed.
If you can use POSIX functions, have a look at clock_gettime. I found an example from a quick google search on how to use it. To measure processor time taken by your program, you need to pass CLOCK_PROCESS_CPUTIME_ID as the first argument to clock_gettime, if your system supports it. Since clock_gettime uses struct timespec, you can probably get useful nanosecond resolution.
As others have said, for any serious profiling work, you will need to use a dedicated profiler.