Is it possible to use RRDs with a high-precision? And by high-precision I mean e.g. in the range of milli-seconds.
If not, are there equally good alternatives to RRD with a C API that work under Linux?
the step size in rrdtool is an integer and thus can not be less than a second. BUT updates can carry a ms precision time stamp and will be handled correctly. It is just that you can not store samples more often than once per second.
Related
I am trying to profile a c++ function using gprof, I am intrested in the %time taken. I did more than one run and for some reason I got a large difference in the results. I don't know what is causing this, I am assuming the sampling rate or I read in other posts that I/O has something to do with it. So is there a way to make it more accurate and generate somehow almost constant results?
I was thinking of the following:
increase the sampling rate
flush the caches before executing anything
use another profiler but I want it to generate results in a similar format to grof as function time% function name, I tried Valgrind but it gave me a massive file in size. So maybe I am generating the file with the wrong command.
Waiting for your input
Regards
I recommend printing a copy of the gprof paper and reading it carefully.
According to the paper, here's how gprof measures time. It samples the PC, and it counts how many samples land in each routine. Multiplied by the time between samples, that is each routine's total self time.
It also records in a table, by call site, how many times routine A calls routine B, assuming routine B is instrumented by the -pg option. By summing those up, it can tell how many times routine B was called.
Starting from the bottom of the call tree (where total time = self time), it assumes the average time per call of each routine is its total time divided by the number of calls.
Then it works back up to each caller of those routines. The time of each routine is its average self time plus the average number of calls to each subordinate routine times the average time of the subordinate routine.
You can see, even if recursions (cycles in the call graph) are not present, how this is fraught with possibilities for errors, such as assumptions about average times and average numbers of calls, and assumptions about subroutines being instrumented, which the authors point out. If there are recursions, they basically say "forget it".
All of this technology, even if it weren't problematic, begs the question - What is it's purpose? Usually, the purpose is "find bottlenecks". According to the paper, it can help people evaluate alternative implementations. That's not finding bottlenecks. They do recommend looking at routines that seem to be called a lot of times, or that have high average times. Certainly routines with low average cumulative time should be ignored, but that doesn't localize the problem very much. And, it completely ignores I/O, as if all I/O that is done is unquestionably necessary.
So, to try to answer your question, try Zoom, for one, and don't expect to eliminate statistical noise in measurements.
gprof is a venerable tool, simple and rugged, but the problems it had in the beginning are still there, and far better tools have come along in the intervening decades.
Here's a list of the issues.
gprof is not very accurate, particularly for small functions, see http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html#SEC11
If this is Linux then I recommend a profiler that doesn't require the code to be instrumented, e.g. Zoom - you can get a free 30 day evaluation license, after that it costs money.
All sampling profilers suffer form statistical inaccuracies - if the error is too large then you need to sample for longer and/or with a smaller sampling interval.
First off I know there are a lot of similar questions and I have done a lot of digging please refrain from immediate hostility ( in my experience the people on this site are pretty hostile if they believe a question has already been asked answered) until you hear me out. If the answer is out there I haven't found it and I don't want to hijack another persons question.
That being said I am working in C on a linux based microcomputer. I have been using it to track and control motor RPM which obviously requires good time keeping. I was originally using calculations with the processor clock to track time on the order of milliseconds but for a variety of reasons that are probably woefully apparent this was problematic. I then switched over to using time.h and specifically the difftime() function. This was a good solution which allows me to accurately track and control the motors RPM with little to no issue. However I want to now plot that data. This again was not overly problematic except that the plot looks terrible because my time scale can not go any lower than seconds.
The best solution I could find would be to use sys/time.h and gettimeofday() which can give time since the epoch in greater resolution. However the issue is, as far as I can tell, that there is no difftime() type function for this that will maintain the higher time resolution. Why is this an issue? Because difftime() returns a double value that can easily be used to calculate RPM from a rotary encoder rotation count (rotations/(sec/60)) whereas there doesn't seem to be a way to do this with gettimeofday() as one uses time_t structs and the other uses timeval structs.
So is there a way to accurately return time differences between two times (as determined by real time elapsed since the epoch) with a better resolution than seconds? Or alternatively does anyone know of a better approach to accurately gauging elapsed time to calculate RPM? Thank you.
Convert the result of gettimeofday to a double:
gettimeofday(&now, NULL);
double dsecs = now.tv_sec + (now.tv_usec / 1000000.0);
Then your difftime is just a subtraction of two of these dsecs
I'm trying to understand why this code is a security flaw. From my understanding, it is not safe to generate a random number seed using the system time when the program is turned on, but how is that guessable? I'm not sure I see a possible exploit.
Code example removed due to DMCA Request by D. Wagner
For a start, your code has a bug where it doesn't use the value in time_in_sec at all - it overwrites that in the following line:
seed = time_micro_sec >> 7;
Further, time_micro_sec only has 1000000 possible values, and after right-shifting by 7 that reduces to only 7813 possible values. This space is trivial to search through by brute-force.
Even if you fix these bugs, at the end of the day the srand() / rand() random number generator is just not a cryptographically-strong PRNG. The interface is ultimately limited by the fact that srand() takes an unsigned int argument, which on common platforms limits the entropy in the initial state to just 32 bits. This is insufficient to prevent a brute-force attack on the seed.
Do not use srand() / rand() for security-critical random numbers.
If you could predict what the system time is when you run the program (which is not a stretch if you're the one running the program), then you can predict the value fed into the seed and thus determine what the "random" number generator will generate.
Ignoring the outright bug, which I assume is not intentional...
You don't have to predict the exact time for this to be a problem - just the rough when. For any given seed, the random values generated are a completely predictable sequence. So, if you repeatedly make actions, you can attempt to deduce which seed matches the pattern you're seeing If you can narrow down your start time to within a minute, say, that's only 60,000 seeds to search over (assuming milliseconds, though you mix comments about millis, variables named micros, and values which don't appear to be either), which might be simple. A day is only 86,400,000 - larger, but doable.
If I have the power to, e.g. crash your application and force a restart, it becomes MUCH easier to predict your seed, and then further exploit it. If not, if you have local access, finding the start time is straightforward. If it's a remote attack, it may be harder, but, e.g. on an MMO, server restarts might be predictable, your bank may do service updates Sunday from 10PM to 4AM, etc. - all of which leak start times.
Basically: if you want something secure, use something that's designed to be secure in the first place.
I'm making a program which controls other processes (as far as stopped and started states).
One of the criteria is that the load on the computer is under a certain value.
So i need a function or small program which will cause the load to be very high for testing purposes. Can you think of anything?
Thanks.
I can think of this one :
for(;;);
If you want to actually generate peak load on a CPU, you typically want a modest-size (so that the working set fits entirely in cache) trivially parallelizable task, that someone has hand-optimized to use the vector unit on a processor. Common choices are things like FFTs, matrix multiplications, and basic operations on mathematical vectors.
These almost always generate much higher power and compute load than do more number-theoretic tasks like primality testing because they are essentially branch-free (other than loops), and are extremely homogeneous, so they can be designed to use the full compute bandwidth of a machine essentially all the time.
The exact function that should be used to generate a true maximum load varies quite a bit with the exact details of the processor micro-architecture (different machines have different load/store bandwidth in proportion to the number and width of multiply and add functional units), but numerical software libraries (and signal processing libraries) are great things to start with. See if any that have been hand tuned for your platform are available.
If you need to control how long will the CPU burst be, you can use something like the Sieve of Eratosthenes (algorithm to find primes until a certain number) and supply a smallish integer (10000) for short bursts, and a big integer (100000000) for long bursts.
If you will take I/O into account for the load, you can write to a file per each test in the sieve.
Does any body know of C code profiler like gprof which gives function call times in microseconds instead of milliseconds?
Take a look at Linux perf. You will need a pretty recent kernel though.
Let me just suggest how I would handle this, assuming you have the source code.
Knowing how long a function takes inclusively per invocation (including I/O), on average, multiplied by the number of invocations, divided by the total running time, would give you the fraction of time under the control of that function. That fraction is how you know if the function is a sufficient time-taker to bother optimizing. That is not easy information to get from gprof.
Another way to learn what fraction of inclusive time is spent under the control of each function is timed or random sampling of the call stack. If a function appears on a fraction X of the samples (even if it appears more than once in a sample), then X is the time-fraction it takes (within a margin of error). What's more, this gives you per-line fraction of time, not just per-function.
That fraction X is the most valuable information you can get, because that is the total amount of time you could potentially save by optimizing that function or line of code.
The Zoom profiler is a good tool for getting this information.
What I would do is wrap a long-running loop around the top-level code, so that it executes repeatedly, long enough to take at least several seconds. Then I would manually sample the stack by interrupting or pausing it at random. It actually takes very few samples, like 10 or 20, to get a really clear picture of the most time-consuming functions and/or lines of code.
Here's an example.
P.S. If you're worried about statistical accuracy, let me get quantitative. If a function or line of code is on the stack exactly 50% of the time, and you take 10 samples, then the number of samples that show it will be 5 +/- 1.6, for a margin of error of 16%. If the actual time is smaller or larger, the margin of error shrinks. You can also reduce the margin of error by taking more samples. To get 1.6%, take 1000 samples. Actually, once you've found the problem, it's up to you to decide if you need a smaller margin of error.
gprof gives results either in milliseconds or in microseconds. I do not know the exact rationale, but my experience is that it will display results in microseconds when it thinks that there is enough precision for it. To get microsecond output, you need to run the program for longer time and/or do not have any routine that takes too much time to run.
oprofile gets you times in clock resolution, i.e. nanoseconds, it produces output files compatible with gprof so very convenient to use.
http://oprofile.sourceforge.net/news/