get process start time in kernel module solaris - c

I am trying to get process start time in kernel module.
I get the proc struct pointer, and from the proc I take field p_mstart ()
typedef struct proc {
.....
/*
* Microstate accounting, resource usage, and real-time profiling
*/
hrtime_t p_mstart; /* hi-res process start time */
this return me the number: 1976026375725303
struct proc* iterated_process_ptr = curproc
LOG("***KERNEL***: PID=%d, StartTime=%lld",iterated_process_ptr->p_pidp->pid_id, iterated_process_ptr->p_mstart);
What is this number ?
In the documentation solaris write:
The gethrtime() function returns the current high-resolution real time. Time is expressed as nanoseconds since some arbitrary time in the past.
And in the book Solaris Internals they write:
Within the process, the operating system maintains a high-resolution teimstamp that marks process start and terminate times, A p_mstart field, the process start time, is set in the kernel fork() code when the process is created.... it return 64-bit value expressed in nanosecond
The number 1976026375725303 does not make sense at all.
If i divide by 1,000,000,000 and then by 3600 in order to get hours, i get 528 hours, 22 days, but my uptime is 5 days..

Based on answer received at google group: comp.unix.solaris.
Instead of going to proc -> p_mstart
I need to take
iterated_process_ptr ->p_user.u_start
This bring me the same struct (timestruc_t) as userspace
typedef struct psinfo {
psinfo ->pr_start; /* process start time, from the epoch */

The number 1976026375725303 does not make sense at all.
Yes it does. Per the very documentation that you quoted:
Time is expressed as nanoseconds since some arbitrary time in the
past.
Thus, the value can be used to calculate how long ago the process started:
hrtime_t howLongAgo = gethrtime() - p->p_mstart;
That produces a value in nanoseconds for how long ago the process started.
And note that the value produced is accurate - the value from iterated_process_ptr ->p_user.u_start is subject to system clock changes, so you can't say, "This process has been running for 3 hours, 15 minutes, and 3 seconds" unless you also know the system clock hasn't been reset or modified in any way.
Per the Solaris 11 gethrtime.9F man page:
Description
The gethrtime() function returns the current high-resolution real
time. Time is expressed as nanoseconds since some arbitrary time in
the past; it is not correlated in any way to the time of day, and
thus is not subject to resetting or drifting by way of adjtime(2) or
settimeofday(3C). The hi-res timer is ideally suited to performance
measurement tasks, where cheap, accurate interval timing is required.
Return Values
gethrtime() always returns the current high-resolution real time.
There are no error conditions.
...
Notes
Although the units of hi-res time are always the same (nanoseconds),
the actual resolution is hardware dependent. Hi-res time is guaranteed
to be monotonic (it does not go backward, it does not periodically
wrap) and linear (it does not occasionally speed up or slow down for
adjustment, as the time of day can), but not necessarily unique: two
sufficiently proximate calls might return the same value.
The time base used for this function is the same as that for
gethrtime(3C). Values returned by both of these functions can be
interleaved for comparison purposes.

Related

Is it acceptable to measure time elapsed by iteratively blocking a thread for a fixed period and then multiply said period by the loop count?

I work for a company that produces automatic machines, and I help maintain their software that controls the machines. The software runs on a real-time operating system, and consists of multiple threads running concurrently. The code bases are legacy, and have substantial technical debts. Among all the issues that the code bases exhibit, one stands out as being rather bizarre to me; most of the timing algorithms that involve the computation of time elapsed to realize common timed features such as timeouts, delays, recording time spent in a particular state, and etc., basically take the following form:
unsigned int shouldContinue = 1;
unsigned int blockDuration = 1; // Let's say 1 millisecond.
unsigned int loopCount = 0;
unsigned int elapsedTime = 0;
while (shouldContinue)
{
.
. // a bunch of statements, selections and function calls
.
blockingSystemCall(blockDuration);
.
. // a bunch of statements, selections and function calls
.
loopCount++;
elapsedTime = loopCount * blockDuration;
}
The blockingSystemCall function can be any operating system's API that suspends the current thread for the specified blockDuration. The elapsedTime variable is subsequently computed by basically multiplying loopCount by blockDuration or by any equivalent algorithm.
To me, this kind of timing algorithm is wrong, and is not acceptable under most circumstances. All the instructions in the loop, including the condition of the loop, are executed sequentially, and each instruction requires measurable CPU time to execute. Therefore, the actual time elapsed is strictly greater than the value of elapsedTime in any given instance after the loop starts. Consequently, suppose the CPU time required to execute all the statements in the loop, denoted by d, is constant. Then, elapsedTime lags behind the actual time elapsed by loopCount • d for any loopCount > 0; that is, the deviation grows according to an arithmetic progression. This sets the lower bound of the deviation because, in reality, there will be additional delays caused by thread scheduling and time slicing, depending on other factors.
In fact, not too long ago, while testing a new data-driven predictive maintenance feature which relies on the operation time of a machine, we discovered that the operation time reported by the software lagged behind that of a standard reference clock by a whopping three hours after the machine was in continuous operation for just over two days. It was through this test that I discovered the algorithm outlined above, which I swiftly determined to be the root cause.
Coming from a background where I used to implement timing algorithms on bare-metal systems using timer interrupts, which allows the CPU to carry on with the execution of the business logic while the timer process runs in parallel, it was shocking for me to have discovered that the algorithm outlined in the introduction is used in the industry to compute elapsed time, even more so when a typical operating system already encapsulates the timer functions in the form of various easy-to-use public APIs, liberating the programmer from the hassle of configuring a timer via hardware registers, raising events via interrupt service routines, etc.
The kind of timing algorithm as illustrated in the skeleton code above is found in at least two code bases independently developed by two distinct software engineering teams from two subsidiary companies located in two different cities, albeit within the same state. This makes me wonder whether it is how things are normally done in the industry or it is just an isolated case and is not widespread.
So, the question is, is the algorithm shown above common or acceptable in calculating elapsed time, given that the underlying operating system already provides highly optimized time-management system calls that can be used right out of the box to accurately measure elapsed time or even used as basic building blocks for creating higher-level timing facilities that provide more intuitive methods similar to, e.g., the Timer class in C#?
You're right that calculating elapsed time that way is inaccurate -- since it assumes that the blocking call will take exactly the amount of time indicated, and that everything that happens outside of the blocking system call will take no time at all, which would only be true on an infinitely-fast machine. Since actual machines are not infinitely fast, the elapsed-time calculated this way will always be somewhat less than the actual elapsed time.
As to whether that's acceptable, it's going to depend on how much timing accuracy your program needs. If it's just doing a rough estimate to make sure a function doesn't run for "too long", this might be okay. OTOH if it is trying for accuracy (and in particular accuracy over a long period of time), then this approach won't provide that.
FWIW the more common (and more accurate) way to measure elapsed time would be something like this:
const unsigned int startTime = current_clock_time();
while (shouldContinue)
{
loopCount++;
elapsedTime = current_clock_time() - startTime;
}
This has the advantage of not "drifting away" from the accurate value over time, but it does assume that you have a current_clock_time() type of function available, and that it's acceptable to call it within the loop. (If current_clock_time() is very expensive, or doesn't provide some real-time performance guarantees that the calling routine requires, that might be a reason not to do it this way)
I don't think these loops do what you think they do.
In a RTOS, the purpose of a loop like this is usually to perform a task at regular intervals.
blockingSystemCall(N) probably does not just sleep for N milliseconds like you think it does. It probably sleeps until N milliseconds after the last time your thread woke up.
More accurately, all the sleeps your thread has performed since starting are added to the thread start time to get the time at which the OS will try to wake the thread up. If your thread woke up due to an I/O event, then the last one of those times could be used instead of the thread start time. The point is that the inaccuracies in all these start times are corrected, so your thread wakes up at regular intervals and the elapsed time measurement is perfectly accurate according to the RTOS master clock.
There could also be very good reasons for measuring elapsed time by the RTOS master clock instead of a more accurate wall clock time, in addition to simplicity. This is because all of the guarantees that an RTOS provides (which is the reason you are using a RTOS in the first place) are provided in that time scale. The amount of time taken by one task can affect the amount of time you are guaranteed to have available for other tasks, as measured by this clock.
It may or may not be a problem that your RTOS master clock runs slow by 3 hours every 2 days...

Getting the real, user, sys time of functions within a program

looking for assistance on getting the real, user, sys times of functions within my C program. For instance how long it took to read in a file. I have been looking at the #include and using the time() function with the -p flag but I am striking out on the execution of this. I guess my question is if I have:
time_t total_time;
time_t start, end;
start = time(-p, &start);
<some code>
end = time(-p, &end);
printf("real = %e, user = %S, sys = %S\n", ???????????);
I understand the differences between the three times, just don't know the proper execution of getting the results.
Check time()'s return value:
The value returned generally represents the number of seconds since 00:00 hours, Jan 1, 1970 UTC (i.e., the current unix timestamp). Although libraries may use a different representation of time: Portable programs should not use the value returned by this function directly, but always rely on calls to other elements of the standard library to translate them to portable types (such as localtime, gmtime or difftime).
To display the real time, read this answer.
For user and system, read: How to measure user/system cpu time for a piece of program?
The real elapsed time can be determined by calling time() before and after, and subtracting the results. If you want to be portable, you should use the difftime function to perform the subtraction, because the return value from time() is not guaranteed to be a number.
If you are using a POSIX compliant system, however, the return value from time() will be the number of seconds since the epoch, so you could just subtract them to get the elapsed seconds.
If you wanted a higher resolution result, you could use gettimeofday(), which returns a timeval struct containing millisecond resolution time.
The most portable way to access the user & system times would be to call the clock() C library function, however you have noted in your comments to #gsamaras that your system does not have this function.
Depending on the system you are working with, you may be able to call the system functions times() or getrusage(). getrusage() would be easier to use, because it returns the values as timeval structs, which contain seconds and milliseconds. times() returns clock ticks which you would have to convert if you wanted actual time units.
If your program is threaded, and you are using Linux, getrusage() offers an additional advantage: you can get the resource consumption for the current thread.
Whichever function you choose the process would be to obtain the initial reading, the final reading, and subtract the two results to see the time consumed by your function.

Precise Linux Timing - What Determines the Resolution of clock_gettime()?

I need to do precision timing to the 1 us level to time a change in duty cycle of a pwm wave.
Background
I am using a Gumstix Over Water COM (https://www.gumstix.com/store/app.php/products/265/) that has a single core ARM Cortex-A8 processor running at 499.92 BogoMIPS (the Gumstix page claims up to 1Ghz with 800Mhz recommended) according to /proc/cpuinfo. The OS is an Angstrom Image version of Linux based of kernel version 2.6.34 and it is stock on the Gumstix Water COM.
The Problem
I have done a fair amount of reading about precise timing in Linux (and have tried most of it) and the consensus seems to be that using clock_gettime() and referencing CLOCK_MONOTONIC is the best way to do it. (I would have liked to use the RDTSC register for timing since I have one core with minimal power saving abilities but this is not an Intel processor.) So here is the odd part, while clock_getres() returns 1, suggesting resolution at 1 ns, actual timing tests suggest a minimum resolution of 30517ns or (it can't be coincidence) exactly the time between a 32.768KHz clock ticks. Here's what I mean:
// Stackoverflow example
#include <stdio.h>
#include <time.h>
#define SEC2NANOSEC 1000000000
int main( int argc, const char* argv[] )
{
// //////////////// Min resolution test //////////////////////
struct timespec resStart, resEnd, ts;
ts.tv_sec = 0; // s
ts.tv_nsec = 1; // ns
int iters = 100;
double resTime,sum = 0;
int i;
for (i = 0; i<iters; i++)
{
clock_gettime(CLOCK_MONOTONIC, &resStart); // start timer
// clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, &ts);
clock_gettime(CLOCK_MONOTONIC, &resEnd); // end timer
resTime = ((double)resEnd.tv_sec*SEC2NANOSEC + (double)resEnd.tv_nsec
- ((double)resStart.tv_sec*SEC2NANOSEC + (double)resStart.tv_nsec);
sum = sum + resTime;
printf("resTime = %f\n",resTime);
}
printf("Average = %f\n",sum/(double)iters);
}
(Don't fret over the double casting, tv_sec in a time_t and tv_nsec is a long.)
Compile with:
gcc soExample.c -o runSOExample -lrt
Run with:
./runSOExample
With the nanosleep commented out as shown, the result is either 0ns or 30517ns with the majority being 0ns. This leads me to believe that CLOCK_MONOTONIC is updated at 32.768kHz and most of the time the clock has not been updated before the second clock_gettime() call is made and in cases where the result is 30517ns the clock has been updated between calls.
When I do the same thing on my development computer (AMD FX(tm)-6100 Six-Core Processor running at 1.4 GHz) the minimum delay is a more constant 149-151ns with no zeros.
So, let's compare those results to the CPU speeds. For the Gumstix, that 30517ns (32.768kHz) equates to 15298 cycles of the 499.93MHz cpu. For my dev computer that 150ns equates to 210 cycles of the 1.4Ghz CPU.
With the clock_nanosleep() call uncommented the average results are these:
Gumstix: Avg value = 213623 and the result varies, up and down, by multiples of that min resolution of 30517ns
Dev computer: 57710-68065 ns with no clear trend. In the case of the dev computer I expect the resolution to actually be at the 1 ns level and the measured ~150ns truly is the time elapsed between the two clock_gettime() calls.
So, my question's are these:
What determines that minimum resolution?
Why is the resolution of the dev computer 30000X better than the Gumstix when the processor is only running ~2.6X faster?
Is there a way to change how often CLOCK_MONOTONIC is updated and where? In the kernel?
Thanks! If you need more info or clarification just ask.
As I understand, the difference between two environments(Gumstix and your Dev-computer) might be the underlying timer h/w they are using.
Commented nanosleep() case:
You are using clock_gettime() twice. To give you a rough idea of what this clock_gettime() will ultimately get mapped to(in kernel):
clock_gettime -->clock_get() -->posix_ktime_get_ts -->ktime_get_ts() -->timekeeping_get_ns()
-->clock->read()
clock->read() basically reads the value of the counter provided by underlying timer driver and corresponding h/w. A simple difference with stored value of the counter in the past and current counter value and then nanoseconds conversion mathematics will yield you the nanoseconds elapsed and will update the time-keeping data structures in kernel.
For example , if you have a HPET timer which gives you a 10 MHz clock, the h/w counter will get updated at 100 ns time interval.
Lets say, on first clock->read(), you get a counter value of X.
Linux Time-keeping data structures will read this value of X, get the difference 'D'compared to some old stored counter value.Do some counter-difference 'D' to nanoseconds 'n' conversion mathematics, update the data-structure by 'n'
Yield this new time value to the user space.
When second clock->read() is issued, it will again read the counter and update the time.
Now, for a HPET timer, this counter is getting updated every 100ns and hence , you will see this difference being reported to the user-space.
Now, Let's replace this HPET timer with a slow 32.768 KHz clock. Now , clock->read()'s counter will updated only after 30517 ns seconds, so, if you second call to clock_gettime() is before this period, you will get 0(which is majority of the cases) and in some cases, your second function call will be placed after counter has incremented by 1, i.e 30517 ns has elapsed. Hence , the value of 30517 ns sometimes.
Uncommented Nanosleep() case:
Let's trace the clock_nanosleep() for monotonic clocks:
clock_nanosleep() -->nsleep --> common_nsleep() -->hrtimer_nanosleep() -->do_nanosleep()
do_nanosleep() will simply put the current task in INTERRUPTIBLE state, will wait for the timer to expire(which is 1 ns) and then set the current task in RUNNING state again. You see, there are lot of factors involved now, mainly when your kernel thread (and hence the user space process) will be scheduled again. Depending on your OS, you will always face some latency when your doing a context-switch and this is what we observe with the average values.
Now Your questions:
What determines that minimum resolution?
I think the resolution/precision of your system will depend on the underlying timer hardware being used(assuming your OS is able to provide that precision to the user space process).
*Why is the resolution of the dev computer 30000X better than the Gumstix when the processor is only running ~2.6X faster?*
Sorry, I missed you here. How it is 30000x faster? To me , it looks like something 200x faster(30714 ns/ 150 ns ~ 200X ? ) .But anyway, as I understand, CPU speed may or may not have to do with the timer resolution/precision. So, this assumption may be right in some architectures(when you are using TSC H/W), though, might fail in others(using HPET, PIT etc).
Is there a way to change how often CLOCK_MONOTONIC is updated and where? In the kernel?
you can always look into the kernel code for details(that's how i looked into it).
In linux kernel code , look for these source files and Documentation:
kernel/posix-timers.c
kernel/hrtimer.c
Documentation/timers/hrtimers.txt
I do not have gumstix on hand, but it looks like your clocksource is slow.
run:
$ dmesg | grep clocksource
If you get back
[ 0.560455] Switching to clocksource 32k_counter
This might explain why your clock is so slow.
In the recent kernels there is a directory /sys/devices/system/clocksource/clocksource0 with two files: available_clocksource and current_clocksource. If you have this directory, try switching to a different source by echo'ing its name into second file.

How to measure user/system cpu time for a piece of program?

In section 3.9 of the classic APUE(Advanced Programming in the UNIX Environment), the author measured the user/system time consumed in his sample program which runs against varying buffer size(an I/O read/write program).
The result table goes kinda like(all the time are in the unit of second):
BUFF_SIZE USER_CPU SYSTEM_CPU CLOCK_TIME LOOPS
1 124.89 161.65 288.64 103316352
...
512 0.27 0.41 7.03 201789
...
I'm curious about and really wondering how to measure the USER/SYSTEM CPU time for a piece of program?
And in this example, what does the CLOCK TIME mean and how to measure it?
Obviously it isn't simply the sum of user CPU time and system CPU time.
You could easily measure the running time of a program using the time command under *nix:
$ time myprog
real 0m2.792s
user 0m0.099s
sys 0m0.200s
The real or CLOCK_TIME refers to the wall clock time i.e the time taken from the start of the program to finish and includes even the time slices taken by other processes when the kernel context switches them. It also includes any time, the process is blocked (on I/O events, etc.)
The user or USER_CPU refers to the CPU time spent in the user space, i.e. outside the kernel. Unlike the real time, it refers to only the CPU cycles taken by the particular process.
The sys or SYSTEM_CPU refers to the CPU time spent in the kernel space, (as part of system calls). Again this is only counting the CPU cycles spent in kernel space on behalf of the process and not any time it is blocked.
In the time utility, the user and sys are calculated from either times() or wait() system calls. The real is usually calculated using the time differences in the 2 timestamps gathered using the gettimeofday() system call at the start and end of the program.
One more thing you might want to know is real != user + sys. On a multicore system the user or sys or their sum can quite easily exceed the real time.
Partial answer:
Well, CLOCK_TIME is same as time shown by a clock, time passed in the so called "real world".
One way to measure that is to use gettimeofday POSIX function, which stores time to caller's struct timeval, containing UNIX seconds field and a microsecond field (actual accuracy is often less). Example for using that in typical benchmark code (ignoring errors etc):
struct timeval tv1, tv2;
gettimeofday(&tv1, NULL);
do_operation_to_measure();
gettimeofday(&tv2, NULL);
// get difference, fix value if microseconds became negative
struct timeval tvdiff = { tv2.tv_sec - tv1.tv_sec, tv2.tv_usec - tv1.tv_usec };
if (tvdiff.tv_usec < 0) { tvdiff.tv_usec += 1000000; tvdiff.tv_sec -= 1; }
// print it
printf("Elapsed time: %ld.%06ld\n", tvdiff.tv_sec, tvdiff.tv_usec);

measuring time of multi-threading program

I measured time with function clock() but it gave bad results. I mean it gives the same results for program with one thread and for the same program running with OpenMP with many threads. But in fact, I notice with my watch that with many threads program counts faster.
So I need some wall-clock timer...
My question is: What is better function for this issue?
clock_gettime() or mb gettimeofday() ? or mb something else?
if clock_gettime(),then with which clock? CLOCK_REALTIME or CLOCK_MONOTONIC?
using mac os x (snow leopard)
If you want wall-clock time, and clock_gettime() is available, it's a good choice. Use it with CLOCK_MONOTONIC if you're measuring intervals of time, and CLOCK_REALTIME to get the actual time of day.
CLOCK_REALTIME gives you the actual time of day, but is affected by adjustments to the system time -- so if the system time is adjusted while your program runs that will mess up measurements of intervals using it.
CLOCK_MONOTONIC doesn't give you the correct time of day, but it does count at the same rate and is immune to changes to the system time -- so it's ideal for measuring intervals, but useless when correct time of day is needed for display or for timestamps.
I think clock() counts the total CPU usage among all threads, I had this problem too...
The choice of wall-clock timing method is personal preference. I use an inline wrapper function to take time-stamps (take the difference of 2 time-stamps to time your processing). I've used floating point for convenience (units are in seconds, don't have to worry about integer overflow). With multi-threading, there are so many asynchronous events that in my opinion it doesn't make sense to time below 1 microsecond. This has worked very well for me so far :)
Whatever you choose, a wrapper is the easiest way to experiment
inline double my_clock(void) {
struct timeval t;
gettimeofday(&t, NULL);
return (1.0e-6*t.tv_usec + t.tv_sec);
}
usage:
double start_time, end_time;
start_time = my_clock();
//some multi-threaded processing
end_time = my_clock();
printf("time is %lf\n", end_time-start_time);

Resources