In a Linux/GNU/C environment, is there any visibility a running thread has into whether it has been put to sleep. For example say you have a function like
void foo() {
startClock();
bar();
endClock();
}
But you're only concerned with the running time of the code itself. I.e. you don't care about any time related to the thread being suspended during the run. Ideally you'd be able to leverage some system call or library, like countThreadSwitches(), such that:
void foo() {
int lastCount = countThreadSwitches();
startClock();
bar();
endClock();
if (countThreadSwitches() != lastCount)
discardClock();
}
Being able to tell whether the thread has switched in between two statements, would allow us to only measure runs unaffected by context switches.
So, is there anything like that hypothetical countThreadSwitches() call? Or is that information opaque to the thread itself?
In linux int getrusage(int who, struct rusage *usage); can be used to fill a struct containing timeval ru_utime (user CPU time used) and timeval ru_stime (system CPU time used), for a thread or a process.
These values along with the system clock will let you know how much CPU time your process/thread was actually running compared to how much time wasn't spent running your process/thread.
For example something like (ru_time + ru_stime) / (clock_endtime - clock_startstart) * 100 will give you CPU usage as a percent of the time elpased between start and end.
There are also some stats in there for number of context switches under certain circumstances, but that info isn't very useful.
On Linux you can read and parse the nonvoluntary_ctxt_switches: line from /proc/self/status (probably best to just do a single 4096-byte read() before and after, then parse them both afterwards).
Related
I am doing C program in Linux . I have a main thread which continuously updates values of two variables and other thread write those variable values into a file every 20 milliseconds. I have used usleep to achieve this time interval. sample code is below.
main()
{
.
.
.
.
.
pthread_create(...write_file..); /* started another thread by passing a function write_file */
while(variable1)
{
updates value of variables
}
return 0;
}
void write_file()
{
.
.
.
.
fp = fopen("sample.txt" , "a");
while(variable2)
{
fprintf(fp," %d \n", somevariable);
usleep(20 * 1000);
}
fclose(fp);
}
Is it suitable to use usleep function achieve 20 milliseconds time interval or should I use some other methods like Timer.?
Is this usleep is accurate enough ? Does this sleep function any way affect the main thread ?
Using of sleep() family often results in non-precise timing, especially when process has many CPU-consuming threads and required intervals are relatively small, like 20ms. So you shouldn't assume that *sleep() call blocks execution exactly to specified time. For described above situation actual sleep duration may be even twice or more greater than specified (assuming that kernel is not real-time one). As result you should implement some kind of compensation logic, that adjusts sleep duration for subsequent calls.
More precise (but of course not ideal) approach is to use POSIX timers. See timer_create(). The most precise timers are the ones that use SIGEV_SIGNAL or SIGEV_THREAD_ID notifications (latter is only on Linux systems). As signal number you can use one of the real-time signals (SIGRTMIN to SIGRTMAX), but be aware that pthread implementations often use few of these signals internally, so you should choose actual number carefully. And also doing something in signal handler context requires extra attention, because not every library function may be used safely here. You can find safe list here.
P.S. Also note that select() called with empty sets is a fairly portable way to sleep with subsecond precision.
Sleeping: sleep() and usleep()
Now, let me start with the easier timing calls. For delays of multiple seconds, your best bet is probably to use sleep(). For delays of at least tens of milliseconds (about 10 ms seems to be the minimum delay), usleep() should work. These functions give the CPU to other processes (``sleep''), so CPU time isn't wasted. See the manual pages sleep(3) and usleep(3) for details.
For delays of under about 50 milliseconds (depending on the speed of your processor and machine, and the system load), giving up the CPU takes too much time, because the Linux scheduler (for the x86 architecture) usually takes at least about 10-30 milliseconds before it returns control to your process. Due to this, in small delays, usleep(3) usually delays somewhat more than the amount that you specify in the parameters, and at least about 10 ms.
nanosleep()
In the 2.0.x series of Linux kernels, there is a new system call, nanosleep() (see the nanosleep(2) manual page), that allows you to sleep or delay for short times (a few microseconds or more).
For delays <= 2 ms, if (and only if) your process is set to soft real time scheduling (using sched_setscheduler()), nanosleep() uses a busy loop; otherwise it sleeps, just like usleep().
The busy loop uses udelay() (an internal kernel function used by many kernel drivers), and the length of the loop is calculated using the BogoMips value (the speed of this kind of busy loop is one of the things that BogoMips measures accurately). See /usr/include/asm/delay.h) for details on how it works.
Source: http://tldp.org/HOWTO/IO-Port-Programming-4.html
Try use nanosleep() instead usleep(), it should be more accurately for 20ms interval.
I want to calculate the context switch time and I am thinking to use mutex and conditional variables to signal between 2 threads so that only one thread runs at a time. I can use CLOCK_MONOTONIC to measure the entire execution time and CLOCK_THREAD_CPUTIME_ID to measure how long each thread runs.
Then the context switch time is the (total_time - thread_1_time - thread_2_time).
To get a more accurate result, I can just loop over it and take the average.
Is this a correct way to approximate the context switch time? I cant think of anything that might go wrong but I am getting answers that are under 1 nanosecond..
I forgot to mention that the more time I loop it over and take the average, the smaller results I get.
Edit
here is a snippet of the code that I have
typedef struct
{
struct timespec start;
struct timespec end;
}thread_time;
...
// each thread function looks similar like this
void* thread_1_func(void* time)
{
thread_time* thread_time = (thread_time*) time;
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &(thread_time->start));
for(x = 0; x < loop; ++x)
{
//where it switches to another thread
}
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &(thread_time->end));
return NULL;
};
void* thread_2_func(void* time)
{
//similar as above
}
int main()
{
...
pthread_t thread_1;
pthread_t thread_2;
thread_time thread_1_time;
thread_time thread_2_time;
struct timespec start, end;
// stamps the start time
clock_gettime(CLOCK_MONOTONIC, &start);
// create two threads with the time structs as the arguments
pthread_create(&thread_1, NULL, &thread_1_func, (void*) &thread_1_time);
pthread_create(&thread_2, NULL, &thread_2_func, (void*) &thread_2_time);
// waits for the two threads to terminate
pthread_join(thread_1, NULL);
pthread_join(thread_2, NULL);
// stamps the end time
clock_gettime(CLOCK_MONOTONIC, &end);
// then I calculate the difference between between total execution time and the total execution time of two different threads..
}
First of all, using CLOCK_THREAD_CPUTIME_ID is probably very wrong; this clock will give the time spent in that thread, in user mode. However the context switch does not happen in user mode, You'd want to use another clock. Also, on multiprocessing systems the clocks can give different values from processor to another! Thus I suggest you use CLOCK_REALTIME or CLOCK_MONOTONIC instead. However be warned that even if you read either of these twice in rapid succession, the timestamps usually will tens of nanoseconds apart already.
As for context switches - tthere are many kinds of context switches. The fastest approach is to switch from one thread to another entirely in software. This just means that you push the old registers on stack, set task switched flag so that SSE/FP registers will be lazily saved, save stack pointer, load new stack pointer and return from that function - since the other thread had done the same, the return from that function happens in another thread.
This thread to thread switch is quite fast, its overhead is about the same as for any system call. Switching from one process to another is much slower: this is because the user-space page tables must be flushed and switched by setting the CR0 register; this causes misses in TLB, which maps virtual addresses to physical ones.
However the <1 ns context switch/system call overhead does not really seem plausible - it is very probable that there is either hyperthreading or 2 CPU cores here, so I suggest that you set the CPU affinity on that process so that Linux only ever runs it on say the first CPU core:
#include <sched.h>
cpu_set_t mask;
CPU_ZERO(&mask);
CPU_SET(0, &mask);
result = sched_setaffinity(0, sizeof(mask), &mask);
Then you should be pretty sure that the time you're measuring comes from a real context switch. Also, to measure the time for switching floating point / SSE stacks (this happens lazily), you should have some floating point variables and do calculations on them prior to context switch, then add say .1 to some volatile floating point variable after the context switch to see if it has an effect on the switching time.
This is not straight forward but as usual someone has already done a lot of work on this. (I'm not including the source here because I cannot see any License mentioned)
https://github.com/tsuna/contextswitch/blob/master/timetctxsw.c
If you copy that file to a linux machine as (context_switch_time.c) you can compile and run it using this
gcc -D_GNU_SOURCE -Wall -O3 -std=c11 -lpthread context_switch_time.c
./a.out
I got the following result on a small VM
2000000 thread context switches in 2178645536ns (1089.3ns/ctxsw)
This question has come up before... for Linux you can find some material here.
Write a C program to measure time spent in context switch in Linux OS
Note, while the user was running the test in the above link they were also hammering the machine with games and compiling which is why the context switches were taking a long time. Some more info here...
how can you measure the time spent in a context switch under java platform
I want to implement a delay function using null loops. But the amount of time needed to complete a loop once is compiler and machine dependant. I want my program to determine the time on its own and delay the program for the specified amount of time. Can anyone give me any idea how to do this?
N. B. There is a function named delay() which suspends the system for the specified milliseconds. Is it possible to suspend the system without using this function?
First of all, you should never sit in a loop doing nothing. Not only does it waste energy (as it keeps your CPU 100% busy counting your loop counter) -- in a multitasking system it also decreases the whole system performance, because your process is getting time slices all the time as it appears to be doing something.
Next point is ... I don't know of any delay() function. This is not standard C. In fact, until C11, there was no standard at all for things like this.
POSIX to the rescue, there is usleep(3) (deprecated) and nanosleep(2). If you're on a POSIX-compliant system, you'll be fine with those. They block (means, the scheduler of your OS knows they have nothing to do and schedules them only after the end of the call), so you don't waste CPU power.
If you're on windows, for a direct delay in code, you only have Sleep(). Note that THIS function takes milliseconds, but has normally only a precision around 15ms. Often good enough, but not always. If you need better precision on windows, you can request more timer interrupts using timeBeginPeriod() ... timeBeginPeriod(1); will request a timer interrupt each millisecond. Don't forget calling timeEndPeriod() with the same value as soon as you don't need the precision any more, because more timer interrupts come with a cost: they keep the system busy, thus wasting more energy.
I had a somewhat similar problem developing a little game recently, I needed constant ticks in 10ms intervals, this is what I came up with for POSIX-compliant systems and for windows. The ticker_wait() function in this code just suspends until the next tick, maybe this is helpful if your original intent was some timing issue.
Unless you're on a real-time operating system, anything you program yourself directly is not going to be accurate. You need to use a system function to sleep for some amount of time like usleep in Linux or Sleep in Windows.
Because the operating system could interrupt the process sooner or later than the exact time expected, you should get the system time before and after you sleep to determine how long you actually slept for.
Edit:
On Linux, you can get the current system time with gettimeofday, which has microsecond resolution (whether the actual clock is that accurate is a different story). On Windows, you can do something similar with GetSystemTimeAsFileTime:
int gettimeofday(struct timeval *tv, struct timezone *tz)
{
const unsigned __int64 epoch_diff = 11644473600000000;
unsigned __int64 tmp;
FILETIME t;
if (tv) {
GetSystemTimeAsFileTime(&t);
tmp = 0;
tmp |= t.dwHighDateTime;
tmp <<= 32;
tmp |= t.dwLowDateTime;
tmp /= 10;
tmp -= epoch_diff;
tv->tv_sec = (long)(tmp / 1000000);
tv->tv_usec = (long)(tmp % 1000000);
}
return 0;
}
You could do something like find the exact time it is at a point in time and then keep it in a while loop which rechecks the time until it gets to whatever the time you want. Then it just breaks out and continue executing the rest of your program. I'm not sure if I see much of a benefit in looping rather than just using the delay function though.
I'm looking to create a state of uninterruptible sleep for a program I'm writing. Any tips or ideas about how to create this state would be helpful.
So far I've looked into the wait_event() function defined in wait.h, but was having little luck implementing it. When trying to initialize my wait queue the compiler complained
warning: parameter names (without types) in function declaration
static DECLARE_WAIT_QUEUE_HEAD(wq);
Has anyone had any experience with the wait_event() function or creating an uninterruptible sleep?
The functions that you're looking at in include/linux/wait.h are internal to the Linux kernel. They are not available to userspace.
Generally speaking, uninterruptible sleep states are considered undesirable. Under normal circumstances, they cannot be triggered by user applications except by accident (e.g, by attempting to read from a storage device that is not responding correctly, or by causing the system to swap).
You can make sleep 'signal-aware`.
sleep can be interrupted by signal. In which case the pause would be stopped and sleep would return with amount of time still left. The application can choose to handle the signal notified and if needed resume sleep for the time left.
Actually, you should use synchronization objects provided by the operating system you're working on or simply check the return value of sleep function. If it returns to a value bigger than zero, it means your procedure was interrupted. According to this return value, call sleep function again by passing the delta (T-returnVal) as argument (probably in a loop, in case of possible interrupts that might occur again in that time interval)
On the other hand, if you really want a real-uninterruptible custom sleep function, I may suggest something like the following:
void uninterruptible_sleep(long time, long factor)
{
long i, j;
__asm__("cli"); // close interrupts
for(i=0; i<time; ++i)
for(j=0; j<factor; ++j)
; // custom timer loop
__asm__("sti"); // open interrupts
}
cli and sti are x86 assembly instructions which allow us to set IF (interrupt flag) of the cpu. In this way, it is possible to clear (cli) or set (sti) all the interrupts. However, if you're working on a multi-processor system, there needs to be taken another synchronization precautions too, due to the fact that these instructions will only be valid for single microprocessor. Moreover, this type of function as I suggested above, will be very system (cpu) dependant. Because, the inner loop requires a clock-cycle count to measure an exact time interval (execution number of instructions per second) depending on the cpu frequency. Thus, if you really want to get rid of every possible interrupt, you may use a function as I suggested above. But be careful, if your program gets a deadlock situation while it's in cli state, you will need to restart your system.
(The inline assembly syntax I have written is for gcc compiler)
I am trying to write a simple win32 console application in C to simulate stock price ticks, I need to specify the time interval so that every n milliseconds a new price is published.
The end goal is to write test data to a database and stress test an application which is supposed to react to new ticks and perform calcs.
My simple price server will be structured as follows
int main (void)
{
int n = 0;
//set interval to 1 millisecond
while (true) {
printf ("New price...\n");
// Publish price and write to database
SleepExecution (n);
}
return 0;
}
I have not been able to find an API call which will allow me to stop the execution of the above code for the arbitrary n milliseconds. Sleep looks to be the solution but I would prefer not to use it.
Are there any libraries you would recommend using or samples on the web I could draw inspiration from?
CreateWaitableTimer is great for running code periodically. Combined with timeBeginPeriod to increase the system timer rate, and turning up your thread priority so it wakes on time, you should have a solution that's 99.99% effective.
Suspending execution is an operating system-specific operation.
For MS Windows (Win 95 and after), use the function Sleep() where the parameter is the minimum rescheduling time in milliseconds.
For Linux, there are several ways, but int nanosleep(const struct timespec *req, struct timespec *rem); allows nanosecond precision for most uses. sleep() allows one second precision, so it probably isn't useful for your purposes.
I see that win32 was mentioned. To use Sleep(), the program can probably use a constant value, depending on the requirements, but if high consistency is required, then dynamically compute the delay based on how much longer it is until the next time a update is needed.