I'm having a strange problem. I have the following code:
dbg("condwait: timeout = %d, %d\n",
abs_timeout->tv_sec, abs_timeout->tv_nsec);
ret = pthread_cond_timedwait( &q->q_cond, &q->q_mtx, abs_timeout );
if (ret == ETIMEDOUT)
{
dbg("cond timed out\n");
return -ETIMEDOUT;
}
dbg calls gettimeofday before every line and prepends the line with the time. It results in the following output:
7.991151: condwait: timeout = 5, 705032704
7.991158: cond timed out
As you can see, only 7 microseconds passed in between the two debug lines, yet pthread_cond_timedwait returned ETIMEDOUT. How can this happen? I even tried setting the clock to something else when initializing the cond variable:
int ret;
ret = pthread_condattr_init(&attributes);
if (ret != 0) printf("CONDATTR INIT FAILED: %d\n", ret);
ret = pthread_condattr_setclock(&attributes, CLOCK_REALTIME);
if (ret != 0) printf("SETCLOCK FAILED: %d\n", ret);
ret = pthread_cond_init( &q->q_cond, &attributes );
if (ret != 0) printf("COND INIT FAILED: %d\n", ret);
(none of the error messages are printed out). I tried both CLOCK_REALTIME and CLOCK_MONOTONIC.
This code is part of a blocking queue. I need functionality such that if nothing gets put on this queue in 5 seconds, something else happens. The mutex and the cond are both initialized, as the blocking queue works fine if I don't use pthread_cond_timedwait.
pthread_cond_timedwait takes an absolute time, not a relative time. You need to make your wait time absolute by adding to the current time to your timeout value.
Overflow in timespec is usually the culprit for weird timeouts.
Check for EINVAL:
void timespec_add(struct timespec* a, struct timespec* b, struct timespec* out)
{
time_t sec = a->tv_sec + b->tv_sec;
long nsec = a->tv_nsec + b->tv_nsec;
sec += nsec / 1000000000L;
nsec = nsec % 1000000000L;
out->tv_sec = sec;
out->tv_nsec = nsec;
}
The condition variable can spuriously unblock. You need to check it in a loop and check the condition each time through. You'll probably need to update the timeout value too.
I found some documentation for pthread_cond_timedwait here.
When using condition variables there
is always a Boolean predicate
involving shared variables associated
with each condition wait that is true
if the thread should proceed. Spurious
wakeups from the
pthread_cond_timedwait() or
pthread_cond_wait() functions may
occur. Since the return from
pthread_cond_timedwait() or
pthread_cond_wait() does not imply
anything about the value of this
predicate, the predicate should be
re-evaluated upon such return.
As already in other answers mentioned you have to use the absolute time. Since C11 you can use timespec_get().
struct timespec time;
timespec_get(&time, TIME_UTC);
time.tv_sec += 5;
pthread_cond_timedwait(&cond, &mutex, &time);
Related
I have a main thread which create child threads to do various task. There is a child thread which is tasked to report on the status every 100s
My current mechanism of stopping the thread is to observe a global boolean. Somewhat like this
Child thread
void* ReportThread(bool* operation)
{
while(*operation)
{
// do its reporting task
// ........
int counter = 0;
while( counter < 100 && operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
sleepCounter += 1;
}
}
}
Parent (Main) Thread:
bool operation = false;
int main(){
pthread_t tid;
err = pthread_create(&tid), NULL, &ReportThread, &operation);
printf("Please input esc to end operation \n");
while ((ch = getchar()) != 27);
operation =true;
pthread_join(tid,NULL);
return 0;
}
The problem:
It seem that using sleep(n). The number of seconds seem very inconsistent. When the program is stopped, this thread takes a while maybe 10 second to actually stop
Is there a way to interrupt a thread to sleep? I heard you could use signal. I am coding in linux
Can I just simply just use a pthread_cancel(tid) instead of pthread_join(tid)?
Regards
This part
while( counter < 100 || operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
sleepCounter += 1;
}
is wrong.
First I assume that sleepCounter += 1; is really a typo and that it should be:
while( counter < 100 || operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
counter += 1;
}
Then the problem is that even if operation is set to false by some other thread, the while will not finish until counter reach 100.
The code should be
while( counter < 100 && operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
counter += 1;
}
Further, in main you never set operation to false. Another typo?
You don't need two while loops. And if you want to set a timer, use time functions for it, because sleep is a cancellation point and it is not guaranteed that sleep actually sleeps that amount of time.
Example:
void* ReportThread(void *args)
{
time_t start = time(NULL);
time_t now;
bool *operation = (bool*) args;
while (*operation) { //while active
now = time(NULL); //get current time
if (now - start >= 100) { //if the threshold is exceeded
start = now; //reset timer
//and probably do other stuff
}
sleep(1); //sleep for one second
}
return NULL;
}
The example above has a max lag of one second, that means if you set operation to false right at that moment when the thread entered the sleep state, you have to wait until sleep returns, only then it will recognize the modified state. The example also has the advantage, that you can easily modify the threshold value (since it depends on the 'real' time, instead of a counter and a non accurate sleep time).
Btw. the variable operation should be either an atomic boolean or protected by a mutex (since it is accessed from different threads).
To answer the questions of your problem:
should be answered by the example above
since i mentioned it before, sleep is a cancellation point, that means it gets interrupted if the process handles a signal (see man pthreads - section Cancellation points).
see man pthread_cancel - section Notes
On Linux, cancellation is implemented using signals. Under the NPTL threading implementation, the first real-time signal (i.e., signal 32) is used for this purpose. On LinuxThreads, the second real-time signal is used, if real-time signals are available, otherwise SIGUSR2 is used.
You cannot use pthread_cancel over pthread_join! You have to use pthread_join in either case (described in detail in the man page).
I don't know if this will fix all your problems, but it's a bit too much for a comment. One problem, your ReportThread function signature is wrong. It should be:
void* ReportThread(void* args);
And then in that function you need to do something like:
void* ReportThread(void* args)
{
bool* operation = (bool*)args;
while(*operation)
{
...
}
}
I'm not sure how it's working right now, but your compiler should at least be issuing a warning trying to convert a bool* type to a bool.
Also be aware of race conditions on operation
I'm writing an ANSI C application that will be built on both Linux and Windows.
I built the pthread 2.9.1 library for Windows and all works fine.
The problem is that I can't find the function: pthread_sleep()
I also looked around for that function but it seems that it doesn't exist.
Without that function my code will have to call Sleep() on Windows and sleep() on Linux but that's exactly what I don't want.
Thanks,
Enrico Migliore
The problem is that I can't find the function: pthread_sleep()
No, you wouldn't, since pthreads does not provide such a function. And why would it? There is nothing specific to the pthreads API about making a thread sleep, and genuine POSIX platforms all have a variety of other mechanisms, including sleep(), to make a thread sleep. Not that making threads of a multithreaded program sleep (as opposed to block) is very often a good or reasonable thing to do, anyway.
Without that function my code will have to call Sleep() on Windows and sleep() on Linux but that's exactly what I don't want.
I'm inclined to think that you don't really want to call a platform-agnostic alternative either. But the traditional approach to handling such issues of using various platform-specific functions to accomplish a common goal is to use a conditionally-defined macro or a wrapper function with condtionally-defined implementation to hide the platform-specific bits. For example,
#if defined(_MSC_VER)
// Sleep() expects an argument in milliseconds
#define SLEEP(time) Sleep((time) * 1000)
#else
// sleep() expects an argument in seconds
#define SLEEP(time) sleep(time)
#endif
For your particular case, however, there is also the possibility of calling a function with a user-specifiable timeout that you expect always to expire. Pthreads does provide functions you could use for that, such as pthread_cond_timedwait(). You could write a sleep function with that, without any conditional compilation. For example,
int my_sleep(long milliseconds) {
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t cv = PTHREAD_COND_INITIALIZER;
int rval = 0;
if (milliseconds > 0) {
struct timespec time;
rval = timespec_get(&time, TIME_UTC);
// ... handle any error ...
time.tv_nsec += (milliseconds % 1000) * 1000000;
time.tv_sec += milliseconds / 1000 + time.tv_nsec / 1000000;
time.tv_nsec %= 1000000;
rval = pthread_mutex_lock(&mutex);
if (rval != 0) {
// handle error ...
} else {
// The loop handles spurrious wakepups
do {
rval = pthread_cond_timedwait(&cond, &mutex, &time); // expects ETIMEDOUT
} while (rval == 0);
if (rval != ETIMEDOUT) {
// handle error ...
}
rval = pthread_mutex_unlock(&mutex); // THIS MUST NOT BE SKIPPED
// ... handle any error ...
}
}
return rval;
}
As I understand it, pthread_cond_timedwait is to be used by taking current time, then calculating absolute time when pthread_cond_timedwait should exit if condition is not signalled.
Is there a simple way to use this function to reliably perform a periodic task (the problem being changes in time between the point where current time is taken and call to pthread_cond_timedwait)?
I have a periodic task that should run ~every second.
do {
pthread_mutex_lock(mutex);
tx = gettimeofday + 1 second;
do_some_simple_periodic_task();
pthread_cond_timedwait(condition, mutex, tx);
pthread_mutex_unlock(mutex);
} while (!some_exit_condition);
Condition is signalled if context (including some_exit_condition) is updated.
Is there a way to use monotonic timer or something similar with this? If not, what is the use case for pthread_cond_timedwait at all? Only for cases, where you don't care about additional hour of delay?
I have seen a solution where another thread signals this one periodically (How to make pthread_cond_timedwait() robust against system clock manipulations?), but it seems like a hack.
Is there a different & better way to make thread sleep for some interval (not prone to wall clock changes) but respond immediately to external condition?
You can set the clock type used by pthread_cond_timedwait() by setting attributes when initializing the condition variable:
pthread_condattr_t cond_attr;
pthread_cond_t cond;
errno = pthread_condattr_init (&cond_attr);
if (errno) {
perror ("pthread_condattr_init");
return -1;
}
errno = pthread_condattr_setclock (&cond_attr, CLOCK_MONOTONIC);
if (errno) {
perror ("pthread_condattr_setclock");
return -1;
}
errno = pthread_cond_init (&cond, &cond_attr);
if (errno) {
perror ("pthread_cond_init");
return -1;
}
And then use time from CLOCK_MONOTONIC for timeout:
struct timespec timeout;
clock_gettime (CLOCK_MONOTONIC, &timeout);
timeout.tv_sec += 1;
// pthread_cond_timedwait (..., timeout);
I've got the following function that gets called from a pthread_create. This function does some work, sets a timer, does some other work and then waits for the timer to expire before doing the loop again. However, on the first run of the timer, after it expires the program quits and I'm not totally sure why. It should never leave the infinite while loop. The main thread accesses nothing from this thread and vice versa (for now).
My guess is I might not have something setup correctly with the thread, or the timer is not calling the handler function correctly. Perhaps changing the IDLE global variable from the thread causes a problem.
I would like to call the handler without signals, hence the use of SIGEV_THREAD_ID. I'm using the SIGUSRx signals in the main thread anyway. Any thoughts about what I've started here what could be wrong?
#ifndef sigev_notify_thread_id
#define sigev_notify_thread_id _sigev_un._tid
#endif
volatile sig_atomic_t IDLE = 0;
timer_t timer_id;
struct sigevent sev;
void handler() {
printf("Timer expired.\n");
IDLE = 0;
}
void *thread_worker() {
struct itimerspec ts;
/* setup the handler for timer event */
memset (&sev, 0, sizeof(struct sigevent));
sev.sigev_notify = SIGEV_THREAD_ID;
sev.sigev_value.sival_ptr = NULL;
sev.sigev_notify_function = handler;
sev.sigev_notify_attributes = NULL;
sev.sigev_signo = SIGRTMIN + 1;
sev.sigev_notify_thread_id = syscall(SYS_gettid);
/* setup "idle" timer */
ts.it_value.tv_sec = 55;
ts.it_value.tv_nsec = 0;
ts.it_interval.tv_sec = 0;
ts.it_interval.tv_nsec = 0;
if (timer_create(0, &sev, &timer_id) == -1) {
printf("timer_create failed: %d: %s\n", errno, strerror(errno));
exit(3);
}
while (1) {
// do work here before timer gets started that takes 5 seconds
while (IDLE); /* wait here until timer_id expires */
/* setup timer */
if (timer_settime(timer_id, 0, &ts, NULL) == -1) {
printf("timer_settime failed: %d\n", errno);
exit(3);
}
IDLE = 1;
// do work here while timer is running but that does not take 10 seconds
}
}
As far as I can tell, you haven't installed a signal handler for SIGUSR1, so by the default action it kills the process when it's acted upon.
In any case, the whole thing strikes me as extraordinarily bad design:
The while loop will give you 100% cpu load while waiting for the timer to expire.
This is not the way you use SIGEV_THREAD_ID, and in fact SIGEV_THREAD_ID isn't really setup to be usable by applications. Rather it's for the libc to use internally for implementing SIGEV_THREAD.
You really don't want to be using signals. They're messy.
If you have threads, why aren't you just calling clock_nanosleep in a loop? Timers are mainly useful when you can't do this, e.g. when you can't use threads.
I have a state machine implementation in a library which runs on Linux. The main loop of the program is to simply wait until enough time has passed to require the next execution of the state machine.
At them moment I have a loop which is similar to the following psuedo-code:
while( 1 )
{
while( StateTicks() > 0 )
StateMachine();
Pause( 10ms );
}
Where StateTicks may return a tick every 50ms or so. The shorter I make the Pause() the more CPU time I use in the program.
Is there a better way to test for a period of time passing, perhaps based on Signals? I'd rather halt execution until StateTicks() is > 0 rather than have the Pause() call at all.
Underneath the hood of the state machine implementation StateTicks uses clock_gettime(PFT_CLOCK ...) which works well. I'm keen to keep that timekeeping because if a StateMachine() call takes longer than a state machine tick this implementation will catchup.
Pause uses nanosleep to achieve a reasonably accurate pause time.
Perhaps this is already the best way, but it doesn't seem particularly graceful.
Create a periodic timer using timer_create(), and have it call sem_post() on a "timer tick semaphore".
To avoid losing ticks, I recommend using a real-time signal, perhaps SIGRTMIN+0 or SIGRTMAX-0. sem_post() is async-signal-safe, so you can safely use it in a signal handler.
Your state machine simply waits on the semaphore; no other timekeeping needed. If you take too long to process a tick, the following sem_wait() will not block, but return immediately. Essentially, the semaphore counts "lost" ticks.
Example code (untested!):
#define _POSIX_C_SOURCE 200809L
#include <semaphore.h>
#include <signal.h>
#include <errno.h>
#include <time.h>
#define TICK_SIGNAL (SIGRTMIN+0)
static timer_t tick_timer;
static sem_t tick_semaphore;
static void tick_handler(int signum, siginfo_t *info, void *context)
{
if (info && info->si_code == SI_TIMER) {
const int saved_errno = errno;
sem_post((sem_t *)info->si_value.sival_ptr);
errno = saved_errno;
}
}
static int tick_setup(const struct timespec interval)
{
struct sigaction act;
struct sigevent evt;
struct itimerspec spec;
if (sem_init(&tick_semaphore, 0, 0))
return errno;
sigemptyset(&act.sa_mask);
act.sa_handler = tick_handler;
act.sa_flags = 0;
if (sigaction(TICK_SIGNAL, &act, NULL))
return errno;
evt.sigev_notify = SIGEV_SIGNAL;
evt.sigev_signo = TICK_SIGNAL;
evt.sigev_value.sival_ptr = &tick_semaphore;
if (timer_create(CLOCK_MONOTONIC, &evt, &tick_timer))
return errno;
spec.it_interval = interval;
spec.it_value = interval;
if (timer_settime(tick_timer, 0, &spec, NULL))
return errno;
return 0;
}
with the tick loop being simply
if (tick_setup(some_interval))
/* failed, see errno; abort */
while (!sem_wait(&tick_semaphore)) {
/* process tick */
}
If you support more than one concurrent state, the one signal handler suffices. Your state typically would include
timer_t timer;
sem_t semaphore;
struct timespec interval;
and the only tricky thing is to make sure there is no pending timer signal when destroying the state that signal would access.
Because signal delivery will interrupt any blocking I/O in the thread used for the signal delivery, you might wish to set up a special thread in your library to handle the timer tick realtime signals, with the realtime signal blocked in all other threads. You can mark your library initialization function __attribute__((constructor)), so that it is automatically executed prior to main().
Optimally, you should use the same thread that does the tick processing for the signal delivery. Otherwise there will be some small jitter or latency in the tick processing, if the signal was delivered using a different CPU core than the one that is running the tick processing.
Basile Starynkevitch's answer jogged my memory about the latencies involved in waiting and signal delivery: If you use nanosleep() and clock_gettime(CLOCK_MONOTONIC,), you can adjust the sleep times to account for the typical latencies.
Here's a quick test program using clock_gettime(CLOCK_MONOTONIC,) and nanosleep():
#define _POSIX_C_SOURCE 200809L
#include <sys/select.h>
#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
static const long tick_latency = 75000L; /* 0.75 ms */
static const long tick_adjust = 75000L; /* 0.75 ms */
typedef struct {
struct timespec next;
struct timespec tick;
} state;
void state_init(state *const s, const double ticks_per_sec)
{
if (ticks_per_sec > 0.0) {
const double interval = 1.0 / ticks_per_sec;
s->tick.tv_sec = (time_t)interval;
s->tick.tv_nsec = (long)(1000000000.0 * (interval - (double)s->tick.tv_sec));
if (s->tick.tv_nsec < 0L)
s->tick.tv_nsec = 0L;
else
if (s->tick.tv_nsec > 999999999L)
s->tick.tv_nsec = 999999999L;
} else {
s->tick.tv_sec = 0;
s->tick.tv_nsec = 0L;
}
clock_gettime(CLOCK_MONOTONIC, &s->next);
}
static unsigned long count;
double state_tick(state *const s)
{
struct timespec now, left;
/* Next tick. */
s->next.tv_sec += s->tick.tv_sec;
s->next.tv_nsec += s->tick.tv_nsec;
if (s->next.tv_nsec >= 1000000000L) {
s->next.tv_nsec -= 1000000000L;
s->next.tv_sec++;
}
count = 0UL;
while (1) {
/* Get current time. */
clock_gettime(CLOCK_MONOTONIC, &now);
/* Past tick time? */
if (now.tv_sec > s->next.tv_sec ||
(now.tv_sec == s->next.tv_sec &&
now.tv_nsec >= s->next.tv_nsec - tick_latency))
return (double)(now.tv_sec - s->next.tv_sec)
+ (double)(now.tv_nsec - s->next.tv_nsec) / 1000000000.0;
/* Calculate duration to wait */
left.tv_sec = s->next.tv_sec - now.tv_sec;
left.tv_nsec = s->next.tv_nsec - now.tv_nsec - tick_adjust;
if (left.tv_nsec >= 1000000000L) {
left.tv_nsec -= 1000000000L;
left.tv_sec++;
} else
if (left.tv_nsec < -1000000000L) {
left.tv_nsec += 2000000000L;
left.tv_sec += 2;
} else
if (left.tv_nsec < 0L) {
left.tv_nsec += 1000000000L;
left.tv_sec--;
}
count++;
nanosleep(&left, NULL);
}
}
int main(int argc, char *argv[])
{
double rate, jitter;
long ticks, i;
state s;
char dummy;
if (argc != 3 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s TICKS_PER_SEC TICKS\n", argv[0]);
fprintf(stderr, "\n");
return 1;
}
if (sscanf(argv[1], " %lf %c", &rate, &dummy) != 1 || rate <= 0.0) {
fprintf(stderr, "%s: Invalid tick rate.\n", argv[1]);
return 1;
}
if (sscanf(argv[2], " %ld %c", &ticks, &dummy) != 1 || ticks < 1L) {
fprintf(stderr, "%s: Invalid tick count.\n", argv[2]);
return 1;
}
state_init(&s, rate);
for (i = 0L; i < ticks; i++) {
jitter = state_tick(&s);
if (jitter > 0.0)
printf("Tick %9ld: Delayed %9.6f ms, %lu sleeps\n", i+1L, +1000.0 * jitter, count);
else
if (jitter < 0.0)
printf("Tick %9ld: Premature %9.6f ms, %lu sleeps\n", i+1L, -1000.0 * jitter, count);
else
printf("Tick %9ld: Exactly on time, %lu sleeps\n", i+1L, count);
fflush(stdout);
}
return 0;
}
Above, tick_latency is the number of nanoseconds you're willing to accept a "tick" in advance, and tick_adjust is the number of nanoseconds you subtract from each sleep duration.
The best values for those are highly configuration-specific, and I haven't got a robust method for estimating them. Hardcoding them (to 0.75ms as above) does not sound too good to me either; perhaps using command-line options or environment values to let users control it, and default to zero would be better.
Anyway, compiling the above as
gcc -O2 test.c -lrt -o test
and running a 500-tick test at 50Hz tick rate,
./test 50 500 | sort -k 4
shows that on my machine, the ticks are accepted within 0.051 ms (51 µs) of the desired moment. Even reducing the priority does not seem to affect it much. A test using 5000 ticks at 5kHz rate (0.2ms per tick),
nice -n 19 ./test 5000 5000 | sort -k 4
yields similar results -- although I did not bother to check what happens if the machine load changes during a run.
In other words, preliminary tests on a single machine indicates it might be a viable option, so you might wish to test the scheme on different machines and under different loads. It is much more precise than I expected on my own machine (Ubuntu 3.11.0-24-generic on x86_64, running on an AMD Athlon II X4 640 CPU).
This approach has the interesting property that you can easily use a single thread to maintain multiple states, even if they use different tick rates. You only need to check which state has the next tick (earliest ->next time), nanosleep() if that occurs in the future, and process the tick, advancing that state to the next tick.
Questions?
In addition of Nominal Animal's answer :
If the Pause time is several milliseconds, you might use poll(2) or perhaps nanosleep(2) (you might compute the remaining time to sleep, e.g. using clock_gettime(2) with CLOCK_REALTIME ...)
If you care about the fact that StateMachine may take several milliseconds (or a large fraction of a millisecond) and you want exactly a 10 millisecond period, consider perhaps using a poll based event loop which uses the Linux specific timerfd_create(2)
See also time(7), and this, that answers (to question about poll etc...)