As I understand it, pthread_cond_timedwait is to be used by taking current time, then calculating absolute time when pthread_cond_timedwait should exit if condition is not signalled.
Is there a simple way to use this function to reliably perform a periodic task (the problem being changes in time between the point where current time is taken and call to pthread_cond_timedwait)?
I have a periodic task that should run ~every second.
do {
pthread_mutex_lock(mutex);
tx = gettimeofday + 1 second;
do_some_simple_periodic_task();
pthread_cond_timedwait(condition, mutex, tx);
pthread_mutex_unlock(mutex);
} while (!some_exit_condition);
Condition is signalled if context (including some_exit_condition) is updated.
Is there a way to use monotonic timer or something similar with this? If not, what is the use case for pthread_cond_timedwait at all? Only for cases, where you don't care about additional hour of delay?
I have seen a solution where another thread signals this one periodically (How to make pthread_cond_timedwait() robust against system clock manipulations?), but it seems like a hack.
Is there a different & better way to make thread sleep for some interval (not prone to wall clock changes) but respond immediately to external condition?
You can set the clock type used by pthread_cond_timedwait() by setting attributes when initializing the condition variable:
pthread_condattr_t cond_attr;
pthread_cond_t cond;
errno = pthread_condattr_init (&cond_attr);
if (errno) {
perror ("pthread_condattr_init");
return -1;
}
errno = pthread_condattr_setclock (&cond_attr, CLOCK_MONOTONIC);
if (errno) {
perror ("pthread_condattr_setclock");
return -1;
}
errno = pthread_cond_init (&cond, &cond_attr);
if (errno) {
perror ("pthread_cond_init");
return -1;
}
And then use time from CLOCK_MONOTONIC for timeout:
struct timespec timeout;
clock_gettime (CLOCK_MONOTONIC, &timeout);
timeout.tv_sec += 1;
// pthread_cond_timedwait (..., timeout);
Related
I have a main thread which create child threads to do various task. There is a child thread which is tasked to report on the status every 100s
My current mechanism of stopping the thread is to observe a global boolean. Somewhat like this
Child thread
void* ReportThread(bool* operation)
{
while(*operation)
{
// do its reporting task
// ........
int counter = 0;
while( counter < 100 && operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
sleepCounter += 1;
}
}
}
Parent (Main) Thread:
bool operation = false;
int main(){
pthread_t tid;
err = pthread_create(&tid), NULL, &ReportThread, &operation);
printf("Please input esc to end operation \n");
while ((ch = getchar()) != 27);
operation =true;
pthread_join(tid,NULL);
return 0;
}
The problem:
It seem that using sleep(n). The number of seconds seem very inconsistent. When the program is stopped, this thread takes a while maybe 10 second to actually stop
Is there a way to interrupt a thread to sleep? I heard you could use signal. I am coding in linux
Can I just simply just use a pthread_cancel(tid) instead of pthread_join(tid)?
Regards
This part
while( counter < 100 || operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
sleepCounter += 1;
}
is wrong.
First I assume that sleepCounter += 1; is really a typo and that it should be:
while( counter < 100 || operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
counter += 1;
}
Then the problem is that even if operation is set to false by some other thread, the while will not finish until counter reach 100.
The code should be
while( counter < 100 && operation )
{
// let it sleep for 1 seconds and wake up to check
sleep(1);
counter += 1;
}
Further, in main you never set operation to false. Another typo?
You don't need two while loops. And if you want to set a timer, use time functions for it, because sleep is a cancellation point and it is not guaranteed that sleep actually sleeps that amount of time.
Example:
void* ReportThread(void *args)
{
time_t start = time(NULL);
time_t now;
bool *operation = (bool*) args;
while (*operation) { //while active
now = time(NULL); //get current time
if (now - start >= 100) { //if the threshold is exceeded
start = now; //reset timer
//and probably do other stuff
}
sleep(1); //sleep for one second
}
return NULL;
}
The example above has a max lag of one second, that means if you set operation to false right at that moment when the thread entered the sleep state, you have to wait until sleep returns, only then it will recognize the modified state. The example also has the advantage, that you can easily modify the threshold value (since it depends on the 'real' time, instead of a counter and a non accurate sleep time).
Btw. the variable operation should be either an atomic boolean or protected by a mutex (since it is accessed from different threads).
To answer the questions of your problem:
should be answered by the example above
since i mentioned it before, sleep is a cancellation point, that means it gets interrupted if the process handles a signal (see man pthreads - section Cancellation points).
see man pthread_cancel - section Notes
On Linux, cancellation is implemented using signals. Under the NPTL threading implementation, the first real-time signal (i.e., signal 32) is used for this purpose. On LinuxThreads, the second real-time signal is used, if real-time signals are available, otherwise SIGUSR2 is used.
You cannot use pthread_cancel over pthread_join! You have to use pthread_join in either case (described in detail in the man page).
I don't know if this will fix all your problems, but it's a bit too much for a comment. One problem, your ReportThread function signature is wrong. It should be:
void* ReportThread(void* args);
And then in that function you need to do something like:
void* ReportThread(void* args)
{
bool* operation = (bool*)args;
while(*operation)
{
...
}
}
I'm not sure how it's working right now, but your compiler should at least be issuing a warning trying to convert a bool* type to a bool.
Also be aware of race conditions on operation
I've created a Timer pseudo class in C that has call back capability and can be cancelled. I come from the .NET/C# world where this is all done by the framework and I'm not an expert with pthreads.
In .NET there are cancellation tokens which you can wait on which means I don't need to worry so much about the nuts and bolts.
However using pthreads is a bit more low level than I am used to so my question is:
Are there any issues with the way I have implemented this?
Thanks in anticipation for any comments you may have.
Timer struct:
typedef struct _timer
{
pthread_cond_t Condition;
pthread_mutex_t ConditionMutex;
bool IsRunning;
pthread_mutex_t StateMutex;
pthread_t Thread;
int TimeoutMicroseconds;
void * Context;
void (*Callback)(bool isCancelled, void * context);
} TimerObject, *Timer;
C Module:
static void *
TimerTask(Timer timer)
{
struct timespec timespec;
struct timeval now;
int returnValue = 0;
clock_gettime(CLOCK_REALTIME, ×pec);
timespec.tv_sec += timer->TimeoutMicroseconds / 1000000;
timespec.tv_nsec += (timer->TimeoutMicroseconds % 1000000) * 1000000;
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = true;
pthread_mutex_unlock(&timer->StateMutex);
pthread_mutex_lock(&timer->ConditionMutex);
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->ConditionMutex, ×pec);
pthread_mutex_unlock(&timer->ConditionMutex);
if (timer->Callback != NULL)
{
(*timer->Callback)(returnValue != ETIMEDOUT, timer->Context);
}
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = false;
pthread_mutex_unlock(&timer->StateMutex);
return 0;
}
void
Timer_Initialize(Timer timer, void (*callback)(bool isCancelled, void * context))
{
pthread_mutex_init(&timer->ConditionMutex, NULL);
timer->IsRunning = false;
timer->Callback = callback;
pthread_mutex_init(&timer->StateMutex, NULL);
pthread_cond_init(&timer->Condition, NULL);
}
bool
Timer_IsRunning(Timer timer)
{
pthread_mutex_lock(&timer->StateMutex);
bool isRunning = timer->IsRunning;
pthread_mutex_unlock(&timer->StateMutex);
return isRunning;
}
void
Timer_Start(Timer timer, int timeoutMicroseconds, void * context)
{
timer->Context = context;
timer->TimeoutMicroseconds = timeoutMicroseconds;
pthread_create(&timer->Thread, NULL, TimerTask, (void *)timer);
}
void
Timer_Stop(Timer timer)
{
void * returnValue;
pthread_mutex_lock(&timer->StateMutex);
if (!timer->IsRunning)
{
pthread_mutex_unlock(&timer->StateMutex);
return;
}
pthread_mutex_unlock(&timer->StateMutex);
pthread_cond_broadcast(&timer->Condition);
pthread_join(timer->Thread, &returnValue);
}
void
Timer_WaitFor(Timer timer)
{
void * returnValue;
pthread_join(timer->Thread, &returnValue);
}
Example use:
void
TimerExpiredCallback(bool cancelled, void * context)
{
fprintf(stderr, "TimerExpiredCallback %s with context %s\n",
cancelled ? "Cancelled" : "Timed Out",
(char *)context);
}
void
ThreadedTimerExpireTest()
{
TimerObject timerObject;
Timer_Initialize(&timerObject, TimerExpiredCallback);
Timer_Start(&timerObject, 5 * 1000000, "Threaded Timer Expire Test");
Timer_WaitFor(&timerObject);
}
void
ThreadedTimerCancelTest()
{
TimerObject timerObject;
Timer_Initialize(&timerObject, TimerExpiredCallback);
Timer_Start(&timerObject, 5 * 1000000, "Threaded Timer Cancel Test");
Timer_Stop(&timerObject);
}
Overall, it seems pretty solid work for someone who ordinarily works in different languages and who has little pthreads experience. The idea seems to revolve around pthread_cond_timedwait() to achieve a programmable delay with a convenient cancellation mechanism. That's not unreasonable, but there are, indeed, a few problems.
For one, your condition variable usage is non-idiomatic. The conventional and idiomatic use of a condition variable associates with each wait a condition for whether the thread is clear to proceed. This is tested, under protection of the mutex, before waiting. If the condition is satisfied then no wait is performed. It is tested again after each wakeup, because there is a variety of scenarios in which a thread may return from waiting even though it is not actually clear to proceed. In these cases, it loops back and waits again.
I see at least two such possibilities with your timer:
The timer is cancelled very quickly, before its thread starts to wait. Condition variables do not queue signals, so in this case the cancellation would be ineffective. This is a form of race condition.
Spurious wakeup. This is always a possibility that must be considered. Spurious wakeups are rare under most circumstances, but they really do happen.
It seems natural to me to address that by generalizing your IsRunning to cover more states, perhaps something more like
enum { NEW, RUNNING, STOPPING, FINISHED, ERROR } State;
, instead.
Of course, you still have to test that under protection of the appropriate mutex, which brings me to my next point: one mutex should suffice. That one can and should serve both to protect shared state and as the mutex associated with the CV wait. This, too, is idiomatic. It would lead to code in TimerTask() more like this:
// ...
pthread_mutex_lock(&timer->StateMutex);
// Responsibility for setting the state to RUNNING transferred to Timer_Start()
while (timer->State == RUNNING) {
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->StateMutex, ×pec);
switch (returnValue) {
case 0:
if (timer->State == STOPPING) {
timer->State = FINISHED;
}
break;
case ETIMEDOUT:
timer->State = FINISHED;
break;
default:
timer->State = ERROR;
break;
}
}
pthread_mutex_unlock(&timer->StateMutex);
// ...
The accompanying Timer_Start() and Timer_Stop() would be something like this:
void Timer_Start(Timer timer, int timeoutMicroseconds, void * context) {
timer->Context = context;
timer->TimeoutMicroseconds = timeoutMicroseconds;
pthread_mutex_lock(&timer->StateMutex);
timer->state = RUNNING;
// start the thread before releasing the mutex so that no one can see state
// RUNNING before the thread is actually running
pthread_create(&timer->Thread, NULL, TimerTask, (void *)timer);
pthread_mutex_unlock(&timer->StateMutex);
}
void Timer_Stop(Timer timer) {
_Bool should_join = 0;
pthread_mutex_lock(&timer->StateMutex);
switch (timer->State) {
case NEW:
timer->state = FINISHED;
break;
case RUNNING:
timer->state = STOPPING;
should_join = 1;
break;
case STOPPING:
should_join = 1;
break;
// else no action
}
pthread_mutex_unlock(&timer->StateMutex);
// Harmless if the timer has already stopped:
pthread_cond_broadcast(&timer->Condition);
if (should_join) {
pthread_join(timer->Thread, NULL);
}
}
A few other, smaller adjustments would be needed elsewhere.
Additionally, although the example code above omits it for clarity, you really should ensure that you test the return values of all the functions that provide status information that way, unless you don't care whether they succeeded. That includes almost all standard library and Pthreads functions. What you should do in the event that that one fails is highly contextual, but pretending (or assuming) that it succeeded, instead, is rarely a good choice.
An alternative
Another approach to a cancellable delay would revolve around select() or pselect() with a timeout. To arrange for cancellation, you set up a pipe, and have select() to listen to the read end. Writing anything to the write end will then wake select().
This is in several ways easier to code, because you don't need any mutexes or condition variables. Also, data written to a pipe persists until it is read (or the pipe is closed), which smooths out some of the timing-related issues that the CV-based approach has to code around.
With select, however, you need to be prepared to deal with signals (at minimum by blocking them), and the timeout is a duration, not an absolute time.
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = true;
pthread_mutex_unlock(&timer->StateMutex);
pthread_mutex_lock(&timer->ConditionMutex);
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->ConditionMutex, ×pec);
pthread_mutex_unlock(&timer->ConditionMutex);
if (timer->Callback != NULL)
{
(*timer->Callback)(returnValue != ETIMEDOUT, timer->Context);
}
You have two bugs here.
A cancellation can slip in after IsRunning is set to true and before pthread_cond_timedwait gets called. In this case, you'll wait out the entire timer. This bug exists because ConditionMutex doesn't protect any shared state. To use a condition variable properly, the mutex associated with the condition variable must protect the shared state. You can't trade the right mutex for the wrong mutex and then call pthread_cond_timedwait because that creates a race condition. The entire point of a condition variable is to provide an atomic "unlock and wait" operation to prevent this race condition and your code goes to effort to break that logic.
You don't check the return value of pthread_cond_timedwait. If neither the timeout has expired nor cancellation has been requested, you call the callback anyway. Condition variables are stateless. It is your responsibility to track and check state, the condition variable will not do this for you. You need to call pthread_cond_timedwait in a loop until either the state is set to STOPPING or the timeout is reached. Note that the mutex associated with the condition variable, as in 1 above, must protect the shared state -- in this case state.
I think you have a fundamental misunderstanding about how condition variable work and what they're for. They are used when you a mutex that protects shared state and you want to wait for that shared state to change. The mutex associated with the condition variable must protect the shared state to avoid the classic race condition where the state changes after you released the lock but before you managed to start waiting.
UPDATE:
To provide some more useful information, let me briefly explain what a condition variable is for. Say you have some shared state protected by a mutex. And say some thread can't make forward progress until that shared state changes.
You have a problem. You have to hold the mutex that protects the shared state to see what the state is. When you see that it's in the wrong state, you need to wait. But you also need to release the mutex or no other thread can change the shared state.
But if you unlock the mutex and then wait (which is what your code does above!) you have a race condition. After you unlock the mutex but before you wait, another thread can acquire the mutex and change the shared state such that you no longer want to wait. So you need an atomic "unlock the mutex and wait" operation.
That is the purpose, and the only purpose, of condition variables. So you can atomically release the mutex that protects some shared state and wait for a sign with no change for the signal to be lost in-between when you released the mutex and when you waited.
Another important point -- condition variables are stateless. They have no idea what you are waiting for. You must never call pthread_cond_wait or pthread_cond_timedwait and make assumptions about the state. You must check it yourself. Your code releases the mutex after pthread_cond_timedwait returns. You only want to do that if the call times out.
If pthread_cond_timedwait doesn't timeout (or, in any case, when pthread_cond_wait returns), you don't know what happened until you check the state. That's why these functions re-acquire the mutex -- so you can check the state and decide what to do. This is why these functions are almost always called in a loop -- if the thing you're waiting for still hasn't happened (which you determine by checking the shared state that you are responsible for), you need to keep waiting.
I've got the following function that gets called from a pthread_create. This function does some work, sets a timer, does some other work and then waits for the timer to expire before doing the loop again. However, on the first run of the timer, after it expires the program quits and I'm not totally sure why. It should never leave the infinite while loop. The main thread accesses nothing from this thread and vice versa (for now).
My guess is I might not have something setup correctly with the thread, or the timer is not calling the handler function correctly. Perhaps changing the IDLE global variable from the thread causes a problem.
I would like to call the handler without signals, hence the use of SIGEV_THREAD_ID. I'm using the SIGUSRx signals in the main thread anyway. Any thoughts about what I've started here what could be wrong?
#ifndef sigev_notify_thread_id
#define sigev_notify_thread_id _sigev_un._tid
#endif
volatile sig_atomic_t IDLE = 0;
timer_t timer_id;
struct sigevent sev;
void handler() {
printf("Timer expired.\n");
IDLE = 0;
}
void *thread_worker() {
struct itimerspec ts;
/* setup the handler for timer event */
memset (&sev, 0, sizeof(struct sigevent));
sev.sigev_notify = SIGEV_THREAD_ID;
sev.sigev_value.sival_ptr = NULL;
sev.sigev_notify_function = handler;
sev.sigev_notify_attributes = NULL;
sev.sigev_signo = SIGRTMIN + 1;
sev.sigev_notify_thread_id = syscall(SYS_gettid);
/* setup "idle" timer */
ts.it_value.tv_sec = 55;
ts.it_value.tv_nsec = 0;
ts.it_interval.tv_sec = 0;
ts.it_interval.tv_nsec = 0;
if (timer_create(0, &sev, &timer_id) == -1) {
printf("timer_create failed: %d: %s\n", errno, strerror(errno));
exit(3);
}
while (1) {
// do work here before timer gets started that takes 5 seconds
while (IDLE); /* wait here until timer_id expires */
/* setup timer */
if (timer_settime(timer_id, 0, &ts, NULL) == -1) {
printf("timer_settime failed: %d\n", errno);
exit(3);
}
IDLE = 1;
// do work here while timer is running but that does not take 10 seconds
}
}
As far as I can tell, you haven't installed a signal handler for SIGUSR1, so by the default action it kills the process when it's acted upon.
In any case, the whole thing strikes me as extraordinarily bad design:
The while loop will give you 100% cpu load while waiting for the timer to expire.
This is not the way you use SIGEV_THREAD_ID, and in fact SIGEV_THREAD_ID isn't really setup to be usable by applications. Rather it's for the libc to use internally for implementing SIGEV_THREAD.
You really don't want to be using signals. They're messy.
If you have threads, why aren't you just calling clock_nanosleep in a loop? Timers are mainly useful when you can't do this, e.g. when you can't use threads.
I'm building a multithreaded application with pthreads and need a thread to periodically check some stuff. During the time in between this thread shouldn't use any CPU. Is this possible with usleep()? Is usleep() not busy waiting? Or is there a better solution?
The function usleep has been removed from SUSv4. You should probably use nanosleep instead or timers (setitimer, etc).
As R.. notes in the comments, should the sleep be implemented as a busy wait:
The thread would continue to use the CPU
Other (lower-priority) threads wouldn't get a chance to run
Thus:
Some might use signals (I think SUSv3 mentioned SIGALARM?)
Some might use fancy timers
(usleep is not part of the C standard, but of an ancient POSIX standard. But see below.)
No, the POSIX specification of usleep clearly states
The usleep() function will cause the calling thread to be suspended
from execution ...
so this clearly requires that it suspends execution and lets the resources to other processes or threads.
As already be mentioned by others, the POSIX function nanosleep is now replacing usleep and you should use that. C (since C11) has a function thrd_sleep that is modeled after nanosleep.
Just be aware that both usleep() and nanosleep() can be interrupted by a signal. nanosleep() lets you pass in an extra timespec pointer where the remaining time will be stored if that happens. So if you really need to guarantee your delay times, you'll probably want to write a simple wrapper around nanosleep().
Beware that this is not tested, but something along these lines:
int myNanoSleep(time_t sec, long nanosec)
{
/* Setup timespec */
struct timespec req;
req.tv_sec = sec;
req.tv_nsec = nanosec;
/* Loop until we've slept long enough */
do
{
/* Store remainder back on top of the original required time */
if( 0 != nanosleep( &req, &req ) )
{
/* If any error other than a signal interrupt occurs, return an error */
if(errno != EINTR)
return -1;
}
else
{
/* nanosleep succeeded, so exit the loop */
break;
}
} while( req.tv_sec > 0 || req.tv_nsec > 0 )
return 0; /* Return success */
}
And if you ever need to wake the thread for something other than a periodic timeout, take a look at condition variables and pthread_cond_timedwait().
On Linux, it is implemented with the nanosleep system call which is not a busy wait.
Using strace, I can see that a call to usleep(1) is translated into nanosleep({0, 1000}, NULL).
usleep() is a C runtime library function built upon system timers.
nanosleep() is a system call.
Only MS-DOS, and like ilk, implement the sleep functions as busy waits. Any actual operating system which offers multitasking can easily provide a sleep function as a simple extension of mechanisms for coordinating tasks and processes.
I'm having a strange problem. I have the following code:
dbg("condwait: timeout = %d, %d\n",
abs_timeout->tv_sec, abs_timeout->tv_nsec);
ret = pthread_cond_timedwait( &q->q_cond, &q->q_mtx, abs_timeout );
if (ret == ETIMEDOUT)
{
dbg("cond timed out\n");
return -ETIMEDOUT;
}
dbg calls gettimeofday before every line and prepends the line with the time. It results in the following output:
7.991151: condwait: timeout = 5, 705032704
7.991158: cond timed out
As you can see, only 7 microseconds passed in between the two debug lines, yet pthread_cond_timedwait returned ETIMEDOUT. How can this happen? I even tried setting the clock to something else when initializing the cond variable:
int ret;
ret = pthread_condattr_init(&attributes);
if (ret != 0) printf("CONDATTR INIT FAILED: %d\n", ret);
ret = pthread_condattr_setclock(&attributes, CLOCK_REALTIME);
if (ret != 0) printf("SETCLOCK FAILED: %d\n", ret);
ret = pthread_cond_init( &q->q_cond, &attributes );
if (ret != 0) printf("COND INIT FAILED: %d\n", ret);
(none of the error messages are printed out). I tried both CLOCK_REALTIME and CLOCK_MONOTONIC.
This code is part of a blocking queue. I need functionality such that if nothing gets put on this queue in 5 seconds, something else happens. The mutex and the cond are both initialized, as the blocking queue works fine if I don't use pthread_cond_timedwait.
pthread_cond_timedwait takes an absolute time, not a relative time. You need to make your wait time absolute by adding to the current time to your timeout value.
Overflow in timespec is usually the culprit for weird timeouts.
Check for EINVAL:
void timespec_add(struct timespec* a, struct timespec* b, struct timespec* out)
{
time_t sec = a->tv_sec + b->tv_sec;
long nsec = a->tv_nsec + b->tv_nsec;
sec += nsec / 1000000000L;
nsec = nsec % 1000000000L;
out->tv_sec = sec;
out->tv_nsec = nsec;
}
The condition variable can spuriously unblock. You need to check it in a loop and check the condition each time through. You'll probably need to update the timeout value too.
I found some documentation for pthread_cond_timedwait here.
When using condition variables there
is always a Boolean predicate
involving shared variables associated
with each condition wait that is true
if the thread should proceed. Spurious
wakeups from the
pthread_cond_timedwait() or
pthread_cond_wait() functions may
occur. Since the return from
pthread_cond_timedwait() or
pthread_cond_wait() does not imply
anything about the value of this
predicate, the predicate should be
re-evaluated upon such return.
As already in other answers mentioned you have to use the absolute time. Since C11 you can use timespec_get().
struct timespec time;
timespec_get(&time, TIME_UTC);
time.tv_sec += 5;
pthread_cond_timedwait(&cond, &mutex, &time);