In the (pseudo-)code below, cond might wake up while it shouldn't, for whatever reason. So I put a while loop there. When it does wake up, it will still consume the lock, so it is guaranteed that in out() only one thread is doing its job.
But what happens if, while there is a spurious wake-up in out(), at the same time in() signals to out(), however at that very moment out() is already locked because of the spurious wake-up. So what happens if the cond signals to a locked thread?
in()
inLock.lock()
isEmpty = false
cond.signal()
inLock.unlock()
out()
outLock.lock()
while isEmpty
cond.wait(outLock)
isEmpty = true
outLock.unlock()
NOTE
Well, to be 100% safe, I know I can use a single mutex for both in() and out(), but the data structure I'm using is 100% safe when input and output happens at the same time; it is a type of a queue. And I think it is a performance compromise to block anything reading out from the queue while filling in some new data or vice versa.
I did consider using semaphores, but the problems is that so many C and C++ libraries don't implement semaphores for whatever reason.
You have to use the same mutex when the in() thread sets isEmpty = false and the out() thread tests while (isEmpty). Otherwise, this can happen:
out() thread tests isEmpty, finds it is true;
in() thread sets isEmpty to false and signals the condition variable (but no-one wakes up, beacuse no-one is waiting yet);
out() thread calls cond.wait() and blocks forever, despite the fact that the queue is not empty anymore.
Note that in this sequence there hasn't been a spurious wakeup - it's just a plain old race condition.
As long as you update isEmpty with the same mutex held as when you test isEmpty, this interleaving can't happen.
So what happens if the cond signals to a locked thread?
The signal is lost forever. If no threads are waiting for the signal when pthread_cond_signal is called, then pthread_cond_signal does nothing.
Since isEmpty is being read and modified by two different threads, it is an error to access it unprotected. This is essentially what you are doing when you allow in and out to use different lock instances.
Using different lock instances on the same condition variable is a violation of the POSIX API for pthread_cond_wait() (emphasis mine).
The effect of using more than one mutex for concurrent pthread_cond_wait() or pthread_cond_timedwait() operations on the same condition variable is undefined; that is, a condition variable becomes bound to a unique mutex when a thread waits on the condition variable, and this (dynamic) binding ends when the wait returns.
Related
I'm trying to wrap my head around pthread condition variables.
I've seen some code examples that use pthread_cond_wait and pthread_cond_signal and all of them look like this:
while (condition)
{
// Assume that the mutex is locked before the following call
pthread_cond_wait(&cond, &mutex);
}
Is there a reason for using a while loop on the condition? why not just use a single if statement?
Spurious wakeups.
See Why does pthread_cond_wait have spurious wakeups? and also https://en.wikipedia.org/wiki/Spurious_wakeup:
Spurious wakeup describes a complication in the use of condition variables as provided by certain multithreading APIs such as POSIX
Threads and the Windows API.
Even after a condition variable appears to have been signaled from a
waiting thread's point of view, the condition that was awaited may
still be false. One of the reasons for this is a spurious wakeup; that
is, a thread might be awoken from its waiting state even though no
thread signaled the condition variable. For correctness it is
necessary, then, to verify that the condition is indeed true after the
thread has finished waiting. Because spurious wakeup can happen
repeatedly, this is achieved by waiting inside a loop that terminates
when the condition is true ...
I believe this is because threads may be spuriously woken up before the condition is met. Man, talk about "gotcha's".
From the Open Group documentation:
"Spurious wakeups from the pthread_cond_wait() or pthread_cond_timedwait() functions may occur."
Source:
http://pubs.opengroup.org/onlinepubs/007908775/xsh/pthread_cond_wait.html
Is there a reason for using a while loop on the condition? why not just use a single if statement?
The idea behind condition variables is to suspend thread execution until a given condition is satisfied. The condition is not built into the variable, however -- it must be provided by the programmer.
If the relevant condition is already satisfied, then there is no need to suspend operation. The key here, however, is that when the thread resumes, the condition might still not be satisfied, either because something changed between the condition variable being signaled and the thread being able to proceed, or because the thread awoke spurriously (this is allowed to happen, though it rarely does). The program therefore checks the condition again upon every wakeup to see whether it must resume waiting.
When reading a book on concurrency, the author says a semaphore is different than a condition variable in the way signal() works. The semaphore keeps track of the number of calls to signal() while the condition variable does not. "Calling pthread_cond_signal while no one is waiting has no effect", it says. Why is this detail important (I have seen it repeated many times in different places)? What are the implications to usage? Thank you
Conceptually, a semaphore is equivalent to a mutex, condition variable, and integer counter protected by the mutex. Under this analogy, posting a semaphore is equivalent to locking the mutex, incrementing the counter, signaling the condition variable, and unlocking the mutex. Even if there is no waiter, state is still modified.
Under this analogy, waiters for the semaphore are doing the equivalent of:
Lock mutex.
While count is non-positive, wait on condition variable.
Decrement count.
Unlock mutex.
Of course if you're talking about the specific case of POSIX, the analogy does not correspond fully to the reality, because semaphores have additional async-signal-safety properties that preclude implementing them using a mutex/condvar/count triple.
The implications are that you have to have a 'condition' that the condition variable is associated with. The way you have to use the condition variable is:
acquire the condition's mutex
while (!condition) {
wait on the condition variable
}
do whatever you need to do while holding the mutex
release the mutex
Correspondingly, whenever the condition associated with the condition variable is updated, it has to be done while holding the mutex. That way, when blocking on the condition variable, the condition cannot have changed until the system is prepared to actually unblock the waiting thread.
Suppose a condition variable is used in a situation where the signaling thread modifies the state affecting the truth value of the predicate and calls pthread_cond_signal without holding the mutex associated with the condition variable? Is it true that this type of usage is always subject to race conditions where the signal may be missed?
To me, there seems to always be an obvious race:
Waiter evaluates the predicate as false, but before it can begin waiting...
Another thread changes state in a way that makes the predicate true.
That other thread calls pthread_cond_signal, which does nothing because there are no waiters yet.
The waiter thread enters pthread_cond_wait, unaware that the predicate is now true, and waits indefinitely.
But does this same kind of race condition always exist if the situation is changed so that either (A) the mutex is held while calling pthread_cond_signal, just not while changing the state, or (B) so that the mutex is held while changing the state, just not while calling pthread_cond_signal?
I'm asking from a standpoint of wanting to know if there are any valid uses of the above not-best-practices usages, i.e. whether a correct condition-variable implementation needs to account for such usages in avoiding race conditions itself, or whether it can ignore them because they're already inherently racy.
The fundamental race here looks like this:
THREAD A THREAD B
Mutex lock
Check state
Change state
Signal
cvar wait
(never awakens)
If we take a lock EITHER on the state change OR the signal, OR both, then we avoid this; it's not possible for both the state-change and the signal to occur while thread A is in its critical section and holding the lock.
If we consider the reverse case, where thread A interleaves into thread B, there's no problem:
THREAD A THREAD B
Change state
Mutex lock
Check state
( no need to wait )
Mutex unlock
Signal (nobody cares)
So there's no particular need for thread B to hold a mutex over the entire operation; it just need to hold the mutex for some, possible infinitesimally small interval, between the state change and signal. Of course, if the state itself requires locking for safe manipulation, then the lock must be held over the state change as well.
Finally, note that dropping the mutex early is unlikely to be a performance improvement in most cases. Requiring the mutex to be held reduces contention over the internal locks in the condition variable, and in modern pthreads implementations, the system can 'move' the waiting thread from waiting on the cvar to waiting on the mutex without waking it up (thus avoiding it waking up only to immediately block on the mutex).
As pointed out in the comments, dropping the mutex may improve performance in some cases, by reducing the number of syscalls needed. Then again it could also lead to extra contention on the condition variable's internal mutex. Hard to say. It's probably not worth worrying about in any case.
Note that the applicable standards require that pthread_cond_signal be safely callable without holding the mutex:
The pthread_cond_signal() or pthread_cond_broadcast() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits [...]
This usually means that condition variables have an internal lock over their internal data structures, or otherwise use some very careful lock-free algorithm.
The state must be modified inside a mutex, if for no other reason than the possibility of spurious wake-ups, which would lead to the reader reading the state while the writer is in the middle of writing it.
You can call pthread_cond_signal anytime after the state is changed. It doesn't have to be inside the mutex. POSIX guarantees that at least one waiter will awaken to check the new state. More to the point:
Calling pthread_cond_signal doesn't guarantee that a reader will acquire the mutex first. Another writer might get in before a reader gets a chance to check the new status. Condition variables don't guarantee that readers immediately follow writers (After all, what if there are no readers?)
Calling it after releasing the lock is actually better, since you don't risk having the just-awoken reader immediately going back to sleep trying to acquire the lock that the writer is still holding.
EDIT: #DietrichEpp makes a good point in the comments. The writer must change the state in such a way that the reader can never access an inconsistent state. It can do so either by acquiring the mutex used in the condition-variable, as I indicate above, or by ensuring that all state-changes are atomic.
The answer is, there is a race, and to eliminate that race, you must do this:
/* atomic op outside of mutex, and then: */
pthread_mutex_lock(&m);
pthread_mutex_unlock(&m);
pthread_cond_signal(&c);
The protection of the data doesn't matter, because you don't hold the mutex when calling pthread_cond_signal anyway.
See, by locking and unlocking the mutex, you have created a barrier. During that brief moment when the signaler has the mutex, there is a certainty: no other thread has the mutex. This means no other thread is executing any critical regions.
This means that all threads are either about to get the mutex to discover the change you have posted, or else they have already found that change and ran off with it (releasing the mutex), or else have not found they are looking for and have atomically given up the mutex to gone to sleep (and are guaranteed to be waiting nicely on the condition).
Without the mutex lock/unlock, you have no synchronization. The signal will sometimes fire as threads which didn't see the changed atomic value are transitioning to their atomic sleep to wait for it.
So this is what the mutex does from the point of view of a thread which is signaling. You can get the atomicity of access from something else, but not the synchronization.
P.S. I have implemented this logic before. The situation was in the Linux kernel (using my own mutexes and condition variables).
In my situation, it was impossible for the signaler to hold the mutex for the atomic operation on shared data. Why? Because the signaler did the operation in user space, inside a buffer shared between the kernel and user, and then (in some situations) made a system call into the kernel to wake up a thread. User space simply made some modifications to the buffer, and then if some conditions were satisfied, it would perform an ioctl.
So in the ioctl call I did the mutex lock/unlock thing, and then hit the condition variable. This ensured that the thread would not miss the wake up related to that latest modification posted by user space.
At first I just had the condition variable signal, but it looked wrong without the involvement of the mutex, so I reasoned about the situation a little bit and realized that the mutex must simply be locked and unlocked to conform to the synchronization ritual which eliminates the lost wakeup.
The typical pattern I've seen for using pthread_cond_wait is:
pthread_mutex_lock(&lock);
while (!test)
pthread_cond_wait(&condition, &lock);
pthread_mutex_unlock(&lock);
Why can't an if statement be used instead of a while loop.
pthread_mutex_lock(&lock);
if (!test)
pthread_cond_wait(&condition, &lock);
pthread_mutex_unlock(&lock);
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_timedwait.html
When using condition variables there is always a Boolean predicate involving shared variables associated with each condition wait that is true if the thread should proceed. Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur. Since the return from pthread_cond_timedwait() or pthread_cond_wait() does not imply anything about the value of this predicate, the predicate should be re-evaluated upon such return.
The while loop is a very standard way of re-evaluating the predicate, as required by POSIX.
The if is possible but rarely used. You just would have to be sure that the state of your application is as you come out of the wait. Usually you would have to ensure that you'd only use pthread_cond_signal at the other end (and not broadcast), e.g.
There are two reasons. One is spurious wakeups, as mentioned in other answers. The other is real wakeups. Imagine this:
You have two threads, A and B, blocked on the same condition variable that protects a queue, the queue is empty. Some thread puts a job on the queue, waking thread A. Another thread puts a job on the queue, waking thread B. Thread A starts running first, and does the first job. It gets back to the if and sees that the condition is still true (since there's a second job), so it does that job too. Now thread B gets the processor, returning from pthread_cond_wait, but the predicate is not false, since the queue is empty.
You only know that the predicate was true when some other thread asked that you be woken up. You have no way to know that it will still be true when you finally do get scheduled and are able to re-acquire the mutex.
I have encountered a problem while implementing wait and signal conditions on multiple threads.
A thread needs to lock a mutex and wait on a condition variable until some other thread signals it. In the meanwhile, another thread locks the same mutex and waits on the same condition variable. Now, the thread which is running concurrently throughout the process signals the condition variable but I want only the first thread that is waiting must be signalled and not the others.
If two threads wait on the same condition variable, they must be prepared to handle the same conditions, or you must carefully construct your program so they are never waiting on the condition variable at the same time.
Why does this notification have to be handled by the first thread and not the second?
You may be better off with two separate condition variables.
Use pthread_cond_signal() to wake up one of the threads.
However, more than one might be awoken; this is termed spurious wakeup. You need a variable to track your application state, as described in the manual page linked above.
Your requirement is impossible. You say "... I want only the first thread that is waiting must be signalled and not the others." But condition variables never, ever provide any way to ensure a thread isn't signaled. So if you have a requirement that a thread must not be signaled, you cannot use condition variables.
You must always use a condition variable like this:
while(NotSupposedToRun)
pthread_cond_wait(...);
So if the thread wakes up when it's not supposed to, the while is still false and the thread just goes back to sleep. This is mandatory because POSIX does not ever provide any guarantee that a thread won't be woken. An implementation is perfectly free to implement pthread_cond_signal as a call to pthread_cond_broadcast and unblock all threads on every signal if it wants to.
Because condition variables are stateless, the implementation never knows whether a thread is supposed to be woken or not for sure. It is your job to call pthread_cond_wait always, and only, when a thread should not be running.
See
http://en.wikipedia.org/wiki/Spurious_wakeup
for more details.
If you cannot precisely specify the wakeup conditions for each thread in a while loop like the one above, you should not be using condition variables.