Why is a while loop needed around pthread wait conditions? - c

I'm learning pthread and wait conditions. As far as I can tell a typical waiting thread is like this:
pthread_mutex_lock(&m);
while(!condition)
pthread_cond_wait(&cond, &m);
// Thread stuff here
pthread_mutex_unlock(&m);
What I can't understand is why the line while(!condition) is necessary even if I use pthread_cond_signal() to wake up the thread.
I can understand that if I use pthread_cond_broadcast() I need to test condition, because I wake up all waiting threads and one of them can make the condition false again before unlocking the mutex (and thus transferring execution to another waked up thread which should not execute at that point).
But if I use pthread_cond_signal() I wake up just one thread so the condition must be true. So the code could look like this:
pthread_mutex_lock(&m);
pthread_cond_wait(&cond, &m);
// Thread stuff here
pthread_mutex_unlock(&m);
I read something about spurious signals that may happen. Is this (and only this) the reason? Why should I have spurious singnals? Or there is something else I don't get?
I assume the signal code is like this:
pthread_mutex_lock(&m);
condition = true;
pthread_cond_signal(&cond); // Should wake up *one* thread
pthread_mutex_unlock(&m);

The real reason you should put pthread_cond_wait in a while loop is not because of spurious wakeup. Even if your condition variable did not have spurious wakeup, you would still need the loop to catch a common type of error. Why? Consider what can happen if multiple threads wait on the same condition:
Thread 1 Thread 2 Thread 3
check condition (fails)
(in cond_wait) unlock mutex
(in cond_wait) wait
lock mutex
set condition
signal condvar
unlock mutex
lock mutex
check condition (succeeds)
do stuff
unset condition
unlock mutex
(in cond_wait) wake up
(in cond_wait) lock mutex
<thread is awake, but condition
is unset>
The problem here is that the thread must release the mutex before waiting, potentially allowing another thread to 'steal' whatever that thread was waiting for. Unless it is guaranteed that only one thread can wait on that condition, it is incorrect to assume that the condition is valid when a thread wakes up.

Suppose you don't check the condition. Then usually you can't avoid the following bad thing happening (at least, you can't avoid it in one line of code):
Sender Receiver
locks mutex
sets condition
signals condvar, but nothing
is waiting so has no effect
releases mutex
locks mutex
waits. Forever.
Of course your second code example could avoid this by doing:
pthread_mutex_lock(&m);
if (!condition) pthread_cond_wait(&cond, &m);
// Thread stuff here
pthread_mutex_unlock(&m);
Then it would certainly be the case that if there is only ever at most one receiver, and if cond_signal were the only thing that could wake it up, then it would only ever wake up when the condition was set and hence would not need a loop. nos covers why the second "if" isn't true.

Related

Where does the execution happen after a thread has called pthread_wait is now signalled?

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cv = PTHREAD_COND_INITIALIZER;
void thread_1() {
pthread_mutex_lock(&mutex);
some_cond = true;
pthread_cond_signal(&cv);
pthread_mutex_unlock(&mutex);
}
void thread_2() {
pthread_mutex_lock(&mutex);
while (!some_cond)
pthread_cond_wait(&cv, &mutex);
printf("test"); // After signaling from thread_1, does this get ran after?
pthread_mutex_unlock(&mutex);
}
Let's say that thread_2 calls pthread_cond_wait.
Thread_1 comes along then does pthread_cond_signal.
I understand that thread_2 will be blocked when pthread_cond_wait is called, and unlock its mutex.
However, I am confuse on which line of code will run in thread_2 after thread_1 calls signal.
In thread_2 when its woken up, does it start from the beginning where thread_2 now has access to the mutex, then locks it, then checks the while condition again, and sees that its true now and prints test?
Or does thread_2 get access to its mutex, then locks it and then the print("test") is ran after (ignoring the while condition)?
There is no specific line of code that is run in another thread when one calls pthread_cond_signal. If you want specific lines to be run in a specific order, you must put all those lines into the path of one thread.
When pthread_cond_signal is called, the other thread could be doing almost anything. One thing we know is that,
because the signal call is inside the mutex, the other thread is not in the mutex. We can label the places where the other can be:
void thread_2() {
// (A) Either here, or earlier.
pthread_mutex_lock(&mutex);
while (!some_cond)
pthread_cond_wait(&cv, &mutex); // (B) Or here.
printf("test"); // After signaling from thread_1, does this get ran after?
pthread_mutex_unlock(&mutex);
// (C) Or else here, or farther
}
The other thread can be at C only if some_cond was already true. If some_cond is assumed false, we can forget about C.
If A is the case, the thread is either executing code before the pthread_mutex_lock (we can call that A1), or else it has hit the lock and is now waiting for the mutex (A2).
The thread which calls pthread_cond_signal owns the mutex, and continues to do so after making this call. So it's possible that the other thread is in A1, and proceeds to A2 (waiting on the mutex).
If the thread is in B: waiting on the condition variable, it's possible that the signal will wake it up. Before returning from pthread_cond_wait, it has to re-acquire the mutex, so it can get stuck waiting there. In any case, in the B state, the other thread cannot return from the pthread_cond_wait call until the first thread does the pthread_mutex_unlock.
It's possible that the other thread is in the A1 state (not yet reached the mutex), and the signaler completes everything: sets the variable, signals, and releases the mutex. Then the other thread will grab the mutex without waiting, see that the condition is true, and leave the mutex. The signaling is then irrelevant, since pthread_cond_wait is never called.
If you're doing programming with shared variables and explicit synchronization primitives like mutexes and conditions, you have to reason about all the cases which can happen: all the relevant states in which the other thread(s) can be.
The several other examples are all good, but they also all seem to gloss over what I take to be the OP's key point of confusion.
I am confuse on which line of code will run in thread_2 after thread_1 calls signal.
There is no magic here. pthread_cond_wait() is a function, and it behaves like one. When its wait is over, and it has reacquired the mutex, it returns to its caller. Control proceeds normally from there, just as it would after any other function call returns.
In this particular case, the function call is the sole statement in the body of a while loop, so the next thing that happens after the call returns will be the while condition being reevaluated.
Note that pthread_cond_wait's caller can and should check the return value to catch and handle cases where it indicates that an error has occurred, just as should be done with other functions that indicate error conditions via their return values.
However, I am confuse on which line of code will run in thread_2 after thread_1 calls signal.
Actually, after the call of pthread_cond_signal from thread_1, nothing will happen, because thread_1 still have the mutex lock. The next instruction on thread_1 is pthread_mutex_unlock. The thread_2 will then stop blocking in pthread_cond_wait and atomically acquire the lock. The while condition will then check the condition, break and print "test".
regarding:
void thread_1() {
and
void thread_2() {
Those are NOT valid signatures for thread functions! perhaps you meant:
void * thread_1( void *arg )
and
void * thread_2( void *arg )
Also, running off the end of a thread function is not valid. Suggest the last statement in each thread be:
pthread_exit( void );
There is no guarantee as to which thread runs first. If thread_2() runs first, then the mutex will be locked and thread_2() will be waiting, forever, for the pthread_condition(), but since thread_1() will be immediately blocked trying to obtain the mutex lock. NOTHING (more) will be executed in those threads.
in general, the function: pthread_wait() is called in the same part of the executable as called the function: pthread_create(). That halts the calling code until some thread exits (which, in the above scenario no thread will ever exit

How exactly does the wait function work wrt to condition variables

Background
I am somewhat confused about my understanding of how condition variables operate in conjunction with concurrent access to shared data. The following is pseudo code to depict my current issue.
// Thread 1: Producer
void cakeMaker()
{
lock(some_lock);
while(number_of_cakes == MAX_CAKES)
wait(rack_has_space);
number_of_cakes++;
signal(rack_has_cakes);
unlock(some_lock);
}
// Thread 2: Consumer
void cakeEater()
{
lock(some_lock);
while(number_of_cakes == 0)
wait(rack_has_cakes);
number_of_cakes--;
signal(rack_has_space);
unlock(some_lock);
}
Let's consider the scenario where the value of number_of_cakes is 0. As a result, Thread 2 is blocked at wait(rack_has_cakes). When Thread 1 runs and increments the value of number_of_cakes to 1, it signals rack_has_cakes. However, Thread 2 wakes up before Thread 1 releases the lock on some_lock, causing it to go back to sleep and miss the signal.
I am unclear about the operation of wait and signal. Are they like a toggle switch that gets set to 1 when signal is called and 0 when wait succeeds? Can someone explain what is happening behind the scenes?
Question
Can someone walk me through one iteration of the above code step-by-step, with a strong emphasis on the events that occur during the signal and wait method calls?
thread 2 wakes up before Thread 1 calls unlock(some_lock), so it goes
back to sleep again and the signal has been missed.
No, that's not how it works. I will use C++ std::condition_variable for my cite, but POSIX threads, and most run-of-the-mill implementation of mutexes and condition variables work the same way. The underlying concepts are the same.
Thread 2 has the mutex locked, when it starts waiting on a condition variable. The wait() operation unlocks the mutex and waits on the condition variable atomically:
Atomically releases lock, blocks the current executing thread, and
adds it to the list of threads waiting on *this.
This operation is considered "atomic"; in other words, indivisible.
Then, when the condition variable is signaled, the thread re-locks the mutex:
When unblocked, regardless of the reason, lock is reacquired and wait
exits.
The thread does not "go back to sleep" before the other thread "calls unlock". If the mutex has not yet been unlocked: when the thread wakes up upon being signaled by a condition variable, the thread will always wait until it succeeds in locking the mutex again. This is unconditional. When wait() returns the mutex is still locked. Then, and only then, the wait() function returns. So, the sequence of events is:
One thread has the mutex locked, sets some counter, variable, or any kind of mutex-protected data to the state that the other thread is waiting for. After doing so the thread signals the condition variable, and then unlocks the mutex at its leisure.
The other thread has locked the mutex before it wait()s on the condition variable. One of wait()'s prerequisites is that the mutex must be locked before wait()ing on the linked condition variable. So, the wait() operation unlocks the mutex "atomically". That is, there is no instance when the mutex is unlocked, and the thread is not yet waiting on the condition variable. When wait() unlocks the mutex, you are guaranteed that the thread will be waiting, and it will wake up. You can take it to the bank.
Once the condition variable is signaled, the wait()ing thread does not return from wait() until it can re-lock the mutex. Having received a signal from the condition variable is just the first step, the mutex must be locked again, by thread, in the final step of the wait() operation. Which, of course, only happens after the signaling thread unlocks the mutex.
When a thread gets signaled by a condition variable, it will return from wait(). But not immediately, it must wait until the thread locks the mutex again, however long it takes. It will not go "back to sleep", but wait until it has the mutex locked again, and then return. You are guaranteed that a received condition variable signal will cause the thread to return from wait(), and the mutex will be re-locked by the thread. And because the original unlock-then-wait operation was atomic, you are guaranteed to receive the condition variable signal.
Lets say we currently have number_of_cakes = 0, so Thread 2 is currently stuck on wait(rack_has_cakes). Thread 1 runs and increments number_of_cakes by 1. Then it calls signal(rack_has_cakes) - this wakes up Thread 2, unfortunately Thread 2 wakes up before Thread 1 calls unlock(some_lock), so it goes back to sleep again and the signal has been missed.
You are right, that might be happens, because your signal command order was not correct.
In both Producer and Consumer, you have set the following order of commands:
signal(rack_has_cakes);
unlock(some_lock);
But the order should be:
unlock(some_lock);
signal(rack_has_cakes);
You first have to unlock the mutex and then signal the other thread.
Since signal command is condition variable wait() and signal() commands are thread safe, you should not worry about releasing the lock before.
But this step is very important as it give the other thread a chance to lock the mutex.

Which thread owns the associated mutex after pthread_cond_broadcast?

This question concerns the pthread API for Posix systems.
My understanding is that when waiting for a conditional variable, or more specifically a pthread_cond_t, the flow goes something like this.
// imagine the mutex is named mutex and the conditional variable is named cond
// first we lock the mutex to prevent race conditions
pthread_mutex_lock(&mutex);
// then we wait for the conditional variable, releasing the mutex
pthread_cond_wait(&cond, &mutex);
// after we're done waiting we own the mutex again have to release it
pthread_mutex_unlock(&mutex);
In this example we stop waiting for the mutex when some other thread follows a procedure like this.
// lock the mutex to prevent race conditions
pthread_mutex_lock(&mutex);
// signal the conditional variable, giving up control of the mutex
pthread_cond_signal(&cond);
My understanding is that if multiple threads are waiting some kind of scheduling policy will be applied, and whichever thread is unblocked also gets back the associated mutex.
Now what I don't understand is what happens when some thread calls pthread_cond_broadcast(&cond) to awake all of the threads waiting on the conditional variable.
Does only one thread get to own the mutex? Do I need to wait in a fundamentally different manner when waiting for a broadcast than when waiting for a signal (i.e. by not calling pthread_mutex_unlock unless I can confirm this thread acquired the mutex)? Or am I wrong in my whole understanding of how the mutex/cond relationship works?
Most importantly, if (as I think is probably the case) pthread_cond_broadcast causes threads to compete for the associated mutex as if they had all tried to lock it, does that mean only one thread will really wake up?
When some thread calls pthread_cond_broadcast while holding the mutex, it holds the mutex. Which means that once pthread_cond_broadcast returns, it still owns the mutex.
The other threads will all wake up, try to lock the mutex, then go to sleep to wait for the mutex to become available.
If you call pthread_cond_broadcast while not holding the mutex, then one of the other threads will be able to lock the mutex immediately. All the others will have to wait to lock the mutex.

C Multithread: Wait Until (expression); [duplicate]

I’m reading up on pthread.h; the condition variable related functions (like pthread_cond_wait(3)) require a mutex as an argument. Why? As far as I can tell, I’m going to be creating a mutex just to use as that argument? What is that mutex supposed to do?
It's just the way that condition variables are (or were originally) implemented.
The mutex is used to protect the condition variable itself. That's why you need it locked before you do a wait.
The wait will "atomically" unlock the mutex, allowing others access to the condition variable (for signalling). Then when the condition variable is signalled or broadcast to, one or more of the threads on the waiting list will be woken up and the mutex will be magically locked again for that thread.
You typically see the following operation with condition variables, illustrating how they work. The following example is a worker thread which is given work via a signal to a condition variable.
thread:
initialise.
lock mutex.
while thread not told to stop working:
wait on condvar using mutex.
if work is available to be done:
do the work.
unlock mutex.
clean up.
exit thread.
The work is done within this loop provided that there is some available when the wait returns. When the thread has been flagged to stop doing work (usually by another thread setting the exit condition then kicking the condition variable to wake this thread up), the loop will exit, the mutex will be unlocked and this thread will exit.
The code above is a single-consumer model as the mutex remains locked while the work is being done. For a multi-consumer variation, you can use, as an example:
thread:
initialise.
lock mutex.
while thread not told to stop working:
wait on condvar using mutex.
if work is available to be done:
copy work to thread local storage.
unlock mutex.
do the work.
lock mutex.
unlock mutex.
clean up.
exit thread.
which allows other consumers to receive work while this one is doing work.
The condition variable relieves you of the burden of polling some condition instead allowing another thread to notify you when something needs to happen. Another thread can tell that thread that work is available as follows:
lock mutex.
flag work as available.
signal condition variable.
unlock mutex.
The vast majority of what are often erroneously called spurious wakeups was generally always because multiple threads had been signalled within their pthread_cond_wait call (broadcast), one would return with the mutex, do the work, then re-wait.
Then the second signalled thread could come out when there was no work to be done. So you had to have an extra variable indicating that work should be done (this was inherently mutex-protected with the condvar/mutex pair here - other threads needed to lock the mutex before changing it however).
It was technically possible for a thread to return from a condition wait without being kicked by another process (this is a genuine spurious wakeup) but, in all my many years working on pthreads, both in development/service of the code and as a user of them, I never once received one of these. Maybe that was just because HP had a decent implementation :-)
In any case, the same code that handled the erroneous case also handled genuine spurious wakeups as well since the work-available flag would not be set for those.
A condition variable is quite limited if you could only signal a condition, usually you need to handle some data that's related to to condition that was signalled. Signalling/wakeup have to be done atomically in regards to achieve that without introducing race conditions, or be overly complex
pthreads can also give you , for rather technical reasons, a spurious wakeup . That means you need to check a predicate, so you can be sure the condition actually was signalled - and distinguish that from a spurious wakeup. Checking such a condition in regards to waiting for it need to be guarded - so a condition variable needs a way to atomically wait/wake up while locking/unlocking a mutex guarding that condition.
Consider a simple example where you're notified that some data are produced. Maybe another thread made some data that you want, and set a pointer to that data.
Imagine a producer thread giving some data to another consumer thread through a 'some_data'
pointer.
while(1) {
pthread_cond_wait(&cond); //imagine cond_wait did not have a mutex
char *data = some_data;
some_data = NULL;
handle(data);
}
you'd naturally get a lot of race condition, what if the other thread did some_data = new_data right after you got woken up, but before you did data = some_data
You cannot really create your own mutex to guard this case either .e.g
while(1) {
pthread_cond_wait(&cond); //imagine cond_wait did not have a mutex
pthread_mutex_lock(&mutex);
char *data = some_data;
some_data = NULL;
pthread_mutex_unlock(&mutex);
handle(data);
}
Will not work, there's still a chance of a race condition in between waking up and grabbing the mutex. Placing the mutex before the pthread_cond_wait doesn't help you, as you will now
hold the mutex while waiting - i.e. the producer will never be able to grab the mutex.
(note, in this case you could create a second condition variable to signal the producer that you're done with some_data - though this will become complex, especially so if you want many producers/consumers.)
Thus you need a way to atomically release/grab the mutex when waiting/waking up from the condition. That's what pthread condition variables does, and here's what you'd do:
while(1) {
pthread_mutex_lock(&mutex);
while(some_data == NULL) { // predicate to acccount for spurious wakeups,would also
// make it robust if there were several consumers
pthread_cond_wait(&cond,&mutex); //atomically lock/unlock mutex
}
char *data = some_data;
some_data = NULL;
pthread_mutex_unlock(&mutex);
handle(data);
}
(the producer would naturally need to take the same precautions, always guarding 'some_data' with the same mutex, and making sure it doesn't overwrite some_data if some_data is currently != NULL)
POSIX condition variables are stateless. So it is your responsibility to maintain the state. Since the state will be accessed by both threads that wait and threads that tell other threads to stop waiting, it must be protected by a mutex. If you think you can use condition variables without a mutex, then you haven't grasped that condition variables are stateless.
Condition variables are built around a condition. Threads that wait on a condition variable are waiting for some condition. Threads that signal condition variables change that condition. For example, a thread might be waiting for some data to arrive. Some other thread might notice that the data has arrived. "The data has arrived" is the condition.
Here's the classic use of a condition variable, simplified:
while(1)
{
pthread_mutex_lock(&work_mutex);
while (work_queue_empty()) // wait for work
pthread_cond_wait(&work_cv, &work_mutex);
work = get_work_from_queue(); // get work
pthread_mutex_unlock(&work_mutex);
do_work(work); // do that work
}
See how the thread is waiting for work. The work is protected by a mutex. The wait releases the mutex so that another thread can give this thread some work. Here's how it would be signalled:
void AssignWork(WorkItem work)
{
pthread_mutex_lock(&work_mutex);
add_work_to_queue(work); // put work item on queue
pthread_cond_signal(&work_cv); // wake worker thread
pthread_mutex_unlock(&work_mutex);
}
Notice that you need the mutex to protect the work queue. Notice that the condition variable itself has no idea whether there's work or not. That is, a condition variable must be associated with a condition, that condition must be maintained by your code, and since it's shared among threads, it must be protected by a mutex.
Not all condition variable functions require a mutex: only the waiting operations do. The signal and broadcast operations do not require a mutex. A condition variable also is not permanently associated with a specific mutex; the external mutex does not protect the condition variable. If a condition variable has internal state, such as a queue of waiting threads, this must be protected by an internal lock inside the condition variable.
The wait operations bring together a condition variable and a mutex, because:
a thread has locked the mutex, evaluated some expression over shared variables and found it to be false, such that it needs to wait.
the thread must atomically move from owning the mutex, to waiting on the condition.
For this reason, the wait operation takes as arguments both the mutex and condition: so that it can manage the atomic transfer of a thread from owning the mutex to waiting, so that the thread does not fall victim to the lost wake up race condition.
A lost wakeup race condition will occur if a thread gives up a mutex, and then waits on a stateless synchronization object, but in a way which is not atomic: there exists a window of time when the thread no longer has the lock, and has not yet begun waiting on the object. During this window, another thread can come in, make the awaited condition true, signal the stateless synchronization and then disappear. The stateless object doesn't remember that it was signaled (it is stateless). So then the original thread goes to sleep on the stateless synchronization object, and does not wake up, even though the condition it needs has already become true: lost wakeup.
The condition variable wait functions avoid the lost wake up by making sure that the calling thread is registered to reliably catch the wakeup before it gives up the mutex. This would be impossible if the condition variable wait function did not take the mutex as an argument.
I do not find the other answers to be as concise and readable as this page. Normally the waiting code looks something like this:
mutex.lock()
while(!check())
condition.wait(mutex) # atomically unlocks mutex and sleeps. Calls
# mutex.lock() once the thread wakes up.
mutex.unlock()
There are three reasons to wrap the wait() in a mutex:
without a mutex another thread could signal() before the wait() and we'd miss this wake up.
normally check() is dependent on modification from another thread, so you need mutual exclusion on it anyway.
to ensure that the highest priority thread proceeds first (the queue for the mutex allows the scheduler to decide who goes next).
The third point is not always a concern - historical context is linked from the article to this conversation.
Spurious wake-ups are often mentioned with regard to this mechanism (i.e. the waiting thread is awoken without signal() being called). However, such events are handled by the looped check().
Condition variables are associated with a mutex because it is the only way it can avoid the race that it is designed to avoid.
// incorrect usage:
// thread 1:
while (notDone) {
pthread_mutex_lock(&mutex);
bool ready = protectedReadyToRunVariable
pthread_mutex_unlock(&mutex);
if (ready) {
doWork();
} else {
pthread_cond_wait(&cond1); // invalid syntax: this SHOULD have a mutex
}
}
// signalling thread
// thread 2:
prepareToRunThread1();
pthread_mutex_lock(&mutex);
protectedReadyToRuNVariable = true;
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&cond1);
Now, lets look at a particularly nasty interleaving of these operations
pthread_mutex_lock(&mutex);
bool ready = protectedReadyToRunVariable;
pthread_mutex_unlock(&mutex);
pthread_mutex_lock(&mutex);
protectedReadyToRuNVariable = true;
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&cond1);
if (ready) {
pthread_cond_wait(&cond1); // uh o!
At this point, there is no thread which is going to signal the condition variable, so thread1 will wait forever, even though the protectedReadyToRunVariable says it's ready to go!
The only way around this is for condition variables to atomically release the mutex while simultaneously starting to wait on the condition variable. This is why the cond_wait function requires a mutex
// correct usage:
// thread 1:
while (notDone) {
pthread_mutex_lock(&mutex);
bool ready = protectedReadyToRunVariable
if (ready) {
pthread_mutex_unlock(&mutex);
doWork();
} else {
pthread_cond_wait(&mutex, &cond1);
}
}
// signalling thread
// thread 2:
prepareToRunThread1();
pthread_mutex_lock(&mutex);
protectedReadyToRuNVariable = true;
pthread_cond_signal(&mutex, &cond1);
pthread_mutex_unlock(&mutex);
The mutex is supposed to be locked when you call pthread_cond_wait; when you call it it atomically both unlocks the mutex and then blocks on the condition. Once the condition is signaled it atomically locks it again and returns.
This allows the implementation of predictable scheduling if desired, in that the thread that would be doing the signalling can wait until the mutex is released to do its processing and then signal the condition.
It appears to be a specific design decision rather than a conceptual need.
Per the pthreads docs the reason that the mutex was not separated is that there is a significant performance improvement by combining them and they expect that because of common race conditions if you don't use a mutex, it's almost always going to be done anyway.
https://linux.die.net/man/3/pthread_cond_wait​
Features of Mutexes and Condition Variables
It had been suggested that the mutex acquisition and release be
decoupled from condition wait. This was rejected because it is the
combined nature of the operation that, in fact, facilitates realtime
implementations. Those implementations can atomically move a
high-priority thread between the condition variable and the mutex in a
manner that is transparent to the caller. This can prevent extra
context switches and provide more deterministic acquisition of a mutex
when the waiting thread is signaled. Thus, fairness and priority
issues can be dealt with directly by the scheduling discipline.
Furthermore, the current condition wait operation matches existing
practice.
There are a tons of exegeses about that, yet I want to epitomize it with an example following.
1 void thr_child() {
2 done = 1;
3 pthread_cond_signal(&c);
4 }
5 void thr_parent() {
6 if (done == 0)
7 pthread_cond_wait(&c);
8 }
What's wrong with the code snippet? Just ponder somewhat before going ahead.
The issue is genuinely subtle. If the parent invokes
thr_parent() and then vets the value of done, it will see that it is 0 and
thus try to go to sleep. But just before it calls wait to go to sleep, the parent
is interrupted between lines of 6-7, and the child runs. The child changes the state variable
done to 1 and signals, but no thread is waiting and thus no thread is
woken. When the parent runs again, it sleeps forever, which is really egregious.
What if they are carried out while acquired locks individually?
I made an exercice in class if you want a real example of condition variable :
#include "stdio.h"
#include "stdlib.h"
#include "pthread.h"
#include "unistd.h"
int compteur = 0;
pthread_cond_t varCond = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex_compteur;
void attenteSeuil(arg)
{
pthread_mutex_lock(&mutex_compteur);
while(compteur < 10)
{
printf("Compteur : %d<10 so i am waiting...\n", compteur);
pthread_cond_wait(&varCond, &mutex_compteur);
}
printf("I waited nicely and now the compteur = %d\n", compteur);
pthread_mutex_unlock(&mutex_compteur);
pthread_exit(NULL);
}
void incrementCompteur(arg)
{
while(1)
{
pthread_mutex_lock(&mutex_compteur);
if(compteur == 10)
{
printf("Compteur = 10\n");
pthread_cond_signal(&varCond);
pthread_mutex_unlock(&mutex_compteur);
pthread_exit(NULL);
}
else
{
printf("Compteur ++\n");
compteur++;
}
pthread_mutex_unlock(&mutex_compteur);
}
}
int main(int argc, char const *argv[])
{
int i;
pthread_t threads[2];
pthread_mutex_init(&mutex_compteur, NULL);
pthread_create(&threads[0], NULL, incrementCompteur, NULL);
pthread_create(&threads[1], NULL, attenteSeuil, NULL);
pthread_exit(NULL);
}

What happens to a thread that got woken up by pthread_cond_signal() but lost competition for a mutex

Regarding this:
How To Use Condition Variable
Say we have number of consumer threads that execute such code (copied from the referenced page):
while (TRUE) {
s = pthread_mutex_lock(&mtx);
while (avail == 0) { /* Wait for something to consume */
s = pthread_cond_wait(&cond, &mtx);
}
while (avail > 0) { /* Consume all available units */
avail--;
}
s = pthread_mutex_unlock(&mtx);
}
I assume that scenario here is: main thread calls pthread_cond_signal() to tell consumer threads to do some work.
As I understand it - subsequent threads call pthread_mutex_lock() and then pthread_cond_wait() (which atomically unlocks the mutex). By now none of the consumer threads is claiming the mutex, they all wait on pthread_cond_wait().
When the main thread calls pthread_cond_signal(), following the manpage, at least one thread is waken up. When any of them returns from pthread_cond_wait() it automatically claims the mutex.
So my question is: what happens now regarding the provided example code? Namely, what does the thread that lost the contest for the mutex do now?
(AFAICT the thread that won the mutex, should run the rest of the code and release the mutex. The one that lost should be stuck waiting on the mutex - somewhere in the 1st nested while loop - while the winner holds it and after it's been released start blocking on pthread_cond_wait() beacuse the while (avail == 0) will be satisfied by then. Am I correct?)
Note that pthread_cond_signal() is generally intended to wake up only one waiting thread (that's all that it guarantees). But it could wake more 'accidentally'. The while (avail > 0) loop performs two functions:
it allows the one thread guaranteed to be woken up to consume all queued work units
it prevents additional 'accidentally' awakened threads from assuming that there's work to be done, when there might not be since the initial thread would have handled all of them.
It also prevents a race condition where a work unit might have been placed on the queue after the while (avail > 0) has completed, but before the worker thread has waited on the condition again - but that race is also handled by the if test just before calling pthread_cond_wait().
Basically when a thread is awakened, all it knows is that there might be work units for it to consume, but there might not (another thread might have consumed them).
So the sequence of events that occurs when pthread_cond_signal() is called is:
the system will wake one or more threads waiting on the condition
all the threads that are awakened will then try to acquire the mutex - only one of them can acquire it at any particular moment, since that's the purpose of a mutex
that thread will then proceed, perform the work in the while (avail > 0) loop, then will release the mutex
at that point one of the other threads that were previously woken up will acquire the mutex and work the same loop, then release the mutex. Generally, there will be no work units available anymore (since the first thread would have consumed all of them), but if another thread had added an additional unit (or more), then this thread would handle that work
the next thread will acquire the mutex and perform that same set of logic
pthread_cond_wait() has to acquire given mutex once signaled/woken up. If another thread wins that race, the function blocks until the mutex is released. So from the application point of view it doesn't return until current thread holds the mutex. The wait is always done in a loop (while (avail == 0) { ... above) to make sure that application condition we are waiting for still holds (buffer not empty, more work available, etc.)
Hope this helps.
The thread that lost the contest wakes up once the mutex is unlocked, checks the condition again, then goes to sleep on the condition variable.
When any of them returns from pthread_cond_wait() it automatically claims the mutex.
Ah, but it doesn't. Not "automatically", that is, depending on what "automatically" means. You might be confused by the "atomic" semantics of pthread_cond_wait; but that semantics is played out on the entry side: a thread is somehow registered for waiting on the condition before giving up the mutex, so that there isn't any window during which the thread no longer has the mutex, and is not yet waiting on the variable.
Each thread which returns from pthread_cond_wait has to acquire the mutex and therefore contend for it. Those which lose the race for the mutex have to block on the mutex, similarly as if they called pthread_mutex_lock.
The way the mutex is acquired on exit from pthread_cond_wait can be modeled as a regular pthread_mutex_lock operation. Essentially, the threads have to queue up on the mutex in order to exit. Each thread which acquires the mutex then returns from the function; the others have to wait until that thread gives up the mutex before they are allowed to return.
No thread woken up by the signal gets the mutex "automatically", in the sense of somehow being transferred ownership due to special eligibility. Firstly, on a multiprocessor, a woken thread can lose the race to a thread already running on another processor which snatches the mutex, if it is available, or else queue to wait on the mutex ahead of the thread which received the signal. Secondly, the thread which calls pthread_cond_signal may itself not have given up the mutex, and may continue to hold it indefinitely, which means that all the woken threads will queue up on a mutex lock operation and none will emerge from pthread_mutex_lock until that thread gives up the mutex.
All that is "automatic" is that the pthread_cond_wait operation doesn't return until acquiring the mutex again, and so the application doesn't have to take the step to acquire the mutex.

Resources