atomicity of pthread_cond_wait()'s unlock-and-wait?

atomicity of pthread_cond_wait()'s unlock-and-wait? - c

How atomic is the unlock-and-wait of pthread_cond_wait() call?
Is there a window where mutex is already unlocked but the thread is not yet in the part where it is actually waiting and able to receive notifications?
IOW, is there a potential for missed wakeups in the pthread_cond_wait() function itself?

The POSIX.1-2008 specification of pthread_cond_wait addresses this question in its second paragraph:
These functions [pthread_cond_wait and pthread_cond_timedwait] atomically release mutex and cause the calling thread to block on the condition variable cond; atomically here means "atomically with respect to access by another thread to the mutex and then the condition variable". That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave as if it were issued after the about-to-block thread has blocked.
So, to first order, the answer is "yes, it's atomic", but pay careful attention to the last sentence. There is no missed-wakeup window as long as the call to pthread_cond_broadcast or pthread_cond_signal comes from a thread that successfully acquired the mutex after the sleeping thread released it, and then, perhaps, released it again. If the call comes from a thread that has not acquired the mutex at least once since it was released, the wakeup might get lost.

Related

pthread_cond_broadcast before and after one pthread_mutex_unlock

For the code below, the mutex will not available by the time second cond_broadcast is executed(assuming multiple threads already waiting on the condition). In such situation, does the broadcast select the thread(waiting on the condition) to hand the mutex to and wait for the mutex to be unlocked by some other thread or the second cond_broadcast is just ignored?
void* func(void* arg){
pthread_mutex_lock(&m);
while(condition){
pthread_cond_wait(&c,&m);
}
pthread_cond_broadcast(&c);
pthread_mutex_unlock(&m);
pthread_cond_broadcast(&c);
}

For the code below, the mutex will not available by the time second
cond_broadcast is executed(assuming multiple threads already waiting
on the condition).
I think you mean that the mutex will not be available to the thread calling pthread_cond_broadcast() at the second call to that function, but that's irrelevant. Calling pthread_cond_broadcast() is independent of holding any mutex.
Or perhaps you mean that one of the previously blocked threads will have acquired the mutex by the time the second broadcast happens, but (1) that's not certain, and (2) if it does happen, that has no particular significance with respect to the broadcast.
In such situation, does the broadcast select the
thread(waiting on the condition) to hand the mutex to and wait for the
mutex to be unlocked by some other thread or the second cond_broadcast
is just ignored?
Neither. pthread_cond_broadcast() and pthread_cond_signal() have no role in locking or transferring control of any mutex. They just wake threads blocked on the associated CV. That each such thread must acquire the mutex before returning from the call is a separate consideration -- they all contend normally to lock the mutex, and they do not return from pthread_cond_wait() until they do. They also do not go back to waiting without first returning from their wait and then calling pthread_cond_wait() again.
But that does not mean that the second pthread_cond_broadcast() in your code necessarily will have no effect. One of the just-woken threads might loop around and wait on the CV again between the two calls, or some other thread might arrive at the CV. That becomes possible as soon as the first thread releases the mutex, and the fact that the first thing that thread tries to do is another broadcast does not ensure that the broadcast happens before another thread can start waiting.
It is unlikely that you want two broadcasts one after the other like that, but which one you retain has little, if any, effect on the overall semantics of the program.

How to implement thread condition with kthread?

I know that we can use pthread_cond_init, pthread_cond_wait, and pthread_cond_broadcast to implement thread condition signaling in user space. But how can we implement it in a kernel module that is using kthread?
linux/mutex.h provides locking as discussed in this question. But it does not seem to be associated with any condition variables.
linex/wait.h seems to have condition waiting but it has no documentation so there are too many concerns around it. For example, wake_up_interruptable could have a race condition when used as suggested in examples with no mutex. It seems unlikely that the wait queues could be combined with mutex waits in a way that would implement semantics like pthread_cond_wait, because the thread scheduling functions implementing these different APIs need to be integrated to provide atomic semantics.
How can a kernel module get atomic sleep-and-unlock semantics, like in a normal user-space API, like pthread_cond_wait?
The specific semantics I need are (from the pthread_cond_wait* manpage):
These functions atomically release mutex and cause the calling thread
to block on the condition variable cond; atomically here means
"atomically with respect to access by another thread to the mutex and
then the condition variable". That is, if another thread is able to
acquire the mutex after the about-to-block thread has released it,
then a subsequent call to pthread_cond_broadcast() or
pthread_cond_signal() in that thread shall behave as if it were issued
after the about-to-block thread has blocked.
The important requirements are:
The condition is a shared structure, and can only be accessed when the mutex is held.
Atomicity prevents the condition from becoming true in between the mutex release and thread blocking. Without this, the thread could check the condition (when it's false), then release the mutex, then the condition becomes true (and a signal is generated and propagated), and then the thread falls asleep and misses the signal.
(Here is how pthread_cond_wait is used in a program).

Do we need to unlocking a mutex after recived a signal from cond variable?

I making an app in C for educational purposes with mutexes and conditional variables.
Short example here:
while (!(is_done == 0))
pthread_cond_wait(&cond, &mutex);
pthread_mutex_unlock(&mutex);
Can you explain me, why after "while" statment (and recived signal from second thread) we need to unlock the mutex, when signal makes that mutex is locked?
I get it from a lot examples from web, but when i want to work with mutex after reciving signal, do i need to unlocking mutex? Ofc i want to work on this thread after reciving signal

Can you explain me, why after "while" statment (and recived signal from second thread) we need to unlock the mutex, when signal makes that mutex is locked?
When pthread_cond_wait returns the mutex is locked so that you can examine the shared state protected by it safely:
These functions atomically release mutex and cause the calling thread to block on the condition variable cond; atomically here means "atomically with respect to access by another thread to the mutex and then the condition variable". That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave as if it were issued after the about-to-block thread has blocked.
Upon successful return, the mutex shall have been locked and shall be owned by the calling thread. If mutex is a robust mutex where an owner terminated while holding the lock and the state is recoverable, the mutex shall be acquired even though the function returns an error code.
Generally, a mutex should be unlocked as soon as possible to reduce contention.
In your case, you wait for shared state is_done to change. Once it changed and you are done accessing the shared state protected by the mutex, the mutex must be unlocked.

What happens to a thread that got woken up by pthread_cond_signal() but lost competition for a mutex

Regarding this:
How To Use Condition Variable
Say we have number of consumer threads that execute such code (copied from the referenced page):
while (TRUE) {
s = pthread_mutex_lock(&mtx);
while (avail == 0) { /* Wait for something to consume */
s = pthread_cond_wait(&cond, &mtx);
}
while (avail > 0) { /* Consume all available units */
avail--;
}
s = pthread_mutex_unlock(&mtx);
}
I assume that scenario here is: main thread calls pthread_cond_signal() to tell consumer threads to do some work.
As I understand it - subsequent threads call pthread_mutex_lock() and then pthread_cond_wait() (which atomically unlocks the mutex). By now none of the consumer threads is claiming the mutex, they all wait on pthread_cond_wait().
When the main thread calls pthread_cond_signal(), following the manpage, at least one thread is waken up. When any of them returns from pthread_cond_wait() it automatically claims the mutex.
So my question is: what happens now regarding the provided example code? Namely, what does the thread that lost the contest for the mutex do now?
(AFAICT the thread that won the mutex, should run the rest of the code and release the mutex. The one that lost should be stuck waiting on the mutex - somewhere in the 1st nested while loop - while the winner holds it and after it's been released start blocking on pthread_cond_wait() beacuse the while (avail == 0) will be satisfied by then. Am I correct?)

Note that pthread_cond_signal() is generally intended to wake up only one waiting thread (that's all that it guarantees). But it could wake more 'accidentally'. The while (avail > 0) loop performs two functions:
it allows the one thread guaranteed to be woken up to consume all queued work units
it prevents additional 'accidentally' awakened threads from assuming that there's work to be done, when there might not be since the initial thread would have handled all of them.
It also prevents a race condition where a work unit might have been placed on the queue after the while (avail > 0) has completed, but before the worker thread has waited on the condition again - but that race is also handled by the if test just before calling pthread_cond_wait().
Basically when a thread is awakened, all it knows is that there might be work units for it to consume, but there might not (another thread might have consumed them).
So the sequence of events that occurs when pthread_cond_signal() is called is:
the system will wake one or more threads waiting on the condition
all the threads that are awakened will then try to acquire the mutex - only one of them can acquire it at any particular moment, since that's the purpose of a mutex
that thread will then proceed, perform the work in the while (avail > 0) loop, then will release the mutex
at that point one of the other threads that were previously woken up will acquire the mutex and work the same loop, then release the mutex. Generally, there will be no work units available anymore (since the first thread would have consumed all of them), but if another thread had added an additional unit (or more), then this thread would handle that work
the next thread will acquire the mutex and perform that same set of logic

pthread_cond_wait() has to acquire given mutex once signaled/woken up. If another thread wins that race, the function blocks until the mutex is released. So from the application point of view it doesn't return until current thread holds the mutex. The wait is always done in a loop (while (avail == 0) { ... above) to make sure that application condition we are waiting for still holds (buffer not empty, more work available, etc.)
Hope this helps.

The thread that lost the contest wakes up once the mutex is unlocked, checks the condition again, then goes to sleep on the condition variable.

When any of them returns from pthread_cond_wait() it automatically claims the mutex.
Ah, but it doesn't. Not "automatically", that is, depending on what "automatically" means. You might be confused by the "atomic" semantics of pthread_cond_wait; but that semantics is played out on the entry side: a thread is somehow registered for waiting on the condition before giving up the mutex, so that there isn't any window during which the thread no longer has the mutex, and is not yet waiting on the variable.
Each thread which returns from pthread_cond_wait has to acquire the mutex and therefore contend for it. Those which lose the race for the mutex have to block on the mutex, similarly as if they called pthread_mutex_lock.
The way the mutex is acquired on exit from pthread_cond_wait can be modeled as a regular pthread_mutex_lock operation. Essentially, the threads have to queue up on the mutex in order to exit. Each thread which acquires the mutex then returns from the function; the others have to wait until that thread gives up the mutex before they are allowed to return.
No thread woken up by the signal gets the mutex "automatically", in the sense of somehow being transferred ownership due to special eligibility. Firstly, on a multiprocessor, a woken thread can lose the race to a thread already running on another processor which snatches the mutex, if it is available, or else queue to wait on the mutex ahead of the thread which received the signal. Secondly, the thread which calls pthread_cond_signal may itself not have given up the mutex, and may continue to hold it indefinitely, which means that all the woken threads will queue up on a mutex lock operation and none will emerge from pthread_mutex_lock until that thread gives up the mutex.
All that is "automatic" is that the pthread_cond_wait operation doesn't return until acquiring the mutex again, and so the application doesn't have to take the step to acquire the mutex.

What happens to Mutex when the thread which acquired it exits?

Suppose there are two threads, the main thread and say thread B(created by main). If B acquired a mutex(say pthread_mutex) and it has called pthread_exit without unlocking the lock. So what happens to the mutex? Does it become free?

nope. The mutex remaines locked. What actually happens to such a lock depends on its type, You can read about that here or here

If you created a robust mutex by setting up the right attributes before calling pthread_mutex_init, the mutex will enter a special state when the thread that holds the lock terminates, and the next thread to attempt to acquire the mutex will obtain an error of EOWNERDEAD. It is then responsible for cleaning up whatever state the mutex protects and calling pthread_mutex_consistent to make the mutex usable again, or calling pthread_mutex_unlock (which will make the mutex permanently unusable; further attempts to use it will return ENOTRECOVERABLE).
For non-robust mutexes, the mutex is permanently unusable if the thread that locked it terminates without unlocking it. Per the standard (see the resolution to issue 755 on the Austin Group tracker), the mutex remains locked and its formal ownership continues to belong to the thread that exited, and any thread that attempts to lock it will deadlock. If another thread attempts to unlock it, that's normally undefined behavior, unless the mutex was created with the PTHREAD_MUTEX_ERRORCHECK attribute, in which case an error will be returned.
On the other hand, many (most?) real-world implementations don't actually follow the requirements of the standard. An attempt to lock or unlock the mutex from another thread might spuriously succeed, since the thread id (used to track ownership) might have been reused and may now refer to a different thread (possibly the one making the new lock/unlock request). At least glibc's NPTL is known to exhibit this behavior.