Is possible that a pthread_cond_wait() consumes multiple pthread_cond_signal()? - c

I've tested this scenario in some environments, and I got the following flow:
However, from the man pages ( http://linux.die.net/man/3/pthread_cond_wait ) or ( http://linux.die.net/man/3/pthread_cond_signal ), I cannot find any guarantee that the following scenario cannot happen:
Which is that 2 threads doing a signal can run before any waiting thread has the chance to run. (scheduling possibility)
[Now, I know that if this was done with semaphores, the second scenario would never happen... however in my case I really need to do this with cond-vars!]
In my case every post increments the predicate, so when the waiting Thread2 wakes-up it will check the predicate (which in this case was incremented by 2), making the thread to not sleep anymore and it would decrement the predicate by 1 (meaning that one post was consumed).
If this scenario can happen, it would imply that the "Thread1" might not wake up until a further post happens, although the predicate was incremented twice (post) and decremented only once (the Thread2 wait).
Even worse, a 3rd wait might never block as it would consume the previous-pending predicate increment.
I could not yet trigger this problem, but does anyone know if this is a possible scenario?
NOTE to overcome this possibility I've replaced the pthread_cond_signal() by pthread_cond_broadcast() so both the Thread1 and Thread2 are guaranteed to wake up and consume the 2 increments. However, this solution decreases a bit (maybe not even significantly) the performance, and I bet it is not obvious to anyone looking at this why we are using broadcasts here.

No, it is not possible for one pthread_cond_wait() to consume two signals.
pthread_cond_signal() is guaranteed to wake up at least one thread that is currently waiting on the condition variable. Once a thread has been signalled, it is no longer waiting on the condition variable (though it may still be waiting on the associated mutex), so a subsequent pthread_cond_signal() must awaken a different waiting thread (if there are any).
(In your second diagram, the second signal must target a thread other than Thread2, because Thread2 is no longer waiting on the condition variable at that point).
The exact wording in the POSIX spec for pthread_cond_signal is:
The pthread_cond_signal() function shall unblock at least one of the
threads that are blocked on the specified condition variable cond
(if any threads are blocked on cond).

Related

pthread_cond_broadcast before and after one pthread_mutex_unlock

For the code below, the mutex will not available by the time second cond_broadcast is executed(assuming multiple threads already waiting on the condition). In such situation, does the broadcast select the thread(waiting on the condition) to hand the mutex to and wait for the mutex to be unlocked by some other thread or the second cond_broadcast is just ignored?
void* func(void* arg){
pthread_mutex_lock(&m);
while(condition){
pthread_cond_wait(&c,&m);
}
pthread_cond_broadcast(&c);
pthread_mutex_unlock(&m);
pthread_cond_broadcast(&c);
}
For the code below, the mutex will not available by the time second
cond_broadcast is executed(assuming multiple threads already waiting
on the condition).
I think you mean that the mutex will not be available to the thread calling pthread_cond_broadcast() at the second call to that function, but that's irrelevant. Calling pthread_cond_broadcast() is independent of holding any mutex.
Or perhaps you mean that one of the previously blocked threads will have acquired the mutex by the time the second broadcast happens, but (1) that's not certain, and (2) if it does happen, that has no particular significance with respect to the broadcast.
In such situation, does the broadcast select the
thread(waiting on the condition) to hand the mutex to and wait for the
mutex to be unlocked by some other thread or the second cond_broadcast
is just ignored?
Neither. pthread_cond_broadcast() and pthread_cond_signal() have no role in locking or transferring control of any mutex. They just wake threads blocked on the associated CV. That each such thread must acquire the mutex before returning from the call is a separate consideration -- they all contend normally to lock the mutex, and they do not return from pthread_cond_wait() until they do. They also do not go back to waiting without first returning from their wait and then calling pthread_cond_wait() again.
But that does not mean that the second pthread_cond_broadcast() in your code necessarily will have no effect. One of the just-woken threads might loop around and wait on the CV again between the two calls, or some other thread might arrive at the CV. That becomes possible as soon as the first thread releases the mutex, and the fact that the first thing that thread tries to do is another broadcast does not ensure that the broadcast happens before another thread can start waiting.
It is unlikely that you want two broadcasts one after the other like that, but which one you retain has little, if any, effect on the overall semantics of the program.

What would happen if pthread_cond_wait was not atomic?

Scenario 1: release mutex then wait
Scenario 2: wait and then release mutex
Trying to understand conceptually what it does.
If the mutex were released before the calling thread is considered "blocked" on the condition variable, then another thread could lock the mutex, change the state that the predicate is based on, and call pthread_cond_signal without the waiting thread ever waking up (since it's not yet blocked). That's the problem.
Scenario 2, waiting then releasing the mutex, is internally how any real-world implementation has to work, since there's no such thing as an atomic implementation of the necessary behavior. But from the application's perspective, there's no way to observe the thread being part of the blocked set without the mutex also being released, so in the sense of the "abstract machine", it's atomic.
Edit: To go into more detail, the real-world implementation of a condition variable wait generally looks like:
Modify some internal state of the condition variable object such that the caller is considered to be part of the blocked set for it.
Unlock the mutex.
Perform a blocking wait operation, with the special property that it will return immediately if the state of the condition variable object from step 1 has changed due to a signal from any other thread.
Thus, the act of "blocking" is split between two steps, one of which happens before the mutex is unlocked (gaining membership in the blocked set) and the other of which happens after the mutex is unlocked (possibly sleeping and yielding control to other threads). It's this split that's able to make the "condition wait" operation "atomic" in the abstract machine.

Breaking a condition variable deadlock

I have a situation where thread 1 is waiting on a condition variable A, which should be woken up by thread 2. Now thread 2 is waiting on a condition variable B , which should be woken up by thread 1. In the scenario I am using the condition variable, I cannot avoid such a deadlock situation. I detect cycle(deadlock) and terminate one of the threads which is a participant in the deadlock.
Now, what I am not sure is how to simply terminate a thread say thread 1 which is waiting on a condition variable.
Would be grateful for some pointers.
Thanks
Condition variables aren't like mutexes. By that I mean they aren't only usable by a single thread controlling them. The mutex that protects the condition variable is treated that way but that's only locked for short periods of time, unlocked manually by a thread after kicking (signalling) the condition variable, and automatically by a thread waiting for such a kick.
You can have a totally separate thread (like your deadlock detector, let's call it thread 3) simply kick one of the condition variables and it will wake up the thread waiting for it.
The usual use case for condition variables is for threads to wait for the kick then check to ensure you have work anyway (don't assume there is work simply because the variable was kicked). That's to take care of spurious wake-ups.
One possibility is to have a "global" deadlock_occurred flag which thread 3 sets when it detects deadlock, then also have thread 3 kick all the condition variables.
The first thing that threads 1 and 2 should do after being woken, should be to check that flag and take appropriate action (probably exit the thread).
You'll find you get into a lot less deadlock-type trouble if you architect your applications so that threads are responsible for their own lifetime. It's too easy to externally kill threads when they're not in a state amenable to being terminated. Don't get me wrong, there are other ways to handle it (such as with cancel points), but my tried and tested solution is by far the easiest I've ever found.

concurrent threads in C programming

I have encountered a problem while implementing wait and signal conditions on multiple threads.
A thread needs to lock a mutex and wait on a condition variable until some other thread signals it. In the meanwhile, another thread locks the same mutex and waits on the same condition variable. Now, the thread which is running concurrently throughout the process signals the condition variable but I want only the first thread that is waiting must be signalled and not the others.
If two threads wait on the same condition variable, they must be prepared to handle the same conditions, or you must carefully construct your program so they are never waiting on the condition variable at the same time.
Why does this notification have to be handled by the first thread and not the second?
You may be better off with two separate condition variables.
Use pthread_cond_signal() to wake up one of the threads.
However, more than one might be awoken; this is termed spurious wakeup. You need a variable to track your application state, as described in the manual page linked above.
Your requirement is impossible. You say "... I want only the first thread that is waiting must be signalled and not the others." But condition variables never, ever provide any way to ensure a thread isn't signaled. So if you have a requirement that a thread must not be signaled, you cannot use condition variables.
You must always use a condition variable like this:
while(NotSupposedToRun)
pthread_cond_wait(...);
So if the thread wakes up when it's not supposed to, the while is still false and the thread just goes back to sleep. This is mandatory because POSIX does not ever provide any guarantee that a thread won't be woken. An implementation is perfectly free to implement pthread_cond_signal as a call to pthread_cond_broadcast and unblock all threads on every signal if it wants to.
Because condition variables are stateless, the implementation never knows whether a thread is supposed to be woken or not for sure. It is your job to call pthread_cond_wait always, and only, when a thread should not be running.
See
http://en.wikipedia.org/wiki/Spurious_wakeup
for more details.
If you cannot precisely specify the wakeup conditions for each thread in a while loop like the one above, you should not be using condition variables.

Semaphore queues

I'm extending the functionality of a semaphore. I ran into a roadblock when I realized I don't know the implementation of an actual semaphore and to make sure my code ran correctly, I needed to know this.
I know a semaphore works by blocking threads that are waiting on it when they call sem_wait() and another thread currently has it locked. The thread is then blocked and then put into a wait list for that semaphore.
My question relates to what happens on a sem_post(). Is the next thread pulled off the waiting list, set as the locking thread, and allowed to be unblocked? Or is the scheme for posting completely different?
Thanks!
The next thread to unblock on it's sem_wait() will be whatever thread the OS decides is the next one to context switch into. Nobody makes any guarantee of ordering; it depends on your OS's scheduling strategy. It might be the thread that has been off the CPU for the longest, or the one that has been assigned the highest "priority", or the one that has historically had certain resource-usage statistics, or whatever.
Most likely, your current thread (the one that called sem_post()) will continue running for a while, until it either starts waiting for user input, blocks on another semaphore, or runs out of its os-allotted time slice. Then, the OS will switch in some totally unrelated process to run for a fraction of a second (probably Firefox or something), then go off and handle some network traffic, get itself a cup of tea, and, finally, when it gets around to it, pick whichever of your other threads it feels like, based on something like whether it feels based on past history that the particular thread is more CPU or I/O-bound.
In many OSes, priority is given to I/O-bound processes that haven't been around for very long. The theory is that new processes might be short-lived (if it's been around for five hours already, odds are it won't be finishing up in the next 1ms) so we might as well get them over with. I/O-bound processes are likely to continue to be I/O-bound, which means that chances are they are going to switch off the CPU shortly while waiting for other resources. Basically, the OS wants to find the process that it's going to be able to be done with ASAP, so it can get back to sipping its tea and running your malware.
Semaphores have two operations:
P() To acquire the semaphore (you seem to call this sem_wait)
V() To release the semaphore (you seem to call this sem_post)
Semaphores also have an integer associated to them, which is the number of concurrent threads allowed to pass P() without blocking. Other calls to P() will block until V() is called to free up spots.
That is the classic definition of a semaphore.
Edit: Semaphores do not make any guarantee of order. They don't have to actually use a queue or other FIFO structure. When only one thread is allowed at a time, when it calls V(), another (possibly random) thread will then return from its P() call and continue.
According to the IEEE standards, the behavior of POSIX semaphores:
If the semaphore value resulting from this operation is positive, then no threads were blocked waiting for the semaphore to become unlocked; the semaphore value is simply incremented.
If the value of the semaphore resulting from this operation is zero, then one of the threads blocked waiting for the semaphore shall be allowed to return successfully from its call to sem_wait(). If the Process Scheduling option is supported, the thread to be unblocked shall be chosen in a manner appropriate to the scheduling policies and parameters in effect for the blocked threads. In the case of the schedulers SCHED_FIFO and SCHED_RR, the highest priority waiting thread shall be unblocked, and if there is more than one highest priority thread blocked waiting for the semaphore, then the highest priority thread that has been waiting the longest shall be unblocked. If the Process Scheduling option is not defined, the choice of a thread to unblock is unspecified.
If the Process Sporadic Server option is supported, and the scheduling policy is SCHED_SPORADIC, the semantics are as per SCHED_FIFO above."

Resources