I have written this code in C language and there are two pthreads that are using this code and trying to access the mutex "firstSection" (in both of them we are sure that the mutex passed to function is the same). The code suppose to check two mutexes, and if both of them were available, performs some actions which take place in function safeUnlockTwoMutexes(), and if failed to acquire at least one of them, it has to wait for two seconds and tries again. ("intersection" mutex is the main-lock to safe check the situation of the other mutexes)
void twoSectionRoute(pthread_mutex_t firstSection, pthread_mutex_t secondSection){
bool pathClear = false;
while (!pathClear){
pthread_mutex_lock(&intersection);
if (pthread_mutex_trylock(&firstSection) == 0){
if (pthread_mutex_trylock(&secondSection) == 0){
pathClear = true;
pthread_mutex_unlock(&intersection);
} else {
pthread_mutex_unlock(&firstSection);
pthread_mutex_unlock(&intersection);
sleep(2);
}
} else {
pthread_mutex_unlock(&intersection);
sleep(2);
}
}
safeUnlockTwoMutexes(firstSection, secondSection, 1);
}
Now the problem with this code is both threads are able to lock the mutex "firstSectio" at almost same time and I don't know why. (maybe because its type is recursive mutex?! I've used "PTHREAD_MUTEX_INITIALIZER" in the beginning of the file as global variables)
I'm wondering how can I fix this issue, and the threads access this sections one after another?
Your function signature passes pthread_mutex_t values firstSection and secondSection by value. You need to pass mutexes by pointer.
void twoSectionRoute(pthread_mutex_t* firstSection, pthread_mutex_t* secondSection){
Then, within the function use just firstSection and secondSection rather than &firstSection and &secondSection.
If you pass the mutex by value (as here), and it compiles, then the mutex itself is copied, so you end up with undefined behaviour and the mutex locks do not operate on the same state.
Related
I've created a Timer pseudo class in C that has call back capability and can be cancelled. I come from the .NET/C# world where this is all done by the framework and I'm not an expert with pthreads.
In .NET there are cancellation tokens which you can wait on which means I don't need to worry so much about the nuts and bolts.
However using pthreads is a bit more low level than I am used to so my question is:
Are there any issues with the way I have implemented this?
Thanks in anticipation for any comments you may have.
Timer struct:
typedef struct _timer
{
pthread_cond_t Condition;
pthread_mutex_t ConditionMutex;
bool IsRunning;
pthread_mutex_t StateMutex;
pthread_t Thread;
int TimeoutMicroseconds;
void * Context;
void (*Callback)(bool isCancelled, void * context);
} TimerObject, *Timer;
C Module:
static void *
TimerTask(Timer timer)
{
struct timespec timespec;
struct timeval now;
int returnValue = 0;
clock_gettime(CLOCK_REALTIME, ×pec);
timespec.tv_sec += timer->TimeoutMicroseconds / 1000000;
timespec.tv_nsec += (timer->TimeoutMicroseconds % 1000000) * 1000000;
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = true;
pthread_mutex_unlock(&timer->StateMutex);
pthread_mutex_lock(&timer->ConditionMutex);
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->ConditionMutex, ×pec);
pthread_mutex_unlock(&timer->ConditionMutex);
if (timer->Callback != NULL)
{
(*timer->Callback)(returnValue != ETIMEDOUT, timer->Context);
}
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = false;
pthread_mutex_unlock(&timer->StateMutex);
return 0;
}
void
Timer_Initialize(Timer timer, void (*callback)(bool isCancelled, void * context))
{
pthread_mutex_init(&timer->ConditionMutex, NULL);
timer->IsRunning = false;
timer->Callback = callback;
pthread_mutex_init(&timer->StateMutex, NULL);
pthread_cond_init(&timer->Condition, NULL);
}
bool
Timer_IsRunning(Timer timer)
{
pthread_mutex_lock(&timer->StateMutex);
bool isRunning = timer->IsRunning;
pthread_mutex_unlock(&timer->StateMutex);
return isRunning;
}
void
Timer_Start(Timer timer, int timeoutMicroseconds, void * context)
{
timer->Context = context;
timer->TimeoutMicroseconds = timeoutMicroseconds;
pthread_create(&timer->Thread, NULL, TimerTask, (void *)timer);
}
void
Timer_Stop(Timer timer)
{
void * returnValue;
pthread_mutex_lock(&timer->StateMutex);
if (!timer->IsRunning)
{
pthread_mutex_unlock(&timer->StateMutex);
return;
}
pthread_mutex_unlock(&timer->StateMutex);
pthread_cond_broadcast(&timer->Condition);
pthread_join(timer->Thread, &returnValue);
}
void
Timer_WaitFor(Timer timer)
{
void * returnValue;
pthread_join(timer->Thread, &returnValue);
}
Example use:
void
TimerExpiredCallback(bool cancelled, void * context)
{
fprintf(stderr, "TimerExpiredCallback %s with context %s\n",
cancelled ? "Cancelled" : "Timed Out",
(char *)context);
}
void
ThreadedTimerExpireTest()
{
TimerObject timerObject;
Timer_Initialize(&timerObject, TimerExpiredCallback);
Timer_Start(&timerObject, 5 * 1000000, "Threaded Timer Expire Test");
Timer_WaitFor(&timerObject);
}
void
ThreadedTimerCancelTest()
{
TimerObject timerObject;
Timer_Initialize(&timerObject, TimerExpiredCallback);
Timer_Start(&timerObject, 5 * 1000000, "Threaded Timer Cancel Test");
Timer_Stop(&timerObject);
}
Overall, it seems pretty solid work for someone who ordinarily works in different languages and who has little pthreads experience. The idea seems to revolve around pthread_cond_timedwait() to achieve a programmable delay with a convenient cancellation mechanism. That's not unreasonable, but there are, indeed, a few problems.
For one, your condition variable usage is non-idiomatic. The conventional and idiomatic use of a condition variable associates with each wait a condition for whether the thread is clear to proceed. This is tested, under protection of the mutex, before waiting. If the condition is satisfied then no wait is performed. It is tested again after each wakeup, because there is a variety of scenarios in which a thread may return from waiting even though it is not actually clear to proceed. In these cases, it loops back and waits again.
I see at least two such possibilities with your timer:
The timer is cancelled very quickly, before its thread starts to wait. Condition variables do not queue signals, so in this case the cancellation would be ineffective. This is a form of race condition.
Spurious wakeup. This is always a possibility that must be considered. Spurious wakeups are rare under most circumstances, but they really do happen.
It seems natural to me to address that by generalizing your IsRunning to cover more states, perhaps something more like
enum { NEW, RUNNING, STOPPING, FINISHED, ERROR } State;
, instead.
Of course, you still have to test that under protection of the appropriate mutex, which brings me to my next point: one mutex should suffice. That one can and should serve both to protect shared state and as the mutex associated with the CV wait. This, too, is idiomatic. It would lead to code in TimerTask() more like this:
// ...
pthread_mutex_lock(&timer->StateMutex);
// Responsibility for setting the state to RUNNING transferred to Timer_Start()
while (timer->State == RUNNING) {
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->StateMutex, ×pec);
switch (returnValue) {
case 0:
if (timer->State == STOPPING) {
timer->State = FINISHED;
}
break;
case ETIMEDOUT:
timer->State = FINISHED;
break;
default:
timer->State = ERROR;
break;
}
}
pthread_mutex_unlock(&timer->StateMutex);
// ...
The accompanying Timer_Start() and Timer_Stop() would be something like this:
void Timer_Start(Timer timer, int timeoutMicroseconds, void * context) {
timer->Context = context;
timer->TimeoutMicroseconds = timeoutMicroseconds;
pthread_mutex_lock(&timer->StateMutex);
timer->state = RUNNING;
// start the thread before releasing the mutex so that no one can see state
// RUNNING before the thread is actually running
pthread_create(&timer->Thread, NULL, TimerTask, (void *)timer);
pthread_mutex_unlock(&timer->StateMutex);
}
void Timer_Stop(Timer timer) {
_Bool should_join = 0;
pthread_mutex_lock(&timer->StateMutex);
switch (timer->State) {
case NEW:
timer->state = FINISHED;
break;
case RUNNING:
timer->state = STOPPING;
should_join = 1;
break;
case STOPPING:
should_join = 1;
break;
// else no action
}
pthread_mutex_unlock(&timer->StateMutex);
// Harmless if the timer has already stopped:
pthread_cond_broadcast(&timer->Condition);
if (should_join) {
pthread_join(timer->Thread, NULL);
}
}
A few other, smaller adjustments would be needed elsewhere.
Additionally, although the example code above omits it for clarity, you really should ensure that you test the return values of all the functions that provide status information that way, unless you don't care whether they succeeded. That includes almost all standard library and Pthreads functions. What you should do in the event that that one fails is highly contextual, but pretending (or assuming) that it succeeded, instead, is rarely a good choice.
An alternative
Another approach to a cancellable delay would revolve around select() or pselect() with a timeout. To arrange for cancellation, you set up a pipe, and have select() to listen to the read end. Writing anything to the write end will then wake select().
This is in several ways easier to code, because you don't need any mutexes or condition variables. Also, data written to a pipe persists until it is read (or the pipe is closed), which smooths out some of the timing-related issues that the CV-based approach has to code around.
With select, however, you need to be prepared to deal with signals (at minimum by blocking them), and the timeout is a duration, not an absolute time.
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = true;
pthread_mutex_unlock(&timer->StateMutex);
pthread_mutex_lock(&timer->ConditionMutex);
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->ConditionMutex, ×pec);
pthread_mutex_unlock(&timer->ConditionMutex);
if (timer->Callback != NULL)
{
(*timer->Callback)(returnValue != ETIMEDOUT, timer->Context);
}
You have two bugs here.
A cancellation can slip in after IsRunning is set to true and before pthread_cond_timedwait gets called. In this case, you'll wait out the entire timer. This bug exists because ConditionMutex doesn't protect any shared state. To use a condition variable properly, the mutex associated with the condition variable must protect the shared state. You can't trade the right mutex for the wrong mutex and then call pthread_cond_timedwait because that creates a race condition. The entire point of a condition variable is to provide an atomic "unlock and wait" operation to prevent this race condition and your code goes to effort to break that logic.
You don't check the return value of pthread_cond_timedwait. If neither the timeout has expired nor cancellation has been requested, you call the callback anyway. Condition variables are stateless. It is your responsibility to track and check state, the condition variable will not do this for you. You need to call pthread_cond_timedwait in a loop until either the state is set to STOPPING or the timeout is reached. Note that the mutex associated with the condition variable, as in 1 above, must protect the shared state -- in this case state.
I think you have a fundamental misunderstanding about how condition variable work and what they're for. They are used when you a mutex that protects shared state and you want to wait for that shared state to change. The mutex associated with the condition variable must protect the shared state to avoid the classic race condition where the state changes after you released the lock but before you managed to start waiting.
UPDATE:
To provide some more useful information, let me briefly explain what a condition variable is for. Say you have some shared state protected by a mutex. And say some thread can't make forward progress until that shared state changes.
You have a problem. You have to hold the mutex that protects the shared state to see what the state is. When you see that it's in the wrong state, you need to wait. But you also need to release the mutex or no other thread can change the shared state.
But if you unlock the mutex and then wait (which is what your code does above!) you have a race condition. After you unlock the mutex but before you wait, another thread can acquire the mutex and change the shared state such that you no longer want to wait. So you need an atomic "unlock the mutex and wait" operation.
That is the purpose, and the only purpose, of condition variables. So you can atomically release the mutex that protects some shared state and wait for a sign with no change for the signal to be lost in-between when you released the mutex and when you waited.
Another important point -- condition variables are stateless. They have no idea what you are waiting for. You must never call pthread_cond_wait or pthread_cond_timedwait and make assumptions about the state. You must check it yourself. Your code releases the mutex after pthread_cond_timedwait returns. You only want to do that if the call times out.
If pthread_cond_timedwait doesn't timeout (or, in any case, when pthread_cond_wait returns), you don't know what happened until you check the state. That's why these functions re-acquire the mutex -- so you can check the state and decide what to do. This is why these functions are almost always called in a loop -- if the thing you're waiting for still hasn't happened (which you determine by checking the shared state that you are responsible for), you need to keep waiting.
This question already has an answer here:
Pthread_create() incorrect start routine parameter passing
(1 answer)
Closed 3 years ago.
I tried to build a program which should create threads and assign a Print function to each one of them, while the main process should use printf function directly.
Firstly, I made it without any synchronization means and expected to get a randomized output.
Later I tried to add a mutex to the Print function which was assigned to the threads and expected to get a chronological output but it seems like the mutex had no effect about the output.
Should I use a mutex on the printf function in the main process as well?
Thanks in advance
My code:
#include <stdio.h>
#include <pthread.h>
#include <errno.h>
pthread_t threadID[20];
pthread_mutex_t lock;
void* Print(void* _num);
int main(void)
{
int num = 20, indx = 0, k = 0;
if (pthread_mutex_init(&lock, NULL))
{
perror("err pthread_mutex_init\n");
return errno;
}
for (; indx < num; ++indx)
{
if (pthread_create(&threadID[indx], NULL, Print, &indx))
{
perror("err pthread_create\n");
return errno;
}
}
for (; k < num; ++k)
{
printf("%d from main\n", k);
}
indx = 0;
for (; indx < num; ++indx)
{
if (pthread_join(threadID[indx], NULL))
{
perror("err pthread_join\n");
return errno;
}
}
pthread_mutex_destroy(&lock);
return 0;
}
void* Print(void* _indx)
{
pthread_mutex_lock(&lock);
printf("%d from thread\n", *(int*)_indx);
pthread_mutex_unlock(&lock);
return NULL;
}
All questions of program bugs notwithstanding, pthreads mutexes provide only mutual exclusion, not any guarantee of scheduling order. This is typical of mutex implementations. Similarly, pthread_create() only creates and starts threads; it does not make any guarantee about scheduling order, such as would justify an assumption that the threads reach the pthread_mutex_lock() call in the same order that they were created.
Overall, if you want to order thread activities based on some characteristic of the threads, then you have to manage that yourself. You need to maintain a sense of which thread's turn it is, and provide a mechanism sufficient to make a thread notice when it's turn arrives. In some circumstances, with some care, you can do this by using semaphores instead of mutexes. The more general solution, however, is to use a condition variable together with your mutex, and some shared variable that serves as to indicate who's turn it currently is.
The code passes the address of the same local variable to all threads. Meanwhile, this variable gets updated by the main thread.
Instead pass it by value cast to void*.
Fix:
pthread_create(&threadID[indx], NULL, Print, (void*)indx)
// ...
printf("%d from thread\n", (int)_indx);
Now, since there is no data shared between the threads, you can remove that mutex.
All the threads created in the for loop have different value of indx. Because of the operating system scheduler, you can never be sure which thread will run. Therefore, the values printed are in random order depending on the randomness of the scheduler. The second for-loop running in the parent thread will run immediately after creating the child threads. Again, the scheduler decides the order of what thread should run next.
Every OS should have an interrupt (at least the major operating systems have). When running the for-loop in the parent thread, an interrupt might happen and leaves the scheduler to make a decision of which thread to run. Therefore, the numbers being printed in the parent for-loop are printed randomly, because all threads run "concurrently".
Joining a thread means waiting for a thread. If you want to make sure you print all numbers in the parent for loop in chronological order, without letting child thread interrupt it, then relocate the for-loop section to be after the thread joining.
Following Code solves ( I think) producer-consumer problem with two threads using only one semaphore.
sem_t sem; //init to 1
int arr[100];
void producer()
{
while(;;) {
sem_wait(sem)
if it is fully filled {
sem_post(sem);
} else {
run 100 times and fill the items
sem_post(sem);
}
sleep(2);
}
}
void consumer()
{
while(;;) {
sem_wait(sem)
if it is empty {
sem_post(sem);
} else {
run 100 times and read the items
reset the start index to 0 so producer could fill again
sem_post(sem);
}
sleep(2);
}
}
int main()
{
//create thread 1 calling consumer
//create thread 2 calling producer
}
Question is why two semaphores (empty and full) are used? Cant the problem be solved with one semaphore?
The reason you need two semaphores is that the producer cannot do anything when the "bin" or whatever the producer and consumer are sharing is full, but the consumer cannot do anything when the bin is empty.
Therefore, the producer needs to have a semaphore for full and the consumer one for empty.
Question is why two semaphores (empty and full) are used? Cant the problem be solved with one semaphore?
Generally speaking, it is needed two some sort of condition variables (empty and full) to solve the problem. These variables used in wait() and notify() semantics.
One semaphore can be use as one of such variables in the given problem, so two semaphores are sufficient.
Another way is to use two actual condition variables with single monitor/mutex (see Using monitors section at http://en.wikipedia.org/wiki/Producer–consumer_problem).
Your variant is similar to the last case, but you use busy-wait(sleep 2) instead of actual waiting on condition variable(wait()).
I have a hash table implementation in C where each location in the table is a linked list (to handle collisions). These linked lists are inherently thread safe and so no additional thread-safe code needs to be written at the hash table level if the table is a constant size - the hash table is thread-safe.
However, I would like the hash table to dynamically expand as values were added so as to maintain a reasonable access time. For the table to expand though, it needs additional thread-safety.
For the purposes of this question, procedures which can safely occur concurrently are 'benign' and the table resizing procedure (which cannot occur concurrently) is 'critical'. Threads currently using the list are known as 'users'.
My first solution to this was to put 'preamble' and 'postamble' code for all the critical function which locks a mutex and then waits until there are no current users proceeding. Then I added preamble and postamble code to the benign functions to check if a critical function was waiting, and if so to wait at the same mutex until the critical section is done.
In pseudocode the pre/post-amble functions SHOULD look like:
benignPreamble(table) {
if (table->criticalIsRunning) {
waitUntilSignal;
}
incrementUserCount(table);
}
benignPostamble(table) {
decrementUserCount(table);
}
criticalPreamble(table) {
table->criticalIsRunning = YES;
waitUntilZero(table->users);
}
criticalPostamble(table) {
table->criticalIsRunning = NO;
signalCriticalDone();
}
My actual code is shown at the bottom of this question and uses (perhaps unnecessarily) caf's PriorityLock from this SO question. My implementation, quite frankly, smells awful. What is a better way to handle this situation? At the moment I'm looking for a way to signal to a mutex that it is irrelevant and 'unlock all waiting threads' simultaneously, but I keep thinking there must be a simpler way. I am trying to code it in such a way that any thread-safety mechanisms are 'ignored' if the critical process is not running.
Current Code
void startBenign(HashTable *table) {
// Ignores if critical process can't be running (users >= 1)
if (table->users == 0) {
// Blocks if critical process is running
PriorityLockLockLow(&(table->lock));
PriorityLockUnlockLow(&(table->lock));
}
__sync_add_and_fetch(&(table->users), 1);
}
void endBenign(HashTable *table) {
// Decrement user count (baseline is 1)
__sync_sub_and_fetch(&(table->users), 1);
}
int startCritical(HashTable *table) {
// Get the lock
PriorityLockLockHigh(&(table->lock));
// Decrement user count BELOW baseline (1) to hit zero eventually
__sync_sub_and_fetch(&(table->users), 1);
// Wait for all concurrent threads to finish
while (table->users != 0) {
usleep(1);
}
// Once we have zero users (any new ones will be
// held at the lock) we can proceed.
return 0;
}
void endCritical(HashTable *table) {
// Increment back to baseline of 1
__sync_add_and_fetch(&(table->users), 1);
// Unlock
PriorityLockUnlockHigh(&(table->lock));
}
It looks like you're trying to reinvent the reader-writer lock, which I believe pthreads provides as a primitive. Have you tried using that?
More specifically, your benign functions should be taking a "reader" lock, while your critical functions need a "writer" lock. The end result will be that as many benign functions can execute as desired, but when a critical function starts executing it will wait until no benign functions are in process, and will block additional benign functions until it has finished. I think this is what you want.
This question is based on:
When is it safe to destroy a pthread barrier?
and the recent glibc bug report:
http://sourceware.org/bugzilla/show_bug.cgi?id=12674
I'm not sure about the semaphores issue reported in glibc, but presumably it's supposed to be valid to destroy a barrier as soon as pthread_barrier_wait returns, as per the above linked question. (Normally, the thread that got PTHREAD_BARRIER_SERIAL_THREAD, or a "special" thread that already considered itself "responsible" for the barrier object, would be the one to destroy it.) The main use case I can think of is when a barrier is used to synchronize a new thread's use of data on the creating thread's stack, preventing the creating thread from returning until the new thread gets to use the data; other barriers probably have a lifetime equal to that of the whole program, or controlled by some other synchronization object.
In any case, how can an implementation ensure that destruction of the barrier (and possibly even unmapping of the memory it resides in) is safe as soon as pthread_barrier_wait returns in any thread? It seems the other threads that have not yet returned would need to examine at least some part of the barrier object to finish their work and return, much like how, in the glibc bug report cited above, sem_post has to examine the waiters count after having adjusted the semaphore value.
I'm going to take another crack at this with an example implementation of pthread_barrier_wait() that uses mutex and condition variable functionality as might be provided by a pthreads implementation. Note that this example doesn't try to deal with performance considerations (specifically, when the waiting threads are unblocked, they are all re-serialized when exiting the wait). I think that using something like Linux Futex objects could help with the performance issues, but Futexes are still pretty much out of my experience.
Also, I doubt that this example handles signals or errors correctly (if at all in the case of signals). But I think proper support for those things can be added as an exercise for the reader.
My main fear is that the example may have a race condition or deadlock (the mutex handling is more complex than I like). Also note that it is an example that hasn't even been compiled. Treat it as pseudo-code. Also keep in mind that my experience is mainly in Windows - I'm tackling this more as an educational opportunity than anything else. So the quality of the pseudo-code may well be pretty low.
However, disclaimers aside, I think it may give an idea of how the problem asked in the question could be handled (ie., how can the pthread_barrier_wait() function allow the pthread_barrier_t object it uses to be destroyed by any of the released threads without danger of using the barrier object by one or more threads on their way out).
Here goes:
/*
* Since this is a part of the implementation of the pthread API, it uses
* reserved names that start with "__" for internal structures and functions
*
* Functions such as __mutex_lock() and __cond_wait() perform the same function
* as the corresponding pthread API.
*/
// struct __barrier_wait data is intended to hold all the data
// that `pthread_barrier_wait()` will need after releasing
// waiting threads. This will allow the function to avoid
// touching the passed in pthread_barrier_t object after
// the wait is satisfied (since any of the released threads
// can destroy it)
struct __barrier_waitdata {
struct __mutex cond_mutex;
struct __cond cond;
unsigned waiter_count;
int wait_complete;
};
struct __barrier {
unsigned count;
struct __mutex waitdata_mutex;
struct __barrier_waitdata* pwaitdata;
};
typedef struct __barrier pthread_barrier_t;
int __barrier_waitdata_init( struct __barrier_waitdata* pwaitdata)
{
waitdata.waiter_count = 0;
waitdata.wait_complete = 0;
rc = __mutex_init( &waitdata.cond_mutex, NULL);
if (!rc) {
return rc;
}
rc = __cond_init( &waitdata.cond, NULL);
if (!rc) {
__mutex_destroy( &pwaitdata->waitdata_mutex);
return rc;
}
return 0;
}
int pthread_barrier_init(pthread_barrier_t *barrier, const pthread_barrierattr_t *attr, unsigned int count)
{
int rc;
rc = __mutex_init( &barrier->waitdata_mutex, NULL);
if (!rc) return rc;
barrier->pwaitdata = NULL;
barrier->count = count;
//TODO: deal with attr
}
int pthread_barrier_wait(pthread_barrier_t *barrier)
{
int rc;
struct __barrier_waitdata* pwaitdata;
unsigned target_count;
// potential waitdata block (only one thread's will actually be used)
struct __barrier_waitdata waitdata;
// nothing to do if we only need to wait for one thread...
if (barrier->count == 1) return PTHREAD_BARRIER_SERIAL_THREAD;
rc = __mutex_lock( &barrier->waitdata_mutex);
if (!rc) return rc;
if (!barrier->pwaitdata) {
// no other thread has claimed the waitdata block yet -
// we'll use this thread's
rc = __barrier_waitdata_init( &waitdata);
if (!rc) {
__mutex_unlock( &barrier->waitdata_mutex);
return rc;
}
barrier->pwaitdata = &waitdata;
}
pwaitdata = barrier->pwaitdata;
target_count = barrier->count;
// all data necessary for handling the return from a wait is pointed to
// by `pwaitdata`, and `pwaitdata` points to a block of data on the stack of
// one of the waiting threads. We have to make sure that the thread that owns
// that block waits until all others have finished with the information
// pointed to by `pwaitdata` before it returns. However, after the 'big' wait
// is completed, the `pthread_barrier_t` object that's passed into this
// function isn't used. The last operation done to `*barrier` is to set
// `barrier->pwaitdata = NULL` to satisfy the requirement that this function
// leaves `*barrier` in a state as if `pthread_barrier_init()` had been called - and
// that operation is done by the thread that signals the wait condition
// completion before the completion is signaled.
// note: we're still holding `barrier->waitdata_mutex`;
rc = __mutex_lock( &pwaitdata->cond_mutex);
pwaitdata->waiter_count += 1;
if (pwaitdata->waiter_count < target_count) {
// need to wait for other threads
__mutex_unlock( &barrier->waitdata_mutex);
do {
// TODO: handle the return code from `__cond_wait()` to break out of this
// if a signal makes that necessary
__cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
} while (!pwaitdata->wait_complete);
}
else {
// this thread satisfies the wait - unblock all the other waiters
pwaitdata->wait_complete = 1;
// 'release' our use of the passed in pthread_barrier_t object
barrier->pwaitdata = NULL;
// unlock the barrier's waitdata_mutex - the barrier is
// ready for use by another set of threads
__mutex_unlock( barrier->waitdata_mutex);
// finally, unblock the waiting threads
__cond_broadcast( &pwaitdata->cond);
}
// at this point, barrier->waitdata_mutex is unlocked, the
// barrier->pwaitdata pointer has been cleared, and no further
// use of `*barrier` is permitted...
// however, each thread still has a valid `pwaitdata` pointer - the
// thread that owns that block needs to wait until all others have
// dropped the pwaitdata->waiter_count
// also, at this point the `pwaitdata->cond_mutex` is locked, so
// we're in a critical section
rc = 0;
pwaitdata->waiter_count--;
if (pwaitdata == &waitdata) {
// this thread owns the waitdata block - it needs to hang around until
// all other threads are done
// as a convenience, this thread will be the one that returns
// PTHREAD_BARRIER_SERIAL_THREAD
rc = PTHREAD_BARRIER_SERIAL_THREAD;
while (pwaitdata->waiter_count!= 0) {
__cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
};
__mutex_unlock( &pwaitdata->cond_mutex);
__cond_destroy( &pwaitdata->cond);
__mutex_destroy( &pwaitdata_cond_mutex);
}
else if (pwaitdata->waiter_count == 0) {
__cond_signal( &pwaitdata->cond);
__mutex_unlock( &pwaitdata->cond_mutex);
}
return rc;
}
17 July 20111: Update in response to a comment/question about process-shared barriers
I forgot completely about the situation with barriers that are shared between processes. And as you mention, the idea I outlined will fail horribly in that case. I don't really have experience with POSIX shared memory use, so any suggestions I make should be tempered with scepticism.
To summarize (for my benefit, if no one else's):
When any of the threads gets control after pthread_barrier_wait() returns, the barrier object needs to be in the 'init' state (however, the most recent pthread_barrier_init() on that object set it). Also implied by the API is that once any of the threads return, one or more of the the following things could occur:
another call to pthread_barrier_wait() to start a new round of synchronization of threads
pthread_barrier_destroy() on the barrier object
the memory allocated for the barrier object could be freed or unshared if it's in a shared memory region.
These things mean that before the pthread_barrier_wait() call allows any thread to return, it pretty much needs to ensure that all waiting threads are no longer using the barrier object in the context of that call. My first answer addressed this by creating a 'local' set of synchronization objects (a mutex and an associated condition variable) outside of the barrier object that would block all the threads. These local synchronization objects were allocated on the stack of the thread that happened to call pthread_barrier_wait() first.
I think that something similar would need to be done for barriers that are process-shared. However, in that case simply allocating those sync objects on a thread's stack isn't adequate (since the other processes would have no access). For a process-shared barrier, those objects would have to be allocated in process-shared memory. I think the technique I listed above could be applied similarly:
the waitdata_mutex that controls the 'allocation' of the local sync variables (the waitdata block) would be in process-shared memory already by virtue of it being in the barrier struct. Of course, when the barrier is set to THEAD_PROCESS_SHARED, that attribute would also need to be applied to the waitdata_mutex
when __barrier_waitdata_init() is called to initialize the local mutex & condition variable, it would have to allocate those objects in shared memory instead of simply using the stack-based waitdata variable.
when the 'cleanup' thread destroys the mutex and the condition variable in the waitdata block, it would also need to clean up the process-shared memory allocation for the block.
in the case where shared memory is used, there needs to be some mechanism to ensured that the shared memory object is opened at least once in each process, and closed the correct number of times in each process (but not closed entirely before every thread in the process is finished using it). I haven't thought through exactly how that would be done...
I think these changes would allow the scheme to operate with process-shared barriers. the last bullet point above is a key item to figure out. Another is how to construct a name for the shared memory object that will hold the 'local' process-shared waitdata. There are certain attributes you'd want for that name:
you'd want the storage for the name to reside in the struct pthread_barrier_t structure so all process have access to it; that means a known limit to the length of the name
you'd want the name to be unique to each 'instance' of a set of calls to pthread_barrier_wait() because it might be possible for a second round of waiting to start before all threads have gotten all the way out of the first round waiting (so the process-shared memory block set up for the waitdata might not have been freed yet). So the name probably has to be based on things like process id, thread id, address of the barrier object, and an atomic counter.
I don't know whether or not there are security implications to having the name be 'guessable'. if so, some randomization needs to be added - no idea how much. Maybe you'd also need to hash the data mentioned above along with the random bits. Like I said, I really have no idea if this is important or not.
As far as I can see there is no need for pthread_barrier_destroy to be an immediate operation. You could have it wait until all threads that are still in their wakeup phase are woken up.
E.g you could have an atomic counter awakening that initially set to the number of threads that are woken up. Then it would be decremented as last action before pthread_barrier_wait returns. pthread_barrier_destroy then just could be spinning until that counter falls to 0.