Implementation of condition variables - c

To understand the code of pthread condition variables, I have written my own version. Does it look correct? I am using it in a program, its working, but working surprisingly much faster. Originally the program takes around 2.5 seconds and with my version of condition variables it takes only 0.8 seconds, and the output of the program is correct too. However, I'm not sure, if my implementation is correct.
struct cond_node_t
{
sem_t s;
cond_node_t * next;
};
struct cond_t
{
cond_node_t * q; // Linked List
pthread_mutex_t qm; // Lock for the Linked List
};
int my_pthread_cond_init( cond_t * cond )
{
cond->q = NULL;
pthread_mutex_init( &(cond->qm), NULL );
}
int my_pthread_cond_wait( cond_t* cond, pthread_mutex_t* mutex )
{
cond_node_t * self;
pthread_mutex_lock(&(cond->qm));
self = (cond_node_t*)calloc( 1, sizeof(cond_node_t) );
self->next = cond->q;
cond->q = self;
sem_init( &self->s, 0, 0 );
pthread_mutex_unlock(&(cond->qm));
pthread_mutex_unlock(mutex);
sem_wait( &self->s );
free( self ); // Free the node
pthread_mutex_lock(mutex);
}
int my_pthread_cond_signal( cond_t * cond )
{
pthread_mutex_lock(&(cond->qm));
if (cond->q != NULL)
{
sem_post(&(cond->q->s));
cond->q = cond->q->next;
}
pthread_mutex_unlock(&(cond->qm));
}
int my_pthread_cond_broadcast( cond_t * cond )
{
pthread_mutex_lock(&(cond->qm));
while ( cond->q != NULL)
{
sem_post( &(cond->q->s) );
cond->q = cond->q->next;
}
pthread_mutex_unlock(&(cond->qm));
}

Apart from the missing return value checks, there are some more issues that should be fixable:
sem_destroy is not called.
Signal/broadcast touch the cond_node_t after waking the target thread, potentially resulting in a use-after-free.
Further comments:
The omitted destroy operation may require changes to the other operations so it is safe to destroy the condition variable when POSIX says it shall be safe. Not supporting destroy or imposing stronger restrictions on when it may be called will simplify things.
A production implementation would handle thread cancellation.
Backing out of a wait (such as required for thread cancellation and pthread_cond_timedwait timeouts) may lead to complications.
Your implementation queues threads in userland, which is done in some production implementations for performance reasons; I do not understand exactly why.
Your implementation always queues threads in LIFO order. This is often faster (such as because of cache effects) but may lead to starvation. Production implementation may use FIFO order sometimes to avoid starvation.

Basically your strategy looks ok, but you have one major danger, some undefined behavior, and a nit pick:
your are not inspecting the return values of your POSIX functions. In particular sem_wait is interruptible so under heavy load or bad luck your thread will be woken up spuriously. You'd have to carefully catch all that
none of your functions returns a value. If some user of the functions will decide to use the return values some day, this is undefined behavior. Carefully analyse the error codes that the condition functions are allowed to return and do just that.
don't cast the return of malloc or calloc
Edit: Actually, you don't need malloc/free at all. A local variable would do as well.

You don't seem to respect this requirement:
These functions atomically release mutex and cause the calling thread to block on the condition variable cond; atomically here means
"atomically with respect to access by another thread to the mutex and then the condition variable". That is, if another thread is able to acquire the mutex
after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall
behave as if it were issued after the about-to-block thread has blocked.
You unlock and then wait. Another thread can do lots of things between these operations.
P.S. I'm not sure myself if I'm interpreting this paragraph correctly, feel free to point out my error.

Related

Use while loop to make a thread wait till the lock variable is set to avoid race condition in C prgramming

#include <stdio.h>
#include <pthread.h>
long mails = 0;
int lock = 0;
void *routine()
{
printf("Thread Start\n");
for (long i = 0; i < 100000; i++)
{
while (lock)
{
}
lock = 1;
mails++;
lock = 0;
}
printf("Thread End\n");
}
int main(int argc, int *argv[])
{
pthread_t p1, p2;
if (pthread_create(&p1, NULL, &routine, NULL) != 0)
{
return 1;
}
if (pthread_create(&p2, NULL, &routine, NULL) != 0)
{
return 2;
}
if (pthread_join(p1, NULL) != 0)
{
return 3;
}
if (pthread_join(p2, NULL) != 0)
{
return 4;
}
printf("Number of mails: %ld \n", mails);
return 0;
}
In the above code each thread runs a for loop to increase the value
of mails by 100000.
To avoid race condition is used lock variable
along with while loop.
Using while loop in routine function does not
help to avoid race condition and give correct output for mails
variable.
In C, the compiler can safely assume a (global) variable is not modified by other threads unless in few cases (eg. volatile variable, atomic accesses). This means the compiler can assume lock is not modified and while (lock) {} can be replaced with an infinite loop. In fact, this kind of loop cause an undefined behaviour since it does not have any visible effect. This means the compiler can remove it (or generate a wrong code). The compiler can also remove the lock = 1 statement since it is followed by lock = 0. The resulting code is bogus. Note that even if the compiler would generate a correct code, some processor (eg. AFAIK ARM and PowerPC) can reorder instructions resulting in a bogus behaviour.
To make sure accesses between multiple threads are correct, you need at least atomic accesses on lock. The atomic access should be combined with proper memory barriers for relaxed atomic accesses. The thing is while (lock) {} will result in a spin lock. Spin locks are known to be a pretty bad solution in many cases unless you really know what you are doing and all the consequence (in doubt, don't use them).
Generally, it is better to uses mutexes, semaphores and wait conditions in this case. Mutexes are generally implemented using an atomic boolean flag internally (with right memory barriers so you do not need to care about that). When the flag is mark as locked, an OS sleeping function is called. The sleeping function wake up when the lock has been released by another thread. This is possible since the thread releasing a lock can send a wake up signal. For more information about this, please read this. In old C, you can use pthread for that. Since C11, you can do that directly using this standard API. For pthread, it is here (do not forget the initialization).
If you really want a spinlock, you need something like:
#include <stdatomic.h>
atomic_flag lock = ATOMIC_FLAG_INIT;
void *routine()
{
printf("Thread Start\n");
for (long i = 0; i < 100000; i++)
{
while (atomic_flag_test_and_set(&lock)) {}
mails++;
atomic_flag_clear(&lock);
}
printf("Thread End\n");
}
However, since you are already using pthreads, you're better off using a pthread_mutex
Jérôme Richard told you about ways in which the compiler could optimize the sense out of your code, but even if you turned all the optimizations off, you still would be left with a race condition. You wrote
while (lock) { }
lock=1;
...critical section...
lock=0;
The problem with that is, suppose lock==0. Two threads racing toward that critical section at the same time could both test lock, and they could both find that lock==0. Then they both would set lock=1, and they both would enter the critical section...
...at the same time.
In order to implement a spin lock,* you need some way for one thread to prevent other threads from accessing the lock variable in between when the first thread tests it, and when the first thread sets it. You need an atomic (i.e., indivisible) "test and set" operation.
Most computer architectures have some kind of specialized op-code that does what you want. It has names like "test and set," "compare and exchange," "load-linked and store-conditional," etc. Chris Dodd's answer shows you how to use a standard C library function that does the right thing on whatever CPU you happen to be using...
...But don't forget what Jérôme said.*
* Jérôme told you that spin locks are a bad idea.

C Timer with cancellation and call back

I've created a Timer pseudo class in C that has call back capability and can be cancelled. I come from the .NET/C# world where this is all done by the framework and I'm not an expert with pthreads.
In .NET there are cancellation tokens which you can wait on which means I don't need to worry so much about the nuts and bolts.
However using pthreads is a bit more low level than I am used to so my question is:
Are there any issues with the way I have implemented this?
Thanks in anticipation for any comments you may have.
Timer struct:
typedef struct _timer
{
pthread_cond_t Condition;
pthread_mutex_t ConditionMutex;
bool IsRunning;
pthread_mutex_t StateMutex;
pthread_t Thread;
int TimeoutMicroseconds;
void * Context;
void (*Callback)(bool isCancelled, void * context);
} TimerObject, *Timer;
C Module:
static void *
TimerTask(Timer timer)
{
struct timespec timespec;
struct timeval now;
int returnValue = 0;
clock_gettime(CLOCK_REALTIME, &timespec);
timespec.tv_sec += timer->TimeoutMicroseconds / 1000000;
timespec.tv_nsec += (timer->TimeoutMicroseconds % 1000000) * 1000000;
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = true;
pthread_mutex_unlock(&timer->StateMutex);
pthread_mutex_lock(&timer->ConditionMutex);
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->ConditionMutex, &timespec);
pthread_mutex_unlock(&timer->ConditionMutex);
if (timer->Callback != NULL)
{
(*timer->Callback)(returnValue != ETIMEDOUT, timer->Context);
}
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = false;
pthread_mutex_unlock(&timer->StateMutex);
return 0;
}
void
Timer_Initialize(Timer timer, void (*callback)(bool isCancelled, void * context))
{
pthread_mutex_init(&timer->ConditionMutex, NULL);
timer->IsRunning = false;
timer->Callback = callback;
pthread_mutex_init(&timer->StateMutex, NULL);
pthread_cond_init(&timer->Condition, NULL);
}
bool
Timer_IsRunning(Timer timer)
{
pthread_mutex_lock(&timer->StateMutex);
bool isRunning = timer->IsRunning;
pthread_mutex_unlock(&timer->StateMutex);
return isRunning;
}
void
Timer_Start(Timer timer, int timeoutMicroseconds, void * context)
{
timer->Context = context;
timer->TimeoutMicroseconds = timeoutMicroseconds;
pthread_create(&timer->Thread, NULL, TimerTask, (void *)timer);
}
void
Timer_Stop(Timer timer)
{
void * returnValue;
pthread_mutex_lock(&timer->StateMutex);
if (!timer->IsRunning)
{
pthread_mutex_unlock(&timer->StateMutex);
return;
}
pthread_mutex_unlock(&timer->StateMutex);
pthread_cond_broadcast(&timer->Condition);
pthread_join(timer->Thread, &returnValue);
}
void
Timer_WaitFor(Timer timer)
{
void * returnValue;
pthread_join(timer->Thread, &returnValue);
}
Example use:
void
TimerExpiredCallback(bool cancelled, void * context)
{
fprintf(stderr, "TimerExpiredCallback %s with context %s\n",
cancelled ? "Cancelled" : "Timed Out",
(char *)context);
}
void
ThreadedTimerExpireTest()
{
TimerObject timerObject;
Timer_Initialize(&timerObject, TimerExpiredCallback);
Timer_Start(&timerObject, 5 * 1000000, "Threaded Timer Expire Test");
Timer_WaitFor(&timerObject);
}
void
ThreadedTimerCancelTest()
{
TimerObject timerObject;
Timer_Initialize(&timerObject, TimerExpiredCallback);
Timer_Start(&timerObject, 5 * 1000000, "Threaded Timer Cancel Test");
Timer_Stop(&timerObject);
}
Overall, it seems pretty solid work for someone who ordinarily works in different languages and who has little pthreads experience. The idea seems to revolve around pthread_cond_timedwait() to achieve a programmable delay with a convenient cancellation mechanism. That's not unreasonable, but there are, indeed, a few problems.
For one, your condition variable usage is non-idiomatic. The conventional and idiomatic use of a condition variable associates with each wait a condition for whether the thread is clear to proceed. This is tested, under protection of the mutex, before waiting. If the condition is satisfied then no wait is performed. It is tested again after each wakeup, because there is a variety of scenarios in which a thread may return from waiting even though it is not actually clear to proceed. In these cases, it loops back and waits again.
I see at least two such possibilities with your timer:
The timer is cancelled very quickly, before its thread starts to wait. Condition variables do not queue signals, so in this case the cancellation would be ineffective. This is a form of race condition.
Spurious wakeup. This is always a possibility that must be considered. Spurious wakeups are rare under most circumstances, but they really do happen.
It seems natural to me to address that by generalizing your IsRunning to cover more states, perhaps something more like
enum { NEW, RUNNING, STOPPING, FINISHED, ERROR } State;
, instead.
Of course, you still have to test that under protection of the appropriate mutex, which brings me to my next point: one mutex should suffice. That one can and should serve both to protect shared state and as the mutex associated with the CV wait. This, too, is idiomatic. It would lead to code in TimerTask() more like this:
// ...
pthread_mutex_lock(&timer->StateMutex);
// Responsibility for setting the state to RUNNING transferred to Timer_Start()
while (timer->State == RUNNING) {
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->StateMutex, &timespec);
switch (returnValue) {
case 0:
if (timer->State == STOPPING) {
timer->State = FINISHED;
}
break;
case ETIMEDOUT:
timer->State = FINISHED;
break;
default:
timer->State = ERROR;
break;
}
}
pthread_mutex_unlock(&timer->StateMutex);
// ...
The accompanying Timer_Start() and Timer_Stop() would be something like this:
void Timer_Start(Timer timer, int timeoutMicroseconds, void * context) {
timer->Context = context;
timer->TimeoutMicroseconds = timeoutMicroseconds;
pthread_mutex_lock(&timer->StateMutex);
timer->state = RUNNING;
// start the thread before releasing the mutex so that no one can see state
// RUNNING before the thread is actually running
pthread_create(&timer->Thread, NULL, TimerTask, (void *)timer);
pthread_mutex_unlock(&timer->StateMutex);
}
void Timer_Stop(Timer timer) {
_Bool should_join = 0;
pthread_mutex_lock(&timer->StateMutex);
switch (timer->State) {
case NEW:
timer->state = FINISHED;
break;
case RUNNING:
timer->state = STOPPING;
should_join = 1;
break;
case STOPPING:
should_join = 1;
break;
// else no action
}
pthread_mutex_unlock(&timer->StateMutex);
// Harmless if the timer has already stopped:
pthread_cond_broadcast(&timer->Condition);
if (should_join) {
pthread_join(timer->Thread, NULL);
}
}
A few other, smaller adjustments would be needed elsewhere.
Additionally, although the example code above omits it for clarity, you really should ensure that you test the return values of all the functions that provide status information that way, unless you don't care whether they succeeded. That includes almost all standard library and Pthreads functions. What you should do in the event that that one fails is highly contextual, but pretending (or assuming) that it succeeded, instead, is rarely a good choice.
An alternative
Another approach to a cancellable delay would revolve around select() or pselect() with a timeout. To arrange for cancellation, you set up a pipe, and have select() to listen to the read end. Writing anything to the write end will then wake select().
This is in several ways easier to code, because you don't need any mutexes or condition variables. Also, data written to a pipe persists until it is read (or the pipe is closed), which smooths out some of the timing-related issues that the CV-based approach has to code around.
With select, however, you need to be prepared to deal with signals (at minimum by blocking them), and the timeout is a duration, not an absolute time.
pthread_mutex_lock(&timer->StateMutex);
timer->IsRunning = true;
pthread_mutex_unlock(&timer->StateMutex);
pthread_mutex_lock(&timer->ConditionMutex);
returnValue = pthread_cond_timedwait(&timer->Condition, &timer->ConditionMutex, &timespec);
pthread_mutex_unlock(&timer->ConditionMutex);
if (timer->Callback != NULL)
{
(*timer->Callback)(returnValue != ETIMEDOUT, timer->Context);
}
You have two bugs here.
A cancellation can slip in after IsRunning is set to true and before pthread_cond_timedwait gets called. In this case, you'll wait out the entire timer. This bug exists because ConditionMutex doesn't protect any shared state. To use a condition variable properly, the mutex associated with the condition variable must protect the shared state. You can't trade the right mutex for the wrong mutex and then call pthread_cond_timedwait because that creates a race condition. The entire point of a condition variable is to provide an atomic "unlock and wait" operation to prevent this race condition and your code goes to effort to break that logic.
You don't check the return value of pthread_cond_timedwait. If neither the timeout has expired nor cancellation has been requested, you call the callback anyway. Condition variables are stateless. It is your responsibility to track and check state, the condition variable will not do this for you. You need to call pthread_cond_timedwait in a loop until either the state is set to STOPPING or the timeout is reached. Note that the mutex associated with the condition variable, as in 1 above, must protect the shared state -- in this case state.
I think you have a fundamental misunderstanding about how condition variable work and what they're for. They are used when you a mutex that protects shared state and you want to wait for that shared state to change. The mutex associated with the condition variable must protect the shared state to avoid the classic race condition where the state changes after you released the lock but before you managed to start waiting.
UPDATE:
To provide some more useful information, let me briefly explain what a condition variable is for. Say you have some shared state protected by a mutex. And say some thread can't make forward progress until that shared state changes.
You have a problem. You have to hold the mutex that protects the shared state to see what the state is. When you see that it's in the wrong state, you need to wait. But you also need to release the mutex or no other thread can change the shared state.
But if you unlock the mutex and then wait (which is what your code does above!) you have a race condition. After you unlock the mutex but before you wait, another thread can acquire the mutex and change the shared state such that you no longer want to wait. So you need an atomic "unlock the mutex and wait" operation.
That is the purpose, and the only purpose, of condition variables. So you can atomically release the mutex that protects some shared state and wait for a sign with no change for the signal to be lost in-between when you released the mutex and when you waited.
Another important point -- condition variables are stateless. They have no idea what you are waiting for. You must never call pthread_cond_wait or pthread_cond_timedwait and make assumptions about the state. You must check it yourself. Your code releases the mutex after pthread_cond_timedwait returns. You only want to do that if the call times out.
If pthread_cond_timedwait doesn't timeout (or, in any case, when pthread_cond_wait returns), you don't know what happened until you check the state. That's why these functions re-acquire the mutex -- so you can check the state and decide what to do. This is why these functions are almost always called in a loop -- if the thing you're waiting for still hasn't happened (which you determine by checking the shared state that you are responsible for), you need to keep waiting.

Two threads accessing a locked mutex simultaneously

I have written this code in C language and there are two pthreads that are using this code and trying to access the mutex "firstSection" (in both of them we are sure that the mutex passed to function is the same). The code suppose to check two mutexes, and if both of them were available, performs some actions which take place in function safeUnlockTwoMutexes(), and if failed to acquire at least one of them, it has to wait for two seconds and tries again. ("intersection" mutex is the main-lock to safe check the situation of the other mutexes)
void twoSectionRoute(pthread_mutex_t firstSection, pthread_mutex_t secondSection){
bool pathClear = false;
while (!pathClear){
pthread_mutex_lock(&intersection);
if (pthread_mutex_trylock(&firstSection) == 0){
if (pthread_mutex_trylock(&secondSection) == 0){
pathClear = true;
pthread_mutex_unlock(&intersection);
} else {
pthread_mutex_unlock(&firstSection);
pthread_mutex_unlock(&intersection);
sleep(2);
}
} else {
pthread_mutex_unlock(&intersection);
sleep(2);
}
}
safeUnlockTwoMutexes(firstSection, secondSection, 1);
}
Now the problem with this code is both threads are able to lock the mutex "firstSectio" at almost same time and I don't know why. (maybe because its type is recursive mutex?! I've used "PTHREAD_MUTEX_INITIALIZER" in the beginning of the file as global variables)
I'm wondering how can I fix this issue, and the threads access this sections one after another?
Your function signature passes pthread_mutex_t values firstSection and secondSection by value. You need to pass mutexes by pointer.
void twoSectionRoute(pthread_mutex_t* firstSection, pthread_mutex_t* secondSection){
Then, within the function use just firstSection and secondSection rather than &firstSection and &secondSection.
If you pass the mutex by value (as here), and it compiles, then the mutex itself is copied, so you end up with undefined behaviour and the mutex locks do not operate on the same state.

Pthread - setting scheduler parameters

I wanted to use read-writer locks from pthread library in a way, that writers have priority over readers. I read in my man pages that
If the Thread Execution Scheduling option is supported, and the threads involved in the lock are executing with the scheduling policies SCHED_FIFO or SCHED_RR, the calling thread shall not acquire the lock if a writer holds the lock or if writers of higher or equal priority are blocked on the lock; otherwise, the calling thread shall acquire the lock.
so I wrote small function that sets up thread scheduling options.
void thread_set_up(int _thread)
{
struct sched_param *_param=malloc(sizeof (struct sched_param));
int *c=malloc(sizeof(int));
*c=sched_get_priority_min(SCHED_FIFO)+1;
_param->__sched_priority=*c;
long *a=malloc(sizeof(long));
*a=syscall(SYS_gettid);
int *b=malloc(sizeof(int));
*b=SCHED_FIFO;
if (pthread_setschedparam(*a,*b,_param) == -1)
{
//depending on which thread calls this functions, few thing can happen
if (_thread == MAIN_THREAD)
client_cleanup();
else if (_thread==ACCEPT_THREAD)
{
pthread_kill(params.main_thread_id,SIGINT);
pthread_exit(NULL);
}
}
}
sorry for those a,b,c but I tried to malloc everything, still I get SIGSEGV on the call to pthread_setschedparam, I am wondering why?
I don't know if these are the exact causes of your problems but they should help you hone in on it.
(1) pthread_setschedparam returns a 0 on success and a positive number otherwise. So
if (pthread_setschedparam(*a,*b,_param) == -1)
will never execute. It should be something like:
if ((ret = pthread_setschedparam(*a, *b, _param)) != 0)
{ //yada yada
}
As an aside, it isn't 100% clear what you are doing but pthread_kill looks about as ugly a way to do it as possible.
(2) syscall(SYS_gettid) gets the OS threadID. pthread__setschedparam expects the pthreads thread id, which is different. The pthreads thread id is returned by pthread_create and pthread_self in the datatype pthread_t. Change the pthread__setschedparam to use this type and the proper values instead and see if things improve.
(3) You need to run as a priviledge user to change the schedule. Try running the program as root or sudo or whatever.

How can barriers be destroyable as soon as pthread_barrier_wait returns?

This question is based on:
When is it safe to destroy a pthread barrier?
and the recent glibc bug report:
http://sourceware.org/bugzilla/show_bug.cgi?id=12674
I'm not sure about the semaphores issue reported in glibc, but presumably it's supposed to be valid to destroy a barrier as soon as pthread_barrier_wait returns, as per the above linked question. (Normally, the thread that got PTHREAD_BARRIER_SERIAL_THREAD, or a "special" thread that already considered itself "responsible" for the barrier object, would be the one to destroy it.) The main use case I can think of is when a barrier is used to synchronize a new thread's use of data on the creating thread's stack, preventing the creating thread from returning until the new thread gets to use the data; other barriers probably have a lifetime equal to that of the whole program, or controlled by some other synchronization object.
In any case, how can an implementation ensure that destruction of the barrier (and possibly even unmapping of the memory it resides in) is safe as soon as pthread_barrier_wait returns in any thread? It seems the other threads that have not yet returned would need to examine at least some part of the barrier object to finish their work and return, much like how, in the glibc bug report cited above, sem_post has to examine the waiters count after having adjusted the semaphore value.
I'm going to take another crack at this with an example implementation of pthread_barrier_wait() that uses mutex and condition variable functionality as might be provided by a pthreads implementation. Note that this example doesn't try to deal with performance considerations (specifically, when the waiting threads are unblocked, they are all re-serialized when exiting the wait). I think that using something like Linux Futex objects could help with the performance issues, but Futexes are still pretty much out of my experience.
Also, I doubt that this example handles signals or errors correctly (if at all in the case of signals). But I think proper support for those things can be added as an exercise for the reader.
My main fear is that the example may have a race condition or deadlock (the mutex handling is more complex than I like). Also note that it is an example that hasn't even been compiled. Treat it as pseudo-code. Also keep in mind that my experience is mainly in Windows - I'm tackling this more as an educational opportunity than anything else. So the quality of the pseudo-code may well be pretty low.
However, disclaimers aside, I think it may give an idea of how the problem asked in the question could be handled (ie., how can the pthread_barrier_wait() function allow the pthread_barrier_t object it uses to be destroyed by any of the released threads without danger of using the barrier object by one or more threads on their way out).
Here goes:
/*
* Since this is a part of the implementation of the pthread API, it uses
* reserved names that start with "__" for internal structures and functions
*
* Functions such as __mutex_lock() and __cond_wait() perform the same function
* as the corresponding pthread API.
*/
// struct __barrier_wait data is intended to hold all the data
// that `pthread_barrier_wait()` will need after releasing
// waiting threads. This will allow the function to avoid
// touching the passed in pthread_barrier_t object after
// the wait is satisfied (since any of the released threads
// can destroy it)
struct __barrier_waitdata {
struct __mutex cond_mutex;
struct __cond cond;
unsigned waiter_count;
int wait_complete;
};
struct __barrier {
unsigned count;
struct __mutex waitdata_mutex;
struct __barrier_waitdata* pwaitdata;
};
typedef struct __barrier pthread_barrier_t;
int __barrier_waitdata_init( struct __barrier_waitdata* pwaitdata)
{
waitdata.waiter_count = 0;
waitdata.wait_complete = 0;
rc = __mutex_init( &waitdata.cond_mutex, NULL);
if (!rc) {
return rc;
}
rc = __cond_init( &waitdata.cond, NULL);
if (!rc) {
__mutex_destroy( &pwaitdata->waitdata_mutex);
return rc;
}
return 0;
}
int pthread_barrier_init(pthread_barrier_t *barrier, const pthread_barrierattr_t *attr, unsigned int count)
{
int rc;
rc = __mutex_init( &barrier->waitdata_mutex, NULL);
if (!rc) return rc;
barrier->pwaitdata = NULL;
barrier->count = count;
//TODO: deal with attr
}
int pthread_barrier_wait(pthread_barrier_t *barrier)
{
int rc;
struct __barrier_waitdata* pwaitdata;
unsigned target_count;
// potential waitdata block (only one thread's will actually be used)
struct __barrier_waitdata waitdata;
// nothing to do if we only need to wait for one thread...
if (barrier->count == 1) return PTHREAD_BARRIER_SERIAL_THREAD;
rc = __mutex_lock( &barrier->waitdata_mutex);
if (!rc) return rc;
if (!barrier->pwaitdata) {
// no other thread has claimed the waitdata block yet -
// we'll use this thread's
rc = __barrier_waitdata_init( &waitdata);
if (!rc) {
__mutex_unlock( &barrier->waitdata_mutex);
return rc;
}
barrier->pwaitdata = &waitdata;
}
pwaitdata = barrier->pwaitdata;
target_count = barrier->count;
// all data necessary for handling the return from a wait is pointed to
// by `pwaitdata`, and `pwaitdata` points to a block of data on the stack of
// one of the waiting threads. We have to make sure that the thread that owns
// that block waits until all others have finished with the information
// pointed to by `pwaitdata` before it returns. However, after the 'big' wait
// is completed, the `pthread_barrier_t` object that's passed into this
// function isn't used. The last operation done to `*barrier` is to set
// `barrier->pwaitdata = NULL` to satisfy the requirement that this function
// leaves `*barrier` in a state as if `pthread_barrier_init()` had been called - and
// that operation is done by the thread that signals the wait condition
// completion before the completion is signaled.
// note: we're still holding `barrier->waitdata_mutex`;
rc = __mutex_lock( &pwaitdata->cond_mutex);
pwaitdata->waiter_count += 1;
if (pwaitdata->waiter_count < target_count) {
// need to wait for other threads
__mutex_unlock( &barrier->waitdata_mutex);
do {
// TODO: handle the return code from `__cond_wait()` to break out of this
// if a signal makes that necessary
__cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
} while (!pwaitdata->wait_complete);
}
else {
// this thread satisfies the wait - unblock all the other waiters
pwaitdata->wait_complete = 1;
// 'release' our use of the passed in pthread_barrier_t object
barrier->pwaitdata = NULL;
// unlock the barrier's waitdata_mutex - the barrier is
// ready for use by another set of threads
__mutex_unlock( barrier->waitdata_mutex);
// finally, unblock the waiting threads
__cond_broadcast( &pwaitdata->cond);
}
// at this point, barrier->waitdata_mutex is unlocked, the
// barrier->pwaitdata pointer has been cleared, and no further
// use of `*barrier` is permitted...
// however, each thread still has a valid `pwaitdata` pointer - the
// thread that owns that block needs to wait until all others have
// dropped the pwaitdata->waiter_count
// also, at this point the `pwaitdata->cond_mutex` is locked, so
// we're in a critical section
rc = 0;
pwaitdata->waiter_count--;
if (pwaitdata == &waitdata) {
// this thread owns the waitdata block - it needs to hang around until
// all other threads are done
// as a convenience, this thread will be the one that returns
// PTHREAD_BARRIER_SERIAL_THREAD
rc = PTHREAD_BARRIER_SERIAL_THREAD;
while (pwaitdata->waiter_count!= 0) {
__cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
};
__mutex_unlock( &pwaitdata->cond_mutex);
__cond_destroy( &pwaitdata->cond);
__mutex_destroy( &pwaitdata_cond_mutex);
}
else if (pwaitdata->waiter_count == 0) {
__cond_signal( &pwaitdata->cond);
__mutex_unlock( &pwaitdata->cond_mutex);
}
return rc;
}
17 July 20111: Update in response to a comment/question about process-shared barriers
I forgot completely about the situation with barriers that are shared between processes. And as you mention, the idea I outlined will fail horribly in that case. I don't really have experience with POSIX shared memory use, so any suggestions I make should be tempered with scepticism.
To summarize (for my benefit, if no one else's):
When any of the threads gets control after pthread_barrier_wait() returns, the barrier object needs to be in the 'init' state (however, the most recent pthread_barrier_init() on that object set it). Also implied by the API is that once any of the threads return, one or more of the the following things could occur:
another call to pthread_barrier_wait() to start a new round of synchronization of threads
pthread_barrier_destroy() on the barrier object
the memory allocated for the barrier object could be freed or unshared if it's in a shared memory region.
These things mean that before the pthread_barrier_wait() call allows any thread to return, it pretty much needs to ensure that all waiting threads are no longer using the barrier object in the context of that call. My first answer addressed this by creating a 'local' set of synchronization objects (a mutex and an associated condition variable) outside of the barrier object that would block all the threads. These local synchronization objects were allocated on the stack of the thread that happened to call pthread_barrier_wait() first.
I think that something similar would need to be done for barriers that are process-shared. However, in that case simply allocating those sync objects on a thread's stack isn't adequate (since the other processes would have no access). For a process-shared barrier, those objects would have to be allocated in process-shared memory. I think the technique I listed above could be applied similarly:
the waitdata_mutex that controls the 'allocation' of the local sync variables (the waitdata block) would be in process-shared memory already by virtue of it being in the barrier struct. Of course, when the barrier is set to THEAD_PROCESS_SHARED, that attribute would also need to be applied to the waitdata_mutex
when __barrier_waitdata_init() is called to initialize the local mutex & condition variable, it would have to allocate those objects in shared memory instead of simply using the stack-based waitdata variable.
when the 'cleanup' thread destroys the mutex and the condition variable in the waitdata block, it would also need to clean up the process-shared memory allocation for the block.
in the case where shared memory is used, there needs to be some mechanism to ensured that the shared memory object is opened at least once in each process, and closed the correct number of times in each process (but not closed entirely before every thread in the process is finished using it). I haven't thought through exactly how that would be done...
I think these changes would allow the scheme to operate with process-shared barriers. the last bullet point above is a key item to figure out. Another is how to construct a name for the shared memory object that will hold the 'local' process-shared waitdata. There are certain attributes you'd want for that name:
you'd want the storage for the name to reside in the struct pthread_barrier_t structure so all process have access to it; that means a known limit to the length of the name
you'd want the name to be unique to each 'instance' of a set of calls to pthread_barrier_wait() because it might be possible for a second round of waiting to start before all threads have gotten all the way out of the first round waiting (so the process-shared memory block set up for the waitdata might not have been freed yet). So the name probably has to be based on things like process id, thread id, address of the barrier object, and an atomic counter.
I don't know whether or not there are security implications to having the name be 'guessable'. if so, some randomization needs to be added - no idea how much. Maybe you'd also need to hash the data mentioned above along with the random bits. Like I said, I really have no idea if this is important or not.
As far as I can see there is no need for pthread_barrier_destroy to be an immediate operation. You could have it wait until all threads that are still in their wakeup phase are woken up.
E.g you could have an atomic counter awakening that initially set to the number of threads that are woken up. Then it would be decremented as last action before pthread_barrier_wait returns. pthread_barrier_destroy then just could be spinning until that counter falls to 0.

Resources