If a thread calls pthread_cond_wait(cond_ptr,mutex_ptr) will a null cond_ptr, is it guaranteed to not fall asleep?
According to http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_cond_wait.html,
a null cond_ptr just means pthread_cond_wait() may (not an emphatic will) fail, so I guess threads can then fall asleep on null condition variables?
I can't see a valid use case for this and I'm wondering why it would ever matter. You shouldn't be calling pthread_cond_wait with an invalid condition variable.
If you're worried about it, just change your code from:
pthread_cond_wait (pcv, pmutex);
to something like:
if (pcv != NULL) pthread_cond_wait (pcv, pmutex);
and it won't be called with a NULL.
I suspect it was put in as "may" simply because there was an implementation of pthreads (perhaps even the original DEC threads itself) which didn't return a failure code for that circumstance.
But, since the alternative is almost certainly that the whole thing fell in a screaming heap, I wouldn't be relying on it :-)
If you're worried about the atomicity of that code, you needn't be. Simply use the same mutex that protects the condition variable to protect the CV pointer being held in your list:
claim mutex A
somenode->cv = NULL
release mutex A
and, in your looping code:
claim mutex A
if loopnode->cv != null:
wait on condvar loopnode->cv using mutex A
// mutex A is locked again
: : :
The fact that the mutex is locked during both the if and calling pthread_condvar_wait means that no race condition can exist. Nothing can change the node condition variables until the looping thread releases the mutex within the pthread_condvar_wait call. And by that time, the call is using its own local copy of the pointer so changing te one in the list will have no effect.
And, if the node-changing code grabs the mutex, the if and pthread_condvar_wait can't execute until that mutex is released.
Related
I am trying to create a multi-threaded app in C. At some point the program waits when trying to acquire lock on mutexQueue. but i don't know why. This happens after recreation of the mutex.
for(int i = 80; i<= 8080; i++)
{
pthread_mutex_init(&mutexQueue,NULL);
...
pthread_mutex_lock(&mutexQueue); <= here it waits forever, after the first iteration (when i=81)
...
pthread_mutex_destroy(&mutexQueue);
}
First time it passes after pthread_mutex_lock therefore it can acquire lock, second time not.
Is there a problem to destroy the mutex and then re-init it after?
Full program execution in real time : https://onlinegdb.com/T5kzCaFUA
EDIT:
as #John Carter suggested and reading current pthread documentation (https://pubs.opengroup.org/onlinepubs/007904875/functions/pthread_mutex_destroy.html) it writes :
In cases where default mutex attributes are appropriate, the macro
PTHREAD_MUTEX_INITIALIZER can be used to initialize mutexes that are
statically allocated. The effect shall be equivalent to dynamic
initialization by a call to pthread_mutex_init() with parameter attr
specified as NULL, except that no error checks are performed.
i also get the __pthread_mutex_cond_lock_adjust: Assertion (mutex->__data.__kind & 128) == 0' failed. error sometimes, after a long run.
So the error should be somewhere around here, still searching for it.
Thank you.
Are you unlocking the mutex? Destroying a locked mutex results in undefined behaviour:
The pthread_mutex_destroy() function shall destroy the mutex object referenced by mutex; the mutex object becomes, in effect, uninitialized. An implementation may cause pthread_mutex_destroy() to set the object referenced by mutex to an invalid value. A destroyed mutex object can be reinitialized using pthread_mutex_init(); the results of otherwise referencing the object after it has been destroyed are undefined.
phread_mutex_destroy
Is there a problem to destroy the mutex and then re-init it after?
If something might still be using it, yes.
The code you showed is crazy: a shared queue's mutex should live as long as the queue it protects.
Further, the code you show acquires a lock and never unlocks. That doesn't make much sense either.
I don't know exactly what is your problem really is, but I think you have some misunderstanding with mutex and threads so ill explain to you some basic knowledge about mutex.
A mutex, in its most fundamental form, is just an integer in memory.
This integer can have a few different values depending on the state of
the mutex. When we speak of mutexes, we also need to speak about the
locking operations. The integer in memory is not intriguing, but the
operations around it are.
There are two fundamental operations which a mutex must provide:
lock
unlock
A thread wishing to use the mutex, must first call lock, then
eventually call unlock to release it
so as I see in your code:
for(int i = 80; i<= 8080; i++)
{
pthread_mutex_init(&mutexQueue,NULL);
...
pthread_mutex_lock(&mutexQueue); //here you lock 'mutexQueue' so 'mutexQueue' need
//to be unlocked so it can pass again the second
//time, otherwise it gonna stop here forever.
...
pthread_mutex_destroy(&mutexQueue);
}
where you declare pthread_mutex_lock(&mutexQueue); normally your code must wait here forever until pthread_mutex_unlock(&mutexQueue); is called in the other tread then that thread can pass and lock the mutex again for the other treads and so on.
You can check that website have some good information about threads https://mortoray.com/how-does-a-mutex-work-what-does-it-cost/
for me, I worked on a project called dining_philosopher_problem and it help me a lot to discover threads/mutex/sharing_memorie/process/semaphore/...
you can check it here https://github.com/mittous/philosopher
In the linux man page for pthread_mutex_destroy, it has the following code snippet below.
One thing I don't understand about this procedure to destroy a mutex, is that how do we know that between pthread_mutex_unlock and pthread_mutex_destroy no other thread tries to acquire a lock on said mutex?
Typically, how should this be handled? 1) Should an additional mutex be used to ensure that this cannot happen? 2) Or is it the clients responsibility to not try to increase the reference count after it hits 0?
obj_done(struct obj *op)
{
pthread_mutex_lock(&op->om);
if (--op->refcnt == 0) {
pthread_mutex_unlock(&op->om);
(A) pthread_mutex_destroy(&op->om);
(B) free(op);
} else
(C) pthread_mutex_unlock(&op->om);
}
Something should be done to ensure the mutex isn’t going to get another lock attempt while you’re destroying it, yes. In the example case, with a reference count going to 0 involved, it's reasonable to expect that the thread holding the mutex is also the last thread with a pointer to the object. All the other threads that were using the object are finished with it, have decremented the reference count, and have forgotten about the object. So no thread will be attempting to lock the mutex when pthread_mutex_destroy is executed.
That's the typical design pattern. You don't destroy a mutex until all threads are done with it. The natural lifetime of a mutex means you don’t have to synchronize destroying them.
When reading a book on concurrency, the author says a semaphore is different than a condition variable in the way signal() works. The semaphore keeps track of the number of calls to signal() while the condition variable does not. "Calling pthread_cond_signal while no one is waiting has no effect", it says. Why is this detail important (I have seen it repeated many times in different places)? What are the implications to usage? Thank you
Conceptually, a semaphore is equivalent to a mutex, condition variable, and integer counter protected by the mutex. Under this analogy, posting a semaphore is equivalent to locking the mutex, incrementing the counter, signaling the condition variable, and unlocking the mutex. Even if there is no waiter, state is still modified.
Under this analogy, waiters for the semaphore are doing the equivalent of:
Lock mutex.
While count is non-positive, wait on condition variable.
Decrement count.
Unlock mutex.
Of course if you're talking about the specific case of POSIX, the analogy does not correspond fully to the reality, because semaphores have additional async-signal-safety properties that preclude implementing them using a mutex/condvar/count triple.
The implications are that you have to have a 'condition' that the condition variable is associated with. The way you have to use the condition variable is:
acquire the condition's mutex
while (!condition) {
wait on the condition variable
}
do whatever you need to do while holding the mutex
release the mutex
Correspondingly, whenever the condition associated with the condition variable is updated, it has to be done while holding the mutex. That way, when blocking on the condition variable, the condition cannot have changed until the system is prepared to actually unblock the waiting thread.
Suppose a condition variable is used in a situation where the signaling thread modifies the state affecting the truth value of the predicate and calls pthread_cond_signal without holding the mutex associated with the condition variable? Is it true that this type of usage is always subject to race conditions where the signal may be missed?
To me, there seems to always be an obvious race:
Waiter evaluates the predicate as false, but before it can begin waiting...
Another thread changes state in a way that makes the predicate true.
That other thread calls pthread_cond_signal, which does nothing because there are no waiters yet.
The waiter thread enters pthread_cond_wait, unaware that the predicate is now true, and waits indefinitely.
But does this same kind of race condition always exist if the situation is changed so that either (A) the mutex is held while calling pthread_cond_signal, just not while changing the state, or (B) so that the mutex is held while changing the state, just not while calling pthread_cond_signal?
I'm asking from a standpoint of wanting to know if there are any valid uses of the above not-best-practices usages, i.e. whether a correct condition-variable implementation needs to account for such usages in avoiding race conditions itself, or whether it can ignore them because they're already inherently racy.
The fundamental race here looks like this:
THREAD A THREAD B
Mutex lock
Check state
Change state
Signal
cvar wait
(never awakens)
If we take a lock EITHER on the state change OR the signal, OR both, then we avoid this; it's not possible for both the state-change and the signal to occur while thread A is in its critical section and holding the lock.
If we consider the reverse case, where thread A interleaves into thread B, there's no problem:
THREAD A THREAD B
Change state
Mutex lock
Check state
( no need to wait )
Mutex unlock
Signal (nobody cares)
So there's no particular need for thread B to hold a mutex over the entire operation; it just need to hold the mutex for some, possible infinitesimally small interval, between the state change and signal. Of course, if the state itself requires locking for safe manipulation, then the lock must be held over the state change as well.
Finally, note that dropping the mutex early is unlikely to be a performance improvement in most cases. Requiring the mutex to be held reduces contention over the internal locks in the condition variable, and in modern pthreads implementations, the system can 'move' the waiting thread from waiting on the cvar to waiting on the mutex without waking it up (thus avoiding it waking up only to immediately block on the mutex).
As pointed out in the comments, dropping the mutex may improve performance in some cases, by reducing the number of syscalls needed. Then again it could also lead to extra contention on the condition variable's internal mutex. Hard to say. It's probably not worth worrying about in any case.
Note that the applicable standards require that pthread_cond_signal be safely callable without holding the mutex:
The pthread_cond_signal() or pthread_cond_broadcast() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits [...]
This usually means that condition variables have an internal lock over their internal data structures, or otherwise use some very careful lock-free algorithm.
The state must be modified inside a mutex, if for no other reason than the possibility of spurious wake-ups, which would lead to the reader reading the state while the writer is in the middle of writing it.
You can call pthread_cond_signal anytime after the state is changed. It doesn't have to be inside the mutex. POSIX guarantees that at least one waiter will awaken to check the new state. More to the point:
Calling pthread_cond_signal doesn't guarantee that a reader will acquire the mutex first. Another writer might get in before a reader gets a chance to check the new status. Condition variables don't guarantee that readers immediately follow writers (After all, what if there are no readers?)
Calling it after releasing the lock is actually better, since you don't risk having the just-awoken reader immediately going back to sleep trying to acquire the lock that the writer is still holding.
EDIT: #DietrichEpp makes a good point in the comments. The writer must change the state in such a way that the reader can never access an inconsistent state. It can do so either by acquiring the mutex used in the condition-variable, as I indicate above, or by ensuring that all state-changes are atomic.
The answer is, there is a race, and to eliminate that race, you must do this:
/* atomic op outside of mutex, and then: */
pthread_mutex_lock(&m);
pthread_mutex_unlock(&m);
pthread_cond_signal(&c);
The protection of the data doesn't matter, because you don't hold the mutex when calling pthread_cond_signal anyway.
See, by locking and unlocking the mutex, you have created a barrier. During that brief moment when the signaler has the mutex, there is a certainty: no other thread has the mutex. This means no other thread is executing any critical regions.
This means that all threads are either about to get the mutex to discover the change you have posted, or else they have already found that change and ran off with it (releasing the mutex), or else have not found they are looking for and have atomically given up the mutex to gone to sleep (and are guaranteed to be waiting nicely on the condition).
Without the mutex lock/unlock, you have no synchronization. The signal will sometimes fire as threads which didn't see the changed atomic value are transitioning to their atomic sleep to wait for it.
So this is what the mutex does from the point of view of a thread which is signaling. You can get the atomicity of access from something else, but not the synchronization.
P.S. I have implemented this logic before. The situation was in the Linux kernel (using my own mutexes and condition variables).
In my situation, it was impossible for the signaler to hold the mutex for the atomic operation on shared data. Why? Because the signaler did the operation in user space, inside a buffer shared between the kernel and user, and then (in some situations) made a system call into the kernel to wake up a thread. User space simply made some modifications to the buffer, and then if some conditions were satisfied, it would perform an ioctl.
So in the ioctl call I did the mutex lock/unlock thing, and then hit the condition variable. This ensured that the thread would not miss the wake up related to that latest modification posted by user space.
At first I just had the condition variable signal, but it looked wrong without the involvement of the mutex, so I reasoned about the situation a little bit and realized that the mutex must simply be locked and unlocked to conform to the synchronization ritual which eliminates the lost wakeup.
I'm implementing pthread condition variables (based on Linux futexes) and I have an idea for avoiding the "stampede effect" on pthread_cond_broadcast with process-shared condition variables. For non-process-shared cond vars, futex requeue operations are traditionally (i.e. by NPTL) used to requeue waiters from the cond var's futex to the mutex's futex without waking them up, but this is in general impossible for process-shared cond vars, because pthread_cond_broadcast might not have a valid pointer to the associated mutex. In a worst case scenario, the mutex might not even be mapped in its memory space.
My idea for overcoming this issue is to have pthread_cond_broadcast only directly wake one waiter, and have that waiter perform the requeue operation when it wakes up, since it does have the needed pointer to the mutex.
Naturally there are a lot of ugly race conditions to consider if I pursue this approach, but if they can be overcome, are there any other reasons such an implementation would be invalid or undesirable? One potential issue I can think of that might not be able to be overcome is the race where the waiter (a separate process) responsible for the requeue gets killed before it can act, but it might be possible to overcome even this by putting the condvar futex in the robust mutex list so that the kernel performs a wake on it when the process dies.
There may be waiters belonging to multiple address spaces, each of which has mapped the mutex associated with the futex at a different address in memory. I'm not sure if FUTEX_REQUEUE is safe to use when the requeue point may not be mapped at the same address in all waiters; if it does then this isn't a problem.
There are other problems that won't be detected by robust futexes; for example, if your chosen waiter is busy in a signal handler, you could be kept waiting an arbitrarily long time. [As discussed in the comments, these are not an issue]
Note that with robust futexes, you must set the value of the futex & 0x3FFFFFFF to be the TID of the thread to be woken up; you must also set bit FUTEX_WAITERS on if you want a wakeup. This means that you must choose which thread to awaken from the broadcasting thread, or you will be unable to deal with thread death immediately after the FUTEX_WAKE. You'll also need to deal with the possibility of the thread dying immediately before the waker thread writes its TID into the state variable - perhaps having a 'pending master' field that is also registered in the robust mutex system would be a good idea.
I see no reason why this can't work, then, as long as you make sure to deal with the thread exit issues carefully. That said, it may be best to simply define in the kernel an extension to FUTEX_WAIT that takes a requeue point and comparison value as an argument, and let the kernel handle this in a simple, race-free manner.
I just don't see why you assume that the corresponding mutex might not be known. It is clearly stated
The effect of using more than one mutex for concurrent
pthread_cond_timedwait() or pthread_cond_wait() operations on the same
condition variable
is undefined; that is, a condition variable becomes bound to
a unique mutex when a thread waits on the condition variable, and this
(dynamic)
binding shall end when the wait returns.
So even for process shared mutexes and conditions this must hold, and any user space process must always have mapped the same and unique mutex that is associated to the condition.
Allowing users to associate different mutexes to a condition at the same time is nothing that I would support.