Can a process call "down" on two semaphores at once? - c

Let's say two semaphores are protecting a critical piece of code, and you only want a critical piece of code to execute if both of them are available. Is there a pattern for writing this?
In other words, is there a statement that reads, "If semaphore a and b are available, then run... otherwise sleep"?

The simplest way to implement this is to use a single pthread_mutex_t to protect some state, and a single pthread_cond_t to notify all threads when the state has changed. If you always broadcast on the condvar, then you will always wake all waiting threads. The threads can then perform arbitrarily complex tests and updates to the shared state.
Of course, this is not the most efficient solution since it potentially wakes threads when the state does not satisfy the condition they are waiting for (and they have to go back to sleep). It could also lead to starvation since a thread may always find itself at the back of the queue whenever it waits on the condvar, and never find an acceptable state when it awakens.
Without knowing more details of the problem you are trying to solve, it is hard to give an air tight answer.
pthreads does not allow you to acquire multiple locks/semaphores atomically; however, as pointed out by #Greg, you can avoid deadlock by assigning an order to the locks/semaphores, and having the threads always acquire them in that order. Of course, you have to know which locks you intend to acquire before you start to acquire any of them. It will not work if you cannot determine the next lock to acquire until you have acquired the current one, since you may be required to take a lock out of order. If you release all of the locks and start over, you may find the state has changed, requiring you to acquire a different set of locks, which could lead to livelock.

Related

How to make a thread wait for multiple conditional signals from different threads?

Does the pthread library (in C/C++) have a function like pthread_cond_wait(&cond, &mutex) that can wait for conditional signals (such as pthread_cond_signal(&cond1), pthread_cond_signal(&cond2) and so on...) from multiple different threads?
If this is not possible, then what is the most effective way to implement this strategy?
Does the pthread library (in C/C++) have a function like pthread_cond_wait(&cond, &mutex) that can wait for conditional signals (such as pthread_cond_signal(&cond1), pthread_cond_signal(&cond2) and so on...) from multiple different threads?
What you're describing is not about different threads -- pthread_cond_wait() / pthread_cond_signal() already handle that just fine, as indeed they need to do to serve their intended purpose. If a given thread is waiting on condition variable cond1 then it can be awakened by any thread that signals or broadcasts to cond1. And ultimately, that may be the direction you want to go.
But what you actually asked about is whether a thread can block on multiple condition variables at the same time, and no, pthreads makes no provision for that. Nor would that make sense for the usage model for which condition variables are designed.
I suspect that you are looking at CV waiting / signaling as purely a notification mechanism, but this is altogether the wrong view. A thread uses a condition variable to suspend execution until another thread performs work that causes some condition to be satisfied (otherwise, why wait at all)? That's normally a condition that can be evaluated based on data protected by the associated mutex, so that the prospective waiting thread can
avoid data races involving the data in question (by locking the mutex at the beginning, before accessing them, and relying on other threads to do the same);
be assured (because it holds the mutex locked) that the condition will not change unexpectedly after it is evaluated;
avoid waiting at all when the condition is already satisfied;
verify when it resumes from waiting that the condition in fact is satisfied (MANDATORY).
When the condition indeed is satisfied, the thread moves on with whatever action it wanted to perform that required the condition to be satisfied. That's important. The thread waited to be able to perform a specific thing, so what does it matter if the conditions are then right for performing some other thing, too? And if it wants to perform that other thing next, then it can do it without waiting if the conditions for that still hold.
On the side of the signal sender, you should not think about it as notifying a specific thread. Instead, it is announcing to any thread that cares that something it is presently interested in may have changed.
I appreciate that this is all fairly abstract. Multithreaded programming is challenging. But this is why students are given exercises such as producer / consumer problems or the Dining Philosophers problem. Learn and apply the correct idioms for CV usage. Think about what they are achieving for you. It will become clearer in time.

Lock that handles a high-contention, high-frequency situation

I am looking for a lock implementation that degrades gracefully in the situation where you have two threads that constantly try to release and re-acquire the same lock, at a very high frequency.
Of course it is clear that in this case the two threads won't significantly progress in parallel. Theoretically, the best result would be achieved by running the whole thread 1, and then the whole thread 2, without any switching---because switching just creates massive overhead here. So I am looking for a lock implementation that would handle this situation gracefully by keeping the same thread running for a while before switching, instead of constantly switching.
Long version of the question
As I would myself be tempted to answer this question by "your program is broken, don't do that", here is some justification about why we end up in this kind of situation.
The lock is a "single global lock", i.e. a very coarse lock. (It is the Global Interpreter Lock (GIL) inside PyPy, but the question is about how to do it in general, say if you have a C program.)
We have the following situation:
There is constantly contention. That's expected in this case: the lock is a global lock that needs to be acquired for most threads to progress. So we expect that a large fraction of them are waiting for the lock. Only one of these threads can progress.
The thread that holds the lock might do sometimes bursts of short releases. A typical example would be if this thread does repeated calls to "something external", e.g. many short writes to a file. Each of these writes is usually completed very quickly. The lock still has to be released just in case this external thing turns out to take longer than expected (e.g. if the write actually needs to wait for disk I/O), so that another thread can acquire the lock in this case.
If we use some standard mutex for the lock, then the lock will often switch to another thread as soon as the owner releases the lock. But the problem is what if the program runs several threads that each wants to do a long burst of short releases. The program ends up spending most of its time switching the lock between CPUs.
It is much faster to run the same thread for a while before switching, at least as long as the lock is released for very short periods of time. (E.g. on Linux/pthread a release immediately followed by an acquire will sometimes re-acquire the lock instantly even if there are other waiting threads; but we'd like this result in a large majority of cases, not just sometimes.)
Of course, as soon as the lock is released for a longer period of time, then it becomes a good idea to transfer ownership of the lock to a different thread.
So I'm looking for general ideas about how to do that. I guess it should exist already somewhere---in a paper, or in some multithreading library?
For reference, PyPy tries to implement something like this by polling: the lock is just a global variable, with synchronized compare-and-swap but no OS calls; one of the waiting threads is given the role of "stealer"; that "stealer" thread wakes up every 100 microseconds to check the variable. This is not horribly bad (it costs maybe 1-2% of CPU time in addition to the 100% consumed by the running thread). This actually implements what I'm asking for here, but the problem is that this is a hack that doesn't cleanly support more traditional cases of locks: for example, if thread 1 tries to send a message to thread 2 and wait for the answer, the two thread switches will take in average 100 microseconds each---which is far too much if the message is processed quickly.
For reference, let me describe how we finally implemented it. I was unsure about it as it still feels like a hack, but it seems to work for PyPy's use case in practice.
We did it as described in the last paragraph of the question, with one addition: the "stealer" thread, which checks some global variable every 100 microseconds, does this by calling pthread_cond_timedwait or WaitForSingleObject with a regular, system-provided mutex, with a timeout of 100 microseconds. This gives a "composite lock" with both the global variable and the regular mutex. The "stealer" will succeed in stealing the "lock" if either it notices a value 0 is the global variable (every 100 microseconds), or immediately if the regular mutex is released by another thread.
It's then a matter of choosing how to release the composite lock in a case-by-case basis. Most external functions (writes to files, etc.) are expected to generally complete quickly, and so we release and re-acquire the composite lock by writing to the global variable. Only in a few specific function cases---like sleep() or lock_acquire()---we expect the calling thread to often block; around these functions, we release the composite lock by actually releasing the mutex instead.
If I understand the problem statement, you are asking the kernel scheduler to do an educated guess on whether your userspace application "hot" thread will try to reacquire the lock in the very near future, to avoid implicitly preempting it by allowing a "not-so-hot" thread to acquire the mutex.
I wouldn't know how the kernel could do that. The only two things that come to my mind:
Do not release mutex unless hot thread is actually transitioning to idle (application specific condition). In Linux you can use MONOTONIC_COARSE to try to reduce the overhead of checking the wall clock to implement some sort of timer.
Increase hot thread prio. This is more of mitigation strategy, in an attempt to reduce the amount of preemption of the hot thread. If the "hot" thread can be identified, you could do something like:
pthread_t thread = pthread_self();
//Set max prio, FIFO
struct sched_param params;
params.sched_priority = sched_get_priority_max(SCHED_FIFO);
int rv = pthread_setschedparam(thread, SCHED_FIFO, &params);
if(rv != 0){
//Print error
//...
}
Spinlock might work better in your case. They avoid context switching and are highly efficient if the threads are likely to hold the lock only for short duration of time.
For this very reason, they are widely used in OS kernels.

Threads - Access resource avoiding starvation

I know that this isn't a "homework helper website", but I got insane in the last days because i have to implement the access to resource avoiding starvation and i can't figure out how to do that. Can anyone help me with some application examples or documentation? The assignment is: a resource may be used by 2 types of processes: black and white. When the resource is used by the white processes, it can not be used by the black processes and vice-versa. Implement the access to the resource avoiding starvation. Is this a producer-consumer case?
Let's make a few assumptions (for the sake of discussion):
Our processes will be threads -- not actual software processes, there's a difference which may be important in your assignment.
White processes are Readers.
Black processes are Writers.
Our common resource is particular Variable.
Mutual exclusion locks (mutex):
A mutex is a type of exclusive lock, it has a binary state, it's either locked or unlocked. You can lock it, unlock it or check to see if it's locked or not.
Threads can lock each other out using mutex (mutual exclusion locks) just as processes can lock each other out using semaphores.
When you want to protect a variable from being used by two threads at once you create a mutex for that variable and write every thread so that it attempts to lock the mutex before attempting to use the variable and unlock it after they're done.
This makes any first thread lock the mutex and any subsequent thread block until the first thread unlocks the mutex basically forcing all of these threads to line up and operate on that particular variable sequentially.
This is a bit ineffective when you just want to read the variable, not change its value, because two threads reading the same content doesn't create any conflict or invalid data. Two threads writing at the same time might however corrupt the data.
Readers/Writers locks (RWL):
Most implementations of Readers/Write locks will use a shared lock and an exclusive lock, but they expose a simple usage approach: if you want to read grab a "read lock", if you want to write grab a "write lock".
"Read locks" are not exclusive and they allow multiple readers to be reading at one particular time (without blocking).
"Write locks" are exclusive and only one writer can be writing at one particular time (without blocking).
Starvation:
First step: Readers/Writers Locks is the event when a first (read) thread grabs a "read lock" on the variable, a second (write) tries to grab a "write lock" but is blocked until all readers finish reading.
Second step: before the first thread finishes reading, a third (read) thread grabs a "read lock" on the variable; this means the second (write) thread has to wait for this third thread to finish.
Repeat the second step until starvation is achieved.
Avoiding starvation with Seqlock:
A seqlock is implemented with one mutex and some counters. It always allows reading, even while the writers are writing to the variable but it gives the readers a means of checking if the data has been written to during the time it was being read, if so it may be corrupt so the readers will have to reread the data and check for consistency again.
The "read & consistency check" phase runs in a loop until the check confirms consistency of the data, at which point the reader can continue with its usual task.
The writers use the mutex to grab exclusive access so they never overlap their operations.
This is good for high read low write situations. If there would be too many writers the readers would continuously loop rereading the data.
Your particular situation:
If black processes need to be able to share the resource among themselves and white processes need to be able to share the resource among themselves but white processes can't share the resource with black processes then the solution will not be either RWL or Seqlock.
A variation on the Seqlock algorithm might be your solution.
Generally, is a problem in which it comes to access a shared resource (or mutex).
If you have two object of the same class, both threads:
pseudo-code:
loop
if shared_resource is free
lock shared_resource
do something
free shared_resource
This in VERY broad terms!

suspend pthread?

I want to implement a mutex lock.
From my understanding, mutex.lock() should work like
1) check lock owner
2) if lock is owned, put thread in waiting queue
3) suspend this thread until another thread send a wait up signal
However, there is nothing like pthread_suspend(), then how do I do suspend?
I found someone saying use pthread_con_wait(), but seems if I want to use that function, I have to set up a pthread_mutex lock first, which it doesn't make sense to use pthread_mutex inside my mutex.
Well, if my understanding of mutex is wrong, please correct me.
Thanks.
Mutexes, locks, and wait conditions are all different, distinct things. You need a mutex variable in order to implement both a lock and a wait condition.
A lock is a simple mechanism that prevents more than one thread from executing the same code at once by making all by one thread wait for the lock to become unlocked.
A wait condition is a slightly more complex structure that allows a thread to monitor a condition (usually a boolean flag) and only wake up when the flag has changed favourably.
In both cases, when a thread blocks (i.e. sleeps), the operating system's scheduling primitives automatically take care of descheduling the thread and using the available computing time elsewhere. Thread and task scheduling is not something you would normally have to worry about manually.
You can only make things that are at least as complex as the simplest pieces you have. If the simplest pieces you have are mutexes, then you can't make mutexes from the pieces you have. You can only make things at least as complex as a mutex or more so. If you have any pieces simpler than a mutex, tell us what they are, and we can tell you how to make a mutex out of them.
I suppose, if you want, you can make your own mutex out of pthread mutexes and condition variables. I'm not sure what the point is, but it's trivial to do. As you noted, you can use pthread_cond_wait to wait on your own kind of mutex.
The reason the pthreads standard gives you a mutex is because it's about the most flexible of the possible synchronization primitives.
mutex.lock() should work like:
1) check lock owner
2) if lock is owned, put thread in waiting queue
3) suspend this thread until THE THREAD THAT OWNS THE LOCK sends a wake up signal. No other thread can release the lock.
These steps should be performed as an atomic operation so that the correct behaviour is followed for all threads acquiring/releasing the mutex, no matter how such calls may be interrupted and reentered from other threads.
'However, there is nothing like pthread_suspend(), then how do I do suspend?' - usually, you don't. The OS kernel provides synchronization primitives that can block threads that should not run on. To implement a 'suspend' in user-space, you can only spin-wait - something that is a good strategy in a few cases, (underloaded multi-core box where the lock is only held for a very short time), but certainly not all, (and can lead to spectacularly disastrous livelocks across whole clusters of machines).
If you want a mutex, use an OS mutex - that's what any cross-platform lib. will do.

Implementing mutex in a user level thread library

I am developing a user level thread library as part of a project. I came up with an approach to implement mutex. I would like to see ur views before going on with it. Basically, i need to implement just 3 functions in my library
mutex_init, mutex_lock and mutex_unlock
I thought my mutex_t structure would look something like
typedef struct
{
int available; //indicates whether the mutex is locked or unlocked
queue listofwaitingthreads;
gtthread_t owningthread;
}mutex_t;
In my mutex_lock function, i will first check if the mutex is available in a while loop. If it is not, i will yield the processor for the next thread to execute.
In my mutex_unlock function, i will check if the owner thread is the current thread. If it is, i will set available to 0.
Is this the way to go about it ? Also, what about deadlock? Should i take care of those conditions in my user level library or should i leave the application programmers to write code properly ?
This won't work, because you have a race condition. If 2 threads try to catch the lock at the same time, both will see available == 0, and both will think they succeeded with taking the mutex.
If you want to do this properly, and without using an already-existing lock, You must access hardware operations like TAS, CAS, etc.
There are algorithms that give you mutual exclusion without such hardware support, but they make some assumptions that are many times false. For more details about this, I highly recommend reading Herlihy and Shavit's The art of multiprocessor programming, chapter 7.
You shouldn't worry about deadlocks in this level - mutex locks should be simple enough, and there is some assumption that the programmer using them should use care not to cause deadlocks (advanced mutexes can check for self-deadlock, meaning a thread that calls lock twice without calling unlock in the middle).
Not only that you have to do atomic operations to read and modify the flag (as Eran pointed out) you also have to watch that your queue is capable to have concurrent accesses. This is not completely trivial, sort of hen and egg problem.
But if you'd really implement this by spinning, you wouldn't even need to have such a queue. The access order to the lock then would be mainly random, though.
Probably just yielding would also not be enough, this can be quite costly if you have threads holding the lock for more than some processor cycles. Consider using nanosleep with a low time value for the wait.
In general, a mutex implementation should look like:
Lock:
while (trylock()==failed) {
atomic_inc(waiter_cnt);
atomic_sleep_if_locked();
atomic_dec(waiter_cnt);
}
Trylock:
return atomic_swap(&lock, 1);
Unlock:
atomic_store(&lock, 0);
if (waiter_cnt) wakeup_sleepers();
Things get more complex if you want recursive mutexes, mutexes that can synchronize their own destruction (i.e. freeing the mutex is safe as soon as you get the lock), etc.
Note that atomic_sleep_if_locked and wakeup_sleepers correspond to FUTEX_WAIT and FUTEX_WAKE ops on Linux. The other atomics are probably CPU instructions, but could be system calls or kernel-assisted userspace function code, as in the case of Linux/ARM and the 0xffff0fc0 atomic compare-and-swap call.
You do not need atomic instructions for a user level thread library, because all the threads are going to be user level threads of the same process. So actually when your process is given the time slice to execute, you are running multiple threads during that time slice but on the same processor. So, no two threads are going to be in the library function at the same time. Considering that the functions for mutex are already in the library, mutual exclusion is guaranteed.

Resources