I have a prioritization problem in my homework. Some threads have higher priorities than the others and the other threads have to wait to access a file until all those threads have finished their job. We are not allowed to use busy waiting. Is there another solution to solve this. Thanks.
Yeah, OK, like my comment.
The low-priority threads must wait on the, (initialized to zero), semaphore before accessing the file. The high-priority threads, when they have finished with tbe file, must acquire the mutex, count down the high count check for zero and release the mutex. The last high-thread that finds zero should then issue [low count] units to the semaphore, so releasing all the low threads.
Actually, a mutex-protected struct with state data can handle just about any weird locking/sequencing/prioritizing scheme that your prof can come up with, no matter how much tequila it consumes first:)
Related
I have created a program in C that creates 2 buffers. The buffer indices hold single characters, 'A' or 'b' etc... In order to learn more about multithreading, I created a set of semaphores based on the producer/consumer problem to produce characters and consume characters from the buffers. I have 3 producer threads for each buffer and 10 consumer threads. The consumers take one item from each buffer, then report it (freeing the memory of the consumed item also). Now, from what I've read, sem_wait() is supposed to signal the "longest waiting thread" when it comes out of a blocking state (I read this in a book and in an online POSIX library).
Now, is this actually true?
The application I have made should have both consumers and producers waiting at the same sem_wait() gate, but the producers get into the critical section more than double the time of any consumer. The consumers do have an extra semaphore to wait for, but that shouldn't make that huge of a difference. I can't seem to figure out why it's happening, so I'm hoping someone else does. If I sleep(1) on the producer threads, the consumers get in just fine and the buffers hover around 0 items...like I would think would happen otherwise.
Also, should thread creation order play any role in how I structure the program for fairness?
IE, produce one of each type in a round robin fashion until everyone is created and running.
Are there any methods anyone can describe to me to institute a more fair system of thread access? I've read that creating a FIFO queue system might be one solution, where the longest waiting thread has the highest priority (which is what I thought sem_wait() would do anyways).
Just wondering what methods are out there for both rudimentary and higher level threading.
The POSIX standard actually says that "the highest priority thread that has been waiting the longest shall be unblocked" only when the SCHED_FIFO or SCHED_RR scheduling policy applies to the blocked thread.
If you're not using one of those two realtime scheduling policies, then the semaphore does not have to be "fair".
I'm trying to figure out a way to put some thread in a passive waiting mode and wake them up as they arrive to the barrier. I have a fixed amount of thread that should arrive.
I was first thinking about a semaphore that i would initialise at 0 so it would block but they will be released in a random way. I would like to implement a system that would release the thread in the order the came to the barrier of synchronisation like a FIFO.
I also thought about using 2 semaphore, on that would block, release a thread and sort it. If the thread is the good one then it just goes, if not then it's blocked by the second semaphore. However this system seems kind of long and fastidious.
Does someone have an idea or suggestions that would help me ?
Thank you very much :)
On Linux, you can just use a condition variable and a mutex to block and unblock threads in the same FIFO order.
This is because all waiters on a condition variable append to the futex wait queue in the kernel in order. Waking up the waiters happens in the same FIFO order. As long as you keep the mutex locked while signaling the condition variable.
However, as commenters mentioned, this is a poor idea to depend on thread execution order.
Suppose I have multiple threads blocking on a call to pthread_mutex_lock(). When the mutex becomes available, does the first thread that called pthread_mutex_lock() get the lock? That is, are calls to pthread_mutex_lock() in FIFO order? If not, what, if any, order are they in? Thanks!
When the mutex becomes available, does the first thread that called pthread_mutex_lock() get the lock?
No. One of the waiting threads gets a lock, but which one gets it is not determined.
FIFO order?
FIFO mutex is rather a pattern already. See Implementing a FIFO mutex in pthreads
"If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex."
Aside from that, the answer to your question isn't specified by the POSIX standard. It may be random, or it may be in FIFO or LIFO or any other order, according to the choices made by the implementation.
FIFO ordering is about the least efficient mutex wake order possible. Only a truly awful implementation would use it. The thread that ran the most recently may be able to run again without a context switch and the more recently a thread ran, more of its data and code will be hot in the cache. Reasonable implementations try to give the mutex to the thread that held it the most recently most of the time.
Consider two threads that do this:
Acquire a mutex.
Adjust some data.
Release the mutex.
Go to step 1.
Now imagine two threads running this code on a single core CPU. It should be clear that FIFO mutex behavior would result in one "adjust some data" per context switch -- the worst possible outcome.
Of course, reasonable implementations generally do give some nod to fairness. We don't want one thread to make no forward progress. But that hardly justifies a FIFO implementation!
In a past question, I asked about implementing pthread barriers without destruction races:
How can barriers be destroyable as soon as pthread_barrier_wait returns?
and received from Michael Burr with a perfect solution for process-local barriers, but which fails for process-shared barriers. We later worked through some ideas, but never reached a satisfactory conclusion, and didn't even begin to get into resource failure cases.
Is it possible on Linux to make a barrier that meets these conditions:
Process-shared (can be created in any shared memory).
Safe to unmap or destroy the barrier from any thread immediately after the barrier wait function returns.
Cannot fail due to resource allocation failure.
Michael's attempt at solving the process-shared case (see the linked question) has the unfortunate property that some kind of system resource must be allocated at wait time, meaning the wait can fail. And it's unclear what a caller could reasonably do when a barrier wait fails, since the whole point of the barrier is that it's unsafe to proceed until the remaining N-1 threads have reached it...
A kernel-space solution might be the only way, but even that's difficult due to the possibility of a signal interrupting the wait with no reliable way to resume it...
This is not possible with the Linux futex API, and I think this can be proven as well.
We have here essentially a scenario in which N processes must be reliably awoken by one final process, and further no process may touch any shared memory after the final awakening (as it may be destroyed or reused asynchronously). While we can awaken all processes easily enough, the fundamental race condition is between the wakeup and the wait; if we issue the wakeup before the wait, the straggler never wakes up.
The usual solution to something like this is to have the straggler check a status variable atomically with the wait; this allows it to avoid sleeping at all if the wakeup has already occurred. However, we cannot do this here - as soon as the wakeup becomes possible, it is unsafe to touch shared memory!
One other approach is to actually check if all processes have gone to sleep yet. However, this is not possible with the Linux futex API; the only indication of number of waiters is the return value from FUTEX_WAKE; if it returns less than the number of waiters you expected, you know some weren't asleep yet. However, even if we find out we haven't woken enough waiters, it's too late to do anything - one of the processes that did wake up may have destroyed the barrier already!
So, unfortunately, this kind of immediately-destroyable primitive cannot be constructed with the Linux futex API.
Note that in the specific case of one waiter, one waker, it may be possible to work around the problem; if FUTEX_WAKE returns zero, we know nobody has actually been awoken yet, so you have a chance to recover. Making this into an efficient algorithm, however, is quite tricky.
It's tricky to add a robust extension to the futex model that would fix this. The basic problem is, we need to know when N threads have successfully entered their wait, and atomically awaken them all. However, any of those threads may leave the wait to run a signal handler at any time - indeed, the waker thread may also leave the wait for signal handlers as well.
One possible way that may work, however, is an extension to the keyed event model in the NT API. With keyed events, threads are released from the lock in pairs; if you have a 'release' without a 'wait', the 'release' call blocks for the 'wait'.
This in itself isn't enough due to the issues with signal handlers; however, if we allow for the 'release' call to specify a number of threads to be awoken atomically, this works. You simply have each thread in the barrier decrement a count, then 'wait' on a keyed event on that address. The last thread 'releases' N - 1 threads. The kernel doesn't allow any wake event to be processed until all N-1 threads have entered this keyed event state; if any thread leaves the futex call due to signals (including the releasing thread), this prevents any wakeups at all until all threads are back.
After a long discussion with bdonlan on SO chat, I think I have a solution. Basically, we break the problem down into the two self-synchronized deallocation issues: the destroy operation and unmapping.
Handling destruction is easy: Simply make the pthread_barrier_destroy function wait for all waiters to stop inspecting the barrier. This can be done by having a usage count in the barrier, atomically incremented/decremented on entry/exit to the wait function, and having the destroy function spin waiting for the count to reach zero. (It's also possible to use a futex here, rather than just spinning, if you stick a waiter flag in the high bit of the usage count or similar.)
Handling unmapping is also easy, but non-local: ensure that munmap or mmap with the MAP_FIXED flag cannot occur while barrier waiters are in the process of exiting, by adding locking to the syscall wrappers. This requires a specialized sort of reader-writer lock. The last waiter to reach the barrier should grab a read lock on the munmap rw-lock, which will be released when the final waiter exits (when decrementing the user count results in a count of 0). munmap and mmap can be made reentrant (as some programs might expect, even though POSIX doesn't require it) by making the writer lock recursive. Actually, a sort of lock where readers and writers are entirely symmetric, and each type of lock excludes the opposite type of lock but not the same type, should work best.
Well, I think I can do it with a clumsy approach...
Have the "barrier" be its own process listening on a socket. Implement barrier_wait as:
open connection to barrier process
send message telling barrier process I am waiting
block in read() waiting for reply
Once N threads are waiting, the barrier process tells all of them to proceed. Each waiter then closes its connection to the barrier process and continues.
Implement barrier_destroy as:
open connection to barrier process
send message telling barrier process to go away
close connection
Once all connections are closed and the barrier process has been told to go away, it exits.
[Edit: Granted, this allocates and destroys a socket as part of the wait and release operations. But I think you can implement the same protocol without doing so; see below.]
First question: Does this protocol actually work? I think it does, but maybe I do not understand the requirements.
Second question: If it does work, can it be simulated without the overhead of an extra process?
I believe the answer is "yes". You can have each thread "take the role of" the barrier process at the appropriate time. You just need a master mutex, held by whichever thread is currently "taking the role" of the barrier process. Details, details... OK, so the barrier_wait might look like:
lock(master_mutex);
++waiter_count;
if (waiter_count < N)
cond_wait(master_condition_variable, master_mutex);
else
cond_broadcast(master_condition_variable);
--waiter_count;
bool do_release = time_to_die && waiter_count == 0;
unlock(master_mutex);
if (do_release)
release_resources();
Here master_mutex (a mutex), master_condition_variable (a condition variable), waiter_count (an unsigned integer), N (another unsigned integer), and time_to_die (a Boolean) are all shared state allocated and initialized by barrier_init. waiter_count is initialiazed to zero, time_to_die to false, and N to the number of threads the barrier is waiting for.
Then barrier_destroy would be:
lock(master_mutex);
time_to_die = true;
bool do_release = waiter_count == 0;
unlock(master_mutex);
if (do_release)
release_resources();
Not sure about all the details concerning signal handling etc... But the basic idea of "last one out turns off the lights" is workable, I think.
My current understanding of condition variables is that all blocked (waiting) threads are inserted into a basic FIFO queue, the first item of which is awakened when signal() is called.
Is there any way to modify this queue (or create a new structure) to perform as a priority queue instead? I've been thinking about it for a while, but most solutions I have end up being hampered by the existing queue structure inherent to C.V.'s and mutexes.
Thanks!
I think you should rethink what you're trying to do. If you're trying to optimize your performance, you're probably barking up the wrong tree.
pthread_cond_signal() isn't even guaranteed to unblock exactly one thread -- it's guaranteed to unblock at least one thread, so your code better be able to handle the situation where multiple threads are unblocked simultaneously. The typical way to do this is for each thread to re-check the condition after becoming unblocked, and, if false, return to waiting again.
You could implement some sort of scheme where you kept your own priority queue of threads waiting, and each thread added itself to that queue immediately before it was to begin waiting, and then it would check the queue when unblocking, but this would add a lot of complexity and a lot of potential for serious problems (race conditions, deadlocks, etc.). It was also add a non-trivial amount of overhead.
Also, what happens if a higher-priority thread starts waiting on a condition variable at the same moment that condition variable is being signalled? Who gets unblocked, the newly arrived high-priority thread or the former highest priority thread?
The order that threads get unblocked in is entirely dependent on the kernel's thread scheduler, so you are at its mercy. I wouldn't even assume FIFO ordering, either.
Since condition variables are basically just a barrier and you have no control over the queue of waiting threads there's no real way to apply priorities. It's invalid to assume waiting threads will act in a FIFO manner.
With a combination of atomics, additional condition variables, and pre-knowledge of the threads/priorities involved you could construct a solution where a signaled thread will re-signal the master CV and then re-block on a priority CV but it certainly wouldn't be a generic solution. That's also off the top of my head so might also have some other flaw.
It's the scheduler that determines which thread will run. You can look at pthread_setschedparam and pthread_getschedparam and fiddle with the policies (SCHED_OTHER, SCHED_FIFO, or SCHED_RR) and the priorities. But it probably won't get you to where I suspect you want to go.
It sounds as if you want to make something predictable from the inherently non-deterministic. As Andrew notes you might hack something but my guess is that this will lead to heartache or a lot code you will hate yourself for writing in six months (or both).