I'm a beginner in C and Multithreading programming.
My textbook describes a readers-writers problem, which favors readers, requires that no reader be kept waiting unless a writer has already been granted permission to use the object. In other words, no reader should wait simply because a writer is waiting, below is the code
where
void P(sem_t *s); /* Wrapper function for sem_wait */
void V(sem_t *s); /* Wrapper function for sem_post */
and
The w semaphore controls access to the critical sections that access the shared object. The mutex semaphore protects access to the shared readcnt variable, which counts the number of readers currently in the critical section.
I don't quite understand what the textbook mean. It seems to me like:if there is a reader, then the writer won't be able to update the share object. But my textbooks says 'no reader should wait simply because a writer is waiting', but when a writer is writing, it just locks the w which doesn't do anything to stop readers except for first reader?
Lets say there are four threads T1,T2,T3,T4.
Tl is writer and T2,T3,T4 are readers.
Now lets assume that T1 gets scheduled first and locks the semaphore w and once done it releases w.
Now after this, lets assume T2 gets scheduled and it locks the semaphore mutex,increments the global variable readcnt and since this is the first reader it locks the semaphore w as well.And releases the semaphore mutex.And enters the crticial section.
Now if T3 gets scheduled, it will acquire the semaphore mutex and increments the global variable readcntand releases the semaphore mutex then enters the critical section.
Now if T1 gets scheduled again it cannot acquire w since it held by the reader thread T2. T1 cannot acquire w until the last reader thread exits. If the writer T1 is waiting and at the same time T4 gets scheduled, T4 will execute as it does not require to lock w.
So the textbooks says no reader should wait simply because a writer is waiting.
But as Leonard said in the other answer, if the writer is already writing, you can't interrupt that process. So T2,T3,T4 has to wait for T1 to release w.
As in the first comment made, if the writer is already writing, you can't interrupt that process. The case you need to deal with is if something is happening (reading or writing) and then a writer request arrives, and then a reader request arrives while the original lock is being held, you need to ensure that the reader request gets serviced first. This is a tiny bit complex, but you're trying to learn, so thinking about how to do this is really the assignment. Go for it! Try some code. If you are truly stuck, ask again.
Related
How to implement the Reader Writer problem, where only one reader is allowed at a time, and only if no writer wants to modify the shared structure?
Reader:
wait(mutex)
wait(w)
// Read
signal(w)
signal(mutex)
Writer:
wait(w)
wait(mutex)
// Write
signal(w)
signal(mutex)
Does this solution make any sense?
Thread priorities are your friend here, plus if you're being strict about it the PREEMPT_RT kernel patch set too. Make the writer a higher priority than the readers.
I'm presuming you have two semaphores to a) guard access to the structure (mutex), and b) flag that the structure has been updated (w). In which case you don't need to wait for w in the writer, and you don't need to signal w in the reader. The reader should wait for w, and then wait for mutex, read, then post mutex. The writer should wait for mutex, write, and then signal mutex and w.
The the priority of the writer thread and the PREEMPT_RT kernel (which resolves priority inversions) means that the writer will be given mutex ASAP, no matter what the reader is doing (in fact the priority of reader will be temporarily bumped upwards to ensure that it gets to the point of signalling mutex as quickly as possible).
I am learning about POSIX threads and my professor has started teaching about the first readers-writers problem. This is the pseudocode I have about solving the problem (only for the first case: reader's preference).
semaphore rw_mutex = 1; /* semaphore common to both reader & writer */
semaphore mutex = 1; /* semaphore for reading (reader lock) */
int read_count = 0; /* track number of readers in CS */
Writer:
do {
lock(rw_mutex);
/* ensure no writer or reader can enter */
...
/* writing is performed */
...
unlock(rw_mutex);
/* release lock */
} while (true);
Reader:
do
{
lock(mutex);
/* first update read_count atomically */
read_count++;
if (read_count == 1) {
lock(rw_mutex);
/* ensure no writer can enter */
}
unlock(mutex);
/* allow other readers to access */
...
/* reading is performed */
...
lock(mutex);
read_count--;if (read_count == 0) unlock(rw_mutex);
/* allow writers after
last reader has left the CS */
unlock(mutex);
/* release lock */
} while(true);
First of all this is my understanding of mutex locks: Once we create a lock and unlock pair, the code between these two entities can only be accessed by a single thread at a time.
Now if my understanding is right, then I can pretty much understand what's happening in the Writer section of the above pseudocode. We are locking and then writing to the shared resource and in the meanwhile, no one can access the shared resource since it's locked and then we simply unlock it.
But I have problems understanding the reader part. If we lock once, it means that it's locked for good until we unlock it again right? In that case, what's the use of locking twice in reader's section?
My main question is this:
What does locking mean? and what's the difference between say lock(rw_mutex) and lock(mutex) in the above pseudocode? If once we call a lock, the program should lock it regardless of what parameter we pass in right? So what do these parameters: rw_mutex and mutex mean here? How does multiple mutex locking work?
The way to think about mutexes is like this: a mutex is like a token that at any point in time can either be held by one thread, or available for any thread to take.
When a thread calls lock() on a mutex, it is attempting to take the mutex: if the mutex is available ("unlocked") then it will take it straight away, otherwise if it is currently held by another thread ("locked") then it will wait until it is available.
When a thread calls unlock() on a mutex, it is returning a mutex that it currently holds so that it is available for another thread to take.
If you have more than one mutex, each mutex is independent: a thread can hold neither, one or both of them.
In your Reader, a thread first acquires mutex. While mutex is owned by the thread, no other thread can acquire mutex, so no other thread can be executing between either of the lock(mutex); / unlock(mutex); pairs (one at the top of the Reader function and one further down). Because read_count is only ever accessed within such a pair (while mutex is held), we know that only one thread will access read_count at a time.
If the Reader has just incremented read_count from zero to one, it will also acquire the rw_mutex mutex. This prevents any other thread from acquiring that mutex until it has been released, which has the effect of preventing Writer from proceeding into its critical section.
This code effectively passes ownership of the rw_mutex from the thread that locked it in Reader to any remaining readers in the critical section, when that thread leaves the critical section. This is just a matter of the code logic - no actual call is required to do this (and it's only possible because it is using a semaphore to implement rw_mutex, and not for example a pthreads mutex, which must be released by the thread that locked it).
I have read the posts on this topic on Stackoverflow but couldn't understand the gist. Maybe we can limit their difference to a specific example.
There is a toilet with a lock.
Mutex: One thread takes the key goes in. If any other threads need to enter the toilet they wait. The current owner comes out and gives the key to the the guard(OS kernel), who gives the owner ship of the toilet to another person.
Problem Statement: I see that all the people agree that the shared resource must be unlocked by the same mutex in that thread that locked it. But for a binary semaphore, it can be unlocked in any other thread as well.
Now please consider the implementation of a semaphore.
First person reaches the toilet, executes the wait statement, and the value of the semaphore structure goes from 1 to 0. Now if any other person(other thread) comes and executes the wait statement, it will block because the 'value = 0'. So why is it always said that any other thread can unlock the toilet/Critical section specially when no other thread can enter the critical section?
A mutex has thread-affinity. Only the thread that acquired the mutex can release it. A semaphore doesn't have affinity. This is a nice property of mutex, it avoids accidents and can tell you when you got it wrong. A mutex can also be recursive, allowing the same thread to acquire it more than once. A countermeasure against accidental deadlock.
Useful properties, you need all the help you can get when writing concurrent code. But sure, a semaphore can get the job done too.
A binary semaphore is a regular semaphore that can only have a value of 0 (resource is unavailable) or 1 (resource is available), there is no difference between this and a mutex (lock).
After a process enters the toilet the semaphore is decremented and any other people who wait on the semaphore are blocked from entering the critical section. The blocked processes/threads/whatever are usually held in a sort of queue ; when the process leaves the toilet the first process waiting will be awoken.
I'm not sure where you read that threads can unlock the critical section when the semaphore is 0.
(note, there could be implementation differences)
There really is no difference whatsoever between a binary semaphore and a mutex.
The only conceptual difference between those is a mutex can represent only one key, while the semaphore may represent multiple (numbered) keys.
Take this example.
There are 5 toilets:
When you request a semaphore you get a key to a specific toilet.
When you exit that toilet is lined up with the remaining free ones.
If all toilets are full then no one can enter.
If you were to solve this with a mutex then the mutex would be protecting the keybox and you would have to continuously check if a key is available, but a semaphore can represent the keyset.
While reading about binary semaphore and mutex I found the following difference:
Both can have value 0 and 1, but mutex can be unlocked by the same
thread which has acquired the mutex lock. A thread which acquires
mutex lock can have priority inversion in case a higher priority
process wants to acquire the same mutex whereas this is not the case
with binary semaphore.
So where should I use binary semaphores? Can anyone cite an example?
EDIT: I think I have figured out the working of both. Basically binary semaphore offer synchronization whereas mutex offer locking mechanism. I read some examples from Galvin OS book to make it more clear.
One typical situation where I find binary semaphores very useful is for thread initialization where the thread will read from a structure owned by the parent thread. The parent thread needs to wait for the new thread to read the shared data from the structure before it can let the structure's lifetime end (by leaving its scope, for instance). With a binary semaphore, all you have to do is initialize the semaphore value to zero and have the child post it while the parent waits on it. Without semaphores, you'd need a mutex and condition variable and much uglier program logic for using them.
In almost all cases I use binary semaphore to signal other thread without locking.
Simple example of usage for synchronous request:
Thread 1:
Semaphore sem;
request_to_thread2(&sem); // Function sending request to thread2 in any fashion
sem.wait(); // Waiting request complete
Thread 2:
Semaphore *sem;
process_request(sem); // Process request from thread 1
sem->post(); // Signal thread 1 that request is completed
Note: You before post semaphore in thread 2 processing you can safely set thread 1 data without any additional synchronization.
The canonical example for using a counted semaphore instead of a binary mutex is when you have a limited number of resources available that are a) interchangeable and b) more than one.
For instance, if you want to allow a maximum of 10 readers to access a database at once, you can use a counted semaphore initialized to 10 to limit access to the resource. Each reader must acquire the semaphore before accessing the resource, decrementing the available count. Once the count reaches 0 (i.e. 10 readers have gained access to, and are stil using the database), all other readers are locked out. Once a reader finishes, they bump semaphore count back up by one to indicate that they are no longer using the resource and some other reader may now obtain the semaphore lock and gain access in their stead.
However, the counted semaphore, just like all other synchronization primitives, has many use cases and it's just a matter of thinking outside the box. You may find that many problems you are used to solving with a mutex plus additional logic can be more-easily and more-straightforwardly implemented with a semaphore. A mutex is a subset of the semaphore, that is to say, anything you can do with a mutex can be done with a semaphore (simply set the count to one), but that there are things that can be done with a semaphore alone that cannot be done with just a mutex.
At the end of the day, any one synchronization primitive is generally enough to do anything (think of it as being "turing-complete" for thread synchronization, to bastardize that word). However, each is tailor-fit to a different application, and while you may be able to force one to do your bidding with some customization and additional glue, it is possible that a different synchronization primitive is better-fit for the job.
The Story
There is a writer thread, periodically gathering data from somewhere (in real-time, but that doesn't matter much in the question). There are many readers then reading from these data. The usual solution for this is with two reader-writer's lock and two buffers like this:
Writer (case 1):
acquire lock 0
loop
write to current buffer
acquire other lock
free this lock
swap buffers
wait for next period
Or
Writer (case 2):
acquire lock 0
loop
acquire other lock
free this lock
swap buffers
write to current buffer
wait for next period
The Problem
In both methods, if the acquire other lock operation fails, no swap is done and writer would overwrite its previous data (because writer is real-time, it can't wait for readers) So in this case, all readers would lose that frame of data.
This is not such a big deal though, the readers are my own code and they are short, so with double buffer, this problem is solved, and if there was a problem I could make it triple buffer (or more).
The problem is the delay that I want to minimize. Imagine case 1:
writer writes to buffer0 reader is reading buffer1
writer can't acquire lock1 because reader is still reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) <- **this point**
|
|
writer wakes up, and again writes to buffer0
At **this point**, other readers in theory could have read data of buffer0 if only the writer could do the swap after the reader finishes instead of waiting for its next period. What happened in this case is that just because one reader was a bit late, all readers missed one frame of data, while the problem could have been totally avoided.
Case 2 is similar:
writer writes to buffer0 reader is idle
| |
| reader finishes reading,
| (writer waiting for next period)
|
| reader starts reading buffer1
writer wakes up |
it can't acquire lock1 because reader is still reading buffer1
overwrites buffer0
I tried mixing the solutions, so the writer tries swapping buffers immediately after writing, and if not possible, just after waking up in the next period. So something like this:
Writer (case 3):
acquire lock 0
loop
if last buffer swap failed
acquire other lock
free this lock
swap buffers
write to current buffer
acquire other lock
free this lock
swap buffers
wait for next period
Now the problem with delay still holds:
writer writes to buffer0 reader is reading buffer1
writer can't acquire lock1 because reader is still reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) <- **this point**
|
|
writer wakes up
swaps buffers
writes to buffer1
Again at **this point**, all the readers could start reading buffer0, which is a short delay after buffer0 has been written, but instead they have to wait until the next period of the writer.
The Question
The question is, how do I handle this? If I want the writer to execute precisely at desired period, it needs to wait for the period using RTAI function and I can't do it like
Writer (case 4):
acquire lock 0
loop
write to current buffer
loop a few times or until the buffer has been swapped
sleep a little
acquire other lock
free this lock
swap buffers
wait for next period
This introduces jitter. because the "few times" could happen to become longer than the "wait for next period" so the writer might miss the start of its period.
Just to be more clear, here's what I want to happen:
writer writes to buffer0 reader is reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) As soon as all readers finish reading,
| the buffer is swapped
| readers start reading buffer0
writer wakes up |
writes to buffer1
What I Found Already
I found read-copy-update which as far as I understood keeps allocating memory for buffers and frees them until the readers are done with them, which is impossible for me for many reasons. One, the threads are shared between kernel and user space. Second, with RTAI, you can't allocate memory in a real-time thread (because then your thread would be calling Linux's system calls and hence break the real-time-itivity! (Not to mention using Linux's own RCU implementation is useless due to the same reasons)
I also thought about having an extra thread that at a higher frequency tries swapping buffers, but that doesn't sound like such a good idea. First, it would itself need to synchronize with the writer, and second, well I have many of these writer-readers working in different parts in parallel and one extra thread for each writer just seems too much. One thread for all writers seems very complicated regarding synchronization with each writer.
What API are you using for reader-writer locks? Do you have a a timed lock, like pthread_rwlock_timedwrlock? If yes, I think the it's a solution to your problem, like in the following code:
void *buf[2];
void
writer ()
{
int lock = 0, next = 1;
write_lock (lock);
while (1)
{
abs_time tm = now() + period;
fill (buf [lock]);
if (timed_write_lock (next, tm))
{
unlock (lock);
lock = next;
next = (next + 1) & 1;
}
wait_period (tm);
}
}
void
reader ()
{
int lock = 0;
while (1)
{
reade_lock (lock);
process (buf [lock]);
unlock (lock);
lock = (lock + 1) & 1;
}
}
What happens here, is that it does not really matter for the writer whether it waits for a lock or for the next period, as long as it is sure to wake up before the next period has come. The absolute timeout ensures this.
Isn't this exactly the problem triple buffering is supposed to solve. So you have 3 buffers, lets call them write1, write2, and read. The write thread alternates between writing to write1 and write2, ensuring that they never block, and that the last complete frame is always available. Then in the read threads, at some appropriate point (say, just before or after reading a frame), the read buffer is flipped with the available write buffer.
While this would ensure that writers never block (the buffer flipping can be done very quickly just by flipping two pointers, perhaps even with a CAS atomic instead of a lock), there is still the issue of readers having to wait for other readers to finish with the read buffer before flipping. I suppose this could be solved slightly RCU-esque with a pool of read buffers where an available one can be flipped.
Use a Queue (FIFO linked list)
The real-time writer will always append (enqueue) to the end of the queue
The readers will always remove (dequeue) from the beginning of the queue
The readers will block if the queue is empty
edit to avoid dynamic allocation
I would probably use a circular queue...
I would use the built in __sync atomic operations.
http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html#Atomic-Builtins
Circular queue (FIFO 2d array)
ex: byte[][] Array = new byte[MAX_SIZE][BUFFER_SIZE];
Start and End index pointers
Writer overwrites buffer at Array[End][]
Writer can increment Start if it ends up looping all the way around
Reader gets buffer from Array[Start][]
Reader blocks if Start == End
If you don't want the writer to wait, perhaps it shouldn't acquire a lock that anybody else might hold. I would have it perform some sort of synchronisation, though, to make sure that what it writes really is written out - typically, most synchronisation calls will cause a memory flush or barrier instruction to be executed, but the details will depend on the memory model of your cpu and the implementation of your threads package.
I would have a look to see if there is any other synchronisation primitive around that fits things better, but if push comes to shove I would have the writer lock and unlock a lock that nobody else ever uses.
Readers must then be prepared to miss things now and then, and must be able to detect when they have missed stuff. I would associate a validity flag and a long sequence count with each buffer, and have the writer do something like "clear validity flag, increment sequence count, sync, write to buffer, increment sequence count, set validity flag, sync." If the reader reads a sequence count, syncs, sees the validity flag true, reads the data out, syncs, and re-reads the same sequence count, then perhaps there is some hope that it did not get garbled data.
If you are going to do this, I would test it exhaustively. It looks plausible to me, but it might not work with your particular implementation of everything from compiler to memory model.
Another idea, or a way to check this one, is to add a checksum to your buffer and write it last of all.
See also searches on lock free algorithms such as http://www.rossbencina.com/code/lockfree
To go with this, you probably want a way for the writer to signal to sleeping readers. You might be able to use Posix semaphores for this - e.g. have the reader ask the writer to call sem_post() on a particular semaphore when it reaches a given sequence number, or when a buffer becomes valid.
Another option is to stick with locking, but ensure that readers never hang too long holding a lock. Readers can keep the time taken holding a lock short and predictable by doing nothing else while they hold that lock but copying the data from the writer's buffer. The only problem then is that a low priority reader can be interrupted by a higher priority task halfway through a write, and the cure for that is http://en.wikipedia.org/wiki/Priority_ceiling_protocol.
Given this, if the writer thread has a high priority, the worst case work to be done per buffer is for the writer thread to fill the buffer and for each reader thread to copy the data out of that buffer to another buffer. If you can afford that in each cycle, then the writer thread and some amount of reader data copying will always be completed, while readers processing the data they have copied may or may not get their work done. If they do not, they will lag behind and will notice this when they next grab a lock and look round to see which buffer they want to copy.
FWIW, my experience with reading real time code (when required to show that the bugs are there, and not in our code) is that it is incredibly and deliberately simple-minded, very clearly laid out, and not necessarily any more efficient than it needs to be to meet its deadlines, so some apparently pointless data-copying in order to get straightforward locking to work might be a good deal, if you can afford it.