I have a threaded server that can add/append/read files and relay data to the client.
If a file is being added, no other thread can append/read it. If a file is being appended, no threads can append/read it. If a file is being read, no other thread can append to it. However, if a file is being read, other files can read it.
Currently I have a mutex system that will do this, except it won't allow multiple reads.
To fix this, in the read method, I will change:
pthread_mutex_lock(&(fm->mutex));//LOCK
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
to
pthread_mutex_trylock(&(fm->mutex));//TRYLOCK [NonBlocking, so the thread can continue the read]
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
Question
How can I unlock the file without allowing the other methods (just append really) to begin writing to the file before all the other read()'s have finished?
Example
For example, if the reading thread that originally locked the file completes and unlocks the file and there are still other threads trying to read the file, then an appending thread gets the chance to lock the file and begin appending while the others are still reading, which is a no-no.
Idea
I want to keep a count of the number of threads currently reading a file. When a thread finishes, reduce the count. If the count is 0, meaning no threads are still reading, unlock the file. But, I'm worried that this would not be thread safe. If this is a viable solution, how could I make it thread safe? Another but, I believe only the original thread can successfully unlock the mutex.
It sounds like you may be looking for a read-write lock, which is provided by pthreads. It allows two modes of locking: a shared/read-lock mode, which can be locked by multiple threads at once, and an exclusive/write-lock mode, where the lock call won't return until all other threads (readers and writers) have given up their hold on the lock.
You could use a semaphore instead of the mutex (see this link about the differences). The semaphore does thread-safe synchronized counting for you.
You can live without an additional mutex to lock the file for writing if you limit the number of simultaneous read accesses to a (sufficient large) number N and require the semaphore to be increased by that number for write access. This way you can only gain write access if the number of readers is zero and all other readers will be locked out until your writer has finished.
Note that the POSIX documentation for pthread_mutex_lock() says:
If successful, the pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() functions shall return zero; otherwise, an error number shall be returned to indicate the error.
Since you don't show your code testing the return values, you don't know whether your lock operations (in particular) succeeded or not.
Separately, since you want a read/write lock, why not use one:
pthread_rwlock_rdlock()
pthread_rwlock_wrlock()
pthread_rwlock_unlock()
pthread_rwlock_init()
pthread_rwlock_destroy()
There are four pthread_rwlockattr_*() functions and a total of 9 pthread_rwlock_*() functions; I only listed the most important functions in the family.
Related
I am writing a program in which a memory array is modified by one thread under 2 possible operations (modify the array content, or dealloc the array and replace it by allocating a new array). The memory array can be read by many threads except when the array is modified or deallocated and replaced.
I know how to use mutex lock to allow the memory to be modified by only one thread at all time. How can I use it (or other multithreading tools in c) to allow arbitrary number of read threads to access the memory, except when the write thread modifies the memory?
The best solution to achieve this is using read-write locks i.e pthread_rwlock_* as already answered in above comments. Providing a little more detail about it.
Read-write locks are used for shared read access or exclusive write access. A thread that needs read access can't continue while any thread currently has write access. A thread that needs write access can't continue when any other thread has either write access or read access. When both readers and writers are waiting for the access at the same time, there is default action to give precedence to any of them, this rule can be changed.
Read-write lock functions with arguments are much clearly explained here:
https://docs.oracle.com/cd/E19455-01/806-5257/6je9h032u/index.html
There is a post in stackoverflow itself about the same:
concurrent readers and mutually excluding writers in C using pthreads
This (read-write locks) may cause the writer thread to starve if precedence are not properly defined and implementation has not taken care of case when too many readers are waiting for writer to finish. So read about it too:
How to prevent writer starvation in a read write lock in pthreads
I am trying to develop a program with POSIX threads in which i have a child thread which will be updating the content of a file and the database between certain intervals and there will be other children who reads data from the file and database all the time. So i don't want any thread to read the file or database while they are being written by the single updater thread. So my idea is to make all other children threads sleep from the child thread which will update the file and database. sleep() makes the calling thread sleep. Is there any way the above scenario can be implemented?!
EDIT:
I have two different functions for reading and writing the file. Most of the threads access the read method so they aren't vulnerable but they might be if they try to read in between while the periodic thread which accesses the write method is updating the file's contents.
You do not want to use sleep for this at all. Instead, use a reader/writer lock. The updater thread must acquire the lock (in write mode) before it modifies the data. And the other threads must acquire the lock (in read mode) before reading the data.
Note that if your reader threads are reading continuously, the writer will get starved and never acquire the lock. So you will need some separate mechanism such as a flag the updater can set that tells the readers to please stop reading and release their locks. If the readers only read occasionally this shouldn't be such an issue (unless there are tons of readers in which case you may have an architectural problem).
I know that this isn't a "homework helper website", but I got insane in the last days because i have to implement the access to resource avoiding starvation and i can't figure out how to do that. Can anyone help me with some application examples or documentation? The assignment is: a resource may be used by 2 types of processes: black and white. When the resource is used by the white processes, it can not be used by the black processes and vice-versa. Implement the access to the resource avoiding starvation. Is this a producer-consumer case?
Let's make a few assumptions (for the sake of discussion):
Our processes will be threads -- not actual software processes, there's a difference which may be important in your assignment.
White processes are Readers.
Black processes are Writers.
Our common resource is particular Variable.
Mutual exclusion locks (mutex):
A mutex is a type of exclusive lock, it has a binary state, it's either locked or unlocked. You can lock it, unlock it or check to see if it's locked or not.
Threads can lock each other out using mutex (mutual exclusion locks) just as processes can lock each other out using semaphores.
When you want to protect a variable from being used by two threads at once you create a mutex for that variable and write every thread so that it attempts to lock the mutex before attempting to use the variable and unlock it after they're done.
This makes any first thread lock the mutex and any subsequent thread block until the first thread unlocks the mutex basically forcing all of these threads to line up and operate on that particular variable sequentially.
This is a bit ineffective when you just want to read the variable, not change its value, because two threads reading the same content doesn't create any conflict or invalid data. Two threads writing at the same time might however corrupt the data.
Readers/Writers locks (RWL):
Most implementations of Readers/Write locks will use a shared lock and an exclusive lock, but they expose a simple usage approach: if you want to read grab a "read lock", if you want to write grab a "write lock".
"Read locks" are not exclusive and they allow multiple readers to be reading at one particular time (without blocking).
"Write locks" are exclusive and only one writer can be writing at one particular time (without blocking).
Starvation:
First step: Readers/Writers Locks is the event when a first (read) thread grabs a "read lock" on the variable, a second (write) tries to grab a "write lock" but is blocked until all readers finish reading.
Second step: before the first thread finishes reading, a third (read) thread grabs a "read lock" on the variable; this means the second (write) thread has to wait for this third thread to finish.
Repeat the second step until starvation is achieved.
Avoiding starvation with Seqlock:
A seqlock is implemented with one mutex and some counters. It always allows reading, even while the writers are writing to the variable but it gives the readers a means of checking if the data has been written to during the time it was being read, if so it may be corrupt so the readers will have to reread the data and check for consistency again.
The "read & consistency check" phase runs in a loop until the check confirms consistency of the data, at which point the reader can continue with its usual task.
The writers use the mutex to grab exclusive access so they never overlap their operations.
This is good for high read low write situations. If there would be too many writers the readers would continuously loop rereading the data.
Your particular situation:
If black processes need to be able to share the resource among themselves and white processes need to be able to share the resource among themselves but white processes can't share the resource with black processes then the solution will not be either RWL or Seqlock.
A variation on the Seqlock algorithm might be your solution.
Generally, is a problem in which it comes to access a shared resource (or mutex).
If you have two object of the same class, both threads:
pseudo-code:
loop
if shared_resource is free
lock shared_resource
do something
free shared_resource
This in VERY broad terms!
I want to create a program, using POSIX threads, having n threads running at different priorities.
There are files (say m files) which are shared among these n threads. If one thread is using the file (assuming that it writing onto the file), no other thread will be allowed to use it. The code should maintain a Table that tells: which file it has acquired and for which file its requests are pending.
Also, we need a Monitor Thread to check for deadlocks ; any implementations hints/ideas?
You don't need to check for deadlocks. You have to write a nice code that makes it impossible to run into deadlock scenario. For that reason, I'd recommend you use try-lock approach to lock down a chain of files and unlock them back shall any of the lock acquisition fail.
Also, if you are using C buffered I/O, I'd recommend you stick with ftrylockfile and funlockfile APIs. Otherwise use a synchronization mechanism that is most appropriate for your case, be that futex API or locks implemented using atomic instructions.
The standard unix way to accomplish this is: spooldirectories.
file operations, such as rename / link / unlink are atomic
have one central input spool-dir, where input files can be placed
a process / thread that wants to process a file, starts by moving it to another name, or better: to another (work) directory (using the thread_id or process number as directory name is obvious.)
(since this move is atomic there is no possible race condition!)
after processing, the finished files can be moved to an output directory
the scoreboard function is simply a readdir(+stat), maybe even inotify, on the work directories
process starvation will always be a problem. Incompletely processed files will live forever in de workdirs. Having a stamp/ pid file in the workdirectories could help cleanup / restart.
if designed well, this structure could work even after machine failure. The workers would have to maintain their own backup / log /stamp-file mechanism.
if you haven't noticed yet: no locking will be needed.
I hate C. I have to try and think of a way to do this without classes:(
OK, a 'Sfile' struct to represent each file. Has name, path, file fd/handle, everything to do with one file, plus an 'inUse' boolean.
A 'waitingThreads' array for those threads waiting for a set of files.
A 'Sfiles' struct with an array of *Sfile to hold all the files, a waitingThreads array and a lock, (mutex/futex/criticalSection).
Each thread should have an event/semaphore/something that it can wait on until its files all become available and some way to access to the set of files that it needs and somewhere to store the fds/handles/whatever for the files.
OK, off we go:
Any thread that wants files locks up the Sfiles and iterates the *Sfile array, checking if every file it needs is free to use. If they all are, it sets the 'inUse' boolean, loads itself up with the fd/handles, unlocks and runs on - it has all its files. If any file it needs is in use, it pushes itself onto the waitingThreads array and waits on its event/sema.
Whne a thread is done with its files, it locks the Sfiles and clears the 'inUse' boolean for the files it was using. It then iterates the waitingThreads array - if the array is empty, it just unlocks and exits. If the array is not empty, it tries to find threads that can now run with the files that are now free. If it finds none, it just unlocks and returns. If it does find one, it loads that thread up with the fd/handles, sets the inUse boolean and signals its event/sema - that thread will then run with its desired set of files. The thread continues to iterate the waitingThreads array to the end, looking for mre threads that it can load up and signal with the remaining free files. When it reaches the end of the array, it returns.
That, or something like it, will ensure that the threads always run with their complete set of files, prevent any deadlocks due to threads locking partial sets of files and does not require any polling.
If you really, really need that table thingy, you can build it inside the lock every time a thread enters or leaves the lock. I would suggest mallocing a suitable struct, loading it up with all the details of the free files and waiting threads, and queueing it off to another thread. You could just have some 'monitoring' thread that periodically locks up the Sfiles, dumps all the info and unlocks, but that keeps the Sfiles locked for the entire 'dump' time - you may not want that overhead - it's up to you.
Edit:
OH - forgot the priority thingy. The OS thread priority is probably useless for your purpose. Have each thread expose a priority enum/int and keep the 'waitingThreads' array sorted by that priority, so giving the higher priority threads the first bite at whatever files are returned.
Is that good enough for your homework assignment?
While reading about binary semaphore and mutex I found the following difference:
Both can have value 0 and 1, but mutex can be unlocked by the same
thread which has acquired the mutex lock. A thread which acquires
mutex lock can have priority inversion in case a higher priority
process wants to acquire the same mutex whereas this is not the case
with binary semaphore.
So where should I use binary semaphores? Can anyone cite an example?
EDIT: I think I have figured out the working of both. Basically binary semaphore offer synchronization whereas mutex offer locking mechanism. I read some examples from Galvin OS book to make it more clear.
One typical situation where I find binary semaphores very useful is for thread initialization where the thread will read from a structure owned by the parent thread. The parent thread needs to wait for the new thread to read the shared data from the structure before it can let the structure's lifetime end (by leaving its scope, for instance). With a binary semaphore, all you have to do is initialize the semaphore value to zero and have the child post it while the parent waits on it. Without semaphores, you'd need a mutex and condition variable and much uglier program logic for using them.
In almost all cases I use binary semaphore to signal other thread without locking.
Simple example of usage for synchronous request:
Thread 1:
Semaphore sem;
request_to_thread2(&sem); // Function sending request to thread2 in any fashion
sem.wait(); // Waiting request complete
Thread 2:
Semaphore *sem;
process_request(sem); // Process request from thread 1
sem->post(); // Signal thread 1 that request is completed
Note: You before post semaphore in thread 2 processing you can safely set thread 1 data without any additional synchronization.
The canonical example for using a counted semaphore instead of a binary mutex is when you have a limited number of resources available that are a) interchangeable and b) more than one.
For instance, if you want to allow a maximum of 10 readers to access a database at once, you can use a counted semaphore initialized to 10 to limit access to the resource. Each reader must acquire the semaphore before accessing the resource, decrementing the available count. Once the count reaches 0 (i.e. 10 readers have gained access to, and are stil using the database), all other readers are locked out. Once a reader finishes, they bump semaphore count back up by one to indicate that they are no longer using the resource and some other reader may now obtain the semaphore lock and gain access in their stead.
However, the counted semaphore, just like all other synchronization primitives, has many use cases and it's just a matter of thinking outside the box. You may find that many problems you are used to solving with a mutex plus additional logic can be more-easily and more-straightforwardly implemented with a semaphore. A mutex is a subset of the semaphore, that is to say, anything you can do with a mutex can be done with a semaphore (simply set the count to one), but that there are things that can be done with a semaphore alone that cannot be done with just a mutex.
At the end of the day, any one synchronization primitive is generally enough to do anything (think of it as being "turing-complete" for thread synchronization, to bastardize that word). However, each is tailor-fit to a different application, and while you may be able to force one to do your bidding with some customization and additional glue, it is possible that a different synchronization primitive is better-fit for the job.