concurrent readers and mutually excluding writers in C using pthreads

concurrent readers and mutually excluding writers in C using pthreads - c

I was hoping if someone could forward me or show me a program that has multiple readers yet mutually excluding writers in C. I searched the entire internet for it, and could not find a single example that displays this behavior using coarse-grained locking. I know I can use pthread_rwlock_init, pthread_rwlock_rdlock, etc, I just don know how to use it. I learn by examples, which is why Im here.
Suppose I have a region of code(not a shared variable)and I want multiple reads, yet a single writer, this is what I don't know how to accompolish using pthreads rwlocks. I don't understand how the code will know that now it is being written to, compared to now it is being read.
Thanks.

You can take a look at page 24 of Peter Chapin's Pthread Tutorial for an example. I added it below.
#include<pthread.h>
int shared;
pthread_rwlock_t lock ;
void∗ thread_function (void∗ arg)
{
pthread_rwlock_rdlock (&lock);
//Read from the shared resource.
pthread_rwlock_unlock(&lock);
}
void main( void )
{
pthread_rwlock_init(&lock, NULL);
// Start threads here.
pthread_rwlock_wrlock(& lock );
// Write to the shared resource .
pthread_rwlock_unlock(&lock);
// Join_with threads here .
pthread_rwlock_destroy(&lock);
return 0 ;
}

Related

Is this an appropriate use case for a recursive mutex?

I've heard from various sources
(1,
2)
that one should avoid using recursive mutexes as it may be a sign of a hack or
bad design. Sometimes, however, I presume they may necessary. In light of that,
is the following an appropriate use case for a recursive mutex?
// main.c
// gcc -Wall -Wextra -Wpedantic main.c -pthread
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif /* _GNU_SOURCE */
#include <assert.h>
#include <pthread.h>
#include <stdlib.h>
typedef struct synchronized_counter
{
int count;
pthread_mutex_t mutex;
pthread_mutexattr_t mutexattr;
} synchronized_counter;
synchronized_counter* create_synchronized_counter()
{
synchronized_counter* sc_ptr = malloc(sizeof(synchronized_counter));
assert(sc_ptr != NULL);
sc_ptr->count = 0;
assert(pthread_mutexattr_init(&sc_ptr->mutexattr) == 0);
assert(pthread_mutexattr_settype(&sc_ptr->mutexattr,
PTHREAD_MUTEX_RECURSIVE) == 0);
assert(pthread_mutex_init(&sc_ptr->mutex, &sc_ptr->mutexattr) == 0);
return sc_ptr;
}
void synchronized_increment(synchronized_counter* sc_ptr)
{
assert(pthread_mutex_lock(&sc_ptr->mutex) == 0);
sc_ptr->count++;
assert(pthread_mutex_unlock(&sc_ptr->mutex) == 0);
}
int main()
{
synchronized_counter* sc_ptr = create_synchronized_counter();
// I need to increment this counter three times in succesion without having
// another thread increment it in between. Therefore, I acquire a lock
// before beginning.
assert(pthread_mutex_lock(&sc_ptr->mutex) == 0);
synchronized_increment(sc_ptr);
synchronized_increment(sc_ptr);
synchronized_increment(sc_ptr);
assert(pthread_mutex_unlock(&sc_ptr->mutex) == 0);
return 0;
}
EDIT:
I wanted to ask the question with a simple example, but perhaps it came across as too simple. This is what I had imagined would be a more realistic scenario: I have a stack data structure that will be accessed by multiple threads. In particular, sometimes a thread will be popping n elements from the stack, but it must do so all at once (without another thread pushing or popping from the stack in between). The crux of the design issue is if I should have the client manage locking the stack themselves with a non-recursive mutex, or have the stack provide synchronized, simple methods along with a recursive mutex which the client can use to make multiple atomic transactions that are also synchronized.

Both of your examples - the original synchronized_counter and the stack in your edit - are correct examples of using a recursive mutex, but they would be considered poor API design if you were building a data structure. I'll try to explain why.
Exposing internals - the caller is required to use the same lock that protects internal access to members of the data structure. This opens the possibility to misusing that lock for purposes other than accessing the data structure. That could lead to lock contention - or worse - deadlock.
Efficiency - it's often more efficient to implement a specialized bulk operations like increment_by(n) or pop_many(n).
First, it allows the data structure to optimize the operations - perhaps the counter can just do count += n or the stack could remove n items from a linked list in one operation. [1]
Second, you save time by not having to lock/unlock the mutex for every operation.[2]
Perhaps a better example for using a recursive mutex would be as follows:
I have a class with two methods Foo and Bar.
The class was designed to be single-threaded.
Sometimes Foo calls Bar.
I want to make the class thread-safe, so I add a mutex to the class and lock it inside Foo and Bar. Now I need to make sure that Bar can lock the mutex when called from Foo.
One way to solve this without a recursive mutex is to create a private unsynchronized_bar and have both Foo and Bar call it after locking the mutex.
This can get tricky if Foo is a virtual method that can be implemented by a sub-class and used to call Bar, or if Foo calls out to some other part of the program that can call back into Bar. However, if you have code inside critical sections (code protected by a mutex) calling into other arbitrary code, then the behaviour of the program will be hard to understand, and it is easy to cause deadlocks between different threads, even if you use a recursive mutex.
The best advice is to solve concurrency problems through good design rather than fancy synchronization primitives.
[1] There are some tricker patterns like "pop an item, look at it, decide if I pop another one", but those can be implemented by supplying a predicate to the bulk operation.
[2] Practically speaking locking a mutex you already own should be pretty cheap, but in your example it requires at least a call to an external library function, which cannot be inlined.

The logic that you describe is not, in fact, a recursive mutex, nor is it an appropriate case for one.
And, if you actually need to ensure that another thread won't increment your counter, I'm sorry to tell you that your logic as-written won't ensure that.
I therefore suggest that you step back from this, clear your head, and re-consider your actual use-case. I think that confusion about recursive mutexes has led you astray. It may well be the case that the logic which you now have in synchronized_increment ... in fact, the need for the entire method ... is unnecessary, and the logic which you show in main is all you really need, and it's just a simple variable after all.

Self-written Mutex for 2+ Threads

I have written the following code, and so far in all my tests it seems as if I have written a working Mutex for my 4 Threads, but I would like to get someone else's opinion on the validity of my solution.
typedef struct Mutex{
int turn;
int * waiting;
int num_processes;
} Mutex;
void enterLock(Mutex * lock, int id){
int i;
for(i = 0; i < lock->num_processes; i++){
lock->waiting[id] = 1;
if (i != id && lock->waiting[i])
i = -1;
lock->waiting[id] = 0;
}
printf("ID %d Entered\n",id);
}
void leaveLock(Mutex * lock, int id){
printf("ID %d Left\n",id);
lock->waiting[id] = 0;
}
void foo(Muted * lock, int id){
enterLock(lock,id);
// do stuff now that i have access
leaveLock(lock,id);
}

I feel compelled writing an answer here because the question is a good one, taking into concern it could help others to understand the general problem with mutual exclusion. In your case, you came a long way to hide this problem, but you can't avoid it. It boils down to this:
01 /* pseudo-code */
02 if (! mutex.isLocked())
03 mutex.lock();
You always have to expect a thread switch between lines 02 and 03. So there is a possible situation where two threads find mutex unlocked and be interrupted after that ... only to resume later and lock this mutex individually. You will have two threads entering the critical section at the same time.
What you definitely need to implement reliable mutual exclusion is therefore an atomic operation that tests a condition and at the same time sets a value without any chance to be interrupted meanwhile.
01 /* pseudo-code */
02 while (! test_and_lock(mutex));
As soon as this test_and_lock function cannot be interrupted, your implementation is safe. Until c11, C didn't provide anything like this, so implementations of pthreads needed to use e.g. assembly or special compiler intrinsics. With c11, there is finally a "standard" way to write atomic operations like this, but I can't give an example here, because I don't have experience doing that. For general use, the pthreads library will give you what you need.
edit: of course, this is still simplified -- in a multi-processor scenario, you need to ensure that even memory accesses are mutually exclusive.

The Problem I see in you code:
The idea behind a mutex is to provide mutual exclusion, means that when thread_a is in the critical section, thread_b must wait(in case he wants also to enter) for thread_a.
This waiting part should be implemented in enterLock function. But what you have is a for loop which might end way before thread_a is done from the critical section and thus thread_b could also enter, hence you can't have mutual exclusion.
Way to fix it:
Take a look for example at Peterson's algorithm or Dekker's(more complicated), what they did there is what's called busy waiting which is basically a while loop which says:
while(i can't enter) { do nothing and wait...}

You are totally ignoring the topic of memory models. Unless you are on a machine with a sequential consistent memory model (which none of today's PC CPUs are), your code is incorrect, as any store executed by one thread is not necessarily immediately visible to other CPUs. However, exactly this seems to be an assumption in your code.
Bottom line: Use the existing synchronization primitives provided by the OS or a runtime library such a POSIX or Win32 API and don't try to be smart and implement this yourself. Unless you have years of experince in parallel programming as well as in-depth knowledge of CPU architecture, chances are quite good that you end up with an incorrect implementation. And debugging parallel programms can be hell...

After enterLock() returns, the state of the Mutex object is the same as before the function was called. Hence it will not prevent a second thread to enter the same Mutex object even before the first one released it calling leaveLock(). There is no mutual exclusiveness.

Multithreading and mutexes

I'm currently beginning development on an indie game in C using the Allegro cross-platform library. I figured that I would separate things like input, sound, game engine, and graphics into their own separate threads to increase the program's robustness. Having no experience in multithreading whatsoever, my question is:
If I have a section of data in memory (say, a pointer to a data structure), is it okay for one thread to write to it at will and another to read from it at will, or would each thread have to use a mutex to lock the memory, then read or write, then unlock?
In particular, I was thinking about the interaction between the game engine and the video renderer. (This is in 2D.) My plan was for the engine to process user input, then spit out the appropriate audio and video to be fed to the speakers and monitor. I was thinking that I'd have a global pointer to the next bitmap to be drawn on the screen, and the code for the game engine and the renderer would be something like this:
ALLEGRO_BITMAP *nextBitmap;
boolean using;
void GameEngine ()
{
ALLEGRO_BITMAP *oldBitmap;
while (ContinueGameEngine())
{
ALLEGRO_BITMAP *bitmap = al_create_bitmap (width, height);
MakeTheBitmap (bitmap);
while (using) ; //The other thread is using the bitmap. Don't mess with it!
al_destroy_bitmap (nextBitmap);
nextBitmap = bitmap;
}
}
void Renderer ()
{
while (ContinueRenderer())
{
ALLEGRO_BITMAP *bitmap = al_clone_bitmap (nextBitmap);
DrawBitmapOnScreen (bitmap);
}
}
This seems unstable... maybe something would happen in the call to al_clone_bitmap but I am not quite certain how to handle something like this. I would use a mutex on the bitmap, but mutexes seem like they take time to lock and unlock and I'd like both of these threads (especially the game engine thread) to run as fast as possible. I also read up on something called a condition, but I have absolutely no idea how a condition would be applicable or useful, although I'm sure they are. Could someone point me to a tutorial on mutexes and conditions (preferably POSIX, not Windows), so I can try to figure all this out?

If I have a section of data in memory (say, a pointer to a data
structure), is it okay for one thread to write to it at will and
another to read from it at will
The answer is "it depends" which usually means "no".
Depending on what you're writing/reading, and depending on the logic of your program, you could wind up with wild results or corruption if you try writing and reading with no synchronization and you're not absolutely sure that writes and reads are atomic.
So you should just use a mutex unless:
You're absolutely sure that writes and reads are atomic, and you're absolutely sure that one thread is only reading (ideally you'd use some kind of specific support for atomic operations such as the Interlocked family of functions from WinAPI).
You absolutely need the tiny performance gain from not locking.
Also worth noting that your while (using); construct would be a lot more reliable, correct, and would probably even perform better if you used a spin lock (again if you're absolutely sure you need a spin lock, rather than a mutex).

The tool that you need is called atomic operations which would ensure that the reader thread only reads whole data as written by the other thread. If you don't use such operations, the data may only be read partially, thus what it read may may make no sense at all in terms of your application.
The new standard C11 has these operations, but it is not yet widely implemented. But many compilers should have extension that implement these. E.g gcc has a series of builtin functions that start with a __sync prefix.

There are a lot of man pages in 'google'. Search for them. I found http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html in a few search minutes:
Besides, begin with a so little example, increasing difficulty. Firstable with threads creation and termination, threads returns, threads sincronization. Continue with posix mutex and conditions and understand all these terms.
One important documentation feed is linux man and info pages.
Good luck

If I have a section of data in memory (say, a pointer to a data structure), is it okay for one thread to write to it at will and another to read from it at will, or would each thread have to use a mutex to lock the memory, then read or write, then unlock?
If you have section of data in memory where two different threads are reading and writing this is called the critical section and is a common issue of the consumer and producer.
There are many resources that speak to this issue:
https://docs.oracle.com/cd/E19455-01/806-5257/sync-31/index.html
https://stackoverflow.com/questions/tagged/producer-consumer
But yes if you are going to be using two different threads to read and write you will have to implement the use of mutexes or another form of locking and unlocking.

How are read/write locks implemented in pthread?

How are they implemented especially in case of pthreads. What pthread synchronization APIs do they use under the hood? A little bit of pseudocode would be appreciated.

I haven't done any pthreads programming for a while, but when I did, I never used POSIX read/write locks. The problem is that most of the time a mutex will suffice: ie. your critical section is small, and the region isn't so performance critical that the double barrier is worth worrying about.
In those cases where performance is an issue, normally using atomic operations (generally available as a compiler extension) are a better option (ie. the extra barrier is the problem, not the size of the critical section).
By the time you eliminate all these cases, you are left with cases where you have specific performance/fairness/rw-bias requirements that require a true rw-lock; and that is when you discover that all the relevant performance/fairness parameters of POSIX rw-lock are undefined and implementation specific. At this point you are generally better off implementing your own so you can ensure the appropriate fairness/rw-bias requirements are met.
The basic algorithm is to keep a count of how many of each are in the critical section, and if a thread isn't allowed access yet, to shunt it off to an appropriate queue to wait. Most of your effort will be in implementing the appropriate fairness/bias between servicing the two queues.
The following C-like pthreads-like pseudo-code illustrates what I'm trying to say.
struct rwlock {
mutex admin; // used to serialize access to other admin fields, NOT the critical section.
int count; // threads in critical section +ve for readers, -ve for writers.
fifoDequeue dequeue; // acts like a cond_var with fifo behaviour and both append and prepend operations.
void *data; // represents the data covered by the critical section.
}
void read(struct rwlock *rw, void (*readAction)(void *)) {
lock(rw->admin);
if (rw->count < 0) {
append(rw->dequeue, rw->admin);
}
while (rw->count < 0) {
prepend(rw->dequeue, rw->admin); // Used to avoid starvation.
}
rw->count++;
// Wake the new head of the dequeue, which may be a reader.
// If it is a writer it will put itself back on the head of the queue and wait for us to exit.
signal(rw->dequeue);
unlock(rw->admin);
readAction(rw->data);
lock(rw->admin);
rw->count--;
signal(rw->dequeue); // Wake the new head of the dequeue, which is probably a writer.
unlock(rw->admin);
}
void write(struct rwlock *rw, void *(*writeAction)(void *)) {
lock(rw->admin);
if (rw->count != 0) {
append(rw->dequeue, rw->admin);
}
while (rw->count != 0) {
prepend(rw->dequeue, rw->admin);
}
rw->count--;
// As we only allow one writer in at a time, we don't bother signaling here.
unlock(rw->admin);
// NOTE: This is the critical section, but it is not covered by the mutex!
// The critical section is rather, covered by the rw-lock itself.
rw->data = writeAction(rw->data);
lock(rw->admin);
rw->count++;
signal(rw->dequeue);
unlock(rw->admin);
}
Something like the above code is a starting point for any rwlock implementation. Give some thought to the nature of your problem and replace the dequeue with the appropriate logic that determines which class of thread should be woken up next. It is common to allow a limited number/period of readers to leapfrog writers or visa versa depending on the application.
Of course my general preference is to avoid rw-locks altogether; generally by using some combination of atomic operations, mutexes, STM, message-passing, and persistent data-structures. However there are times when what you really need is a rw-lock, and when you do it is useful to know how they work, so I hope this helped.
EDIT - In response to the (very reasonable) question, where do I wait in the pseudo-code above:
I have assumed that the dequeue implementation contains the wait, so that somewhere within append(dequeue, mutex) or prepend(dequeue, mutex) there is a block of code along the lines of:
while(!readyToLeaveQueue()) {
wait(dequeue->cond_var, mutex);
}
which was why I passed in the relevant mutex to the queue operations.

Each implementation can be different, but normally they have to favor readers by default due to the requirement by POSIX that a thread be able to obtain the read-lock on an rwlock multiple times. If they favored writers, then whenever a writer was waiting, the reader would deadlock on the second read-lock attempt unless the implementation could determine the reader already has a read lock, but the only way to determine that is storing a list of all threads that hold read locks, which is very inefficient in time and space requirements.

Tips to write thread-safe UNIX code?

What are the guidelines to write thread-safe UNIX code in C and C++?
I know only a few:
Don't use globals
Don't use static local storage
What others are there?

The simple thing to do is read a little. The following list contains some stuff to look at and research.
Spend time reading the Open Group Base Specification particularly the General Information section and the subsection on threads. This is the basis information for multithreading under most UN*X-alike systems.
Learn the difference between a mutex and a semaphore
Realize that everything that is shared MUST be protected. This applies to global variables, static variables, and any shared dynamically allocated memory.
Replace global state flags with condition variables. These are implemented using pthread_cond_init and related functions.
Once you understand the basics, learn about the common problems so that you can identify them when they occur:
Lock inversion deadlocks
Priority inversion - if you are interested in a real life scenario, then read this snippet about the Mars Pathfinder

It really comes down to shared state, globals and static local are examples of shared state. If you don't share state, you won't have a problem. Other examples of shared state include multiple threads writing to a file or socket.
Any shared resource will need to be managed properly - that might mean making something mutex protected, opening another file, or intelligently serializing requests.
If two threads are reading and writing from the same struct, you'll need to handle that case.

Beware of the sem_t functions, they may return uncompleted on interrupts, IO, SIGCHLD etc. If you need them, be sure to allways capture that case.
pthread_mut_t and pthread_cond_t functions are safe with respect to EINTR.

A good open book about concurrency in general can be found here: Little Book of Semaphores
It presents various problems that are solved step-by step and include solutions to common concurrency issues like starvation, race conditions etc.
It is not language-specific but contains short chapters about implementing the solutions in C with the Pthread-Library or Python.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight