Can atomic variable replace pthread_rwlock ? Can it be lock-free - c

I have some thread to write resource and some to read it.But pthread_rwlock cause a lot of context switch. So I imagine a way to avoid it. But I'm not sure it is safe or not.
This is the code:
sig_atomic_t slot = 0;
struct resource {
sig_atomic_t in_use; /*Counter,if in_use, not zero*/
.....
} xxx[2];
int read_thread()
{
i = slot; /*avoid slot changes in process */
xxx[i].in_use++;
read(xxx[i]);
xxx[i].in_use--;
}
int write_thread()
{
mutex_lock; /*mutex between write threads */
if (slot == 0) {
while(xxx[1].in_use != 0); /*wait last read thread in slot 1*/
clear(xxx[1]);
write(xxx[1]);
slot = 1;
} else if (slot == 1) {
while(xxx[0].in_use != 0);
clear(xxx[0]);
write(xxx[0]);
slot = 0;
}
mutex_unlock;
}
Will that works? The cost is 2 times storage and 3 atomic variable.
Thanks a lot!

Your algorithm is not lock-free; the writers use a spin lock.
Is it really necessary to do double-buffering and spin locks? Could you instead use (slot ^ 1) as the writing slot and slot as the reading slot? After writing, the writer would atomically change the value of slot, thus "publishing" its write. You may read the same slot many times consecutively this way, but if that's not the semantics you want then you should be using a queue.
By the way, a sig_atomic_t does not provide the type of atomicity you need for multiple threads. At a minimum, you should declare slot as volatile sig_atomic_t, and use memory barriers when reading and writing.

Your strategy is to have writers write to a different slot than what the readers are reading from. And you are switching the reading slot number after a write is completed. However, you will have a race.
slot reader writer1 writer2
---- ------ ------- -------
0 mutex_lock
i = 0
... slot=1
1 mutex_unlock mutex_lock
... clear(xxx[0])
xxx[0].in_use++
read(xxx[0]) write(xxx[0])
In general, though, this strategy could lead to starvation of writers (that is a writer may spin forever).
However, if you are willing to tolerate that, it would be safer to let xxx[] be an array of 2 pointers to resource. Let the reader always read from xxx[0], and let the writers contend for updates on xxx[1]. When a writer is finished updating xxx[1], it uses CAS on xxx[0] and xxx[1].
struct resource {
sig_atomic_t in_use; /*Counter,if in_use, not zero*/
sig_atomic_t writer;
.....
} *xxx[2];
void read_thread()
{
resource *p = xxx[0];
p->in_use++;
while (p->writer) {
p->in_use--;
p = xxx[0];
p->in_use++;
}
read(*p);
p->in_use--;
}
void write_thread()
{
resource *p;
mutex_lock; /*mutex between write threads */
xxx[1]->writer = 1;
while(xxx[1]->in_use != 0); /*wait last read thread in slot 1*/
clear(xxx[1]);
write(xxx[1]);
xxx[1] = CAS(&xxx[0], p = xxx[0], xxx[1]);
assert(p == xxx[1]);
xxx[0]->writer = 0;
mutex_unlock;
}
If you want to avoid writer starvation, but you want the performance of spinlocks, you are looking at implementing your own reader/writer locks using spinlocks instead of mutex locks. A google search for "read write spinlock implementation" pointed to this page which I found to be an interesting read.

Related

Reader Writer Problem With Writer Priority Problem

I came across this problem as I am learning more about operating systems. In my code, I've tried making the reader having priority and it worked, so next I modified it a bit to make the writer have the priority. When I ran the code, the output was exactly the same and it seemed like the writer did not have the priority. Here is the code with comments. I am not sure what I've done wrong, since I modified a lot of the code but the output remains the same if I did not change it at all.
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
/*
This program provides a possible solution for first readers writers problem using mutex and semaphore.
I have used 10 readers and 5 producers to demonstrate the solution. You can always play with these values.
*/
// Semaphore initialization for writer and reader
sem_t wrt;
sem_t rd;
// Mutex 1 blocks other readers, mutex 2 blocks other writers
pthread_mutex_t mutex1;
pthread_mutex_t mutex2;
// Value the writer is changing, we are simply multiplying this value by 2
int cnt = 2;
int numreader = 0;
int numwriter = 0;
void *writer(void *wno)
{
pthread_mutex_lock(&mutex2);
numwriter++;
if(numwriter == 1){
sem_wait(&rd);
}
pthread_mutex_unlock(&mutex2);
sem_wait(&wrt);
// Writing Section
cnt = cnt*2;
printf("Writer %d modified cnt to %d\n",(*((int *)wno)),cnt);
sem_post(&wrt);
pthread_mutex_lock(&mutex2);
numwriter--;
if(numwriter == 0){
sem_post(&rd);
}
pthread_mutex_unlock(&mutex2);
}
void *reader(void *rno)
{
sem_wait(&rd);
pthread_mutex_lock(&mutex1);
numreader++;
if(numreader == 1){
sem_wait(&wrt);
}
pthread_mutex_unlock(&mutex1);
sem_post(&rd);
// Reading Section
printf("Reader %d: read cnt as %d\n",*((int *)rno),cnt);
pthread_mutex_lock(&mutex1);
numreader--;
if(numreader == 0){
sem_post(&wrt);
}
pthread_mutex_unlock(&mutex1);
}
int main()
{
pthread_t read[10],write[5];
pthread_mutex_init(&mutex1, NULL);
pthread_mutex_init(&mutex2, NULL);
sem_init(&wrt,0,1);
sem_init(&rd,0,1);
int a[10] = {1,2,3,4,5,6,7,8,9,10}; //Just used for numbering the writer and reader
for(int i = 0; i < 5; i++) {
pthread_create(&write[i], NULL, (void *)writer, (void *)&a[i]);
}
for(int i = 0; i < 10; i++) {
pthread_create(&read[i], NULL, (void *)reader, (void *)&a[i]);
}
for(int i = 0; i < 5; i++) {
pthread_join(write[i], NULL);
}
for(int i = 0; i < 10; i++) {
pthread_join(read[i], NULL);
}
pthread_mutex_destroy(&mutex1);
pthread_mutex_destroy(&mutex2);
sem_destroy(&wrt);
sem_destroy(&rd);
return 0;
}
Output (for both is the same. I think if writer had priority it will change first, then will be read):
Alternative Semantics
Much of what you want to do can probably be accomplished with less overhead. For example, in the classic reader-writer problem, readers shouldn’t need to block other readers.
You might be able to replace the reader-writer pattern with a publisher-consumer pattern that manages pointers to blocks of data with acquire-consume memory ordering. You only need locking at all if one thread needs to update the same block of memory after it was originally written.
POSIX and Linux have an implementation of reader-writer locks in the system library, which were designed to avoid starvation. This is most likely the high-level construct you want.
If you still want to implement your own, one implementation would use a count of current readers, a count of pending writers and a flag that indicates whether a write is in progress. It packs all these values into an atomic bitfield that it updates with a compare-and-swap.
Reader threads would retrieve the value, check whether there are any starving writers waiting, and if not, increment the count of readers. If there are writers, it backs off (perhaps spinning and yielding the CPU, perhaps sleeping on a condition variable). If there is a write in progress, it waits for that to complete. If it sees only other reads in progress, it goes ahead.
Writer threads would check if there are any reads or writes in progress. If so, they increment the count of waiting writers, and wait. If not, they set the write-in-progress bit and proceed.
Packing all these fields into the same atomic bitfield guarantees that no thread will think it’s safe to use the buffer while another thread thinks it’s safe to write: if two threads try to update the state at the same time, one will always fail.
If You Stick With Semaphores
You can still have reader threads check sem_getvalue() on the writer semaphore, and back off if they see any starved writers are waiting. One method would be to wait on a condition variable that threads signal when they are done with the buffer. A reader that sees that it holds the mutex while writers are waiting can try to wake up one writer thread and go back to sleep, and a reader that sees only other readers are waiting can wake up a reader, which will wake up the next reader, and so on.

Reader/Writer implementation in C

I'm currently learning about concurrency at my University. In this context I have to implement the reader/writer problem in C, and I think I'm on the right track.
My thought on the problem is, that we need two locks rd_lock and wr_lock. When a writer thread wants to change our global variable, it tries to grab both locks, writes to the global and unlocks. When a reader wants to read the global, it checks if wr_lock is currently locked, and then reads the value, however one of the reader threads should grab the rd_lock, but the other readers should not care if rd_lock is locked.
I am not allowed to use the implementation already in the pthread library.
typedef struct counter_st {
int value;
} counter_t;
counter_t * counter;
pthread_t * threads;
int readers_tnum;
int writers_tnum;
pthread_mutex_t rd_lock;
pthread_mutex_t wr_lock;
void * reader_thread() {
while(true) {
pthread_mutex_lock(&rd_lock);
pthread_mutex_trylock(&wr_lock);
int value = counter->value;
printf("%d\n", value);
pthread_mutex_unlock(&rd_lock);
}
}
void * writer_thread() {
while(true) {
pthread_mutex_lock(&wr_lock);
pthread_mutex_lock(&rd_lock);
// TODO: increment value of counter->value here.
counter->value += 1;
pthread_mutex_unlock(&rd_lock);
pthread_mutex_unlock(&wr_lock);
}
}
int main(int argc, char **args) {
readers_tnum = atoi(args[1]);
writers_tnum = atoi(args[2]);
pthread_mutex_init(&rd_lock, 0);
pthread_mutex_init(&wr_lock, 0);
// Initialize our global variable
counter = malloc(sizeof(counter_t));
counter->value = 0;
pthread_t * threads = malloc((readers_tnum + writers_tnum) * sizeof(pthread_t));
int started_threads = 0;
// Spawn reader threads
for(int i = 0; i < readers_tnum; i++) {
int code = pthread_create(&threads[started_threads], NULL, reader_thread, NULL);
if (code != 0) {
printf("Could not spawn a thread.");
exit(-1);
} else {
started_threads++;
}
}
// Spawn writer threads
for(int i = 0; i < writers_tnum; i++) {
int code = pthread_create(&threads[started_threads], NULL, writer_thread, NULL);
if (code != 0) {
printf("Could not spawn a thread.");
exit(-1);
} else {
started_threads++;
}
}
}
Currently it just prints a lot of zeroes, when run with 1 reader and 1 writer, which means, that it never actually executes the code in the writer thread. I know that this is not going to work as intended with multiple readers, however I don't understand what is wrong, when running it with one of each.
Don't think of the locks as "reader lock" and "writer lock".
Because you need to allow multiple concurrent readers, readers cannot hold a mutex. (If they do, they are serialized; only one can hold a mutex at the same time.) They can take one for a short duration (before they begin the access, and after they end the access), to update state, but that's it.
Split the timeline for having a rwlock into three parts: "grab rwlock", "do work", "release rwlock".
For example, you could use one mutex, one condition variable, and a counter. The counter holds the number of active readers. The condition variable is signaled on by the last reader, and by writers just before they release the mutex, to wake up a waiting writer. The mutex protects both, and is held by writers for the whole duration of their write operation.
So, in pseudocode, you might have
Function rwlock_rdlock:
Take mutex
Increment counter
Release mutex
End Function
Function rwlock_rdunlock:
Take mutex
Decrement counter
If counter == 0, Then:
Signal_on cond
End If
Release mutex
End Function
Function rwlock_wrlock:
Take mutex
While counter > 0:
Wait_on cond
End Function
Function rwlock_unlock:
Signal_on cond
Release mutex
End Function
Remember that whenever you wait on a condition variable, the mutex is atomically released for the duration of the wait, and automatically grabbed when the thread wakes up. So, for waiting on a condition variable, a thread will have the mutex both before and after the wait, but not during the wait itself.
Now, the above approach is not the only one you might implement.
In particular, you might note that in the above scheme, there is a different "unlock" operation you must use, depending on whether you took a read or a write lock on the rwlock. In POSIX pthread_rwlock_t implementation, there is just one pthread_rwlock_unlock().
Whatever scheme you design, it is important to examine it whether it works right in all situations: a lone read-locker, a lone-write-locker, several read-lockers, several-write-lockers, a lone write-locker and one read-locker, a lone write-locker and several read-lockers, several write-lockers and a lone read-locker, and several read- and write-lockers.
For example, let's consider the case when there are several active readers, and a writer wants to write-lock the rwlock.
The writer grabs the mutex. It then notices that the counter is nonzero, so it starts waiting on the condition variable. When the last reader -- note how the order of the readers exiting does not matter, since a simple counter is used! -- unlocks its readlock on the rwlock, it signals on the condition variable, which wakes up the writer. The writer then grabs the mutex, sees the counter is zero, and proceeds to do its work. During that time, the mutex is held by the writer, so all new readers will block, until the writer releases the mutex. Because the writer will also signal on the condition variable when it releases the mutex, it is a race between other waiting writers and waiting readers, who gets to go next.

How should I simulate sem_wait with a count?

I'm using semaphore.h and would like to acquire a semaphore if n instead of just one slot is available. Posix does not provide this natively. How can I work around that? I'm bound to using semaphores, no other means of synchronization are possible.
I'm pondering using a binary semaphore with a separate counter variable, but that would, in my opinion, kind of defeat its purpose.
Since you have multiple threads contending for slots of the semaphore (else you wouldn't need semaphores at all), you need to protect against deadlock. For example, if your semaphore has four slots, and each of two threads is trying to acquire three, then they will deadlock if each manages to acquire two. It follows that you must protect access to the process of acquiring semaphore slots.
A binary semaphore protecting a counter is not sufficient to prevent the deadlock scenario described above. Moreover, if not enough slots are available at any given time then you must have some synchronous means to wait for more slots to become available. You can do the job with two semaphores, though, one to protect access to the semaphore acquisition process, and another carrying the actual slots being acquired. Something like this, for example:
#define DO_OR_RETURN(x) do { int _r; if ((_r = (x))) return _r; } while (0)
typedef struct multi_sem {
sem_t sem_acquire_sem;
sem_t multislot_sem;
} multisem;
int multisem_init(multisem *ms, unsigned int slots) {
DO_OR_RETURN(sem_init(&ms->sem_acquire_sem, 0, 1));
return sem_init(&ms->multislot_sem, 0, slots);
}
int multisem_wait(multisem *ms, unsigned int slots_to_acquire) {
int result;
DO_OR_RETURN(sem_wait(&ms->sem_acquire_sem));
while (slots_to_acquire) {
result = sem_wait(&ms->multislot_sem);
switch (result) {
case 0:
slots_to_acquire -= 1;
break;
case EINTR:
/* interrupted by a signal; try again */
break;
default:
/* undocumented error - should never happen */
/* insert appropriate apocalypse response here */
slots_to_acquire = 0; /* bail out */
break;
}
}
if (sem_post(&ms->sem_acquire_sem)) {
/* big oops - no recovery possible - should never happen */
/* insert appropriate apocalypse response here */
}
return result;
}
int multisem_post(multisem *ms, unsigned int slots_to_post) {
while (slots_to_post) {
DO_OR_RETURN(sem_post(&ms->multislot_sem));
slots_to_post -= 1;
}
return 0;
}
Do note that that is still susceptible to deadlock in the event that a thread tries to acquire slots of the multisem when it already holds at least one (among other ways). I think that risk is inherent in the problem.

Why read/write locks require a lock?

I got one implementation of read/write locks which is below. Notice that in the beginning of the functions, there is a pthread_mutex_lock call. If it uses pthread_mutex_lock once anyway, then what is the benefit of using read/write locks. How is it better than simply using pthread_mutex_lock?
int pthread_rwlock_rlock_np(pthread_rwlock_t *rwlock)
{
pthread_mutex_lock(&(rwlock->mutex));
rwlock->r_waiting++;
while (rwlock->r_wait > 0)
{
pthread_cond_wait(&(rwlock->r_ok), &(rwlock->mutex));
}
rwlock->reading++;
rwlock->r_waiting--;
pthread_mutex_unlock(&(rwlock->mutex));
return 0;
}
int pthread_rwlock_wlock_np(pthread_rwlock_t *rwlock)
{
pthread_mutex_lock(&(rwlock->mutex));
if(pthread_mutex_trylock(&(rwlock->w_lock)) == 0)
{
rwlock->r_wait = 1;
rwlock->w_waiting++;
while (rwlock->reading > 0)
{
pthread_cond_wait(&(rwlock->w_ok), &(rwlock->mutex));
}
rwlock->w_waiting--;
pthread_mutex_unlock(&(rwlock->mutex));
return 0;
}
else
{
rwlock->wu_waiting++;
while (pthread_mutex_trylock(&(rwlock->w_lock)) != 0)
{
pthread_cond_wait(&(rwlock->w_unlock), &(rwlock->mutex));
}
rwlock->wu_waiting--;
rwlock->r_wait = 1;
rwlock->w_waiting++;
while (rwlock->reading > 0)
{
pthread_cond_wait(&(rwlock->w_ok), &(rwlock->mutex));
}
rwlock->w_waiting--;
pthread_mutex_unlock(&(rwlock->mutex));
return 0;
}
}
The rwlock->mutex mutex is used to protect the state of the rwlock structure itself, not the state of whatever the reader/writer lock may be protecting in your target program. This mutex is held only during the time the lock is acquired or released. It is entered into just briefly, to avoid corrupting the state that is needed for "bookkeeping" of the reader/writer lock itself. In contrast, the reader/writer lock may be held for an extended period of time by the callers performing the actual reads and writes on the structure the lock protects.
In both functions, rwlock->mutex is released before returning. That means that just because you hold an rwlock as reader or writer, doesn't mean you hold the mutex.
Half the point of a rwlock is that multiple readers can operate simultaneously, so that's the immediate advantage over just using a mutex. Those readers only hold the mutex briefly, in order to acquire the reader lock. They don't hold the mutex while they do their actual work.
It will allows multiple read OR a single write at a time, which is better than one read OR one write operation at a time.
How is it better than simply using pthread_mutex_lock?
Even with this implementation, the readers will run simultaneously in the critical section.
There could be a better (less portable) implementation that would use atomics in the "fast path" (locking would still be needed when waiting).

Concurrent Queue, C

So, I am trying to implement a concurrent queue in C. I have split the methods into "read methods" and "write methods". So, when accessing the write methods, like push() and pop(), I acquire a writer lock. And the same for the read methods. Also, we can have several readers but only one writer.
In order to get this to work in code, I have a mutex lock for the entire queue. And two condition locks - one for the writer and the other for the reader. I also have two integers keeping track of the number of readers and writers currently using the queue.
So my main question is - how to implement several readers accessing the read methods at the same time?
At the moment this is my general read method code: (In psuedo code - not C. I am actually using pthreads).
mutex.lock();
while (nwriter > 0) {
wait(&reader);
mutex.unlock();
}
nreader++;
//Critical code
nreader--;
if (nreader == 0) {
signal(&writer)
}
mutex.unlock
So, imagine we have a reader which holds the mutex. Now any other reader which comes along, and tries to get the mutex, would not be able to. Wouldn't it block? Then how are many readers accessing the read methods at the same time?
Is my reasoning correct? If yes, how to solve the problem?
If this is not for an exercise, use read-write lock from pthreads (pthread_rwlock_* functions).
Also note that protecting individual calls with a lock stil might not provide necessary correctness guarantees. For example, a typical code for popping an element from STL queue is
if( !queue.empty() ) {
data = queue.top();
queue.pop();
}
And this will fail in concurrent code even if locks are used inside the queue methods, because conceptually this code must be an atomic transaction, but the implementation does not provide such guarantees. A thread may pop a different element than it read by top(), or attempt to pop from empty queue, etc.
Please find the following read\write functions.
In my functions, I used canRead and canWrite mutexes and nReads for number of readers:
Write function:
lock(canWrite) // Wait if mutex if not free
// Write
unlock(canWrite)
Read function:
lock(canRead) // This mutex protect the nReaders
nReaders++ // Init value should be 0 (no readers)
if (nReaders == 1) // No other readers
{
lock(canWrite) // No writers can enter critical section
}
unlock(canRead)
// Read
lock(canRead)
nReaders--;
if (nReaders == 0) // No more readers
{
unlock(canWrite) // Writer can enter critical secion
}
unlock(canRead)
A classic solution is multiple-readers, single-writer.
A data structure begins with no readers and no writers.
You permit any number of concurrent readers.
When a writer comes along, you block him till all current readers complete; then you let him go (any new readers and writers which come along which the writer is blocked queue up behind him, in order).
You may try this library it is built in c native, lock free, suitable for cross-platform lfqueue,
For Example:-
int* int_data;
lfqueue_t my_queue;
if (lfqueue_init(&my_queue) == -1)
return -1;
/** Wrap This scope in other threads **/
int_data = (int*) malloc(sizeof(int));
assert(int_data != NULL);
*int_data = i++;
/*Enqueue*/
while (lfqueue_enq(&my_queue, int_data) == -1) {
printf("ENQ Full ?\n");
}
/** Wrap This scope in other threads **/
/*Dequeue*/
while ( (int_data = lfqueue_deq(&my_queue)) == NULL) {
printf("DEQ EMPTY ..\n");
}
// printf("%d\n", *(int*) int_data );
free(int_data);
/** End **/
lfqueue_destroy(&my_queue);

Resources