I came across this problem as I am learning more about operating systems. In my code, I've tried making the reader having priority and it worked, so next I modified it a bit to make the writer have the priority. When I ran the code, the output was exactly the same and it seemed like the writer did not have the priority. Here is the code with comments. I am not sure what I've done wrong, since I modified a lot of the code but the output remains the same if I did not change it at all.
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
/*
This program provides a possible solution for first readers writers problem using mutex and semaphore.
I have used 10 readers and 5 producers to demonstrate the solution. You can always play with these values.
*/
// Semaphore initialization for writer and reader
sem_t wrt;
sem_t rd;
// Mutex 1 blocks other readers, mutex 2 blocks other writers
pthread_mutex_t mutex1;
pthread_mutex_t mutex2;
// Value the writer is changing, we are simply multiplying this value by 2
int cnt = 2;
int numreader = 0;
int numwriter = 0;
void *writer(void *wno)
{
pthread_mutex_lock(&mutex2);
numwriter++;
if(numwriter == 1){
sem_wait(&rd);
}
pthread_mutex_unlock(&mutex2);
sem_wait(&wrt);
// Writing Section
cnt = cnt*2;
printf("Writer %d modified cnt to %d\n",(*((int *)wno)),cnt);
sem_post(&wrt);
pthread_mutex_lock(&mutex2);
numwriter--;
if(numwriter == 0){
sem_post(&rd);
}
pthread_mutex_unlock(&mutex2);
}
void *reader(void *rno)
{
sem_wait(&rd);
pthread_mutex_lock(&mutex1);
numreader++;
if(numreader == 1){
sem_wait(&wrt);
}
pthread_mutex_unlock(&mutex1);
sem_post(&rd);
// Reading Section
printf("Reader %d: read cnt as %d\n",*((int *)rno),cnt);
pthread_mutex_lock(&mutex1);
numreader--;
if(numreader == 0){
sem_post(&wrt);
}
pthread_mutex_unlock(&mutex1);
}
int main()
{
pthread_t read[10],write[5];
pthread_mutex_init(&mutex1, NULL);
pthread_mutex_init(&mutex2, NULL);
sem_init(&wrt,0,1);
sem_init(&rd,0,1);
int a[10] = {1,2,3,4,5,6,7,8,9,10}; //Just used for numbering the writer and reader
for(int i = 0; i < 5; i++) {
pthread_create(&write[i], NULL, (void *)writer, (void *)&a[i]);
}
for(int i = 0; i < 10; i++) {
pthread_create(&read[i], NULL, (void *)reader, (void *)&a[i]);
}
for(int i = 0; i < 5; i++) {
pthread_join(write[i], NULL);
}
for(int i = 0; i < 10; i++) {
pthread_join(read[i], NULL);
}
pthread_mutex_destroy(&mutex1);
pthread_mutex_destroy(&mutex2);
sem_destroy(&wrt);
sem_destroy(&rd);
return 0;
}
Output (for both is the same. I think if writer had priority it will change first, then will be read):
Alternative Semantics
Much of what you want to do can probably be accomplished with less overhead. For example, in the classic reader-writer problem, readers shouldn’t need to block other readers.
You might be able to replace the reader-writer pattern with a publisher-consumer pattern that manages pointers to blocks of data with acquire-consume memory ordering. You only need locking at all if one thread needs to update the same block of memory after it was originally written.
POSIX and Linux have an implementation of reader-writer locks in the system library, which were designed to avoid starvation. This is most likely the high-level construct you want.
If you still want to implement your own, one implementation would use a count of current readers, a count of pending writers and a flag that indicates whether a write is in progress. It packs all these values into an atomic bitfield that it updates with a compare-and-swap.
Reader threads would retrieve the value, check whether there are any starving writers waiting, and if not, increment the count of readers. If there are writers, it backs off (perhaps spinning and yielding the CPU, perhaps sleeping on a condition variable). If there is a write in progress, it waits for that to complete. If it sees only other reads in progress, it goes ahead.
Writer threads would check if there are any reads or writes in progress. If so, they increment the count of waiting writers, and wait. If not, they set the write-in-progress bit and proceed.
Packing all these fields into the same atomic bitfield guarantees that no thread will think it’s safe to use the buffer while another thread thinks it’s safe to write: if two threads try to update the state at the same time, one will always fail.
If You Stick With Semaphores
You can still have reader threads check sem_getvalue() on the writer semaphore, and back off if they see any starved writers are waiting. One method would be to wait on a condition variable that threads signal when they are done with the buffer. A reader that sees that it holds the mutex while writers are waiting can try to wake up one writer thread and go back to sleep, and a reader that sees only other readers are waiting can wake up a reader, which will wake up the next reader, and so on.
Related
#include <stdio.h>
#include <pthread.h>
#include <semaphore>
sem_t empty, full, mutex;
#define N 10
void* producerThread(void*) {
int i = 0;
while (1) {
sem_wait(&empty);
sem_wait(&mutex);
buff[i] = rand();
i = ++i % N;
sem_post(&mutex);
sem_post(&full);
}
}
void* consumerThread(void*) {
int i = 0;
while (1) {
sem_wait(&full);
sem_wait(&mutex);
printf("%d\n", buff[i]);
i = ++i % N;
sem_post(&mutex);
sem_post(&empty);
}
}
void main() {
pthread_t producer, consumer;
sem_init(&empty, N);
sem_init(&full, 0);
sem_init(&mutex, 1);
pthread_create(&producer, NULL, &producerThread, NULL);
pthread_create(&consumer, NULL, &consumerThread, NULL);
pthread_join(producer, NULL);
pthread_join(consumer, NULL);
sem_destroy(&empty);
sem_destroy(&full);
sem_destroy(&mutex);
}
I have the following question, this code is well know Producer-Consumer problem when learning about multi-threading, but i do not understand why do we need an additional semaphore (mutex) in this case? Can't we do everything with semaphores full & empty and there will be no problems whatsoever where producer produces on the spot consumer didnt already consume or vice-versa? Afaik with mutex we are adding additional bagage on the code and this is not necessary. Can someone point to me why we need 3 semaphores instead of 2?
I've tried running this code on my computer and everything works the same with and without additional semaphore, so I do not understand why did author choose 3 semaphores in this instance?
The empty and full semaphores take values in the range [0..N), allowing the producer to run ahead of the consumer by up to N elements.
The mutex semaphore only bounces between values 0 and 1, and enforces a critical section ensuring that only one thread is touching any part of the buffer memory at a time. However, the separate computation of i on each thread and the empty/full handshake ensures there can be no data race on individual elements of buff, so that critical section is probably overkill.
You don't show the definition of buff. For a sufficiently narrow datatype (like individual bytes), some architectures may exhibit word tearing on concurrent writes to adjacent elements. However in your example only one thread is performing writes, so even in the presence of word-tearing the concurrent adjacent reads are unlikely to observe a problem.
I have problem with readers-writers problem. I want to write writers favor solution using mutex. So far i have written this
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <pthread.h>
#include <memory.h>
#include <stdbool.h>
#include <stdint.h>
#include<unistd.h>
int NO_READERS;
int NO_WRITERS;
int NO_READERS_READING = 0; // How many readers need shared resources
int NO_WRITERS_WRITING = 0; // How many writers need shared resources
pthread_mutex_t resourceMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t tryResourceMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t readerMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t writerMutex = PTHREAD_MUTEX_INITIALIZER;
void *readerJob(void *arg) {
int *id = (int*)arg;
while (1) {
pthread_mutex_lock(&tryResourceMutex); // Indicate reader is trying to enter
pthread_mutex_lock(&readerMutex);
NO_READERS_READING++; // Indicate that you are needing the shared resource (one more reader)
if (NO_READERS_READING == 1) {
pthread_mutex_lock(&resourceMutex);
}
pthread_mutex_unlock(&readerMutex);
pthread_mutex_unlock(&tryResourceMutex);
printf("READER ID %d WALKED IN \n",*id);
printf("ReaderQ: %d , WriterQ: %d [in: R:%d W:%d]\n",
NO_READERS - NO_READERS_READING,
NO_WRITERS - NO_WRITERS_WRITING,
NO_READERS_READING,
NO_WRITERS_WRITING);
sleep(1);
pthread_mutex_lock(&readerMutex);
NO_READERS_READING--;
if (NO_READERS_READING == 0) { // Check if you are the last reader
pthread_mutex_unlock(&resourceMutex);
}
pthread_mutex_unlock(&readerMutex);
}
return 0;
}
void *writerJob(void *arg) {
int *id = (int*)arg;
while (1) {
pthread_mutex_lock(&writerMutex);
NO_WRITERS_WRITING++;
if (NO_WRITERS_WRITING == 1) {
pthread_mutex_lock(&tryResourceMutex); // If there are no other writers lock the readers out
}
pthread_mutex_unlock(&writerMutex);
pthread_mutex_lock(&resourceMutex);
printf("WRITER ID %d WALKED IN \n",*id);
printf("ReaderQ: %d , WriterQ: %d [in: R:%d W:%d]\n",
NO_READERS - NO_READERS_READING,
NO_WRITERS - NO_WRITERS_WRITING,
NO_READERS_READING,
NO_WRITERS_WRITING);
sleep(1);
pthread_mutex_unlock(&resourceMutex);
pthread_mutex_lock(&writerMutex);
NO_WRITERS_WRITING--;
if (NO_WRITERS_WRITING == 0) {
pthread_mutex_unlock(&tryResourceMutex); // If there are no writers left unlock the readers
}
pthread_mutex_unlock(&writerMutex);
}
return 0;
}
int main(int argc, char *argv[]) {
NO_READERS = atoi(argv[1]);
NO_WRITERS = atoi(argv[2]);
// Initialize arrays of threads IDs
pthread_t *readersThreadsIds = malloc(NO_READERS * sizeof(pthread_t));
pthread_t *writersThreadsIds = malloc(NO_READERS * sizeof(pthread_t));
// Initialize shared memory (array) with random numbers
// Create readers threads
for (int i = 0; i < NO_READERS; ++i) {
int* id = (int*)(malloc(sizeof(int)));
*id = i;
pthread_create(&readersThreadsIds[i], NULL, readerJob,(void*)id);
}
// Create writers threads
for (int i = 0; i < NO_WRITERS; ++i) {
int* id = (int*)(malloc(sizeof(int)));
*id = i;
pthread_create(&writersThreadsIds[i], NULL, writerJob, (void*)id);
}
// Wait for readers to finish
for (int i = 0; i < NO_READERS; ++i) {
pthread_join(readersThreadsIds[i], NULL);
}
// Wait for writers to finish
for (int i = 0; i < NO_WRITERS; ++i) {
pthread_join(writersThreadsIds[i], NULL);
}
free(readersThreadsIds);
free(writersThreadsIds);
pthread_mutex_destroy(&resourceMutex);
pthread_mutex_destroy(&tryResourceMutex);
pthread_mutex_destroy(&readerMutex);
pthread_mutex_destroy(&writerMutex);
return 0;
}
And I'm not sure if this should be working like this. Can anyone check this for me? I want to have information about which reader or writer is going in or out. It seems like it stuck in some point but I don't know why.
It seems to do what you want, that is give preference to the writers. Because your threads loop acquiring and releasing the lock; if you have more than one writer, the writers will take turns passing it between themselves and starve the readers. That is, every time one releases the resourceMutex, there is another writer waiting on it, so NO_WRITERS_WRITING will never hit zero.
To see it operating as intended, add a delay at the top of the while loop of each thread:
usleep((rand() % 10000) * 10000);
That will permit the readers to periodically get access, whenever all the writers are in the usleep().
At the begining all readers are coming in,
By "coming in", I take you to mean executing the printf() calls in the readerJob loop. It's not surprising that the readers all come in first, because you start them first, and in the likely event that the first reader thread to attempt to lock tryResourceMutex does so before any writer thread does, it will then lock resourceMutex(), too, preventing any writer from "coming in". But that does not prevent writers from incrementing NO_WRITERS_WRITING. And it also does not preventing one of them from locking tryResourceMutex and holding it locked.
The sleep() call in the reader will then (probably) cause resourceMutex to be held continuously long enough that all the readers come in before any of the writers do, since each writer needs to acquire resourceMutex to come in.
then also writers which shouldn't be possible at the same time.
I don't see that in my tests. But I do see what I already described: the writer count increases from zero, even though they are prevented from coming in while any readers are inside. In effect, the name of your variable NO_WRITERS_WRITING is inconsistent with your actual usage -- indicates how many writers are writing or waiting to write.
When the readers leave they are blocked from reentering right away because one of the writers holds tryResourceMutex. Eventually, then, the last reader will exit and release the resourceMutex. This will allow the writers to proceed, one at a time, but with the sleep() call positioned where it is in the writer loop, it is extremely unlikely that the number of writers will ever fall to zero to allow any of the readers to re-enter. If it did, however, then very likely the same cycle would repeat: all of the readers would enter, once, while all the writers queue up.
Then all readers are gone but there are more than one writer at the same time in library.
Again, no. Only one writer is inside at a time, but the others are queued most of the time, so NO_WRITERS_WRITING will almost always be equal to NO_WRITERS.
Bottom line, then: you have confused yourself. You are using variable NO_WRITERS_WRITING primarily to represent the number of writers that are ready to write, but your messaging uses it as if it were the number actually writing. The same does not apply to NO_READERS_READING because once a thread acquires the mutex needed to modify that variable, nothing else prevents it from proceeding on into the room.
One more thing: to make the simulation interesting -- i.e. to keep the writers from taking permanent control -- you should implement a delay, preferably a random one, after each thread leaves the room, before it tries to reenter. And the delay for writers should probably be substantially longer than the delay for readers.
I'm currently learning about concurrency at my University. In this context I have to implement the reader/writer problem in C, and I think I'm on the right track.
My thought on the problem is, that we need two locks rd_lock and wr_lock. When a writer thread wants to change our global variable, it tries to grab both locks, writes to the global and unlocks. When a reader wants to read the global, it checks if wr_lock is currently locked, and then reads the value, however one of the reader threads should grab the rd_lock, but the other readers should not care if rd_lock is locked.
I am not allowed to use the implementation already in the pthread library.
typedef struct counter_st {
int value;
} counter_t;
counter_t * counter;
pthread_t * threads;
int readers_tnum;
int writers_tnum;
pthread_mutex_t rd_lock;
pthread_mutex_t wr_lock;
void * reader_thread() {
while(true) {
pthread_mutex_lock(&rd_lock);
pthread_mutex_trylock(&wr_lock);
int value = counter->value;
printf("%d\n", value);
pthread_mutex_unlock(&rd_lock);
}
}
void * writer_thread() {
while(true) {
pthread_mutex_lock(&wr_lock);
pthread_mutex_lock(&rd_lock);
// TODO: increment value of counter->value here.
counter->value += 1;
pthread_mutex_unlock(&rd_lock);
pthread_mutex_unlock(&wr_lock);
}
}
int main(int argc, char **args) {
readers_tnum = atoi(args[1]);
writers_tnum = atoi(args[2]);
pthread_mutex_init(&rd_lock, 0);
pthread_mutex_init(&wr_lock, 0);
// Initialize our global variable
counter = malloc(sizeof(counter_t));
counter->value = 0;
pthread_t * threads = malloc((readers_tnum + writers_tnum) * sizeof(pthread_t));
int started_threads = 0;
// Spawn reader threads
for(int i = 0; i < readers_tnum; i++) {
int code = pthread_create(&threads[started_threads], NULL, reader_thread, NULL);
if (code != 0) {
printf("Could not spawn a thread.");
exit(-1);
} else {
started_threads++;
}
}
// Spawn writer threads
for(int i = 0; i < writers_tnum; i++) {
int code = pthread_create(&threads[started_threads], NULL, writer_thread, NULL);
if (code != 0) {
printf("Could not spawn a thread.");
exit(-1);
} else {
started_threads++;
}
}
}
Currently it just prints a lot of zeroes, when run with 1 reader and 1 writer, which means, that it never actually executes the code in the writer thread. I know that this is not going to work as intended with multiple readers, however I don't understand what is wrong, when running it with one of each.
Don't think of the locks as "reader lock" and "writer lock".
Because you need to allow multiple concurrent readers, readers cannot hold a mutex. (If they do, they are serialized; only one can hold a mutex at the same time.) They can take one for a short duration (before they begin the access, and after they end the access), to update state, but that's it.
Split the timeline for having a rwlock into three parts: "grab rwlock", "do work", "release rwlock".
For example, you could use one mutex, one condition variable, and a counter. The counter holds the number of active readers. The condition variable is signaled on by the last reader, and by writers just before they release the mutex, to wake up a waiting writer. The mutex protects both, and is held by writers for the whole duration of their write operation.
So, in pseudocode, you might have
Function rwlock_rdlock:
Take mutex
Increment counter
Release mutex
End Function
Function rwlock_rdunlock:
Take mutex
Decrement counter
If counter == 0, Then:
Signal_on cond
End If
Release mutex
End Function
Function rwlock_wrlock:
Take mutex
While counter > 0:
Wait_on cond
End Function
Function rwlock_unlock:
Signal_on cond
Release mutex
End Function
Remember that whenever you wait on a condition variable, the mutex is atomically released for the duration of the wait, and automatically grabbed when the thread wakes up. So, for waiting on a condition variable, a thread will have the mutex both before and after the wait, but not during the wait itself.
Now, the above approach is not the only one you might implement.
In particular, you might note that in the above scheme, there is a different "unlock" operation you must use, depending on whether you took a read or a write lock on the rwlock. In POSIX pthread_rwlock_t implementation, there is just one pthread_rwlock_unlock().
Whatever scheme you design, it is important to examine it whether it works right in all situations: a lone read-locker, a lone-write-locker, several read-lockers, several-write-lockers, a lone write-locker and one read-locker, a lone write-locker and several read-lockers, several write-lockers and a lone read-locker, and several read- and write-lockers.
For example, let's consider the case when there are several active readers, and a writer wants to write-lock the rwlock.
The writer grabs the mutex. It then notices that the counter is nonzero, so it starts waiting on the condition variable. When the last reader -- note how the order of the readers exiting does not matter, since a simple counter is used! -- unlocks its readlock on the rwlock, it signals on the condition variable, which wakes up the writer. The writer then grabs the mutex, sees the counter is zero, and proceeds to do its work. During that time, the mutex is held by the writer, so all new readers will block, until the writer releases the mutex. Because the writer will also signal on the condition variable when it releases the mutex, it is a race between other waiting writers and waiting readers, who gets to go next.
This question is a follow-up to a question which happened to be more complex than I had initially thought would be. In a program I'm writing the main thread takes care of GUI-driven data updates, a producer thread (with a number of sub-threads, because the producer task is "embarrassingly parallel") writes to the circular buffer, while the real-time consumer thread reads from it. Original platform of development was OSX/Darwin, but I'd like to make the code more portable, UNIX source compatible. Everything can easily be written in POSIX, except for the following OSX-specific GCD command for which I can't estimate a POSIX equivalent, if any. It launches the producer thread, from which its subthreads are being launched programmatically, depending on the number of available logical CPU cores:
void dproducer (bool on, int cpuNum, uData* data)
{
if (on == true)
{
data->state = starting;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{
producerSum(on, cpuNum, data);
});
}
return;
}
This is the block diagram of the program:
For clarity I'm also adding the producerSum code. It's an infinite loop whose execution can either wait for the consumer thread to do the work, or get interrupted by changing data->state, which has global scope:
void producerSum(bool on, int cpuNum, uData* data)
{
int rc;
pthread_t threads[cpuNum]; //subthreads
tData thread_args[cpuNum];
void* resulT;
static float frames [4096];
while(on){
memset(frames, 0, 4096*sizeof(float));
if( (fbuffW = (float**)calloc(cpuNum + 1, sizeof(float*)))!= NULL)
for (int i=0; i<cpuNum; ++i){
fbuffW[i] = (float*)calloc(data->frames, sizeof(float));
thread_args[i].tid = i; //ord. number of thread
thread_args[i].cpuCount = cpuNum; //counter increment step
thread_args[i].data = data;
rc = pthread_create(&threads[i], NULL, producerTN, (void *) &thread_args[i]);
if(rc != 0)printf("rc = %s\n",strerror(rc));
}
for (int i=0; i<cpuNum; ++i) rc = pthread_join(threads[i], &resulT);
//each subthread writes to fbuffW[i] and results get summed
for(UInt32 samp = 0; samp < data->frames; samp++)
for(int i = 0; i < cpuNum; i++)
frames[samp] += fbuffW[i][samp];
switch (data->state) { ... } //graceful interruption, resuming and termination mechanism
{ … } //simple code for copying frames[] to the circular buffer
pthread_cond_wait (&cond, &mutex);//wait for the real-time consumer
for(int i = 0; i < cpuNum; i++) free(fbuffW[i]); free(fbuffW);
} //end while(on)
return;
}
The syncing inside the producer thread is being successfully handled by pthread_create( ) and pthread_join( ), while necessary coordination between the producer and consumer threads is being successfully handled by a variable of pthread_mutex_t and a variable of pthread_cond_t (with corresponding locking, unlocking, broadcasting and waiting commands). uData is a program defined struct (or class instance). Any direction where to look at would help indeed.
Thanks for reading this post!
A dispatch queue is just what it sounds like: a queue, as in the standard FIFO list data structure. It holds tasks. Those tasks can be represented by Objective-C Blocks as in your code or by function pointers and context pointer values. You'll presumably need to avoid Blocks if you're aiming for cross-platform compatibility. In fact, since you only ever dispatch one task, your tasks can just encapsulate the parameters (on, cpuNum, and data) and not the code (the call to producerSum()).
The queues are serviced by threads from a thread pool. GCD manages the threads and pool. At least on OS X, there's integration with the OS such that the pool's size is governed by overall system load, which you won't be able to reproduce in a cross-platform manner.
Operations on a dispatch queue are thread-safe. This includes adding tasks to them and the worker threads removing tasks from them.
You're going to have to implement all of this. It's definitely possible, but it will be a bother. In many ways, the queues and the thread pool constitute a producer-consumer architecture. Basically, your GCD-based solution was a bit of a cheat because you just used a producer-consumer API to implement your producer-consumer design. Now, you're going to have to really implement a producer-consumer design without the crutch.
There's basically no more to it than the thread-creation and POSIX condition variables you're already using.
dispatch_async() is basically just locking the mutex for the queue of tasks, adding the task to the queue, signalling the condition variable, and unlocking the mutex. Each worker thread will just wait on the condition variable and, when it wakes, lock the mutex, pop a task off the queue if there's one, unlock the mutex, and run the task if it got one. You probably also need a mechanism to signal the worker thread that it's time to gracefully terminate.
I am working on a project with a user defined number of threads I am using 7 at the moment. I have a while loop that runs in each thread but I need all of the threads to wait for each other at the end of the while loop. The tricky thing is that some of the threads do not all end on the same number of times through the loop.
void *entryFunc(void *param)
{
int *i = (int *)param;
int nextPrime;
int p = latestPrime;
while(latestPrime < testLim)
{
sem_wait(&sem);
nextPrime = findNextPrime(latestPrime);
if(nextPrime != -1)
{
latestPrime = nextPrime;
p = latestPrime;
}
else
{
sem_post(&sem);
break;
}
sem_post(&sem);
if(p < 46341)
{
incrementNotPrimes(p);
}
/*
sem_wait(&sem2);
doneCount++;
sem_post(&sem2);
while(go != 1);
sem_wait(&sem2);
doneCount--;
//sem_post(&sem3);
sem_post(&sem2);
*/
}
return NULL;
}
where the chunk of code is commented out is part of my last attempt at solving this problem. That is where the functions all need to wait for each other. I have a feeling I am missing something simple.
If your problem is that on each thread, the while loop has a different numbers of iterations and some threads never reach the synchronization point after exiting the loop, you could use a barrier. Check here for an example.
However you need to decrease the number of threads at the barrier after you exit each thread. Waiting at the barrier will end after count number of threads reached that point.
So you need to update the barrier object each time a thread finishes. And make sure you do this atomically.
As I mentioned in the comments, you should use a barrier instead of a semaphore for this kind of situation, as it should be simpler to implement (barriers have been designed exactly to solve that problem). However, you may still use a semaphore with a little bit of arithmetic
arithmetic: your goal is to have all thread execute the same code path, but somehow the last thread to finish its task should wake all the other threads up. One way to achieve that is to have at the end of the function an atomic counter which each thread would decrement, and if the counter reaches 0, the thread simply calls as many time sem_post as necessary to release all the waiting threads, instead of issuing a sem_wait as the others.
A second method, this time using only a semaphore, is also possible. Since we cannot differentiate the last thread for the others, all the threads must do the same operations with the semaphore, ie try to release everyone, but also wait for the last one. So the idea is to initialize the semaphore to (1-n)*(n+1), so that each of the n-1 first threads would fail at waking up their friends with n+1 calls to sem_post, but still work toward getting the semaphore at exactly 0. Then the last thread would do the same, pushing the semaphore value to n+1, thus releasing the locked threads, and leaving room for it to also perform its sem_wait and be released immediately.
void *entryFunc(void *param)
{
int *i = (int *)param;
int nextPrime;
int p = latestPrime, j;
while(latestPrime < testLim){
nextPrime = findNextPrime(latestPrime);
if(nextPrime != -1)
{
latestPrime = nextPrime;
p = latestPrime;
}
if(p < 46341)
{
incrementNotPrimes(p);
}
}
for (j=0;j<=THREAD_COUNT;j++)
sem_post(&sem);
sem_wait(&sem);
return NULL;
}
The problem with this approach is that it doesn't deal with how the semaphore should be reset in between uses (if your program needs to repeat this mechanism, it will need to reset the semaphore value, since it will end up being 1 after this code has been executed successfully).