how to avoid polling in pthreads - c

I've got some code that currently looks like this (simplified)
/* instance in global var *mystruct, count initialized to 0 */
typedef struct {
volatile unsigned int count;
} mystruct_t;
pthread_mutex_t mymutex; // is initialized
/* one thread, goal: block while mystruct->count == 0 */
void x(void *n) {
while(1) {
pthread_mutex_lock(&mymutex);
if(mystruct->count != 0) break;
pthread_mutex_unlock(&mymutex);
}
pthread_mutex_unlock(&mymutex);
printf("count no longer equals zero");
pthread_exit((void*) 0)
}
/* another thread */
void y(void *n) {
sleep(10);
pthread_mutex_lock(&mymutex);
mystruct->count = 10;
pthread_mutex_unlock(&mymutex);
}
This seems inefficient and wrong to me--but I don't know a better way of doing it. Is there a better way, and if so, what is it?

Condition variables allow you to wait for a certain event, and have a different thread signal that condition variable.
You could have a thread that does this:
for (;;)
{
if (avail() > 0)
do_work();
else
pthread_cond_wait();
}
and a different thread that does this:
for (;;)
{
put_work();
pthread_cond_signal();
}
Very simplified of course. :) You'll need to look up how to use it properly, there are some difficulties working with condition variables due to race conditions.
However, if you are certain that the thread will block for a very short time (in order of µs) and rarely, using a spin loop like that is probably more efficient.

A general solution is to use a POSIX semaphore. These are not part of the pthread library but work with pthreads just the same.
Since semaphores are provided in most other multi-threading APIs, it is a general technique that may be applied perhaps more portably; however perhaps more appropriate in this instance is a condition variable, which allows a thread to pend on the conditional value of a variable without polling, which seems to be exactly what you want.

Condition variables are the solution to this problem. Your code can be fairly easily modified to use them:
pthread_mutex_t mymutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t mycond = PTHREAD_COND_INITIALIZER;
/* one thread, goal: block while mystruct->count == 0 */
void x(void *n) {
pthread_mutex_lock(&mymutex);
while (mystruct->count == 0)
pthread_cond_wait(&mycond, &mymutex);
printf("count no longer equals zero");
pthread_mutex_unlock(&mymutex);
pthread_exit((void*) 0)
}
/* another thread */
void y(void *n) {
sleep(10);
pthread_mutex_lock(&mymutex);
mystruct->count = 10;
pthread_cond_signal(&mycond);
pthread_mutex_unlock(&mymutex);
}
Note that you should generally keep the mutex locked while you act on the result - "consume" the event or similar. That's why I moved the pthread_mutex_unlock() to a point after the printf(), even though it doesn't really matter in this toy case.
(Also, in real code it might well make sense to put the mutex and condition variable inside mystruct).

You could use barriers.

You may also use semaphores to synchronize threads.

Related

can pthread_cond_signal make more than one thread to wake up?

I'm studying on condition variables of Pthread. When I'm reading the explanation of pthread_cond_signal, I see the following.
The pthread_cond_signal() function shall unblock at least one of
the
threads that are blocked on the specified condition variable cond (if
any threads are blocked on cond).
Till now I knew pthread_cond_signal() would make only one thread to wake up at a time. But, the quoted explanation says at least one. What does it mean? Can it make more than one thread wake up? If yes, why is there pthread_cond_broadcast()?
En passant, I wish the following code taken from UNIX Systems Programming book of Robbins is also related to my question. Is there any reason the author's pthread_cond_broadcast() usage instead of pthread_cond_signal() in waitbarrier function? As a minor point, why is !berror checking needed too as a part of the predicate? When I try both of them by changing, I cannot see any difference.
/*
The program implements a thread-safe barrier by using condition variables. The limit
variable specifies how many threads must arrive at the barrier (execute the
waitbarrier) before the threads are released from the barrier.
The count variable specifies how many threads are currently waiting at the barrier.
Both variables are declared with the static attribute to force access through
initbarrier and waitbarrier. If successful, the initbarrier and waitbarrier
functions return 0. If unsuccessful, these functions return a nonzero error code.
*/
#include <errno.h>
#include <pthread.h>
#include <stdio.h>
static pthread_cond_t bcond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t bmutex = PTHREAD_MUTEX_INITIALIZER;
static int count = 0;
static int limit = 0;
int initbarrier(int n) { /* initialize the barrier to be size n */
int error;
if (error = pthread_mutex_lock(&bmutex)) /* couldn't lock, give up */
return error;
if (limit != 0) { /* barrier can only be initialized once */
pthread_mutex_unlock(&bmutex);
return EINVAL;
}
limit = n;
return pthread_mutex_unlock(&bmutex);
}
int waitbarrier(void) { /* wait at the barrier until all n threads arrive */
int berror = 0;
int error;
if (error = pthread_mutex_lock(&bmutex)) /* couldn't lock, give up */
return error;
if (limit <= 0) { /* make sure barrier initialized */
pthread_mutex_unlock(&bmutex);
return EINVAL;
}
count++;
while ((count < limit) && !berror)
berror = pthread_cond_wait(&bcond, &bmutex);
if (!berror) {
fprintf(stderr,"soner %d\n",
(int)pthread_self());
berror = pthread_cond_broadcast(&bcond); /* wake up everyone */
}
error = pthread_mutex_unlock(&bmutex);
if (berror)
return berror;
return error;
}
/* ARGSUSED */
static void *printthread(void *arg) {
fprintf(stderr,"This is the first print of thread %d\n",
(int)pthread_self());
waitbarrier();
fprintf(stderr,"This is the second print of thread %d\n",
(int)pthread_self());
return NULL;
}
int main(void) {
pthread_t t0,t1,t2;
if (initbarrier(3)) {
fprintf(stderr,"Error initilizing barrier\n");
return 1;
}
if (pthread_create(&t0,NULL,printthread,NULL))
fprintf(stderr,"Error creating thread 0.\n");
if (pthread_create(&t1,NULL,printthread,NULL))
fprintf(stderr,"Error creating thread 1.\n");
if (pthread_create(&t2,NULL,printthread,NULL))
fprintf(stderr,"Error creating thread 2.\n");
if (pthread_join(t0,NULL))
fprintf(stderr,"Error joining thread 0.\n");
if (pthread_join(t1,NULL))
fprintf(stderr,"Error joining thread 1.\n");
if (pthread_join(t2,NULL))
fprintf(stderr,"Error joining thread 2.\n");
fprintf(stderr,"All threads complete.\n");
return 0;
}
Due to spurious wake-ups pthread_cond_signal could wake up more than one thread.
Look for word "spurious" in pthread_cond_wait.c from glibc.
In waitbarrier it must wake up all threads when they all have arrived to that point, hence it uses pthread_cond_broadcast.
Can [pthread_cond_signal()] make more than one thread wake up?
That's not guaranteed. On some operating system, on some hardware platform, under some circumstances it could wake more than one thread. It is allowed to wake more than one thread because that gives the implementer more freedom to make it work in the most efficient way possible for any given hardware and OS.
It must wake at least one waiting thread, because otherwise, what would be the point of calling it?
But, if your applicaton needs a signal that is guaranteed to wake all of the waiting threads, then that is what pthread_cond_broadcast() is for.
Making efficient use of a multi-processor system is hard. https://www.e-reading.club/bookreader.php/134637/Herlihy,Shavit-_The_art_of_multiprocessor_programming.pdf
Most programming language and library standards allow similar freedoms in the behavior of multi-threaded programs, for the same reason: To allow programs to achieve high performance on a variety of different platforms.

unexpected behavior when using conditional variables (c, gcc)

I am trying to learn how to use conditional variables properly in C.
As an exercise for myself I am trying to make a small program with 2 threads that print "Ping" followed by "Pong" in an endless loop.
I have written a small program:
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
void* T1(){
printf("thread 1 started\n");
while(1)
{
pthread_mutex_lock(&lock);
sleep(0.5);
printf("ping\n");
pthread_cond_signal(&cond);
pthread_mutex_unlock(&lock);
pthread_cond_wait(&cond,&lock);
}
}
void* T2(){
printf("thread 2 started\n");
while(1)
{
pthread_cond_wait(&cond,&lock);
pthread_mutex_lock(&lock);
sleep(0.5);
printf("pong\n");
pthread_cond_signal(&cond);
pthread_mutex_unlock(&lock);
}
}
int main(void)
{
int i = 1;
pthread_t t1;
pthread_t t2;
printf("main\n");
pthread_create(&t1,NULL,&T1,NULL);
pthread_create(&t2,NULL,&T2,NULL);
while(1){
sleep(1);
i++;
}
return EXIT_SUCCESS;
}
And when running this program the output I get is:
main
thread 1 started
thread 2 started
ping
Any idea what is the reason the program does not execute as expected?
Thanks in advance.
sleep takes an integer, not a floating point. Not sure what sleep(0) does on your system, but it might be one of your problems.
You need to hold the mutex while calling pthread_cond_wait.
Naked condition variables (that is condition variables that don't indicate that there is a condition to read somewhere else) are almost always wrong. A condition variable indicates that something we are waiting for might be ready to be consumed, they are not for signalling (not because it's illegal, but because it's pretty hard to get them right for pure signalling). So in general a condition will look like this:
/* consumer here */
pthread_mutex_lock(&something_mutex);
while (something == 0) {
pthread_cond_wait(&something_cond, &something_mutex);
}
consume(something);
pthread_mutex_unlock(&something_mutex);
/* ... */
/* producer here. */
pthread_mutex_lock(&something_mutex);
something = 4711;
pthread_cond_signal(&something_cond, &something_mutex);
pthread_mutex_unlock(&something_mutex);
It's a bad idea to sleep while holding locks.
T1 and T2 are not valid functions to use as functions to pthread_create they are supposed to take arguments. Do it right.
You are racing yourself in each thread between cond_signal and cond_wait, so it's not implausible that each thread might just signal itself all the time. (correctly holding the mutex in the calls to pthread_cond_wait may help here, or it may not, that's why I said that getting naked condition variables right is hard, because it is).
First of all you should never use sleep() to synchronize threads (use nanosleep() if you need to reduce output speed). You may need (it's a common use) a shared variable ready to let each thread know that he can print the message. Before you make a pthread_cond_wait() you must acquire the lock because the pthread_cond_wait() function shall block on a condition variable. It shall be called with mutex locked by the calling thread or undefined behavior results.
Steps are:
Acquire the lock
Use wait in a while with a shared variable in guard[*]
Do stuffs
Change the value of shared variable for synchronize (if you've one) and signal/broadcast that you finished to work
Release the lock
Steps 4 and 5 can be reversed.
[*]You use pthread_cond_wait() to release the mutex and block the thread on the condition variable and when using condition variables there is always a Boolean predicate involving shared variables associated with each condition wait that is true if the thread should proceed because spurious wakeups may occur. watch more here
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
int ready = 0;
void* T1(){
printf("thread 1 started\n");
while(1)
{
pthread_mutex_lock(&lock);
while(ready == 1){
pthread_cond_wait(&cond,&lock);
}
printf("ping\n");
ready = 1;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&lock);
}
}
void* T2(){
printf("thread 2 started\n");
while(1)
{
pthread_mutex_lock(&lock);
while(ready == 0){
pthread_cond_wait(&cond,&lock);
}
printf("pong\n");
ready = 0;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&lock);
}
}
int main(void)
{
int i = 1;
pthread_t t1;
pthread_t t2;
printf("main\n");
pthread_create(&t1,NULL,&T1,NULL);
pthread_create(&t2,NULL,&T2,NULL);
pthread_join(t1,NULL);
pthread_join(t2,NULL);
return EXIT_SUCCESS;
}
You should also use pthread_join() in main instead of a while(1)

Main thread and worker thread initialization

I'm creating a multi-thread program in C and I've some troubles.
There you have the function which create the threads :
void create_thread(t_game_data *game_data)
{
size_t i;
t_args *args = malloc(sizeof(t_args));
i = 0;
args->game = game_data;
while (i < 10)
{
args->initialized = 0;
args->id = i;
printf("%zu CREATION\n", i);//TODO: Debug
pthread_create(&game_data->object[i]->thread_id, NULL, &do_action, args);
i++;
while (args->initialized == 0)
continue;
}
}
Here you have my args struct :
typedef struct s_args
{
t_game_data *object;
size_t id;
int initialized;
}args;
And finally, the function which handle the created threads
void *do_action(void *v_args)
{
t_args *args;
t_game_data *game;
size_t id;
args = v_args;
game = args->game;
id = args->id;
args->initialized = 1;
[...]
return (NULL);
}
The problem is :
The main thread will create new thread faster than the new thread can init his variables :
args = v_args;
game = args->game;
id = args->id;
So, sometime, 2 different threads will get same id from args->id.
To solve that, I use an variable initialized as a bool so make "sleep" the main thread during the new thread's initialization.
But I think that is really sinful.
Maybe there is a way to do that with a mutex? But I heard it wasn't "legal" to unlock a mutex which does not belong his thread.
Thanks for your answers!
The easiest solution to this problem would be to pass a different t_args object to each new thread. To do that, move the allocation inside the loop, and make each thread responsible for freeing its own argument struct:
void create_thread(t_game_data *game_data) {
for (size_t i = 0; i < 10; i++) {
t_args *args = malloc(sizeof(t_args));
if (!args) {
/* ... handle allocation error ... */
} else {
args->game = game_data;
args->id = i;
printf("%zu CREATION\n", i);//TODO: Debug
if (pthread_create(&game_data->object[i]->thread_id, NULL,
&do_action, args) != 0) {
// thread creation failed
free(args);
// ...
}
}
}
}
// ...
void *do_action(void *v_args) {
t_args *args = v_args;
t_game_data *game = args->game;
size_t id = args->id;
free(v_args);
args = v_args = NULL;
// ...
return (NULL);
}
But you also write:
To solve that, I use an variable initialized as a bool so make "sleep"
the main thread during the new thread's initialization.
But I think that is really sinful. Maybe there is a way to do that
with a mutex? But I heard it wasn't "legal" to unlock a mutex which
does not belong his thread.
If you nevertheless wanted one thread to wait for another thread to modify some data, as your original strategy requires, then you must employ either atomic data or some kind of synchronization object. Your code otherwise contains a data race, and therefore has undefined behavior. In practice, you cannot assume in your original code that the main thread will ever see the new thread's write to args->initialized. "Sinful" is an unusual way to describe that, but maybe appropriate if you belong to the Church of the Holy C.
You could solve that problem with a mutex by protecting just the test of args->initialized in your loop -- not the whole loop -- with a mutex, and protecting the threads' write to that object with the same mutex, but that's nasty and ugly. It would be far better to wait for the new thread to increment a semaphore (not a busy wait, and the initialized variable is replaced by the semaphore), or to set up and wait on a condition variable (again not a busy wait, but the initialized variable or an equivalent is still needed).
The problem is that in create_thread you are passing the same t_args structure to each thread. In reality, you probably want to create your own t_args structure for each thread.
What's happening is your 1st thread is starting up with the args passed to it. Before that thread can run do_action the loop is modifying the args structure. Since thread2 and thread1 will both be pointing to the same args structure, when they run do_action they will have the same id.
Oh, and don't forget to not leak your memory
Your solution should work in theory except for a couple of major problems.
The main thread will sit spinning in the while loop that checks the flag using CPU cycles (this is the least bad problem and can be OK if you know it won't have to wait long)
Compiler optimisers can get trigger happy with respect to empty loops. They are also often unaware that a variable may get modified by other threads and can make bad decisions on that basis.
On multi core systems, the main thread may never see the change to args->initiialzed or at least not until much later if the change is in the cache of another core that hasn't been flushed back to main memory yet.
You can use John Bollinger's solution that mallocs a new set of args for each thread and it is fine. The only down side is a malloc/free pair for each thread creation. The alternative is to use "proper" synchronisation functions like Santosh suggests. I would probably consider this except I would use a semaphore as being a bit simpler than a condition variable.
A semaphore is an atomic counter with two operations: wait and signal. The wait operation decrements the semaphore if its value is greater than zero, otherwise it puts the thread into a wait state. The signal operation increments the semaphore, unless there are threads waiting on it. If there are, it wakes one of the threads up.
The solution is therefore to create a semaphore with an initial value of 0, start the thread and wait on the semaphore. The thread then signals the semaphore when it is finished with the initialisation.
#include <semaphore.h>
// other stuff
sem_t semaphore;
void create_thread(t_game_data *game_data)
{
size_t i;
t_args args;
i = 0;
if (sem_init(&semaphore, 0, 0) == -1) // third arg is initial value
{
// error
}
args.game = game_data;
while (i < 10)
{
args.id = i;
printf("%zu CREATION\n", i);//TODO: Debug
pthread_create(&game_data->object[i]->thread_id, NULL, &do_action, args);
sem_wait(&semaphore);
i++;
}
sem_destroy(&semaphore);
}
void *do_action(void *v_args) {
t_args *args = v_args;
t_game_data *game = args->game;
size_t id = args->id;
sem_post(&semaphore);
// Rest of the thread work
return NULL;
}
Because of the synchronisation, I can reuse the args struct safely, in fact, I don't even need to malloc it - it's small so I declare it local to the function.
Having said all that, I still think John Bollinger's solution is better for this use-case but it's useful to be aware of semaphores generally.
You should consider using condition variable for this. You can find an example here http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_28.html.
Basically wait in the main thread and signal in your other threads.

Scheduling two threads in linux two print a pattern of numbers

I want to print numbers using two threads T1 and T2. T1 should print numbers like 1,2,3,4,5 and then T2 should print 6,7,8,9,10 and again T1 should start and then T2 should follow. It should print from 1....100. I have two questions.
How can I complete the task using threads and one global variable?
How can I schedule threads in desired order in linux?
How can I schedule threads in desired order in linux?
You need to use locking primitives, such as mutexes or condition variables to influence scheduling order of threads. Or you have to make your threads work independently on the order.
How can I complete the task using threads and one global variable?
If you are allowed to use only one variable, then you can't use mutex (it will be the second variable). So the first thing you must do is to declare your variable atomic. Otherwise compiler may optimize your code in such a way that one thread will not see changes made by other thread. And for such simple code that you want, it will do so by caching variable on register. Use std::atomic_int. You may find an advice to use volatile int, but nowdays std::atomic_int is a more direct approach to specify what you want.
You can't use mutexes, so you can't make your threads wait. They will be constantly running and wasting CPU. But that's seems OK for the task. So you will need to write a spinlock. Threads will wait in a loop constantly checking the value. If value % 10 < 5 then first thread breaks the loop and does incrementing, otherwise second thread does the job.
Because the question looks like a homework I don't show here any code samples, you need to write them yourself.
1: the easiest way is to use mutexes.
this is a basic implementation with a unfair/undefined sheduling
int counter=1;
pthread_mutex_t mutex; //needs to be initialised
void incrementGlobal() {
for(int i=0;i<5;i++){
counter++;
printf("%i\n",counter);
}
}
T1/T2:
pthread_mutex_lock(&mutex);
incrementGlobal();
pthread_mutex_unlock(&mutex);
2: the correct order can be archieved with conditional-variables:
(but this needs more global-variables)
global:
int num_thread=2;
int current_thread_id=0;
pthread_cond_t cond; //needs to be initialised
T1/T2:
int local_thread_id; // the ID of the thread
while(true) {
phread_mutex_lock(&mutex);
while (current_thread_id != local_thread_id) {
pthread_cond_wait(&cond, &mutex);
}
incrementGlobal();
current_thread_id = (current_thread_id+1) % num_threads;
pthread_cond_broadcast(&cond);
pthread_mutex_unlock(&mutex);
}
How can I complete the task using threads and one global variable?
If you are using Linux, you can use POSIX library for threading in C programming Language. The library is <pthread.h>. However, as an alternative, a quite portable and relatively non-intrusive library, well supported on gcc and g++ and (with an old version) on MSVC is openMP.
It is not standard C and C++, but OpenMP itself is a standard.
How can I schedule threads in desired order in linux?
To achieve your desired operation of printing, you need to have a global variable which can be accessed by two threads of yours. Both threads take turns to access the global variable variable and perform operation (increment and print). However, to achieve desired order, you need to have a mutex. A mutex is a mutual exclusion semaphore, a special variant of a semaphore that only allows one locker at a time. It can be used when you have an instance of resource (global variable in your case) and that resource is being shared by two threads. A thread after locking that mutex can have exclusive access to an instance of resource and after completing it's operation, a thread should release the mutex for other threads.
You can start with threads and mutex in <pthread.h> from here.
The one of the possible solution to your problem could be this program is given below. However, i would suggest you to find try it yourself and then look at my solution.
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
pthread_mutex_t lock;
int variable=0;
#define ONE_TIME_INC 5
#define MAX 100
void *thread1(void *arg)
{
while (1) {
pthread_mutex_lock(&lock);
printf("Thread1: \n");
int i;
for (i=0; i<ONE_TIME_INC; i++) {
if (variable >= MAX)
goto RETURN;
printf("\t\t%d\n", ++variable);
}
printf("Thread1: Sleeping\n");
pthread_mutex_unlock(&lock);
usleep(1000);
}
RETURN:
pthread_mutex_unlock(&lock);
return NULL;
}
void *thread2(void *arg)
{
while (1) {
pthread_mutex_lock(&lock);
printf("Thread2: \n");
int i;
for (i=0; i<ONE_TIME_INC; i++) {
if (variable >= MAX)
goto RETURN;
printf("%d\n", ++variable);
}
printf("Thread2: Sleeping\n");
pthread_mutex_unlock(&lock);
usleep(1000);
}
RETURN:
pthread_mutex_unlock(&lock);
return NULL;
}
int main()
{
if (pthread_mutex_init(&lock, NULL) != 0) {
printf("\n mutex init failed\n");
return 1;
}
pthread_t pthread1, pthread2;
if (pthread_create(&pthread1, NULL, thread1, NULL))
return -1;
if (pthread_create(&pthread2, NULL, thread2, NULL))
return -1;
pthread_join(pthread1, NULL);
pthread_join(pthread2, NULL);
pthread_mutex_destroy(&lock);
return 0;
}

How can I wait for any/all pthreads to complete?

I just want my main thread to wait for any and all my (p)threads to complete before exiting.
The threads come and go a lot for different reasons, and I really don't want to keep track of all of them - I just want to know when they're all gone.
wait() does this for child processes, returning ECHILD when there are no children left, however wait does not (appear to work with) (p)threads.
I really don't want to go through the trouble of keeping a list of every single outstanding thread (as they come and go), then having to call pthread_join on each.
As there a quick-and-dirty way to do this?
Do you want your main thread to do anything in particular after all the threads have completed?
If not, you can have your main thread simply call pthread_exit() instead of returning (or calling exit()).
If main() returns it implicitly calls (or behaves as if it called) exit(), which will terminate the process. However, if main() calls pthread_exit() instead of returning, that implicit call to exit() doesn't occur and the process won't immediately end - it'll end when all threads have terminated.
http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_exit.html
Can't get too much quick-n-dirtier.
Here's a small example program that will let you see the difference. Pass -DUSE_PTHREAD_EXIT to the compiler to see the process wait for all threads to finish. Compile without that macro defined to see the process stop threads in their tracks.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <time.h>
static
void sleep(int ms)
{
struct timespec waittime;
waittime.tv_sec = (ms / 1000);
ms = ms % 1000;
waittime.tv_nsec = ms * 1000 * 1000;
nanosleep( &waittime, NULL);
}
void* threadfunc( void* c)
{
int id = (int) c;
int i = 0;
for (i = 0 ; i < 12; ++i) {
printf( "thread %d, iteration %d\n", id, i);
sleep(10);
}
return 0;
}
int main()
{
int i = 4;
for (; i; --i) {
pthread_t* tcb = malloc( sizeof(*tcb));
pthread_create( tcb, NULL, threadfunc, (void*) i);
}
sleep(40);
#ifdef USE_PTHREAD_EXIT
pthread_exit(0);
#endif
return 0;
}
The proper way is to keep track of all of your pthread_id's, but you asked for a quick and dirty way so here it is. Basically:
just keep a total count of running threads,
increment it in the main loop before calling pthread_create,
decrement the thread count as each thread finishes.
Then sleep at the end of the main process until the count returns to 0.
.
volatile int running_threads = 0;
pthread_mutex_t running_mutex = PTHREAD_MUTEX_INITIALIZER;
void * threadStart()
{
// do the thread work
pthread_mutex_lock(&running_mutex);
running_threads--;
pthread_mutex_unlock(&running_mutex);
}
int main()
{
for (i = 0; i < num_threads;i++)
{
pthread_mutex_lock(&running_mutex);
running_threads++;
pthread_mutex_unlock(&running_mutex);
// launch thread
}
while (running_threads > 0)
{
sleep(1);
}
}
If you don't want to keep track of your threads then you can detach the threads so you don't have to care about them, but in order to tell when they are finished you will have to go a bit further.
One trick would be to keep a list (linked list, array, whatever) of the threads' statuses. When a thread starts it sets its status in the array to something like THREAD_STATUS_RUNNING and just before it ends it updates its status to something like THREAD_STATUS_STOPPED. Then when you want to check if all threads have stopped you can just iterate over this array and check all the statuses.
Don't forget though that if you do something like this, you will need to control access to the array so that only one thread can access (read and write) it at a time, so you'll need to use a mutex on it.
you could keep a list all your thread ids and then do pthread_join on each one,
of course you will need a mutex to control access to the thread id list. you will
also need some kind of list that can be modified while being iterated on, maybe a std::set<pthread_t>?
int main() {
pthread_mutex_lock(&mutex);
void *data;
for(threadId in threadIdList) {
pthread_mutex_unlock(&mutex);
pthread_join(threadId, &data);
pthread_mutex_lock(&mutex);
}
printf("All threads completed.\n");
}
// called by any thread to create another
void CreateThread()
{
pthread_t id;
pthread_mutex_lock(&mutex);
pthread_create(&id, NULL, ThreadInit, &id); // pass the id so the thread can use it with to remove itself
threadIdList.add(id);
pthread_mutex_unlock(&mutex);
}
// called by each thread before it dies
void RemoveThread(pthread_t& id)
{
pthread_mutex_lock(&mutex);
threadIdList.remove(id);
pthread_mutex_unlock(&mutex);
}
Thanks all for the great answers! There has been a lot of talk about using memory barriers etc - so I figured I'd post an answer that properly showed them used for this.
#define NUM_THREADS 5
unsigned int thread_count;
void *threadfunc(void *arg) {
printf("Thread %p running\n",arg);
sleep(3);
printf("Thread %p exiting\n",arg);
__sync_fetch_and_sub(&thread_count,1);
return 0L;
}
int main() {
int i;
pthread_t thread[NUM_THREADS];
thread_count=NUM_THREADS;
for (i=0;i<NUM_THREADS;i++) {
pthread_create(&thread[i],0L,threadfunc,&thread[i]);
}
do {
__sync_synchronize();
} while (thread_count);
printf("All threads done\n");
}
Note that the __sync macros are "non-standard" GCC internal macros. LLVM supports these too - but if your using another compiler, you may have to do something different.
Another big thing to note is: Why would you burn an entire core, or waste "half" of a CPU spinning in a tight poll-loop just waiting for others to finish - when you could easily put it to work? The following mod uses the initial thread to run one of the workers, then wait for the others to complete:
thread_count=NUM_THREADS;
for (i=1;i<NUM_THREADS;i++) {
pthread_create(&thread[i],0L,threadfunc,&thread[i]);
}
threadfunc(&thread[0]);
do {
__sync_synchronize();
} while (thread_count);
printf("All threads done\n");
}
Note that we start creating the threads starting at "1" instead of "0", then directly run "thread 0" inline, waiting for all threads to complete after it's done. We pass &thread[0] to it for consistency (even though it's meaningless here), though in reality you'd probably pass your own variables/context.

Resources