I am running this in C++, however I think the same code would work in C. My understanding of a barrier is that when you pthread_barrier_init() you give it a number. That number represents how many threads must make a call to pthread_barrier_wait() before any of them get unblocked. So basically if the number is 4 and you have 3 threads who have executed that wait() line so far, all 3 of those threads will be blocked until a 4th thread comes along and calls pthread_barrier_wait().
I am trying to get all of the threads to begin execution at the same time.
#include <pthread.h>
#include <stdio.h>
void* thread_func(void* args) {
pthread_barrier_t *barrier = (pthread_barrier_t*)args;
printf("waiting for barrier\n");
pthread_barrier_wait(barrier);
printf("passed barrier\n");
return NULL;
}
int main() {
int const num_threads = 4;
pthread_t threads[num_threads];
pthread_barrier_t *barrier;
pthread_barrier_init(barrier, NULL, num_threads);
int i;
for (i = 0; i < num_threads; i++) {
pthread_create(&threads[i], NULL, thread_func, (void*)barrier);
}
for (i = 0; i < num_threads; i++) {
pthread_join(threads[i], NULL);
}
return 0;
}
output I receive:
waiting for barrier
waiting for barrier
waiting for barrier
passed barrier
passed barrier
passed barrier
waiting for barrier
passed barrier
output I expect:
waiting for barrier
waiting for barrier
waiting for barrier
waiting for barrier
passed barrier
passed barrier
passed barrier
passed barrier
Another very strange thing occurring is that in my pthread_barrier_init(barrier, NULL, num_threads) call, I can change the number to (num_threads+40) and the program still runs. I would think that in that case, all of the threads would be sitting at their wait() calls forever in that case since there would never be num_threads+40 threads waiting.
What am I missing?
pthread_barrier_t *barrier;
pthread_barrier_init(barrier, NULL, num_threads);
So barrier is a pointer that's never made to point to anything in particular. You never assign barrier a value, yet you pass its value to pthread_barrier_init. So pthread_barrier_init gets a garbage value, as do your threads.
Somewhere, you need to create an actual barrier, not just a pointer to one.
You could do this:
pthread_barrier_t actual_barrier;
pthread_barrier_t *barrier = &actual_barrier;
pthread_barrier_init(barrier, NULL, num_threads);
This actually does create a barrier and passes its address to pthread_barrier_init so the barrier you actually created can be initialized.
The answer #DavidSchwartz gave adds an extra line to make it clear how the pointer form would need to point to an actual instance of a pthread_barrier_t. However more typical for very localized usage would be:
pthread_barrier_t barrier;
pthread_barrier_init(&barrier, NULL, num_threads);
Related
I am learning the basics of POSIX threads. I want to create a program that prints "Hello World!" 10 times with a delay of a second between each printout. I've used a for loop to print it 10 times, but I am stuck on how to implement the time delay part.
This is my code so far:
#define MAX 10
void* helloFunc(void* tid)
{
printf("Hello World!\n", (int)(intptr_t)tid);
}
int main(int ac, char * argv)
{
pthread_t hej[MAX];
for (int i = 0; i < MAX; i++)
{
pthread_create(&hej[i], NULL, helloFunc, (void*)(intptr_t)i);
pthread_join(&hej[i], NULL);
}
pthread_exit(NULL);
return(0);
}
Thanks in advance!
There are two major problems with your code:
First of all you must wait for the threads to finish. You do that by joining them with pthread_join. And for that to work you must save the pthread_t value from each and every thread (for example in an array).
If you don't wait for the threads then the exit call will end the process, and that will also unexpectedly kill and end all threads in the process.
For all threads to run in parallel you should wait in a separate loop after you have created them:
pthread_t hej[MAX];
for (int i = 0; i < MAX; i++)
{
pthread_create(&hej[i], ...);
}
for (int i = 0; i < MAX; i++)
{
pthread_join(&hej[i], NULL);
}
The second problem is that you pass a pointer to i to the thread, so tid inside the thread functions will be all be the same (and a very large and weird value). To pass a value you must first cast it to intptr_t and then to void *:
pthread_create(..., (void *) (intptr_t) i);
And in the thread function you do the opposite casting:
printf("Hello World %d!\n", (int) (intptr_t) tid);
Note that this is an exception to the rule that one should never pass values as pointers (or opposite).
Finally for the "delay" bit... On POSIX systems there are many ways to delay execution, or to sleep. The natural and simple solution would be to use sleep(1) which sleeps one second.
The problem is where do to this sleep(1) call. If you do it in the thread functions after the printf then all threads will race to print the message and then all will sleep at the same time.
If you do it in the loop where you create the threads, then the threads won't really run in parallel, but really in serial where one thread prints it message and exits, then the main thread will wait one second before creating the next thread. It makes the threads kind of useless.
As a possible third solution, use the value passed to the thread function to use as the sleep time, so the thread that is created first (when i == 0) will primt immediately, the second thread (when i == 1) will sleep one second. And so on, until the tenth thread is created and will sleep nine seconds before printing the message.
Could be done as:
void* helloFunc(void* tid)
{
int value = (int) (intptr_t) tid;
sleep(value);
printf("Hello World %d!\n", value);
// Must return a value, as the function is declared as such
return NULL;
}
So I know that you can create barriers in C to control the flow of a threaded program. You can initialize the barrier, have your threads use it, and then destroy it. However, I am unsure whether or not the same barrier can be reused (say if it were in a loop). Or must you use a new barrier for a second wait point? As an example, is the below code correct (reusing the same barrier)?
#include <pthread.h>
pthread_barrier_t barrier;
void* thread_func (void *not_used) {
//some code
pthread_barrier_wait(&barrier);
//some more code
pthread_barrier_wait(&barrier);
//even more code
}
int main() {
pthread_barrier_init (&barrier, NULL, 2);
pthread_t tid[2];
pthread_create (&tid[0], NULL, thread_func, NULL);
pthread_create (&tid[1], NULL, thread_func, NULL);
pthread_join(tid[0], NULL);
pthread_join(tid[1], NULL);
pthread_barrier_destroy(&barrier);
}
Yes, they are reusable. The man page says:
When the required number of threads have called pthread_barrier_wait()...the barrier shall be
reset to the state it had as a result of the most recent
pthread_barrier_init() function that referenced it.
My command line tool keeps throwing the bus error: 10 message. Xcode debugger shows EXC_BAD_ACCESS message and highlights the function call that creates the thread. Manual debugging shows that the execution flow breaks at random positions inside the thread flow. I tried another compiler (gcc), but it ended up the same. Disabling pthread_mutex_lock() and pthread_mutex_unlock() doesn't help. I wrote this small example that reproduces the error.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
typedef struct thread_args {
pthread_mutex_t* mutex;
} thread_args;
void* test(void* t_args) {
printf("Thread initiated\n");
thread_args* args = (thread_args* )t_args;
printf("Args casted\n");
pthread_mutex_lock(args->mutex);
printf("Mutex locked\n");
pthread_mutex_unlock(args->mutex);
printf("Mutex unlocked\n");
pthread_exit(NULL);
}
int main() {
pthread_mutex_t mutex1;
pthread_mutex_init(&mutex1, NULL);
thread_args args;
args.mutex = &mutex1;
pthread_t* thread;
printf("Initiating a thread\n");
pthread_create(thread, NULL, test, &args);
return(0);
}
I think, in your case,
pthread_create(thread, NULL, test, &args);
at this call, thread is a pointer and not allocated memory. So, essentially pthread_create() tries to write into uninitialized memory, which creates undefined behavior.
Referring the man page of pthread_create()
Before returning, a successful call to pthread_create() stores the ID of the new thread in the buffer pointed to by thread;....
Instead, you can do
pthread_t thread;
...
pthread_create(&thread, NULL, test, &args);
You're using an uninitialized pointer to your pthread_t. The actual storage of the pthread_t needs to be somewhere!
Try :
int main() {
pthread_mutex_t mutex1;
pthread_mutex_init(&mutex1, NULL);
thread_args args;
args.mutex = &mutex1;
pthread_t thread;
printf("Initiating a thread\n");
pthread_create(&thread, NULL, test, &args);
return(0);
}
As other answers pointed out, you need to initialize your pointer thread which you can simply do with:
pthread_t thread;
pthread_create(&thread, NULL, test, &args);
Well, then I'll have to allocate memory dynamically, because different
threads are spawned inside many different functions, hence I can't use
local variables, because I'm not going to join the threads. Then, how
can I free the allocated memory without waiting for the thread to
finish, i.e. without calling join?
No. You don't need to dynamically allocate just because you are going to spawn multiple threads. The thread identifier is no longer needed once a thread has been created So whether it's a local variable or malloced is not important. It's only needed when you need to join or change some characteristics of the thread -- for which you need the ID. Otherwise, you can even reuse the same thread for creating multiple threads. For example,
pthread_t thread;
for( i = 0; i<8; i++)
pthread_create(&thread, NULL, thread_func, NULL);
is perfectly fine. A thread can always get its own ID by calling pthread_self() if needed. But you can't pass a local variable mutex1 to thread functions as once main thread exits, the mutex1 no longer exits as thread created continues to use it. So you either need malloc mutex1 or make it a global variable.
Another thing to do is that if you decide to let the main thread exit then you should call pthread_exit(). Otherwise, when the main thread exits (either by calling exit or simply return) then the whole process will die, meaning, all the threads will die too.
I am trying to write a code that does not block main() when pthread_join() is called:
i.e. basically trying to implement my previous question mentioned below:
https://stackoverflow.com/questions/24509500/pthread-join-and-main-blocking-multithreading
And the corresponding explanation at:
pthreads - Join on group of threads, wait for one to exit
As per suggested answer:
You'd need to create your own version of it - e.g. an array of flags (one flag per thread) protected by a mutex and a condition variable; where just before "pthread_exit()" each thread acquires the mutex, sets its flag, then does "pthread_cond_signal()". The main thread waits for the signal, then checks the array of flags to determine which thread/s to join (there may be more than one thread to join by then).
I have tried as below:
My status array which keeps a track of which threads have finished:
typedef struct {
int Finish_Status[THREAD_NUM];
int signalled;
pthread_mutex_t mutex;
pthread_cond_t FINISHED;
}THREAD_FINISH_STATE;
The thread routine, it sets the corresponding array element when the thread finishes and also signals the condition variable:
void* THREAD_ROUTINE(void* arg)
{
THREAD_ARGUMENT* temp=(THREAD_ARGUMENT*) arg;
printf("Thread created with id %d\n",temp->id);
waitFor(5);
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
ThreadFinishStatus.Finish_Status[temp->id]=TRUE;
ThreadFinishStatus.signalled=TRUE;
if(ThreadFinishStatus.signalled==TRUE)
{
pthread_cond_signal(&(ThreadFinishStatus.FINISHED));
printf("Signal that thread %d finished\n",temp->id);
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
pthread_exit((void*)(temp->id));
}
I am not able to write the corresponding parts pthread_join() and pthread_cond_wait() functions. There are a few things which I am not able to implement.
1) How to write corresponding part pthread_cond_wait() in my main()?
2) I am trying to write it as:
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
while((ThreadFinishStatus.signalled != TRUE){
pthread_cond_wait(&(ThreadFinishStatus.FINISHED), &(ThreadFinishStatus.mutex));
printf("Main Thread signalled\n");
ThreadFinishStatus.signalled==FALSE; //Reset signalled
//check which thread to join
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
But it does not enter the while loop.
3) How to use pthread_join() so that I can get the return value stored in my arg[i].returnStatus
i.e. where to put below statement in my main:
`pthread_join(T[i],&(arg[i].returnStatus));`
COMPLETE CODE
#include <stdio.h>
#include <pthread.h>
#include <time.h>
#define THREAD_NUM 5
#define FALSE 0
#define TRUE 1
void waitFor (unsigned int secs) {
time_t retTime;
retTime = time(0) + secs; // Get finishing time.
while (time(0) < retTime); // Loop until it arrives.
}
typedef struct {
int Finish_Status[THREAD_NUM];
int signalled;
pthread_mutex_t mutex;
pthread_cond_t FINISHED;
}THREAD_FINISH_STATE;
typedef struct {
int id;
void* returnStatus;
}THREAD_ARGUMENT;
THREAD_FINISH_STATE ThreadFinishStatus;
void initializeState(THREAD_FINISH_STATE* state)
{
int i=0;
state->signalled=FALSE;
for(i=0;i<THREAD_NUM;i++)
{
state->Finish_Status[i]=FALSE;
}
pthread_mutex_init(&(state->mutex),NULL);
pthread_cond_init(&(state->FINISHED),NULL);
}
void destroyState(THREAD_FINISH_STATE* state)
{
int i=0;
for(i=0;i<THREAD_NUM;i++)
{
state->Finish_Status[i]=FALSE;
}
pthread_mutex_destroy(&(state->mutex));
pthread_cond_destroy(&(state->FINISHED));
}
void* THREAD_ROUTINE(void* arg)
{
THREAD_ARGUMENT* temp=(THREAD_ARGUMENT*) arg;
printf("Thread created with id %d\n",temp->id);
waitFor(5);
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
ThreadFinishStatus.Finish_Status[temp->id]=TRUE;
ThreadFinishStatus.signalled=TRUE;
if(ThreadFinishStatus.signalled==TRUE)
{
pthread_cond_signal(&(ThreadFinishStatus.FINISHED));
printf("Signal that thread %d finished\n",temp->id);
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
pthread_exit((void*)(temp->id));
}
int main()
{
THREAD_ARGUMENT arg[THREAD_NUM];
pthread_t T[THREAD_NUM];
int i=0;
initializeState(&ThreadFinishStatus);
for(i=0;i<THREAD_NUM;i++)
{
arg[i].id=i;
}
for(i=0;i<THREAD_NUM;i++)
{
pthread_create(&T[i],NULL,THREAD_ROUTINE,(void*)&arg[i]);
}
/*
Join only if signal received
*/
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
//Wait
while((ThreadFinishStatus.signalled != TRUE){
pthread_cond_wait(&(ThreadFinishStatus.FINISHED), &(ThreadFinishStatus.mutex));
printf("Main Thread signalled\n");
ThreadFinishStatus.signalled==FALSE; //Reset signalled
//check which thread to join
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
destroyState(&ThreadFinishStatus);
return 0;
}
Here is an example of a program that uses a counting semaphore to watch as threads finish, find out which thread it was, and review some result data from that thread. This program is efficient with locks - waiters are not spuriously woken up (notice how the threads only post to the semaphore after they've released the mutex protecting shared state).
This design allows the main program to process the result from some thread's computation immediately after the thread completes, and does not require the main wait for all threads to complete. This would be especially helpful if the running time of each thread varied by a significant amount.
Most importantly, this program does not deadlock nor race.
#include <pthread.h>
#include <semaphore.h>
#include <stdlib.h>
#include <stdio.h>
#include <queue>
void* ThreadEntry(void* args );
typedef struct {
int threadId;
pthread_t thread;
int threadResult;
} ThreadState;
sem_t completionSema;
pthread_mutex_t resultMutex;
std::queue<int> threadCompletions;
ThreadState* threadInfos;
int main() {
int numThreads = 10;
int* threadResults;
void* threadResult;
int doneThreadId;
sem_init( &completionSema, 0, 0 );
pthread_mutex_init( &resultMutex, 0 );
threadInfos = new ThreadState[numThreads];
for ( int i = 0; i < numThreads; i++ ) {
threadInfos[i].threadId = i;
pthread_create( &threadInfos[i].thread, NULL, &ThreadEntry, &threadInfos[i].threadId );
}
for ( int i = 0; i < numThreads; i++ ) {
// Wait for any one thread to complete; ie, wait for someone
// to queue to the threadCompletions queue.
sem_wait( &completionSema );
// Find out what was queued; queue is accessed from multiple threads,
// so protect with a vanilla mutex.
pthread_mutex_lock(&resultMutex);
doneThreadId = threadCompletions.front();
threadCompletions.pop();
pthread_mutex_unlock(&resultMutex);
// Announce which thread ID we saw finish
printf(
"Main saw TID %d finish\n\tThe thread's result was %d\n",
doneThreadId,
threadInfos[doneThreadId].threadResult
);
// pthread_join to clean up the thread.
pthread_join( threadInfos[doneThreadId].thread, &threadResult );
}
delete threadInfos;
pthread_mutex_destroy( &resultMutex );
sem_destroy( &completionSema );
}
void* ThreadEntry(void* args ) {
int threadId = *((int*)args);
printf("hello from thread %d\n", threadId );
// This can safely be accessed since each thread has its own space
// and array derefs are thread safe.
threadInfos[threadId].threadResult = rand() % 1000;
pthread_mutex_lock( &resultMutex );
threadCompletions.push( threadId );
pthread_mutex_unlock( &resultMutex );
sem_post( &completionSema );
return 0;
}
Pthread conditions don't have "memory"; pthread_cond_wait doesn't return if pthread_cond_signal is called before pthread_cond_wait, which is why it's important to check the predicate before calling pthread_cond_wait, and not call it if it's true. But that means the action, in this case "check which thread to join" should only depend on the predicate, not on whether pthread_cond_wait is called.
Also, you might want to make the while loop actually wait for all the threads to terminate, which you aren't doing now.
(Also, I think the other answer about "signalled==FALSE" being harmless is wrong, it's not harmless, because there's a pthread_cond_wait, and when that returns, signalled would have changed to true.)
So if I wanted to write a program that waited for all threads to terminate this way, it would look more like
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
// AllThreadsFinished would check that all of Finish_Status[] is true
// or something, or simpler, count the number of joins completed
while (!AllThreadsFinished()) {
// Wait, keeping in mind that the condition might already have been
// signalled, in which case it's too late to call pthread_cond_wait,
// but also keeping in mind that pthread_cond_wait can return spuriously,
// thus using a while loop
while (!ThreadFinishStatus.signalled) {
pthread_cond_wait(&(ThreadFinishStatus.FINISHED), &(ThreadFinishStatus.mutex));
}
printf("Main Thread signalled\n");
ThreadFinishStatus.signalled=FALSE; //Reset signalled
//check which thread to join
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
Your code is racy.
Suppose you start a thread and it finishes before you grab the mutex in main(). Your while loop will never run because signalled was already set to TRUE by the exiting thread.
I will echo #antiduh's suggestion to use a semaphore that counts the number of dead-but-not-joined threads. You then loop up to the number of threads spawned waiting on the semaphore. I'd point out that the POSIX sem_t is not like a pthread_mutex in that sem_wait can return EINTR.
Your code appears fine. You have one minor buglet:
ThreadFinishStatus.signalled==FALSE; //Reset signalled
This does nothing. It tests whether signalled is FALSE and throws away the result. That's harmless though since there's nothing you need to do. (You never want to set signalled to FALSE because that loses the fact that it was signalled. There is never any reason to unsignal it -- if a thread finished, then it's finished forever.)
Not entering the while loop means signalled is TRUE. That means the thread already set it, in which case there is no need to enter the loop because there's nothing to wait for. So that's fine.
Also:
ThreadFinishStatus.signalled=TRUE;
if(ThreadFinishStatus.signalled==TRUE)
There's no need to test the thing you just set. It's not like the set can fail.
FWIW, I would suggest re-architecting. If the existing functions like pthread_join don't do exactly what you want, just don't use them. If you're going to have structures that track what work is done, then totally separate that from thread termination. Since you will already know what work is done, what different does it make when and how threads terminate? Don't think of this as "I need a special way to know when a thread terminates" and instead think of this "I need to know what work is done so I can do other things".
Writing my basic programs on multi threading and I m coming across several difficulties.
In the program below if I give sleep at position 1 then value of shared data being printed is always 10 while keeping sleep at position 2 the value of shared data is always 0.
Why this kind of output is coming ?
How to decide at which place we should give sleep.
Does this mean that if we are placing a sleep inside the mutex then the other thread is not being executed at all thus the shared data being 0.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include<unistd.h>
pthread_mutex_t lock;
int shared_data = 0;
void * function(void *arg)
{
int i ;
for(i =0; i < 10; i++)
{
pthread_mutex_lock(&lock);
shared_data++;
pthread_mutex_unlock(&lock);
}
pthread_exit(NULL);
}
int main()
{
pthread_t thread;
void * exit_status;
int i;
pthread_mutex_init(&lock, NULL);
i = pthread_create(&thread, NULL, function, NULL);
for(i =0; i < 10; i++)
{
sleep(1); //POSITION 1
pthread_mutex_lock(&lock);
//sleep(1); //POSITION 2
printf("Shared data value is %d\n", shared_data);
pthread_mutex_unlock(&lock);
}
pthread_join(thread, &exit_status);
pthread_mutex_destroy(&lock);
}
When you sleep before you lock the mutex, then you're giving the other thread plenty of time to change the value of the shared variable. That's why you're seeing a value of "10" with the 'sleep' in position #1.
When you grab the mutex first, you're able to lock it fast enough that you can print out the value before the other thread has a chance to modify it. The other thread sits and blocks on the pthread_mutex_lock() call until your main thread has finished sleeping and unlocked it. At that point, the second thread finally gets to run and alter the value. That's why you're seeing a value of "0" with the 'sleep' at position #2.
This is a classic case of a race condition. On a different machine, the same code might not display "0" with the sleep call at position #2. It's entirely possible that the second thread has the opportunity to alter the value of the variable once or twice before your main thread locks the mutex. A mutex can ensure that two threads don't access the same variable at the same time, but it doesn't have any control over the order in which the two threads access it.
I had a full explanation here but ended up deleting it. This is a basic synchronization problem and you should be able to trace and identify it before tackling anything more complicated.
But I'll give you a hint: It's only the sleep() in position 1 that matters; the other one inside the lock is irrelevant as long as it doesn't change the code outside the lock.