Case is : There are multiple (say n^2) pthreads all derived from same function.
Pthread with id zero is master .
Each pthread (other than zero) has a for loop which runs n times
for(j=0;j<n;j++)
{ pthread_mutex_lock(&conditionmutex);
pthread_cond_wait(&cv,&conditionmutex);
pthread_mutex_unlock(&conditionmutex);
do something;
}
For thread zero it is
for(j=0;j<n;j++)
{ pthread_cond_broadcast(&cv);
do something else;
}
I expected that each pthread (other than zero) will wait at the beginning of for loop.
Thread zero will signal then all other threads will 'do something'
and then wait again at beginning of loop with incremented value of j.
My program gets stuck at beginning itself.
Can't find out the cause.
Thanks.
PS:cv is condition variable initialized to NULL , similiarly for mutex variable 'conditionmutex'.
pthread_cond_t cv;
pthread_cond_init(&cv, NULL);
pthread_mutex_init(&conditionmutex, NULL);
I have put up a barrier of n^2 threads before for loop in function.
This ensures that all threads are created and waiting before for loop.
I want to know - if my method is ok ?
Related
In my program the main thread starts multiple child threads, locks a mutex for a counter and then sleeps until the counter reaches N.
In an infinite for loop, the other threads do some work and they increase the counter in each iteration alternately. Once the counter == N one of them has to signal the main thread and all of the child threads have to end.
My problem is that I'm not sure when I should lock the counter mutex and make a signal in the thread function, because once the counter becomes N I managed to wake up the main thread and exit one of the child threads, but the other threads will keep on trying to work when they should all be terminating.
My question is, how can I check if counter == N but send a signal by only one of the threads, and the others just return without any signalling?
Can this be done by only locking the mutex once in each iteration, for the time of checking its value (== N)? If yes, how?
void *thread_function(void *arg) {
int *id = (int *) arg;
for (;;) {
pthread_mutex_lock(&mutex_counter);
counter++;
if (counter == N) {
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex_counter);
return NULL;
}
if (counter > N) {
pthread_mutex_unlock(&mutex_counter);
return NULL;
}
pthread_mutex_unlock(&mutex_counter);
sleep(random());
// do some work here
// ...
// finish work
}
The problem with above code is that, despite all threads terminating, they execute the loop one less iteration than needed because the counter is increased before the if clause (so counter will be N, but there were no N loops of work done).
However, if I put the counter to the end of the function, the threads will do more work than needed which messes up my program.
why you lock counter-mutex in main thread? Why you need to send signal from other treads to main? If you have only one global counter and all threads increasing it, you can just start in main thread all threads, remember their thread ids and then wait using while(counter<N); instruction. After all, just kill all threads with pthread_cancel.
I'm currently programming an algorithm where a main thread Thread_0 manage a pool of thread {Thread_1,Thread_2,...,Thread_m} that are used to execute an ordered sequence of tasks {Task_1,Task_2,Task_3,...,Taskn}. Each task Task_i is executed by all threads Tread_i i>0, and the conclusion of the Task_i by each thread is the necessary and sufficient condition of the execution of the Task_i+1. My problem is with the synchronization between the pool of threads and the main thread.
First, when each thread is created, the main thread must wait for the initialization of each thread before creating the next one, because the data pointed by pointer passed to the thread function must be copied by the thread before its content is modified. Once all threads ara initialized, the main thread must compute some values which are necessary for execute Task_0, and the other threads must wait for a broadcasted signal to start with Task_0. Once Task_0 is completed, the main thread must
must compute other values which are necessary for execute Task_1, and the other threads must wait for a broadcasted signal to start with Task_1. And so on.
My questions are the following:
Have this algorithm/problem a specific name?
It seems that is common for waiting for thread initialization,
there is an standard procedure for that?
This kind of algorithm must be implement using barriers?
I have an idea to solve this problem usign a mutex for each thread,
but I want to avoid this.
I have tried to solve this with two condition variable cv0 and cv1, and one mutex mx0, but this avoid threads to work in parallel:
void* worker(void* data_in){
lock mx0
copy input data
flag=1
signal cv0
wait for condition signal cv1 with mx0
// Initialize Task 0
read DataForTask0
// do Task_0
flag=1
signal cv0
wait for condition signal cv1 with mx0
// Initialize Task_1
//...
//...
}
int main(){
flag=0
for(i=0;i<TNum;i++){
lock mx0
data=CurrentValue(i);
pthread_create(&threads[i],NULL,worker,(void*)&data);
while(!flag0)
wait for condition sigal cv0 with mx0
flag=0
unlock mx0
}
// Here all threads must be waithing for cv1
lock mx0
Initialize DataForTask0
broadcast cv1
counter=0
do{
flag=0
while(!flag0)
wait for condition signal cv0 with mx0
counter++
}while(counter<TNum)
//...
}
I have a function: createrWorkerPool which will spawn "n" worker threads and each of them will take input as file specified in the args for pthread_create, read the file modify a shared variable by using a mutex around it and wait at a barrier until all threads have done modifying their shared variable. This operation happens in a loop a large number of times.
The problem I am facing is that lets consider there are two files file1 and file2 and file2 is much bigger in size than file1. The barrier synchronization works till completion of file1- but since it finishes execution it no longer reaches the barrier and file2 is stuck in the barrier forever.
My question is that is there a way to dynamically change the count of the number of threads the barrier is waiting on when a thread exits. so in the above case if file1 finishes early it decrements the barrier count from 2 to 1 so file1 can proceed with its execution. I tried looking at the man page but don't see any function.Example code
pthread_mutex_t m1;
pthread_barrier_t b1;
//common function executed by each worker thread
void* doWork(void* arg) {
const char* file = arg;
while(1) {
pthread_mutex_lock(&m1);
// Modify shared variable
pthread_mutex_unlock(&m1);
// Wait for all threads to finish modifying shared variable
pthread_barrier_wait(&b1);
// Once all threads reach barrier check state of shared variable and do some task based on state
// check for condition if true break out of loop
}
return 0;
}
So basically thread1 manipulating file1 finishes before and thread2 is stuck at the barrier forever
You can't really change the barrier count while the barrier is in use like that.
Presumably the problem is that the condition that is tested to break out of the loop is not true for all files at the same time - ie, each thread might execute a different number of loops.
If this is the case, one solution is to have each thread that finishes early continue to loop around, but do nothing but wait on the barrier in each loop. Then arrange for the threads to all exit together - something like this:
void* doWork(void* arg)
{
const char* file = arg;
int work_done = 0;
while(1) {
if (work_done)
{
if (all_threads_done)
break;
pthread_barrier_wait(&b1);
continue;
}
pthread_mutex_lock(&m1);
// Modify shared variable
pthread_mutex_unlock(&m1);
// Wait for all threads to finish modifying shared variable
pthread_barrier_wait(&b1);
// Once all threads reach barrier check state of shared variable and do some task based on state
if (finish_condition)
work_done = 1;
}
return 0;
}
I have the following code running for N threads with count=0 initially as shared variable. Every variable is initialised before the working of the threads. I am trying to execute the critical section of code only for MAX number of threads.
void *tmain(){
while(1){
pthread_mutex_lock(&count_mutex);
count++;
if(count>MAX){
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
pthread_mutex_unlock(&count_mutex);
/*
some code not associated with count_mutex or count_threshold_cv
*/
pthread_mutex_lock(&count_mutex);
count--;
pthread_cond_signal(&count_threshold_cv);
pthread_mutex_unlock(&count_mutex);
}
}
But after running for some time the threads get blocked at pthread_cond_signal(). I am unable to understand why this is occuring. Any help is appreciated.
This code has one weak point that may lead to a blocking problem.
More precisely, it is not protected against so called spurious wakes up,
meaning that the pthread_cond_wait() function may return when no signal were delivered explicitly by calling either pthread_cond_signal() or pthread_cond_broadcast().
Therefore, the following lines from the code do not guarantee that the thread wakes up when the count variable is less or equal MAX
if(count>MAX){
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
Let's see what may happen when one thread wakes up when the count still greater than MAX:
immediately after that the mutex is unlocked.
Now other thread can enter the critical session and increment the count variable more than expected:
pthread_mutex_lock(&count_mutex);
count++;
How to protect code against spurious signals?
The pthread_cond_wait wake up is a recommendation to check the predicate (count>MAX).
If it is still false, we need to continue to wait on the conditional variable.
Try to fix your code by changing the if statement to the while statement (and, as remarked by #alk, change the tmain() signature):
while(count>MAX)
{
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
Now, if a spurious wake up occurs and the count still greater than MAX,
the flow will wait on the conditional variable again. The flow will escape the waiting loop only when a wake up is accompanied by the predicate change.
The reason your code blocks is because you place count++ before the wait:
count++;
if(count>MAX){
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
Instead you should write
while (count >= MAX) {
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
count++;
The reason is that count should be the number of working threads.
A thread must only increment count when it is done waiting.
Your count variable, on the other hand, counts the number of working threads plus the number of waiting threads. This count is too large and leads to the condition count > MAX being true which blocks.
You should also replace "if" with "while" as MichaelGoren writes. Using "if" instead of "while" does not lead to blocking, but rather to too many threads running simultaneously; the woken thread starts working even if count > MAX.
The reason that you need "while" is that pthread_cond_signal unblocks one of the waiting threads. The unblocked thread, however, is still waiting for the mutex, and it is not necessarily scheduled to run either. When the awoken thread finally acquires the mutex and starts running, the call to pthread_cond_wait returns. In the mean time, between pthread_cond_signal and return of pthread_cond_wait, other threads could have owned the mutex. So you must check the condition again, which is what "while" does.
Also, because count++ is now after wait, the condition becomes count >= MAX instead of count > MAX. You should wait even if the number of workers is MAX.
Alternatively, you could have used semaphores for this problem.
I am trying to learn basics of pthread_cond_wait. In all the usages, I see either
if(cond is false)
pthread_cond_wait
or
while(cond is false)
pthread_cond_wait
My question is, we want to cond_wait only because condition is false. Then why should i take the pain of explicitly putting an if/while loop. I can understand that without any if/while check before cond_wait we will directly hit that and it wont return at all. Is the condition check solely for solving this purpose or does it have anyother significance. If it for solving an unnecessary condition wait, then putting a condition check and avoiding the cond_wait is similar to polling?? I am using cond_wait like this.
void* proc_add(void *name){
struct vars *my_data = (struct vars*)name;
printf("In thread Addition and my id = %d\n",pthread_self());
while(1){
pthread_mutex_lock(&mutexattr);
while(!my_data->ipt){ // If no input get in
pthread_cond_wait(&mutexaddr_add,&mutexattr); // Wait till signalled
my_data->opt = my_data->a + my_data->b;
my_data->ipt=1;
pthread_cond_signal(&mutexaddr_opt);
}
pthread_mutex_unlock(&mutexattr);
if(my_data->end)
pthread_exit((void *)0);
}
}
The logic is, I am asking the input thread to process the data whenever an input is available and signal the output thread to print it.
You need a while loop because the thread that called pthread_cond_wait might wake up even when the condition you are waiting for isn't reached. This phenomenon is called "spurious wakeup".
This is not a bug, it is the way the conditional variables are implemented.
This can also be found in man pages:
Spurious wakeups from the pthread_cond_timedwait() or
pthread_cond_wait() functions may occur. Since the return from
pthread_cond_timedwait() or pthread_cond_wait() does not imply
anything about the value of this predicate, the predicate should be
re-evaluated upon such return.
Update regarding the actual code:
void* proc_add(void *name)
{
struct vars *my_data = (struct vars*)name;
printf("In thread Addition and my id = %d\n",pthread_self());
while(1) {
pthread_mutex_lock(&mutexattr);
while(!my_data->ipt){ // If no input get in
pthread_cond_wait(&mutexaddr_add,&mutexattr); // Wait till signalled
}
my_data->opt = my_data->a + my_data->b;
my_data->ipt=1;
pthread_cond_signal(&mutexaddr_opt);
pthread_mutex_unlock(&mutexattr);
if(my_data->end)
pthread_exit((void *)0);
}
}
}
You must test the condition under the mutex before waiting because signals of the condition variable are not queued (condition variables are not semaphores). That is, if a thread calls pthread_cond_signal() when no threads are blocked in pthread_cond_wait() on that condition variable, then the signal does nothing.
This means that if you had one thread set the condition:
pthread_mutex_lock(&m);
cond = true;
pthread_cond_signal(&c);
pthread_mutex_unlock(&m);
and then another thread unconditionally waited:
pthread_mutex_lock(&m);
pthread_cond_wait(&c, &m);
/* cond now true */
this second thread would block forever. This is avoided by having the second thread check for the condition:
pthread_mutex_lock(&m);
if (!cond)
pthread_cond_wait(&c, &m);
/* cond now true */
Since cond is only modified with the mutex m held, this means that the second thread waits if and only if cond is false.
The reason a while () loop is used in robust code instead of an if () is because pthread_cond_wait() does not guarantee that it will not wake up spuriously. Using a while () also means that signalling the condition variable is always perfectly safe - "extra" signals don't affect the program's correctness, which means that you can do things like move the signal outside of the locked section of code.