I have the following code running for N threads with count=0 initially as shared variable. Every variable is initialised before the working of the threads. I am trying to execute the critical section of code only for MAX number of threads.
void *tmain(){
while(1){
pthread_mutex_lock(&count_mutex);
count++;
if(count>MAX){
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
pthread_mutex_unlock(&count_mutex);
/*
some code not associated with count_mutex or count_threshold_cv
*/
pthread_mutex_lock(&count_mutex);
count--;
pthread_cond_signal(&count_threshold_cv);
pthread_mutex_unlock(&count_mutex);
}
}
But after running for some time the threads get blocked at pthread_cond_signal(). I am unable to understand why this is occuring. Any help is appreciated.
This code has one weak point that may lead to a blocking problem.
More precisely, it is not protected against so called spurious wakes up,
meaning that the pthread_cond_wait() function may return when no signal were delivered explicitly by calling either pthread_cond_signal() or pthread_cond_broadcast().
Therefore, the following lines from the code do not guarantee that the thread wakes up when the count variable is less or equal MAX
if(count>MAX){
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
Let's see what may happen when one thread wakes up when the count still greater than MAX:
immediately after that the mutex is unlocked.
Now other thread can enter the critical session and increment the count variable more than expected:
pthread_mutex_lock(&count_mutex);
count++;
How to protect code against spurious signals?
The pthread_cond_wait wake up is a recommendation to check the predicate (count>MAX).
If it is still false, we need to continue to wait on the conditional variable.
Try to fix your code by changing the if statement to the while statement (and, as remarked by #alk, change the tmain() signature):
while(count>MAX)
{
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
Now, if a spurious wake up occurs and the count still greater than MAX,
the flow will wait on the conditional variable again. The flow will escape the waiting loop only when a wake up is accompanied by the predicate change.
The reason your code blocks is because you place count++ before the wait:
count++;
if(count>MAX){
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
Instead you should write
while (count >= MAX) {
pthread_cond_wait(&count_threshold_cv, &count_mutex);
}
count++;
The reason is that count should be the number of working threads.
A thread must only increment count when it is done waiting.
Your count variable, on the other hand, counts the number of working threads plus the number of waiting threads. This count is too large and leads to the condition count > MAX being true which blocks.
You should also replace "if" with "while" as MichaelGoren writes. Using "if" instead of "while" does not lead to blocking, but rather to too many threads running simultaneously; the woken thread starts working even if count > MAX.
The reason that you need "while" is that pthread_cond_signal unblocks one of the waiting threads. The unblocked thread, however, is still waiting for the mutex, and it is not necessarily scheduled to run either. When the awoken thread finally acquires the mutex and starts running, the call to pthread_cond_wait returns. In the mean time, between pthread_cond_signal and return of pthread_cond_wait, other threads could have owned the mutex. So you must check the condition again, which is what "while" does.
Also, because count++ is now after wait, the condition becomes count >= MAX instead of count > MAX. You should wait even if the number of workers is MAX.
Alternatively, you could have used semaphores for this problem.
Related
In my program the main thread starts multiple child threads, locks a mutex for a counter and then sleeps until the counter reaches N.
In an infinite for loop, the other threads do some work and they increase the counter in each iteration alternately. Once the counter == N one of them has to signal the main thread and all of the child threads have to end.
My problem is that I'm not sure when I should lock the counter mutex and make a signal in the thread function, because once the counter becomes N I managed to wake up the main thread and exit one of the child threads, but the other threads will keep on trying to work when they should all be terminating.
My question is, how can I check if counter == N but send a signal by only one of the threads, and the others just return without any signalling?
Can this be done by only locking the mutex once in each iteration, for the time of checking its value (== N)? If yes, how?
void *thread_function(void *arg) {
int *id = (int *) arg;
for (;;) {
pthread_mutex_lock(&mutex_counter);
counter++;
if (counter == N) {
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex_counter);
return NULL;
}
if (counter > N) {
pthread_mutex_unlock(&mutex_counter);
return NULL;
}
pthread_mutex_unlock(&mutex_counter);
sleep(random());
// do some work here
// ...
// finish work
}
The problem with above code is that, despite all threads terminating, they execute the loop one less iteration than needed because the counter is increased before the if clause (so counter will be N, but there were no N loops of work done).
However, if I put the counter to the end of the function, the threads will do more work than needed which messes up my program.
why you lock counter-mutex in main thread? Why you need to send signal from other treads to main? If you have only one global counter and all threads increasing it, you can just start in main thread all threads, remember their thread ids and then wait using while(counter<N); instruction. After all, just kill all threads with pthread_cancel.
Case is : There are multiple (say n^2) pthreads all derived from same function.
Pthread with id zero is master .
Each pthread (other than zero) has a for loop which runs n times
for(j=0;j<n;j++)
{ pthread_mutex_lock(&conditionmutex);
pthread_cond_wait(&cv,&conditionmutex);
pthread_mutex_unlock(&conditionmutex);
do something;
}
For thread zero it is
for(j=0;j<n;j++)
{ pthread_cond_broadcast(&cv);
do something else;
}
I expected that each pthread (other than zero) will wait at the beginning of for loop.
Thread zero will signal then all other threads will 'do something'
and then wait again at beginning of loop with incremented value of j.
My program gets stuck at beginning itself.
Can't find out the cause.
Thanks.
PS:cv is condition variable initialized to NULL , similiarly for mutex variable 'conditionmutex'.
pthread_cond_t cv;
pthread_cond_init(&cv, NULL);
pthread_mutex_init(&conditionmutex, NULL);
I have put up a barrier of n^2 threads before for loop in function.
This ensures that all threads are created and waiting before for loop.
I want to know - if my method is ok ?
I have a thread running in while with condition and it has sleep of 2 minutes.
(i.e.
while (condition) {
//do something
sleep(120);
}
)
To terminate the thread gracefully, I used pthread_join() and made while condition to false (e.g. someflag = 0)
And its working to terminate the thread, but if the thread is sleeping, it doesn't terminate until it finishes sleeping.
This is the problem I need avoid; I need to make thread come out early even if it is in sleep.
None of the above. Instead of while (condition) sleep(120); you should be using a condition variable:
while (condition) {
...
pthread_cond_timedwait(&condvar, &mutex, &abstime);
...
}
I chose pthread_cond_timedwait assuming you actually need to wake up and do something every 120 seconds even if nobody signals you, but if not you could just use pthread_cond_wait instead. The signaling thread needs to call pthread_cond_signal(&condvar) after changing the condition, and of course all access (reads and writes) to the state the condition depends on need to be protected by a mutex, mutex. You have to hold the mutex while calling pthread_cond_[timed]wait. If you have further questions on how to use condition variables, search the existing questions/answers (there are lots) or ask a follow-up.
This may not be the right answer, however I can suggest a work around to break sleep() of 120 sec into smaller time such as 2 seconds and put that in loop. Every time the loop executes, you can check for condition e.g.
while (condition)
{
//do something
int i = 0;
while(condition && (60 > i))
{
sleep (2);
i++;
}
}
I hope someone will surely paste better answer.
I am trying to learn basics of pthread_cond_wait. In all the usages, I see either
if(cond is false)
pthread_cond_wait
or
while(cond is false)
pthread_cond_wait
My question is, we want to cond_wait only because condition is false. Then why should i take the pain of explicitly putting an if/while loop. I can understand that without any if/while check before cond_wait we will directly hit that and it wont return at all. Is the condition check solely for solving this purpose or does it have anyother significance. If it for solving an unnecessary condition wait, then putting a condition check and avoiding the cond_wait is similar to polling?? I am using cond_wait like this.
void* proc_add(void *name){
struct vars *my_data = (struct vars*)name;
printf("In thread Addition and my id = %d\n",pthread_self());
while(1){
pthread_mutex_lock(&mutexattr);
while(!my_data->ipt){ // If no input get in
pthread_cond_wait(&mutexaddr_add,&mutexattr); // Wait till signalled
my_data->opt = my_data->a + my_data->b;
my_data->ipt=1;
pthread_cond_signal(&mutexaddr_opt);
}
pthread_mutex_unlock(&mutexattr);
if(my_data->end)
pthread_exit((void *)0);
}
}
The logic is, I am asking the input thread to process the data whenever an input is available and signal the output thread to print it.
You need a while loop because the thread that called pthread_cond_wait might wake up even when the condition you are waiting for isn't reached. This phenomenon is called "spurious wakeup".
This is not a bug, it is the way the conditional variables are implemented.
This can also be found in man pages:
Spurious wakeups from the pthread_cond_timedwait() or
pthread_cond_wait() functions may occur. Since the return from
pthread_cond_timedwait() or pthread_cond_wait() does not imply
anything about the value of this predicate, the predicate should be
re-evaluated upon such return.
Update regarding the actual code:
void* proc_add(void *name)
{
struct vars *my_data = (struct vars*)name;
printf("In thread Addition and my id = %d\n",pthread_self());
while(1) {
pthread_mutex_lock(&mutexattr);
while(!my_data->ipt){ // If no input get in
pthread_cond_wait(&mutexaddr_add,&mutexattr); // Wait till signalled
}
my_data->opt = my_data->a + my_data->b;
my_data->ipt=1;
pthread_cond_signal(&mutexaddr_opt);
pthread_mutex_unlock(&mutexattr);
if(my_data->end)
pthread_exit((void *)0);
}
}
}
You must test the condition under the mutex before waiting because signals of the condition variable are not queued (condition variables are not semaphores). That is, if a thread calls pthread_cond_signal() when no threads are blocked in pthread_cond_wait() on that condition variable, then the signal does nothing.
This means that if you had one thread set the condition:
pthread_mutex_lock(&m);
cond = true;
pthread_cond_signal(&c);
pthread_mutex_unlock(&m);
and then another thread unconditionally waited:
pthread_mutex_lock(&m);
pthread_cond_wait(&c, &m);
/* cond now true */
this second thread would block forever. This is avoided by having the second thread check for the condition:
pthread_mutex_lock(&m);
if (!cond)
pthread_cond_wait(&c, &m);
/* cond now true */
Since cond is only modified with the mutex m held, this means that the second thread waits if and only if cond is false.
The reason a while () loop is used in robust code instead of an if () is because pthread_cond_wait() does not guarantee that it will not wake up spuriously. Using a while () also means that signalling the condition variable is always perfectly safe - "extra" signals don't affect the program's correctness, which means that you can do things like move the signal outside of the locked section of code.
I have a doubt regarding pthread_cond_wait and pthread_cond_signal functions. I was not able to understand after reading the man pages also.
Please consider the following code.
void* thread_handler(){
... // counts till COUNT_LIMIT is reached
if (count == COUNT_LIMIT) {
pthread_cond_signal(&count_threshold_cv);
printf("inc_count(): thread %ld, count = %d Threshold reached.\n",
my_id, count);
}
printf("inc_count(): thread %ld, count = %d, unlocking mutex\n",
my_id, count);
...
}
void* thread_handler1(){
... // waits till the previous thread has finished counting
pthread_mutex_lock(&count_mutex);
while (count<COUNT_LIMIT) {
pthread_cond_wait(&count_threshold_cv, &count_mutex);
printf("watch_count(): thread %ld Condition signal received.\n", my_id);
}
pthread_mutex_unlock(&count_mutex);
pthread_exit(NULL);
}
The code is working as per expectations. I am trying to understand the code. Here is how the program works
Enter thread_handler1 and do cond_wait. From the man page i understood that cond_wait will immediatly release the lock atomically. So why are they releasing the lock again below in thread_handler1
After first thread has satisfied the condition and hits the condition signal I expected the thread which was blocking to execute its steps. Instead i got the printfs below the thread which executed the cond_signal. Why is this happening
Overall why do we need to take a lock before waiting and signalling. Cant this be done without a lock.
For a briefer look at the program, please see the Complete program here. You can find it under the section "Using Conditional Variables"
Thanks in advance
Chidambaram
Enter thread_handler1 and do cond_wait. From the man page i understood
that cond_wait will immediatly release the lock atomically. So why are
they releasing the lock again below in thread_handler1
Because the thread that is supposed to wait will release the lock when calling wait, but will then reacquire after it gets signaled (when it becomes available). That's why you need to rerelease it explicitly later.
After first thread has satisfied the condition and hits the condition
signal I expected the thread which was blocking to execute its steps.
Instead i got the printfs below the thread which executed the
cond_signal. Why is this happening
Because calling signal does not switch the thread from the CPU. It will continue to run normally.