I work on an embedded system with eCos:
I have 2 threads within the same process and 1 semaphore.
Thread A initializes a semaphore to 0 so that the 1st attempt to take it will block.
Thread A sends a command to Thread B, providing a callback.
Thread A waits on semaphore with sem_timedwait
Thread B process the command and increments the semaphore
Thread A should be woken up but is still blocked
Here is the code:
Thread A
static sem_t semaphore;
void callback()
{
// Do some stuff
int ret = sem_post(&semaphore);
// print confirmation message
}
void foo()
{
int ret = sem_init(&semaphore, 0, 0);
if(ret != 0)
{
// print errno
}
struct timespec ts;
clock_gettime(CLOCK_REALTIME,&ts); // Get current date
ts.tv_sec += 2; // Add 2s for the deadline
send_command_to_thread_B(&callback);
ret = sem_timedwait(&semaphore, &ts);
if(ret != 0)
{
// print errno
}
// print waking up message
}
What is in Thread B is not relevant.
For debug I tried the following:
Using sem_wait rather than sem_timedwait works: Thread A is blocked, then unlocked after the callback. But I don't want to use it because if there is a failure in the callback process that prevent the semaphore to be incremented, Thread A will wait forever.
If I don't add the 2s to the timespec struct, sem_timedwait returns immediatly and errno is set to ETIMEDOUT (seems legit). The callback is called but it is too late for Thread A.
I put traces in the callback call to ensure that the semaphore is indeed incremented from 0 to 1: all the process is done, the callback exits but Thread A is still blocked.
Do you guys have any clue ? Am I missing something ?
Ok so actually everything is fine with this code, the issue was elsewhere : I had a re-entrance problem that caused a deadlock.
Moral: Carefully protect your resources and addresses in a multi-thread context
Related
So I'm trying to understand exactly how pthread_mutex_lock works.
My current understanding is that it unlocks the mutex and puts whatever thread is going though it to sleep. Sleep meaning that the thread is inactive and consuming no resources.
It then waits for a signal to go from asleep to blocked, meaning that the thread can no longer change any variables.
thread 1:
pthread_mutex_lock(&mutex);
while (!condition){
printf("Thread wating.\n");
pthread_cond_wait(&cond, &mutex);
printf("Thread awakened.\n");
fflush(stdout);
}
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&condVar);
pthread_mutex_unlock(&mutex);
So basically in the sample above, the loop runs and runs and each iteration pthread_cond_wait checks if the condition of the loop is true. If it is then the cond_signal is sent and the thread is blocked so it can't manipulate any more data.
I'm really having trouble wrapping my head around this, I'd appreciate some input and feedback about how this works and whether or not I am beginning to understand this based on what I have above.
I've gone over this post but am still having trouble
First, a summary:
pthread_mutex_lock(&mutex):
If mutex is free, then this thread grabs it immediately.
If mutex is grabbed, then this thread waits until the mutex becomes free, and then grabs it.
pthread_mutex_trylock(&mutex):
If mutex is free, then this thread grabs it.
If mutex is grabbed, then the call returns immediately with EBUSY.
pthread_mutex_unlock(&mutex):
Releases mutex.
pthread_cond_signal(&cond):
Wake up one thread waiting on the condition variable cond.
pthread_cond_broadcast(&cond):
Wake up all threads waiting on the condition variable cond.
pthread_cond_wait(&cond, &mutex):
This must be called with mutex grabbed.
The calling thread will temporarily release mutex and wait on cond.
When cond is broadcast on, or signaled on and this thread happens to be the one woken up, then the calling thread will first re-grab the mutex, and then return from the call.
It is important to note that at all times, the calling thread either has mutex grabbed, or is waiting on cond. There is no interval in between.
Let's look at a practical, running example code. We'll create it along the lines of OP's code.
First, we'll use a structure to hold the parameters for each worker function. Since we'll want the mutex and the condition variable to be shared between threads, we'll use pointers.
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <pthread.h>
#include <limits.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
/* Worker function work. */
struct work {
pthread_t thread_id;
pthread_mutex_t *lock; /* Pointer to the mutex to use */
pthread_cond_t *wait; /* Pointer to the condition variable to use */
volatile int *done; /* Pointer to the flag to check */
FILE *out; /* Stream to output to */
long id; /* Identity of this thread */
unsigned long count; /* Number of times this thread iterated. */
};
The thread worker function receives a pointer to the above structure. Each thread iterates the loop once, then waits on the condition variable. When woken up, if the done flag is still zero, the thread iterates the loop. Otherwise, the thread exits.
/* Example worker function. */
void *worker(void *workptr)
{
struct work *const work = workptr;
pthread_mutex_lock(work->lock);
/* Loop as long as *done == 0: */
while (!*(work->done)) {
/* *(work->lock) is ours at this point. */
/* This is a new iteration. */
work->count++;
/* Do the work. */
fprintf(work->out, "Thread %ld iteration %lu\n", work->id, work->count);
fflush(work->out);
/* Wait for wakeup. */
pthread_cond_wait(work->wait, work->lock);
}
/* *(work->lock) is still ours, but we've been told that all work is done already. */
/* Release the mutex and be done. */
pthread_mutex_unlock(work->lock);
return NULL;
}
To run the above, we'll need a main() as well:
#ifndef THREADS
#define THREADS 4
#endif
int main(void)
{
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t wait = PTHREAD_COND_INITIALIZER;
volatile int done = 0;
struct work w[THREADS];
char *line = NULL, *p;
size_t size = 0;
ssize_t len = 0;
unsigned long total;
pthread_attr_t attrs;
int i, err;
/* The worker functions require very little stack, but the default stack
size is huge. Limit that, to reduce the (virtual) memory use. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, 2 * PTHREAD_STACK_MIN);
/* Grab the mutex so the threads will have to wait to grab it. */
pthread_mutex_lock(&lock);
/* Create THREADS worker threads. */
for (i = 0; i < THREADS; i++) {
/* All threads use the same mutex, condition variable, and done flag. */
w[i].lock = &lock;
w[i].wait = &wait;
w[i].done = &done;
/* All threads output to standard output. */
w[i].out = stdout;
/* The rest of the fields are thread-specific. */
w[i].id = i + 1;
w[i].count = 0;
err = pthread_create(&(w[i].thread_id), &attrs, worker, (void *)&(w[i]));
if (err) {
fprintf(stderr, "Cannot create thread %d of %d: %s.\n", i+1, THREADS, strerror(errno));
exit(EXIT_FAILURE); /* Exits the entire process, killing any other threads as well. */
}
}
fprintf(stderr, "The first character on each line controls the type of event:\n");
fprintf(stderr, " e, q exit\n");
fprintf(stderr, " s signal\n");
fprintf(stderr, " b broadcast\n");
fflush(stderr);
/* Let each thread grab the mutex now. */
pthread_mutex_unlock(&lock);
while (1) {
len = getline(&line, &size, stdin);
if (len < 1)
break;
/* Find the first character on the line, ignoring leading whitespace. */
p = line;
while ((p < line + len) && (*p == '\0' || *p == '\t' || *p == '\n' ||
*p == '\v' || *p == '\f' || *p == '\r' || *p == ' '))
p++;
/* Do the operation mentioned */
if (*p == 'e' || *p == 'E' || *p == 'q' || *p == 'Q')
break;
else
if (*p == 's' || *p == 'S')
pthread_cond_signal(&wait);
else
if (*p == 'b' || *p == 'B')
pthread_cond_broadcast(&wait);
}
/* It is time for the worker threads to be done. */
pthread_mutex_lock(&lock);
done = 1;
pthread_mutex_unlock(&lock);
/* To ensure all threads see the state of that flag,
we wake up all threads by broadcasting on the condition variable. */
pthread_cond_broadcast(&wait);
/* Reap all threds. */
for (i = 0; i < THREADS; i++)
pthread_join(w[i].thread_id, NULL);
/* Output the thread statistics. */
total = 0;
for (i = 0; i < THREADS; i++) {
total += w[i].count;
fprintf(stderr, "Thread %ld: %lu events.\n", w[i].id, w[i].count);
}
fprintf(stderr, "Total: %lu events.\n", total);
return EXIT_SUCCESS;
}
If you save the above as example.c, you can compile it to example using e.g. gcc -Wall -O2 example.c -lpthread -o example.
To get the correct intuitive grasp of the operations, run the example in a terminal, with the source code in a window next to it, and see how the execution progresses as you provide input.
You can even run commands like printf '%s\n' s s s b q | ./example to run a sequence of events in a quick succession, or printf 's\ns\ns\nb\nq\n' | ./example with even less time in between events.
After some experimentation, you'll hopefully find out that not all input events cause their respective action. This is because the exit event (q above) is not synchronous: it does not wait for all pending work to be done, but tells the threads to exit right then and there. That is why the number of events may vary even for the exact same input.
(Also, if you signal on the condition variable, and immediately broadcast on it, the threads tend to only get woken up once.)
You can mitigate that by delaying the exit, using e.g. (printf '%s\n' s s b s s s ; sleep 1 ; printf 'q\n' ) | ./example.
However, there are better ways. A condition variable is not suitable for countable events; it is really flag-like. A semaphore would work better, but then you should be careful to not overflow the semaphore; it can only be from 0 to SEM_VALUE_MAX, inclusive. (So, you could use a semaphore to represent the number of pending job, but probably not for the number of iterations done by each/all thread workers.) A queue for the work to do, in thread pool fashion, is the most common approach.
pthread_cond_wait() simply means that the current thread shall release the mutex and then waits on a condition. The trick here is that both happens atomically, so it cannot happen, that the thread has released the mutex and is not yet waiting on the condition or is already waiting on the condition and has not yet released the mutex. Either both has happened or none has happened.
pthread_cond_signal() simply wakes up any thread that is currently waiting on the signaled condition. The first thing the woken up thread will do is obtaining the mutex again, if it cannot obtain it (e.g. as the signaling thread is currently owning the mutex), it will block until it can. If multiple threads are waiting on the condition, pthread_cond_signal() just wakes up one of them, which one is not defined. If you want to wake up all the waiting threads, you must use pthread_cond_broadcast() instead; but of course they won't run at the same time as now each of them first requires to obtain the mutex and that will only be possible one after another.
pthread_cond_t has no state. If you signal a condition no thread is waiting for, then nothing will happen. It's not like this will set a flag internally and if later on some thread calls pthread_cond_wait(), it will be woken up immediately as there is a pending signal. pthread_cond_signal() only wakes up threads that are already waiting, that means these threads must have called pthread_cond_wait() prior to you calling pthread_cond_signal().
Here's some simple sample code. First a reader thread:
// === Thread 1 ===
// We want to process an item from a list.
// To make sure the list is not altered by one
// thread while another thread is accessing it,
// it is protected by a mutex.
pthread_mutex_lock(&listLock);
// Now nobody but us is allowed to access the list.
// But what if the list is empty?
while (list->count == 0) {
// As long as we hold the mutex, no other thread
// thread can add anything to the list. So we
// must release it. But we want to know as soon
// as another thread has changed it.
pthread_cond_wait(&listCondition, &listLock);
// When we get here, somebody has signaled the
// condition and we have the mutex again and
// thus are allowed to access the list. The list
// may however still be empty, as another thread
// may have already consumed the new item in case
// there are multiple readers and all are woken
// up, thus the while-loop. If the list is still
// empty, we just go back to sleep and wait again.
}
// If we get here, the list is not empty.
processListItem(list);
// Finally we release the mutex again.
pthread_mutex_unlock(&listLock);
And then a writer thread:
// === Thread 2 ===
// We want to add a new item to the list.
// To make sure that nobody is accessing the
// list while we do, we need to obtain the mutex.
pthread_mutex_lock(&listLock);
// Now nobody but us is allowed to access the list.
// Check if the list is empty.
bool listWasEmpty = (list->count == 0);
// We add our item.
addListItem(list, newItem);
// If the list was empty, one or even multiple
// threads may be waiting for us adding an item.
// So we should wake them up here.
if (listWasEmpty) {
// If any thread is waiting for that condition,
// wake it up as now there is an item to process.
pthread_cond_signal(&listCondition);
}
// Finally we must release the mutex again.
pthread_mutex_unlock(&listLock);
The code is written so that there can be any number of reader/writer threads. Signaling only if the list was empty (listWasEmpty) is just a performance optimization, the code would also work correctly if you always signal the condition after adding an item.
Would it be possible to wake up a thread that is waiting on a futex lock? I tried
using a signal mechanism but it does not seem to work. Are there any other approches
I could try out? Below, I've added in an example that might be similar to what I'm
trying to achieve.
I have a thread A that acquires a futex lock "lockA" as follows :-
ret = syscall(__NR_futex, &lockA, FUTEX_LOCK_PI, 1, 0, NULL, 0);
I have a thread B that tries to acquire the futex lock "lockA", and blocks in the kernel,
as thread A has acquired the lock.
ret = syscall(__NR_futex, &lockA, FUTEX_LOCK_PI, 1, 0, NULL, 0);
If thread B does acquire lockA, another thread, thread C will know about it. If thread B
does not acquire the lock, thread C would like thread B to stop waiting for the lock, and
do something else.
So basically, at this point I'm trying to figure out if I can make thread C "signal" thread B
so that it won't block in the kernel anymore. In order to do that, I set a signal handler in
thread B as follows :-
struct sigaction act;
act.sa_handler = handler;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
act.sa_restorer = NULL;
sigaction(SIGSYS, &act, NULL);
...
...
void handler() {
fprintf(stderr, "Inside the handler, outta the kernel\n");
}
From thread C I try to send the signal as :-
pthread_kill(tid_of_B, SIGSYS);
What am I doing wrong? Can thread B be woken up at all? If so, should I use another approach?
[EDIT]
Based on a comment below, I tried checking the return value from pthread_kill and realised that the call was not returning.
A few things.
You're using FUTEX_LOCK_PI which is not in the man page. I've just looked at the kernel source and a document and it appears this version is only for use inside the kernel itself. It's used to implement a "PI mutex" as a replacement for a kernel spinlock.
If you use a futex, you must implement the semantics on data in the address it points to.
Here is a crude, pseudocode, and possibly/probably wrong example:
int mysem = 1;
void
lock(void)
{
// atomic_dec returns new value
while (1) {
if (atomic_dec(&mysem) == 0)
break;
futex(&mysem,FUTEX_WAIT,...)
}
}
void
unlock(void)
{
// non_atomic_swap returns old value
if (non_atomic_swap(&mysem,1) != 0)
futex(&mysem,FUTEX_WAKE,...)
}
This code plays a sound clip by creating a thread to do it. When bleep() runs, it sets the global variable bleep_playing to TRUE. In its main loop, if it notices that bleep_playing has been set to FALSE, it terminates that loop, cleans up (closing files, freeing buffers), and exits. I don't know the correct way to wait for a detached thread to finish. pthread_join() doesn't do the job. The while loop here continually checks bleep_id to see if it's valid. When it isn't, execution continues. Is this the correct and portable way to tell a thread to clean up and terminate before the next thread is allowed to be created?
if (bleep_playing) {
bleep_playing = FALSE;
while (pthread_kill(bleep_id, 0) == 0) {
/* nothing */
}
}
err = pthread_create(&bleep_id, &attr, (void *) &bleep, &effect);
I
Hmm... pthread_join should do the job. As far as I remember the thread has to call pthread_exit...?
/* bleep thread */
void *bleep(void *)
{
/* do bleeping */
pthread_exit(NULL);
}
/* main thread */
if (pthread_create(&thread, ..., bleep, ...) == 0)
{
/*
** Try sleeping for some ms !!!
** I've had some issues on multi core CPUs requiring a sleep
** in order for the created thread to really "exist"...
*/
pthread_join(&thread, NULL);
}
Anyway if it isn't doing its thing you shouldn't poll a global variable since it will eat up your CPU. Instead create a mutex (pthread_mutex_*-functions) which is initially locked and freed by the "bleep thread". In your main thread you can wait for that mutex which makes your thread sleep until the "bleep thread" frees the mutex.
(or quick & and dirty: sleep for a small amount of time while waiting for bleep_playing becoming FALSE)
I am trying to figure out how to get rid of a reliance on the pthread_timedjoin_np because I am trying to build some code on OSX.
Right now I have a Queue of threads that I am popping from, doing that pthread_timedjoin_np and if they dont return, they get pushed back on the queue.
The end of the thread_function that is called for each thread does a pthread_exit(0); so that the recieving thread can check for a return value of zero.
I thought i might try to use pthread_cond_timedwait() to achieve a similar effect, however I think i am missing a step.
I thought I would be able to make worker Thread A signal a condition AND pthread_exit() within a mutex, , and worker Thread B could wake up on the signal, and then pthread_join(). The problem is, Thread B doesn't know which thread threw the conditional signal. Do I need to explicitly pass that as part of the conditonal signal or what?
Thanks
Derek
Here is a portable implementation of pthread_timedjoin_np. It's a bit costly, but it's a full drop-in replacement:
struct args {
int joined;
pthread_t td;
pthread_mutex_t mtx;
pthread_cond_t cond;
void **res;
};
static void *waiter(void *ap)
{
struct args *args = ap;
pthread_join(args->td, args->res);
pthread_mutex_lock(&args->mtx);
args->joined = 1;
pthread_mutex_unlock(&args->mtx);
pthread_cond_signal(&args->cond);
return 0;
}
int pthread_timedjoin_np(pthread_t td, void **res, struct timespec *ts)
{
pthread_t tmp;
int ret;
struct args args = { .td = td, .res = res };
pthread_mutex_init(&args.mtx, 0);
pthread_cond_init(&args.cond, 0);
pthread_mutex_lock(&args.mtx);
ret = pthread_create(&tmp, 0, waiter, &args);
if (!ret)
do ret = pthread_cond_timedwait(&args.cond, &args.mtx, ts);
while (!args.joined && ret != ETIMEDOUT);
pthread_mutex_unlock(&args.mtx);
pthread_cancel(tmp);
pthread_join(tmp, 0);
pthread_cond_destroy(&args.cond);
pthread_mutex_destroy(&args.mtx);
return args.joined ? 0 : ret;
}
There may be small errors since I wrote this on the spot and did not test it, but the concept is sound.
Producer-consumer queue. Have the threads queue *themselves, and so their results,(if any), to the queue before they exit. Wait on the queue.
No polling, no latency.
With your current design, you would have to join() the returned threads get the valueptr and to ensure that they are destroyed.
Maybe you could sometime move to a real threadpool, where task items are queued to threads that never terminate, (so eliminating thread create/terminate/destroy overhead)?
solution with alarm.
pthread should enable cancel, so it can stop by external.(even with pthread_timedjoin_np).
pthread_timedjoin_np return with ETIMEOUT after waited time.
set alarm, use alarm also can give "TIMEOUT" signal.
In handler, just pthread_cancel it. (only timeout run this).
pthread_join it in main thread.
reset alarm
I write test code in here:github
I seem to be running in to a possible deadlock with a pthreads conditional variable.
Here is the code
thread function(){
for (condition){
do work
/* should the thread continue? */
if (exit == 1){
break; /* exit for */
}
} /* end for */
pthread_mutex_lock(&mtxExit);
exit = 0;
pthread_cond_signal(&condVar);
pthread_mutex_unlock(&mtxExit);
}
The main function is as follows:
function main(){
if (thread is still active){
pthread_mutex_lock(&mtxExit);
exit = 1;
pthread_mutex_unlock(&mtxExit);
} /* end if */
while (exit == 1){
pthread_mutex_lock(&mtxExit);
/* check again */
if (exit == 1)
pthread_cond_wait(&condVar, &mtxExit);
pthread_mutex_unlock(&mtxExit);
}
create new thread()
....
}
The code is always getting stuck at cond_wait. :(
EDIT:
Let me add some clarification to the thread to explain what I am doing.
At any given time, I need only one thread running. I have a function that starts the thread, tells it what to do and the main thread continues it work.
The next time the main thread decides it needs to spawn another thread, it has to make sure the thread that was previously started has exited. I cannot have two threads alive at the same time as they will interfere with each other. This is by design and by definition of the problem I am working on.
That is where I am running in to problems.
This is my approach:
Start the thread, let it do its job.
the thread checks in every step of its job to see if it is still relevant. This is where "exit" comes in to picture. The main thread sets "exit" to 1, if it needs to tell the thread that it is no longer relevant.
In most cases, the thread will exit before the main thread decides to spawn another thread. But I still need to factor in the case that the thread is still alive by the time the main thread is ready to start another one.
So the main thread sets the value of "exit" and needs to wait for the thread to exit. I dont want to use pthread_kill with 0 as signal because then main thread will be in a loop wasting CPU cycles. I need the main thread to relinquish control and sleep/wait till the thread exits.
Since I only need one thread at a time, I dont need to worry about scaling to more threads. The solution will never have more than one thread. I just need a reliable mechanism to test if my thread is still alive, if it is, signal it to exit, wait for it to exit and start the next one.
From my testing, it looks like, the main thread is still entering the conditional variable even if the thread may have exited or that the signal is not getting delivered to the main thread at all. And its waiting there forever. And is some cases, in debugger I see that the value of exit is set to 0 and still the main thread is waiting at signal. There seems to be a race condition some where.
I am not a fan of how I set up the code right now, its too messy. Its only a proof of concept right now, I will move to a better solution soon. My challenge is to reliably signal the thread to exit, wait on it to exit.
I appreciate your time.
Did you forget to initialize your condition variable?
pthread_cond_init(&condVar, NULL)
while (exit == 1) {
In the code you quote, the way you quote I do not see any particular problem. It is not clean, but it appears functional. What leads me to believe that somewhere else you are setting exit to 0 without signaling that. Or the thread is getting stuck somewhere doing the work.
But considering the comments which hint that you try to signal one thread to terminate before starting another thread, I think you are doing it wrong. Generally pthread condition signaling shouldn't be relied upon if a signal may not be missed. Though it seems that state variable exit covers that, it is still IMO wrong application of the pthread conditions.
In the case you can try to use a semaphores. While terminating, the thread increments the termination semaphore so that main can wait (decrement) the semaphore.
thread function()
{
for (condition)
{
do work
/* should the thread continue? */
if (exit == 1) {
break; /* exit for */
}
} /* end for */
sem_post(&termSema);
}
function main()
{
if (thread is still active)
{
exit = 1;
sem_wait(&termSema);
exit = 0;
}
create new thread()
....
}
As a general remark, I can suggest to look for some thread pool implementations. Because using a state variable to sync threads is still wrong and doesn't scale to more than one thread. And error prone.
When the code is stuck in pthread_cond_wait, is exit 1 or 0? If exit is 1, it should be stuck.
If exit is 0, one of two things are most likely the case:
1) Some code set exit to 0 but didn't signal the condition variable.
2) Some thread blocked on pthread_cond_wait, consumed a signal, but didn't do whatever it is you needed done.
You have all sorts of timing problems with your current implementation (hence the problems).
To ensure that the thread has finished (and its resources have been released), you should call pthread_join().
There is no need for a pthread_cond_t here.
It might also make more sense to use pthread_cancel() to notify the thread that it is no longer required, rather than a flag like you are currently doing.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *thread_func(void *arg) {
int i;
for (i = 0; i < 10; i++) {
/* protect any regions that must not be cancelled... */
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);
/* very important work */
printf("%d\n", i);
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
/* force a check to see if we're finished */
pthread_testcancel();
/* sleep (for clarity in the example) */
sleep(1);
}
return NULL;
}
void main(void) {
int ret;
pthread_t tid;
ret = pthread_create(&tid, NULL, thread_func, NULL);
if (ret != 0) {
printf("pthread_create() failed %d\n", ret);
exit(1);
}
sleep(5);
ret = pthread_cancel(tid);
if (ret != 0) {
printf("pthread_cancel() failed %d\n", ret);
exit(1);
}
ret = pthread_join(tid, NULL);
if (ret != 0) {
printf("pthread_join() failed %d\n", ret);
exit(1);
}
printf("finished...\n");
}
It's also worth noting:
exit() is a library function - you should not declare anything with the same name as something else.
Depending on your specific situation, it might make sense to keep a single thread alive always, and provide it with jobs to do, rather than creating / cancelling threads continuously (research 'thread pools')