How do I stop threads stalling on pthread_join? - c

I've got a project where I'm adding jobs to a queue and I have multiple threads taking jobs, and calculating their own independent results.
My program handles the SIGINT signal and I'm attempting to join the threads to add up the results, print to screen, and then exit. My problem is is that the threads either seem to stop functioning when I send the signal, or they're getting blocked on the mutex_lock. Here are the important parts of my program so as to be concise.
main.c
//the thread pool has a queue of jobs inside
//called jobs (which is a struct)
struct thread_pool * pool;
void signal_handler(int signo) {
pool->jobs->running = 0; //stop the thread pool
pthread_cond_broadcast(pool->jobs->cond);
for (i = 0; i < tpool->thread_count; i++) {
pthread_join(tpool->threads[i], retval);
//do stuff with retval
}
//print results then exit
exit(EXIT_SUCCESS);
}
int main() {
signal(SIGINT, signal_handler);
//set up threadpool and jobpool
//start threads (they all run the workerThread function)
while (1) {
//send jobs to the job pool
}
return 0;
}
thread_stuff.c
void add_job(struct jobs * j) {
if (j->running) {
pthread_mutex_lock(j->mutex);
//add job to queue and update count and empty
pthread_cond_signal(j->cond);
pthread_mutex_unlock(j->mutex);
}
}
struct job * get_job(struct jobs * j) {
pthread_mutex_lock(j->mutex);
while (j->running && j->empty)
pthread_cond_wait(j->cond, j->mutex);
if (!j->running || j->empty) return NULL;
//get the next job from the queue
//unlock mutex and send a signal to other threads
//waiting on the condition
pthread_cond_signal(j->cond);
pthread_mutex_unlock(j->mutex);
//return new job
}
void * workerThread(void * arg) {
struct jobs * j = (struct jobs *) arg;
int results = 0;
while (j->running) {
//get next job and process results
}
return results;
}
Thanks for your help, this is giving me a real headache!

You should not call pthread_cond_wait or pthread_join from a signal handler which handles asynchronously generated signals such as SIGINT. Instead, you should block SIGINT for all threads, spawn a dedicated thread, and call sigwait there. This means that you detect the arrival of the SIGINT signal outside of a signal handler context, so that you are not restricted to async-signal-safe functions. You also avoid the risk of self-deadlock in case the signal is delivered to one of the worker threads.
At this point, you just need to shut down your work queue/thread pool in an orderly manner. Depending on the details, your existing approach with the running flag might even work unchanged.

Related

How to wake the sleeping threads and exit the main thread?

I am creating 10 threads. Each thread will do some task. There are 7 tasks to be done. Since number of tasks are less than the number of threads, there would always be 3 threads that sleep and do nothing.
My main thread must wait for the tasks to get completed and exit only when all the tasks are done (i.e when the thread exits). I am waiting in a for loop and calling pthread_join but with the 3 threads that are sleeping, how can I wake them up and make them exit?
Here is what I am doing now.
// thread handler function, called when the thread is created
void* handler_func(void* arg) {
while(true){
pthread_mutex_lock(&my_queue_mutex);
while(my_queue_is_empty()) {
pthread_cond_wait(&my_cond_var, &my_queue_mutex);
}
item = get_item_from_queue();
pthread_mutex_unlock(&my_queue_mutex);
}
pthread_exit(NULL);
}
int total_threads_to_create = 10;
int total_requests_to_make = 7;
pthread_t threads[total_threads_to_create];
for(int i = 0; i < total_threads_to_create; i++) {
// create threads
pthread_create(&threads[i], NULL, handler_func, NULL);
}
for(int i=0;i<total_requests_to_make;i++){
// fill up the task queue
add_task_to_my_queue(i + 100);
}
for(int i = 0; i< total_threads_to_create; i++) {
// wait for threads to finish
pthread_join(threads[i], NULL);
}
The simplest thing to do would be to enqueue dummy tasks that indicate "all done" to the workers, since you know the number of workers in advance:
for(int i=0;i<total_threads_to_create;i++){
// a task of -1 means "no more work"
add_task_to_my_queue(-1);
}
Alternatively, you can have a "breakable" queue. This queue wakes up waiting consumers with a compound predicate: either non-empty or finished. Perl's Thread::Queue objects can be ended, for example, and Python's queues can track completed tasks.
Another option would be to keep track of the number of tasks completed yourself, with its own condition variable and mutex or a "countdown latch" or whatever, and not care about the worker threads. They'll be vaporized when the program exits.

The version of pthread_join() that does not block main(): POSIX

I am trying to write a code that does not block main() when pthread_join() is called:
i.e. basically trying to implement my previous question mentioned below:
https://stackoverflow.com/questions/24509500/pthread-join-and-main-blocking-multithreading
And the corresponding explanation at:
pthreads - Join on group of threads, wait for one to exit
As per suggested answer:
You'd need to create your own version of it - e.g. an array of flags (one flag per thread) protected by a mutex and a condition variable; where just before "pthread_exit()" each thread acquires the mutex, sets its flag, then does "pthread_cond_signal()". The main thread waits for the signal, then checks the array of flags to determine which thread/s to join (there may be more than one thread to join by then).
I have tried as below:
My status array which keeps a track of which threads have finished:
typedef struct {
int Finish_Status[THREAD_NUM];
int signalled;
pthread_mutex_t mutex;
pthread_cond_t FINISHED;
}THREAD_FINISH_STATE;
The thread routine, it sets the corresponding array element when the thread finishes and also signals the condition variable:
void* THREAD_ROUTINE(void* arg)
{
THREAD_ARGUMENT* temp=(THREAD_ARGUMENT*) arg;
printf("Thread created with id %d\n",temp->id);
waitFor(5);
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
ThreadFinishStatus.Finish_Status[temp->id]=TRUE;
ThreadFinishStatus.signalled=TRUE;
if(ThreadFinishStatus.signalled==TRUE)
{
pthread_cond_signal(&(ThreadFinishStatus.FINISHED));
printf("Signal that thread %d finished\n",temp->id);
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
pthread_exit((void*)(temp->id));
}
I am not able to write the corresponding parts pthread_join() and pthread_cond_wait() functions. There are a few things which I am not able to implement.
1) How to write corresponding part pthread_cond_wait() in my main()?
2) I am trying to write it as:
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
while((ThreadFinishStatus.signalled != TRUE){
pthread_cond_wait(&(ThreadFinishStatus.FINISHED), &(ThreadFinishStatus.mutex));
printf("Main Thread signalled\n");
ThreadFinishStatus.signalled==FALSE; //Reset signalled
//check which thread to join
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
But it does not enter the while loop.
3) How to use pthread_join() so that I can get the return value stored in my arg[i].returnStatus
i.e. where to put below statement in my main:
`pthread_join(T[i],&(arg[i].returnStatus));`
COMPLETE CODE
#include <stdio.h>
#include <pthread.h>
#include <time.h>
#define THREAD_NUM 5
#define FALSE 0
#define TRUE 1
void waitFor (unsigned int secs) {
time_t retTime;
retTime = time(0) + secs; // Get finishing time.
while (time(0) < retTime); // Loop until it arrives.
}
typedef struct {
int Finish_Status[THREAD_NUM];
int signalled;
pthread_mutex_t mutex;
pthread_cond_t FINISHED;
}THREAD_FINISH_STATE;
typedef struct {
int id;
void* returnStatus;
}THREAD_ARGUMENT;
THREAD_FINISH_STATE ThreadFinishStatus;
void initializeState(THREAD_FINISH_STATE* state)
{
int i=0;
state->signalled=FALSE;
for(i=0;i<THREAD_NUM;i++)
{
state->Finish_Status[i]=FALSE;
}
pthread_mutex_init(&(state->mutex),NULL);
pthread_cond_init(&(state->FINISHED),NULL);
}
void destroyState(THREAD_FINISH_STATE* state)
{
int i=0;
for(i=0;i<THREAD_NUM;i++)
{
state->Finish_Status[i]=FALSE;
}
pthread_mutex_destroy(&(state->mutex));
pthread_cond_destroy(&(state->FINISHED));
}
void* THREAD_ROUTINE(void* arg)
{
THREAD_ARGUMENT* temp=(THREAD_ARGUMENT*) arg;
printf("Thread created with id %d\n",temp->id);
waitFor(5);
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
ThreadFinishStatus.Finish_Status[temp->id]=TRUE;
ThreadFinishStatus.signalled=TRUE;
if(ThreadFinishStatus.signalled==TRUE)
{
pthread_cond_signal(&(ThreadFinishStatus.FINISHED));
printf("Signal that thread %d finished\n",temp->id);
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
pthread_exit((void*)(temp->id));
}
int main()
{
THREAD_ARGUMENT arg[THREAD_NUM];
pthread_t T[THREAD_NUM];
int i=0;
initializeState(&ThreadFinishStatus);
for(i=0;i<THREAD_NUM;i++)
{
arg[i].id=i;
}
for(i=0;i<THREAD_NUM;i++)
{
pthread_create(&T[i],NULL,THREAD_ROUTINE,(void*)&arg[i]);
}
/*
Join only if signal received
*/
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
//Wait
while((ThreadFinishStatus.signalled != TRUE){
pthread_cond_wait(&(ThreadFinishStatus.FINISHED), &(ThreadFinishStatus.mutex));
printf("Main Thread signalled\n");
ThreadFinishStatus.signalled==FALSE; //Reset signalled
//check which thread to join
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
destroyState(&ThreadFinishStatus);
return 0;
}
Here is an example of a program that uses a counting semaphore to watch as threads finish, find out which thread it was, and review some result data from that thread. This program is efficient with locks - waiters are not spuriously woken up (notice how the threads only post to the semaphore after they've released the mutex protecting shared state).
This design allows the main program to process the result from some thread's computation immediately after the thread completes, and does not require the main wait for all threads to complete. This would be especially helpful if the running time of each thread varied by a significant amount.
Most importantly, this program does not deadlock nor race.
#include <pthread.h>
#include <semaphore.h>
#include <stdlib.h>
#include <stdio.h>
#include <queue>
void* ThreadEntry(void* args );
typedef struct {
int threadId;
pthread_t thread;
int threadResult;
} ThreadState;
sem_t completionSema;
pthread_mutex_t resultMutex;
std::queue<int> threadCompletions;
ThreadState* threadInfos;
int main() {
int numThreads = 10;
int* threadResults;
void* threadResult;
int doneThreadId;
sem_init( &completionSema, 0, 0 );
pthread_mutex_init( &resultMutex, 0 );
threadInfos = new ThreadState[numThreads];
for ( int i = 0; i < numThreads; i++ ) {
threadInfos[i].threadId = i;
pthread_create( &threadInfos[i].thread, NULL, &ThreadEntry, &threadInfos[i].threadId );
}
for ( int i = 0; i < numThreads; i++ ) {
// Wait for any one thread to complete; ie, wait for someone
// to queue to the threadCompletions queue.
sem_wait( &completionSema );
// Find out what was queued; queue is accessed from multiple threads,
// so protect with a vanilla mutex.
pthread_mutex_lock(&resultMutex);
doneThreadId = threadCompletions.front();
threadCompletions.pop();
pthread_mutex_unlock(&resultMutex);
// Announce which thread ID we saw finish
printf(
"Main saw TID %d finish\n\tThe thread's result was %d\n",
doneThreadId,
threadInfos[doneThreadId].threadResult
);
// pthread_join to clean up the thread.
pthread_join( threadInfos[doneThreadId].thread, &threadResult );
}
delete threadInfos;
pthread_mutex_destroy( &resultMutex );
sem_destroy( &completionSema );
}
void* ThreadEntry(void* args ) {
int threadId = *((int*)args);
printf("hello from thread %d\n", threadId );
// This can safely be accessed since each thread has its own space
// and array derefs are thread safe.
threadInfos[threadId].threadResult = rand() % 1000;
pthread_mutex_lock( &resultMutex );
threadCompletions.push( threadId );
pthread_mutex_unlock( &resultMutex );
sem_post( &completionSema );
return 0;
}
Pthread conditions don't have "memory"; pthread_cond_wait doesn't return if pthread_cond_signal is called before pthread_cond_wait, which is why it's important to check the predicate before calling pthread_cond_wait, and not call it if it's true. But that means the action, in this case "check which thread to join" should only depend on the predicate, not on whether pthread_cond_wait is called.
Also, you might want to make the while loop actually wait for all the threads to terminate, which you aren't doing now.
(Also, I think the other answer about "signalled==FALSE" being harmless is wrong, it's not harmless, because there's a pthread_cond_wait, and when that returns, signalled would have changed to true.)
So if I wanted to write a program that waited for all threads to terminate this way, it would look more like
pthread_mutex_lock(&(ThreadFinishStatus.mutex));
// AllThreadsFinished would check that all of Finish_Status[] is true
// or something, or simpler, count the number of joins completed
while (!AllThreadsFinished()) {
// Wait, keeping in mind that the condition might already have been
// signalled, in which case it's too late to call pthread_cond_wait,
// but also keeping in mind that pthread_cond_wait can return spuriously,
// thus using a while loop
while (!ThreadFinishStatus.signalled) {
pthread_cond_wait(&(ThreadFinishStatus.FINISHED), &(ThreadFinishStatus.mutex));
}
printf("Main Thread signalled\n");
ThreadFinishStatus.signalled=FALSE; //Reset signalled
//check which thread to join
}
pthread_mutex_unlock(&(ThreadFinishStatus.mutex));
Your code is racy.
Suppose you start a thread and it finishes before you grab the mutex in main(). Your while loop will never run because signalled was already set to TRUE by the exiting thread.
I will echo #antiduh's suggestion to use a semaphore that counts the number of dead-but-not-joined threads. You then loop up to the number of threads spawned waiting on the semaphore. I'd point out that the POSIX sem_t is not like a pthread_mutex in that sem_wait can return EINTR.
Your code appears fine. You have one minor buglet:
ThreadFinishStatus.signalled==FALSE; //Reset signalled
This does nothing. It tests whether signalled is FALSE and throws away the result. That's harmless though since there's nothing you need to do. (You never want to set signalled to FALSE because that loses the fact that it was signalled. There is never any reason to unsignal it -- if a thread finished, then it's finished forever.)
Not entering the while loop means signalled is TRUE. That means the thread already set it, in which case there is no need to enter the loop because there's nothing to wait for. So that's fine.
Also:
ThreadFinishStatus.signalled=TRUE;
if(ThreadFinishStatus.signalled==TRUE)
There's no need to test the thing you just set. It's not like the set can fail.
FWIW, I would suggest re-architecting. If the existing functions like pthread_join don't do exactly what you want, just don't use them. If you're going to have structures that track what work is done, then totally separate that from thread termination. Since you will already know what work is done, what different does it make when and how threads terminate? Don't think of this as "I need a special way to know when a thread terminates" and instead think of this "I need to know what work is done so I can do other things".

Wait termination of threads work before close a server after receiving SIGINT or SIGTERM (Posix thread in C)

I'm developing a client / server in C using pthread.
It's a card game where server is the dealer between two clients that play.
I have this situation
server.c:
main:
int main (int argc, char * argv[])
using as a thread dispatcher for listening eventually entering connection on accept
pthread_create(&sig_thread, NULL, &thread_signal_handler, &server_Socket );
thread to catch signals (maskered conveniently) created inside main function
pthread_create(&tid[i++], NULL, &worker, (void*) arr_args)
every client is associated to a thread created (tid is the array of thread ID)
pthread_join(sig_thread, NULL)
to join thread for signals
pthread_join(tid[i], NULL)
to join thread created for every client connected.
worker:
void * worker(void * args)
do his job and terminated
thread signal handler:
void * thread_signal_handler(void * arg)
void * thread_signal_handler(void * arg) {
sigset_t set;
int sig, n;
int * socket = (int*) arg;
sigemptyset(&set);
sigaddset(&set, SIGINT);
sigaddset(&set,SIGTERM);
while (server_UP)
{
/* wait for a signal */
if (sigwait(&set, &sig))
perror("Sigwait");
/* received SIGINT or SIGTERM - closing server */
if ((sig == SIGINT || sig == SIGTERM)) {
printf(TERM_SERVER"\n");
server_UP = 0; /* global variable */
/* close socket blocked on MAIN */
shutdown( *socket, SHUT_RDWR );
}
return (void *) EXIT_SUCCESS;
}
In this way, the signal is catched well and the server is terminated immediately.
I need, if there are matches in progress, to waiting that all the workers finish their jobs properly, so I want to know how to say to thread handler "did you receive a SIGINT (or SIGTERM)? Ok, waiting the termination of all workers".
Any suggests? If you need more code, i'll edit this post.
When you create them you can save all of the threads in an array or a list and at the end call pthread_join() on all of them.
while (pthread_list) {
pthread_join(pthread_list->th_id, NULL);
pthread_list = pthread_list->next;
}
If you don't want to keep track of all threads you can keep a count of running threads, increment it before calling pthread_create, decrement the count as each thread finishes.
Then in the handler you'll sleep for some seconds iteratively until the count becomes 0.

Pthreads and signals

I'm having a little trouble with pthreads. Basically, I want to catch a SIGINT and have all threads cleanup and exit. What I have (skeleton code):
main.c:
sig_atomic_t running;
void handler(int signal_number)
{
running = 0;
}
int main(void)
{
queue job_queue = new_job_queue();
running = 1;
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = &handler;
sigaction(SIGINT, &sa, NULL);
/* create a bunch of threads */
init_threads(&job_queue);
while(running) {
/* do stuff */
}
cleanup();
return (0);
}
threads.c
extern sig_atomic_t running;
pthread_mutex_t queue_mutex = PTHREAD_MUTEX_INITIALIZER;
sem_t queue_count;
void init_threads(queue *q)
{
int numthreads = 12; /* say */
sem_init (&queue_count, 0, 0);
pthread_t worker_threads[numthreads];
int i;
for(i=0;i<numthreads;i++)
pthread_create(&worker_threads[i], NULL, &thread_function, q);
}
void * thread_function(void *args)
{
pthread_detatch(pthread_self());
queue *q = (queue *)args;
while(running) {
job *j = NULL;
sem_wait(&queue_count);
pthread_mutex_lock(&queue_mutex);
j = first_job_in_queue(q);
pthread_mutex_unlock(&queue_mutex);
if(j) {
/*do something*/
}
}
return (NULL);
}
I am having little luck with this. Since you're not guarenteed which thread gets the signal I thought this was a good way to go. But I am having a problem where sem_wait() in threads.c is hanging, which is expected but not desired. The while(running) loop in threads.c seems redundant. Should I maybe do a pthread_kill() to all the threads from main? Any obvious problems with the above skeleton code? Is there a better/easier way to go about doing this?
Thanks.
What you can do is to call sem_post() from the handler until all threads are unlocked. In the thread function, immediately after sem_wait() you should check the value of the running variable and if it's zero break breom the while.
The code in the handler could be something like the following:
int sval;
sem_getvalue(&queue_count, &sval);
while (sval < 0) {
sem_post(&queue_count);
sem_getvalue(&queue_count, &sval);
}
Of course return values should be verified for errors
You can catch SIGINT in one thread, and use pthread_sigmask() to block SIGINT in all other threads, if SIGINT generated by some way, the signal will be delivered to the specified thread, that thread can call pthread_cancel() to cancel all other threads.
You may want to consider calling pthread_join after each call to pthread_create. This will allow for your main thread to wait until all threads are done executing.
But maybe I'm misunderstanding slightly... Do you want to wait for all threads to finish, or simply wait for one to finish, and then stop all others immediately?
You shouldn't do a pthread_kill() if you don't have to. I'm not to familiar with pthread_detatch() but if you are wanting your main() function to wait for the threads to finish, it would probably be better if your cleanup() function did a pthread_join() on each thread id returned from pthread_create() to wait for each thread to exit normally.
Also, as far as I can tell, sem_wait() is hanging because your semaphore value is initialized to 0. If you want say at most 5 threads to access the shared resource at a time, initialize the semaphore to 5, i.e. sem_init(&queue_count, 0, 5).

WaitForSingleObject and WaitForMultipleObjects equivalent in Linux?

I am migrating an applciation from windows to linux. I am facing problem with respect to WaitForSingleObject and WaitForMultipleObjects interfaces.
In my application I spawn multiple threads where all threads wait for events from parent process or periodically run for every t seconds.
I have checked pthread_cond_timedwait, but we have to specify absolute time for this.
How can I implement this in Unix?
Stick to pthread_cond_timedwait and use clock_gettime. For example:
struct timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
ts.tv_sec += 10; // ten seconds
while (!some_condition && ret == 0)
ret = pthread_cond_timedwait(&cond, &mutex, &ts);
Wrap it in a function if you wish.
UPDATE: complementing the answer based on our comments.
POSIX doesn't have a single API to wait for "all types" of events/objects as Windows does. Each one has its own functions. The simplest way to notify a thread for termination is using atomic variables/operations. For example:
Main thread:
// Declare it globally (argh!) or pass by argument when the thread is created
atomic_t must_terminate = ATOMIC_INIT(0);
// "Signal" termination by changing the initial value
atomic_inc(&must_terminate);
Secondary thread:
// While it holds the default value
while (atomic_read(&must_terminate) == 0) {
// Keep it running...
}
// Do proper cleanup, if needed
// Call pthread_exit() providing the exit status
Another alternative is to send a cancellation request using pthread_cancel. The thread being cancelled must have called pthread_cleanup_push to register any necessary cleanup handler. These handlers are invoked in the reverse order they were registered. Never call pthread_exit from a cleanup handler, because it's undefined behaviour. The exit status of a cancelled thread is PTHREAD_CANCELED. If you opt for this alternative, I recommend you to read mainly about cancellation points and types.
And last but not least, calling pthread_join will make the current thread block until the thread passed by argument terminates. As bonus, you'll get the thread's exit status.
For what it's worth, we (NeoSmart Technologies) have just released an open source (MIT licensed) library called pevents which implements WIN32 manual and auto-reset events on POSIX, and includes both WaitForSingleObject and WaitForMultipleObjects clones.
Although I'd personally advise you to use POSIX multithreading and signaling paradigms when coding on POSIX machines, pevents gives you another choice if you need it.
I realise this is an old question now, but for anyone else who stumbles across it, this source suggests that pthread_join() does effectively the same thing as WaitForSingleObject():
http://www.ibm.com/developerworks/linux/library/l-ipc2lin1/index.html
Good luck!
For WaitForMultipleObjects with false WaitAll try this:
#include <unistd.h>
#include <pthread.h>
#include <stdio.h>
using namespace std;
pthread_cond_t condition;
pthread_mutex_t signalMutex;
pthread_mutex_t eventMutex;
int finishedTask = -1;
void* task(void *data)
{
int num = *(int*)data;
// Do some
sleep(9-num);
// Task finished
pthread_mutex_lock(&eventMutex); // lock until the event will be processed by main thread
pthread_mutex_lock(&signalMutex); // lock condition mutex
finishedTask = num; // memorize task number
pthread_cond_signal(&condition);
pthread_mutex_unlock(&signalMutex); // unlock condtion mutex
}
int main(int argc, char *argv[])
{
pthread_t thread[10];
pthread_cond_init(&condition, NULL);
pthread_mutex_init(&signalMutex, NULL); // First mutex locks signal
pthread_mutex_init(&eventMutex, NULL); // Second mutex locks event processing
int numbers[10];
for (int i = 0; i < 10; i++) {
numbers[i] = i;
printf("created %d\n", i); // Creating 10 asynchronous tasks
pthread_create(&thread[i], NULL, task, &numbers[i]);
}
for (int i = 0; i < 10;)
{
if (finishedTask >= 0) {
printf("Task %d finished\n", finishedTask); // handle event
finishedTask = -1; // reset event variable
i++;
pthread_mutex_unlock(&eventMutex); // unlock event mutex after handling
} else {
pthread_cond_wait(&condition, &signalMutex); // waiting for event
}
}
return 0;
}

Resources