So reading this topic, the common way to exit would be using a flag. My question is, how is the waiting handled? Say the thread is only to run every 30s, how would you wait those 30s properly?
Using sem_timedwait() isn't ideal as it relies on the system clock and any change to the clock can severely impact your application. This topic explains using condition variables instead. The problem is, it relies on a mutex. You can't safely use pthread_mutex_lock() and pthread_mutex_unlock() in a signal handler. So in terms of my example above of 30s, if you want to exit immediately, who is handling the mutex unlock?
My guess would be another thread that sole purpose is to check the exit flag and if true it would unlock the mutex. However, what is that thread like? Would it not be wasted resources to just sit there constantly checking a flag? Would you use sleep() and check every 1s for example?
I don't believe my guess is a good one. It seems very inefficient and I run into similar "how do I wait" type of question. I feel like I'm missing something, but my searching is leading to topics similar to what I linked where it talks about flags, but nothing on waiting.
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <errno.h>
#include <pthread.h>
#include <signal.h>
#include <unistd.h>
pthread_mutex_t my_mutex;
volatile sig_atomic_t exitRequested = 0;
void signal_handler(int signum) {
exitRequested = 1;
}
bool my_timedwait(pthread_mutex_t *mutex, int seconds) {
pthread_condattr_t attr;
pthread_condattr_init(&attr);
pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
pthread_cond_t cond;
pthread_cond_init(&cond, &attr);
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
ts.tv_sec += seconds;
int status = pthread_cond_timedwait(&cond, mutex, &ts);
if (status == 0) {
return false; // mutex unlocked
}
if ((status < 0) && (status != ETIMEDOUT)) {
// error, do something
return false;
}
return true; // timedout
}
void *exitThread(void *ptr) {
// constant check???
while (1) {
if (exitRequested) {
pthread_mutex_unlock(&my_mutex);
break;
}
}
}
void *myThread(void *ptr) {
while (1) {
// do work
printf("test\n");
// wait and check for exit (how?)
if (!my_timedwait(&my_mutex, 30)) {
// exiting
break;
}
}
}
int main(void) {
// init and setup signals
struct sigaction sa;
sa.sa_handler = signal_handler;
sigaction(SIGINT, &sa, NULL);
// init the mutex and lock it
pthread_mutex_init(&my_mutex, NULL);
pthread_mutex_lock(&my_mutex);
// start exit thread
pthread_t exitHandler;
pthread_create(&exitHandler, NULL, exitThread, NULL);
// start thread
pthread_t threadHandler;
pthread_create(&threadHandler, NULL, myThread, NULL);
// wait for thread to exit
pthread_join(threadHandler, NULL);
pthread_join(exitHandler, NULL);
return EXIT_SUCCESS;
}
The solution is simple. Instead of having the first thread block in pthread_join, block that thread waiting for signals. That will ensure that a SIGINT can be handled synchronously.
You need a global structure protected by a mutex. It should count the number of outstanding threads and whether or not a shutdown is requested.
When a thread finishes, have it acquire the mutex, decrement the number of outstanding threads and, if it's zero, send a SIGINT. The main thread can loop waiting for a signal. If it's from the thread count going to zero, let the process terminate. If it's from an external signal, set the shutdown flag, broadcast the condition variable, unlock the mutex, and continue waiting for the thread count to hit zero.
Here's a start:
pthread_mutex_t my_mutex; // protects shared state
pthread_cond_t my_cond; // allows threads to wait some time
bool exitRequested = 0; // protected by mutex
int threadsRunning = 0; // protected by mutex
pthread_t main_thread; // set in main
bool my_timedwait(int seconds)
{
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
ts.tv_sec += seconds;
pthread_mutex_lock (&my_mutex);
while (exitRequested == 0)
{
int status = pthread_cond_timedwait(&my_cond, &my_mutex, &ts);
if (status == ETIMEDOUT) // we waited as long as supposed to
break;
}
bool ret = ! exitRequested;
pthread_mutex_unlock (&my_mutex);
return ret; // timedout
}
bool shuttingDown()
{
pthread_mutex_lock (&my_mutex);
bool ret = exitRequested;
pthread_mutex_unlock (&my_mutex);
return ret;
}
void requestShutdown()
{
// call from the main thread if a SIGINT is received
pthread_mutex_lock (&my_mutex);
exitRequested = 1;
pthread_cond_broadcast (&my_cond);
pthread_mutex_unlock (&my_mutex);
}
void threadDone()
{
// call when a thread is done
pthread_mutex_lock (&my_mutex);
if (--threadsRunning == 0)
pthread_kill(main_thread, SIGINT); // make the main thread end
pthread_mutex_unlock (&my_mutex);
}
Related
I have two threads and one CPU.
I want each of the two threads that reached line A earlier in their program to wait for the other thread to reach line A, after which both threads continue to run their program. I have done this as follows, But I want both threads of line A to run their program exactly at the same time.
How can I accomplish this?
My code:
//headers
static volatile bool waitFlag[2];
void *threadZero(void*){
//some codes
waitFlag[1] = true;
while(!waitFlag[0]);
//line A of thread zero
//some codes
}
void *threadOne(void*){
// some codes
waitFlag[0] = true;
while(!waitFlag[1]);
//line A of thread one
//some codes
}
int main(){
waitFlag[0] = waitFlag[1] = false;
//Creates two threads and waits for them to finish.
}
A busy loop is inefficient for this purpose. What you want is a condition, a mutex and a simple counter:
Lock the mutex.
Increase the counter.
If the counter is 2, broadcast on the condition.
Otherwise wait on the condition.
Unlock the mutex.
This logic can be easily adapted to any number of threads by changing the threshold for the counter. The last thread to increment the counter (protected by the mutex) will broadcast on the condition and will unlock itself and all the other threads simultaneously. If you want to synchronize multiple times you can also reset the counter.
Here's an example:
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/random.h>
pthread_cond_t cond;
pthread_mutex_t cond_mutex;
unsigned int waiting;
// Sleep a random amount between 0 and 3s.
// This is just for testing, you don't actually neeed it.
void waste_time(void) {
unsigned us;
getrandom(&us, sizeof(us), 0);
us %= 3000000;
fprintf(stderr, "[%lx] Sleeping %u us...\n", pthread_self(), us);
usleep(us);
}
void synchronize(void) {
pthread_mutex_lock(&cond_mutex);
if (++waiting == 2) {
pthread_cond_broadcast(&cond);
} else {
while (waiting != 2)
pthread_cond_wait(&cond, &cond_mutex);
}
pthread_mutex_unlock(&cond_mutex);
}
void *threadZero(void *_) {
waste_time();
// ...
synchronize();
fprintf(stderr, "[%lx] Resuming.\n", pthread_self());
// ...
return NULL;
}
void *threadOne(void *_) {
waste_time();
// ...
synchronize();
fprintf(stderr, "[%lx] Resuming.\n", pthread_self());
// ...
return NULL;
}
int main(void) {
pthread_t zero, one;
pthread_create(&zero, NULL, threadZero, NULL);
pthread_create(&one, NULL, threadOne, NULL);
// ...
pthread_join(zero, NULL);
pthread_join(one, NULL);
return 0;
}
I have a threadpool of workers. Each worker executes this routine:
void* worker(void* args){
...
pthread_mutex_lock(&mtx);
while (queue == NULL && stop == 0){
pthread_cond_wait(&cond, &mtx);
}
el = pop(queue);
pthread_mutex_unlock(&mtx);
...
}
main thread:
int main(){
...
while (stop == 0){
...
pthread_mutex_lock(&mtx);
insert(queue, el);
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mtx);
...
}
...
}
Then I have a signal handler that executes this code when it receives a signal:
void exit_handler(){
stop = 1;
pthread_mutex_lock(&mtx);
pthread_cond_broadcast(&cond);
pthread_mutex_unlock(&mtx);
}
I have omitted declarations and initialization, but the original code has them.
After a signal is received most of the time it's all ok, but sometimes it seems that some worker threads stay in the wait loop because they don't see that the variable stop is changed and/or they are not waken up by the broadcast.
So the threads never end.
What I am missing?
EDIT: stop=1 moved inside the critical section in exit_handler. The issue remains.
EDIT2: I was executing the program on a VM with Ubuntu. Since the code appears to be totally right I tried to change VM and OS (XUbuntu) and now it seems to work correctly. Still don't know why, anyone has an idea?
Some guessing here, but it's too long for a comment, so if this is wrong, I will delete. I think you may have a misconception about how pthread_cond_broadcast works (at least something I've been burned with in the past). From the man page:
The pthread_cond_broadcast() function shall unblock all threads
currently blocked on the specified condition variable cond.
Ok, that make sense, _broadcast awakens all threads currently blocked on cond. However, only one of the awakened threads will then be able to lock the mutex after they're all awoken. Also from the man page:
The thread(s) that are unblocked shall contend for the mutex according
to the scheduling policy (if applicable), and as if each had called
pthread_mutex_lock().
So this means that if 3 threads are blocked on cond and _broadcast is called, all 3 threads will wake up, but only 1 can grab the mutex. The other 2 will still be stuck in pthread_cond_wait, waiting on a signal. Because of this, they don't see stop set to 1, and exit_handler (I'm assuming a Ctrl+c software signal?) is done signaling, so the remaining threads that lost the _broadcast competition are stuck in limbo, waiting on a signal that will never come, and unable to read that the stop flag has been set.
I think there are 2 options to work-around/fix this:
Use pthread_cond_timedwait. Even without being signaled, this will return from waiting at the specified time interval, see that stop == 1, and then exit.
Add pthread_cond_signal or pthread_cond_broadcast at the end of your worker function. This way, right before a thread exits, it will signal the cond variable allowing any other waiting threads to grab the mutex and finish processing. There is no harm in signaling a conditional variable if no threads are waiting on it, so this should be fine even for the last thread.
EDIT: Here is an MCVE that proves (as far as I can tell) that my answer above is wrong, heh. As soon as I press Ctrl+c, the program exits "immediately", which says to me all the threads are quickly acquiring the mutex after the broadcast, seeing that stop is false, and exiting. Then main joins on the threads and it's process over.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdbool.h>
#include <signal.h>
#include <unistd.h>
#define NUM_THREADS 3
#define STACK_SIZE 10
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t c = PTHREAD_COND_INITIALIZER;
volatile bool stop = false;
int stack[STACK_SIZE] = { 0 };
int sp = 0; // stack pointer,, also doubles as the current stack size
void SigHandler(int sig)
{
if (sig == SIGINT)
{
stop = true;
}
else
{
printf("Received unexcepted signal %d\n", sig);
}
}
void* worker(void* param)
{
long tid = (long)(param);
while (stop == false)
{
// acquire the lock
pthread_mutex_lock(&m);
while (sp <= 0) // sp should never be < 0
{
// there is no data in the stack to consume, wait to get signaled
// this unlocks the mutex when it is called, and locks the
// mutex before it returns
pthread_cond_wait(&c, &m);
}
// when we get here we should be guaranteed sp >= 1
printf("thread %ld consuming stack[%d] = %d\n", tid, sp-1, stack[sp-1]);
sp--;
pthread_mutex_unlock(&m);
int sleepVal = rand() % 10;
printf("thread %ld sleeping for %d seconds...\n", tid, sleepVal);
sleep(sleepVal);
}
pthread_exit(NULL);
}
int main(void)
{
pthread_t threads[NUM_THREADS];
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
srand(time(NULL));
for (long i=0; i<NUM_THREADS; i++)
{
int rc = pthread_create(&threads[i], &attr, worker, (void*)i);
if (rc != 0)
{
fprintf(stderr, "Failed to create thread %ld\n", i);
}
}
while (stop == false)
{
// produce data in bursts
int numValsToInsert = rand() % (STACK_SIZE - sp);
printf("main producing %d values\n", numValsToInsert);
// acquire the lock
pthread_mutex_lock(&m);
for (int i=0; i<numValsToInsert; i++)
{
// produce values for the stack
int val = rand() % 10000;
// I think this should already be guaranteed..?
if (sp+1 < STACK_SIZE)
{
printf("main pushing stack[%d] = %d\n", sp, val);
stack[sp++] = val;
// signal the workers that data is ready
//printf("main signaling threads...\n");
//pthread_cond_signal(&c);
}
else
{
printf("stack full!\n");
}
}
pthread_mutex_unlock(&m);
// signal the workers that data is ready
printf("main signaling threads...\n");
pthread_cond_broadcast(&c);
int sleepVal = 1;//rand() % 5;
printf("main sleeping for %d seconds...\n", sleepVal);
sleep(sleepVal);
}
for (long i=0; i<NUM_THREADS; i++)
{
pthread_join(threads[i], NULL);
}
return 0;
}
In the context of an existing multi-threaded application I want to suspend a list of threads for a specific duration then resume their normal execution. I know some of you wil say that I should not do that but I know that and I don't have a choice.
I came up with the following code that sort of work but randomly failed. For each thread I want to suspend, I send a signal and wait for an ack via a semaphore. The signal handler when invoked, post the semaphore and sleep for the specified duration.
The problem is when the system is fully loaded, the call to sem_timedwait sometimes fails with ETIMEDOUT and I am left with an inconsistent logic with semaphore used for the ack: I don't know if the signal has been dropped or is just late.
// compiled with: gcc main.c -o test -pthread
#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <semaphore.h>
#include <sys/types.h>
#include <sys/syscall.h>
#define NUMTHREADS 40
#define SUSPEND_SIG (SIGRTMIN+1)
#define SUSPEND_DURATION 80 // in ms
static sem_t sem;
void checkResults(const char *msg, int rc) {
if (rc == 0) {
//printf("%s success\n", msg);
} else if (rc == ESRCH) {
printf("%s failed with ESRCH\n", msg);
} else if (rc == EINVAL) {
printf("%s failed with EINVAL\n", msg);
} else {
printf("%s failed with unknown error: %d\n", msg, rc);
}
}
static void suspend_handler(int signo) {
sem_post(&sem);
usleep(SUSPEND_DURATION*1000);
}
void installSuspendHandler() {
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = suspend_handler;
int rc = sigaction(SUSPEND_SIG, &sa, NULL);
checkResults("sigaction SUSPEND", rc);
}
void *threadfunc(void *param) {
int tid = *((int *) param);
free(param);
printf("Thread %d entered\n", tid);
// this is an example workload, the real app is doing many things
while (1) {
int rc = sleep(30);
if (rc != 0 && errno == EINTR) {
//printf("Thread %d got a signal delivered to it\n", tid);
} else {
//printf("Thread %d did not get expected results! rc=%d, errno=%d\n", tid, rc, errno);
}
}
return NULL;
}
int main(int argc, char **argv) {
pthread_t threads[NUMTHREADS];
int i;
sem_init(&sem, 0, 0);
installSuspendHandler();
for(i=0; i<NUMTHREADS; ++i) {
int *arg = malloc(sizeof(*arg));
if ( arg == NULL ) {
fprintf(stderr, "Couldn't allocate memory for thread arg.\n");
exit(EXIT_FAILURE);
}
*arg = i;
int rc = pthread_create(&threads[i], NULL, threadfunc, arg);
checkResults("pthread_create()", rc);
}
sleep(3);
printf("Will start to send signals...\n");
while (1) {
printf("***********************************************\n");
for(i=0; i<NUMTHREADS; ++i) {
int rc = pthread_kill(threads[i], SUSPEND_SIG);
checkResults("pthread_kill()", rc);
printf("Waiting for Semaphore for thread %d ...\n", i);
// compute timeout abs timestamp for ack
struct timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
const int TIMEOUT = SUSPEND_DURATION*1000*1000; // in nano-seconds
ts.tv_nsec += TIMEOUT; // timeout to receive ack from signal handler
// normalize timespec
ts.tv_sec += ts.tv_nsec / 1000000000;
ts.tv_nsec %= 1000000000;
rc = sem_timedwait(&sem, &ts); // try decrement semaphore
if (rc == -1 && errno == ETIMEDOUT) {
// timeout
// semaphore is out of sync
printf("Did not received signal handler sem_post before timeout of %d ms for thread %d", TIMEOUT/1000000, i);
abort();
}
checkResults("sem_timedwait", rc);
printf("Received Semaphore for thread %d.\n", i);
}
sleep(1);
}
for(i=0; i<NUMTHREADS; ++i) {
int rc = pthread_join(threads[i], NULL);
checkResults("pthread_join()\n", rc);
}
printf("Main completed\n");
return 0;
}
Questions?
Is it possible for a signal to be dropped and never delivered?
What causes the timeout on the semaphore at random time when the system is loaded?
usleep() is not among the async-signal-safe functions (though sleep() is, and there are other async-signal-safe functions by which you can produce a timed delay). A program that calls usleep() from a signal handler is therefore non-conforming. The specifications do not describe what may happen -- neither with such a call itself nor with the larger program execution in which it occurs. Your questions can be answered only for a conforming program; I do that below.
Is it possible for a signal to be dropped and never delivered?
It depends on what exactly you mean:
If a normal (not real-time) signal is delivered to a thread that already has that signal queued then no additional instance is queued.
A thread can die with signals still queued for it; those signals will not be handled.
A thread can change a given signal's disposition (to SIG_IGN, for example), though this is a per-process attribute, not a per-thread one.
A thread can block a signal indefinitely. A blocked signal is not dropped -- it remains queued for the thread and will eventually be received some time after it is unblocked, if that ever happens.
But no, having successfully queued a signal via the kill() or raise() function, that signal will not be randomly dropped.
What causes the timeout on the semaphore at random time when the system is loaded?
A thread can receive a signal only when it is actually running on a core. On a system with more runnable processes than cores, some runnable processes must be suspended, without a timeslice on any core, at any given time. On a heavily-loaded system, that's the norm. Signals are asynchronous, so you can send one to a thread that is currently waiting for a timeslice without the sender blocking. It is entirely possible, then, that the thread you have signaled does not get scheduled to run before the timeout expires. If it does run, it may have the signal blocked for one reason or another, and not get around to unblocking it before it uses up its timeslice.
Ultimately, you can use your semaphore-based approach to check whether the target thread handled the signal within any timeout of your choice, but you cannot predict in advance how long it will take for the thread to handle the signal, nor even whether it will do so in any finite amount of time (for example, it could die for one reason or another before doing so).
I am running the following program which implements a timer. When a thread awake after receiving a signal on condition variable from the previous running thread, it creates a timer and send a signal to the next thread on timer expiration. I want it to run for some time, but the timer stops ticking after some runs.
//Import
#define _POSIX_C_SOURCE 199309
#include <sched.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <time.h>
#include <sys/time.h>
#include <signal.h>
#include <errno.h>
#define NUM_THREADS 10
#define CLOCKID CLOCK_REALTIME
#define SIG SIGUSR1
timer_t timerid;
pthread_cond_t condA[NUM_THREADS+1] = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_t tid[NUM_THREADS];
int state = 0;
static void handler(int sig, siginfo_t *si, void *uc)
{
if(si->si_value.sival_ptr != &timerid){
printf("Stray signal\n");
} else {
//printf("Caught signal %d from timer\n", sig);
}
pthread_cond_signal(&condA[state]);
}
void *threadA(void *data_)
{
int i = 0, s;
long int loopNum, j;
int turn = (intptr_t)data_;
struct timeval tval_result;
// Timer's part starts
struct sigevent sev;
struct itimerspec its;
long long freq_nanosecs;
sigset_t mask;
struct sigaction sa;
// TImer'spart ends
while(1)
{
/* Wait for state A */
pthread_mutex_lock(&mutex);
for (;state != turn;)
{
s = pthread_cond_wait(&condA[turn], &mutex);
if (s != 0)
perror("pthread_cond_wait");
// printf("main(): state = %d\n", state);
}
pthread_mutex_unlock(&mutex);
//do stuff
for(j=0;j<10000;j++)
{//some dummy time consuming works}
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = handler;
sigemptyset(&sa.sa_mask);
sigaction(SIG, &sa, NULL);
sev.sigev_notify = SIGEV_SIGNAL;
sev.sigev_signo = SIG;
sev.sigev_value.sival_ptr = &timerid;
timer_create(CLOCKID, &sev, &timerid);
/* Start the timer */
its.it_value.tv_sec = 0;
its.it_value.tv_nsec = 2000;
its.it_interval.tv_sec = 0;
its.it_interval.tv_nsec = 0;
timer_settime(timerid, 0, &its, NULL);
pthread_mutex_lock(&mutex);
state = (state +1)%NUM_THREADS;
//pthread_cond_signal(&condA[state]);
pthread_mutex_unlock(&mutex);
// Timer's code ends
}
}
int main(int argc, char *argv[])
{
int data = 0;
int err;
while(data < NUM_THREADS)
{
//create our threads
err = pthread_create(&tid[data], NULL, threadA, (void *)(intptr_t)data);
if(err != 0)
printf("\ncan't create thread :[%s]", strerror(err));
else
// printf("\n Thread created successfully\n");
data++;
}
pthread_exit(NULL);
}
Although no printf statements are executing, why is it freezing after some time?
If no. of timers are limited, what other strategy should I use to redress this issue?
POSIX says:
It is not safe to use the pthread_cond_signal() function in a signal handler that is invoked asynchronously.
Most likely you end up corrupting the state of pthread_cond_wait/pthread_cond_signal and anything can happen.
Don't mix threads and signal handlers, it leads only to madness. There are very few things you're allowed to do inside a signal handler, even fewer that are thread related, it's very hard to ensure that the right thread ends up handling the right signal, etc.
If you're doing threads anyway implement a timer in one thread that calculates how much time it needs to sleep to deliver the next event (don't just hardcode it to your timer period since that will make your timer drift), sleep that much and call pthread_cond_signal.
Also, it's bad form to have naked pthread_cond_signal calls and most often a bug. You might get unlucky and call it just before the other thread does the pthread_cond_wait and your signal will get lost. The normal thing to do is to set a variable (protected by a mutex, that's why pthread_cond_signal wants a mutex) and then signal that the variable is set.
If you think this is too much work, condition variables are probably not the right mechanism in this case and you should use semaphores instead. Incidentally sem_post is legal to call from a signal handler according to POSIX, but I still think it's a bad idea to mix threads with signals.
If I setup and signal handler for SIGABRT and meanwhile I have a thread that waits on sigwait() for SIGABRT to come (I have a blocked SIGABRT in other threads by pthread_sigmask).
So which one will be processed first ? Signal handler or sigwait() ?
[I am facing some issues that sigwait() is get blocked for ever. I am debugging it currently]
main()
{
sigset_t signal_set;
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGABRT);
sigprocmask(SIG_BLOCK, &signal_set, NULL);
// Dont deliver SIGABORT while running this thread and it's kids.
pthread_sigmask(SIG_BLOCK, &signal_set, NULL);
pthread_create(&tAbortWaitThread, NULL, WaitForAbortThread, NULL);
..
Create all other threads
...
}
static void* WaitForAbortThread(void* v)
{
sigset_t signal_set;
int stat;
int sig;
sigfillset( &signal_set);
pthread_sigmask( SIG_BLOCK, &signal_set, NULL ); // Dont want any signals
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGABRT); // Add only SIGABRT
// This thread while executing , will handle the SIGABORT signal via signal handler.
pthread_sigmask(SIG_UNBLOCK, &signal_set, NULL);
stat= sigwait( &signal_set, &sig ); // lets wait for signal handled in CatchAbort().
while (stat == -1)
{
stat= sigwait( &signal_set, &sig );
}
TellAllThreadsWeAreGoingDown();
sleep(10);
return null;
}
// Abort signal handler executed via sigaction().
static void CatchAbort(int i, siginfo_t* info, void* v)
{
sleep(20); // Dont return , hold on till the other threads are down.
}
Here at sigwait(), i will come to know that SIGABRT is received. I will tell other threads about it. Then will hold abort signal handler so that process is not terminated.
I wanted to know the interaction of sigwait() and the signal handler.
From sigwait() documentation :
The sigwait() function suspends execution of the calling thread until
one of the signals specified in the signal set becomes pending.
A pending signal means a blocked signal waiting to be delivered to one of the thread/process. Therefore, you need not to unblock the signal like you did with your pthread_sigmask(SIG_UNBLOCK, &signal_set, NULL) call.
This should work :
static void* WaitForAbortThread(void* v){
sigset_t signal_set;
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGABRT);
sigwait( &signal_set, &sig );
TellAllThreadsWeAreGoingDown();
sleep(10);
return null;
}
I got some information from this <link>
It says :
To allow a thread to wait for asynchronously generated signals, the threads library provides the sigwait subroutine. The sigwait subroutine blocks the calling thread until one of the awaited signals is sent to the process or to the thread. There must not be a signal handler installed on the awaited signal using the sigwait subroutine.
I will remove the sigaction() handler and try only sigwait().
From the code snippet you've posted, it seems you got the use of sigwait() wrong. AFAIU, you need WaitForAbortThread like below:
sigemptyset( &signal_set); // change it from sigfillset()
for (;;) {
stat = sigwait(&signal_set, &sig);
if (sig == SIGABRT) {
printf("here's sigbart.. do whatever you want.\n");
pthread_kill(tid, signal); // thread id and signal
}
}
I don't think pthread_sigmask() is really needed. Since you only want to handle SIGABRT, first init signal_set as empty then simply add SIGABRT, then jump into the infinite loop, sigwait will wait for the particular signal that you're looking for, you check the signal if it's SIGABRT, if yes - do whatever you want. NOTE the uses of pthread_kill(), use it to sent any signal to other threads specified via tid and the signal you want to sent, make sure you know the tid of other threads you want to sent signal. Hope this will help!
I know this question is about a year old, but I often use a pattern, which solves exactly this issue using pthreads and signals. It is a little length but takes care of any issues I am aware of.
I recently used in combination with a library wrapped with SWIG and called from within Python. An annoying issue was that my IRQ thread waiting for SIGINT using sigwait never received the SIGINT signal. The same library worked perfectly when called from Matlab, which didn't capture the SIGINT signal.
The solution was to install a signal handler
#define _NTHREADS 8
#include <signal.h>
#include <pthread.h>
#include <unistd.h>
#include <sched.h>
#include <linux/unistd.h>
#include <sys/signal.h>
#include <sys/syscall.h>
#include <setjmp.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h> // strerror
#define CallErr(fun, arg) { if ((fun arg)<0) \
FailErr(#fun) }
#define CallErrExit(fun, arg, ret) { if ((fun arg)<0) \
FailErrExit(#fun,ret) }
#define FailErrExit(msg,ret) { \
(void)fprintf(stderr, "FAILED: %s(errno=%d strerror=%s)\n", \
msg, errno, strerror(errno)); \
(void)fflush(stderr); \
return ret; }
#define FailErr(msg) { \
(void)fprintf(stderr, "FAILED: %s(errno=%d strerror=%s)\n", \
msg, errno, strerror(errno)); \
(void)fflush(stderr);}
typedef struct thread_arg {
int cpu_id;
int thread_id;
} thread_arg_t;
static jmp_buf jmp_env;
static struct sigaction act;
static struct sigaction oact;
size_t exitnow = 0;
pthread_mutex_t exit_mutex;
pthread_attr_t attr;
pthread_t pids[_NTHREADS];
pid_t tids[_NTHREADS+1];
static volatile int status[_NTHREADS]; // 0: suspended, 1: interrupted, 2: success
sigset_t mask;
static pid_t gettid( void );
static void *thread_function(void *arg);
static void signalHandler(int);
int main() {
cpu_set_t cpuset;
int nproc;
int i;
thread_arg_t thread_args[_NTHREADS];
int id;
CPU_ZERO( &cpuset );
CallErr(sched_getaffinity,
(gettid(), sizeof( cpu_set_t ), &cpuset));
nproc = CPU_COUNT(&cpuset);
for (i=0 ; i < _NTHREADS ; i++) {
thread_args[i].cpu_id = i % nproc;
thread_args[i].thread_id = i;
status[i] = 0;
}
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_mutex_init(&exit_mutex, NULL);
// We pray for no locks on buffers and setbuf will work, if not we
// need to use filelock() on on FILE* access, tricky
setbuf(stdout, NULL);
setbuf(stderr, NULL);
act.sa_flags = SA_NOCLDSTOP | SA_NOCLDWAIT;
act.sa_handler = signalHandler;
sigemptyset(&act.sa_mask);
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
if (setjmp(jmp_env)) {
if (gettid()==tids[0]) {
// Main Thread
printf("main thread: waiting for clients to terminate\n");
for (i = 0; i < _NTHREADS; i++) {
CallErr(pthread_join, (pids[i], NULL));
if (status[i] == 1)
printf("thread %d: terminated\n",i+1);
}
// On linux this can be done immediate after creation
CallErr(pthread_attr_destroy, (&attr));
CallErr(pthread_mutex_destroy, (&exit_mutex));
return 0;
}
else {
// Should never happen
printf("worker thread received signal");
}
return -1;
}
// Install handler
CallErr(sigaction, (SIGINT, &act, &oact));
// Block SIGINT
CallErr(pthread_sigmask, (SIG_BLOCK, &mask, NULL));
tids[0] = gettid();
srand ( time(NULL) );
for (i = 0; i < _NTHREADS; i++) {
// Inherits main threads signal handler, they are blocking
CallErr(pthread_create,
(&pids[i], &attr, thread_function,
(void *)&thread_args[i]));
}
if (pthread_sigmask(SIG_UNBLOCK, &mask, NULL)) {
fprintf(stderr, "main thread: can't block SIGINT");
}
printf("Infinite loop started - CTRL-C to exit\n");
for (i = 0; i < _NTHREADS; i++) {
CallErr(pthread_join, (pids[i], NULL));
//printf("%d\n",status[i]);
if (status[i] == 2)
printf("thread %d: finished succesfully\n",i+1);
}
// Clean up and exit
CallErr(pthread_attr_destroy, (&attr));
CallErr(pthread_mutex_destroy, (&exit_mutex));
return 0;
}
static void signalHandler(int sig) {
int i;
pthread_t id;
id = pthread_self();
for (i = 0; i < _NTHREADS; i++)
if (pids[i] == id) {
// Exits if worker thread
printf("Worker thread caught signal");
break;
}
if (sig==2) {
sigaction(SIGINT, &oact, &act);
}
pthread_mutex_lock(&exit_mutex);
if (!exitnow)
exitnow = 1;
pthread_mutex_unlock(&exit_mutex);
longjmp(jmp_env, 1);
}
void *thread_function(void *arg) {
cpu_set_t set;
thread_arg_t* threadarg;
int thread_id;
threadarg = (thread_arg_t*) arg;
thread_id = threadarg->thread_id+1;
tids[thread_id] = gettid();
CPU_ZERO( &set );
CPU_SET( threadarg->cpu_id, &set );
CallErrExit(sched_setaffinity, (gettid(), sizeof(cpu_set_t), &set ),
NULL);
int k = 8;
// While loop waiting for exit condition
while (k>0) {
sleep(rand() % 3);
pthread_mutex_lock(&exit_mutex);
if (exitnow) {
status[threadarg->thread_id] = 1;
pthread_mutex_unlock(&exit_mutex);
pthread_exit(NULL);
}
pthread_mutex_unlock(&exit_mutex);
k--;
}
status[threadarg->thread_id] = 2;
pthread_exit(NULL);
}
static pid_t gettid( void ) {
pid_t pid;
CallErr(pid = syscall, (__NR_gettid));
return pid;
}
I run serveral tests and the conbinations and results are:
For all test cases, I register a signal handler by calling sigaction in the main thread.
main thread block target signal, thread A unblock target signal by calling pthread_sigmask, thread A sleep, send target signal.
result: signal handler is executed in thread A.
main thread block target signal, thread A unblock target signal by calling pthread_sigmask, thread A calls sigwait, send target signal.
result: sigwait is executed.
main thread does not block target signal, thread A does not block target signal, thread A calls sigwait, send target signal.
result: main thread is chosen and the registered signal handler is executed in the main thread.
As you can see, conbination 1 and 2 are easy to understand and conclude.
It is:
If a signal is blocked by a thread, then the process-wide signal handler registered by sigaction just can't catch or even know it.
If a signal is not blocked, and it's sent before calling sigwait, the process-wide signal handler wins. And that's why APUE the books require us to block the target signal before calling sigwait. Here I use sleep in thread A to simulate a long "window time".
If a signal is not blocked, and it's sent when sigwait has already been waiting, sigwait wins.
But you should notice that for test case 1 and 2, main thread is designed to block the target signal.
At last for test case 3, when main thread is not blocked the target signal, and sigwait in thread A is also waiting, the signal handler is executed in the main thread.
I believe the behaviour of test case 3 is what APUE talks about:
From APUE ยง12.8:
If a signal is being caught (the process has established a signal
handler by using sigaction, for example) and a thread is waiting for
the same signal in a call to sigwait, it is left up to the
implementation to decide which way to deliver the signal. The
implementation could either allow sigwait to return or invoke the
signal handler, but not both.
Above all, if you want to accomplish one thread <-> one signal model, you should:
block all signals in the main thread with pthread_sigmask (subsequent thread created in main thread inheris the signal mask)
create threads and call sigwait(target_signal) with target signal.
test code
#define _POSIX_C_SOURCE 200809L
#include <signal.h>
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
FILE* file;
void* threadA(void* argv){
fprintf(file, "%ld\n", pthread_self());
sigset_t m;
sigemptyset(&m);
sigaddset(&m, SIGUSR1);
int signo;
int err;
// sigset_t q;
// sigemptyset(&q);
// pthread_sigmask(SIG_SETMASK, &q, NULL);
// sleep(50);
fprintf(file, "1\n");
err = sigwait(&m, &signo);
if (err != 0){
fprintf(file, "sigwait error\n");
exit(1);
}
switch (signo)
{
case SIGUSR1:
fprintf(file, "SIGUSR1 received\n");
break;
default:
fprintf(file, "?\n");
break;
}
fprintf(file, "2\n");
}
void hello(int signo){
fprintf(file, "%ld\n", pthread_self());
fprintf(file, "hello\n");
}
int main(){
file = fopen("daemon", "wb");
setbuf(file, NULL);
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_handler = hello;
sigaction(SIGUSR1, &sa, NULL);
sigset_t n;
sigemptyset(&n);
sigaddset(&n, SIGUSR1);
// pthread_sigmask(SIG_BLOCK, &n, NULL);
pthread_t pid;
int err;
err = pthread_create(&pid, NULL, threadA, NULL);
if(err != 0){
fprintf(file, "create thread error\n");
exit(1);
}
pause();
fprintf(file, "after pause\n");
fclose(file);
return 0;
}
run with ./a.out & (run in the background), and use kill -SIGUSR1 pid to test. Do not use raise. raise, sleep, pause are thread-wide.