I have a thread that sits in a blocking recv() loop and I want to terminate (assume this can't be changed to select() or any other asynchronous approach).
I also have a signal handler that catches SIGINT and theoretically it should make recv() return with error and errno set to EINTR.
But it doesn't, which I assume has something to do with the fact that the application is multi-threaded. There is also another thread, which is meanwhile waiting on a pthread_join() call.
What's happening here?
EDIT:
OK, now I explicitly deliver the signal to all blocking recv() threads via pthread_kill() from the main thread (which results in the same global SIGINT signal handler installed, though multiple invocations are benign). But recv() call is still not unblocked.
EDIT:
I've written a code sample that reproduces the problem.
Main thread connects a socket to a misbehaving remote host that won't let the connection go.
All signals blocked.
Read thread thread is started.
Main unblocks and installs handler for SIGINT.
Read thread unblocks and installs handler for SIGUSR1.
Main thread's signal handler sends a SIGUSR1 to the read thread.
Interestingly, if I replace recv() with sleep() it is interrupted just fine.
PS
Alternatively you can just open a UDP socket instead of using a server.
client
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <errno.h>
static void
err(const char *msg)
{
perror(msg);
abort();
}
static void
blockall()
{
sigset_t ss;
sigfillset(&ss);
if (pthread_sigmask(SIG_BLOCK, &ss, NULL))
err("pthread_sigmask");
}
static void
unblock(int signum)
{
sigset_t ss;
sigemptyset(&ss);
sigaddset(&ss, signum);
if (pthread_sigmask(SIG_UNBLOCK, &ss, NULL))
err("pthread_sigmask");
}
void
sigusr1(int signum)
{
(void)signum;
printf("%lu: SIGUSR1\n", pthread_self());
}
void*
read_thread(void *arg)
{
int sock, r;
char buf[100];
unblock(SIGUSR1);
signal(SIGUSR1, &sigusr1);
sock = *(int*)arg;
printf("Thread (self=%lu, sock=%d)\n", pthread_self(), sock);
r = 1;
while (r > 0)
{
r = recv(sock, buf, sizeof buf, 0);
printf("recv=%d\n", r);
}
if (r < 0)
perror("recv");
return NULL;
}
int sock;
pthread_t t;
void
sigint(int signum)
{
int r;
(void)signum;
printf("%lu: SIGINT\n", pthread_self());
printf("Killing %lu\n", t);
r = pthread_kill(t, SIGUSR1);
if (r)
{
printf("%s\n", strerror(r));
abort();
}
}
int
main()
{
pthread_attr_t attr;
struct sockaddr_in addr;
printf("main thread: %lu\n", pthread_self());
memset(&addr, 0, sizeof addr);
sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (socket < 0)
err("socket");
addr.sin_family = AF_INET;
addr.sin_port = htons(8888);
if (inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr) <= 0)
err("inet_pton");
if (connect(sock, (struct sockaddr *)&addr, sizeof addr))
err("connect");
blockall();
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
if (pthread_create(&t, &attr, &read_thread, &sock))
err("pthread_create");
pthread_attr_destroy(&attr);
unblock(SIGINT);
signal(SIGINT, &sigint);
if (sleep(1000))
perror("sleep");
if (pthread_join(t, NULL))
err("pthread_join");
if (close(sock))
err("close");
return 0;
}
server
import socket
import time
s = socket.socket(socket.AF_INET)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('127.0.0.1',8888))
s.listen(1)
c = []
while True:
(conn, addr) = s.accept()
c.append(conn)
Normally signals do not interrupt system calls with EINTR. Historically there were two possible signal delivery behaviors: the BSD behavior (syscalls are automatically restarted when interrupted by a signal) and the Unix System V behavior (syscalls return -1 with errno set to EINTR when interrupted by a signal). Linux (the kernel) adopted the latter, but the GNU C library developers (correctly) deemed the BSD behavior to be much more sane, and so on modern Linux systems, calling signal (which is a library function) results in the BSD behavior.
POSIX allows either behavior, so it's advisable to always use sigaction where you can choose to set the SA_RESTART flag or omit it depending on the behavior you want. See the documentation for sigaction here:
http://www.opengroup.org/onlinepubs/9699919799/functions/sigaction.html
In a multi-threaded application, normal signals can be delivered to any thread arbitrarily. Use pthread_kill to send the signal to the specific thread of interest.
Does signal handler invoked in same thread which waits in recv()?
You may need to explicitly mask SIGINT in all other threads via pthread_sigmask()
As alluded to in the post by <R..>, it is indeed possible to change the signal activities.
I often create my own "signal" function that makes use of sigaction. Here's what I use
typedef void Sigfunc(int);
static Sigfunc*
_signal(int signum, Sigfunc* func)
{
struct sigaction act, oact;
act.sa_handler = func;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
if (signum != SIGALRM)
act.sa_flags |= SA_NODEFER; //SA_RESTART;
if (sigaction(signum, &act, &oact) < 0)
return (SIG_ERR);
return oact.sa_handler;
}
The attribute in question above is the 'or'ing of the sa_flags field. This is from the man page for 'sigaction': SA_RESTART provides the BSD-like behavior of allowing system calls to be restartable across signals. SA_NODEFER means allow the signal to be received from within its own signal handler.
When the signal calls are replaced with "_signal", the thread is interrupted. The output prints out "interrupted system call" and recv returned a -1 when SIGUSR1 was sent. The program stopped altogether with the same output when SIGINT was sent, but the abort was called at the end.
I did not write the server portion of the code, I just changed the socket type to "DGRAM, UDP" to allow the client to start.
You can set a timeout on Linux recv: Linux: is there a read or recv from socket with timeout?
When you get a signal, call done on the class doing the receive.
void* signalThread( void* ptr )
{
CapturePkts* cap=(CapturePkts*)ptr;
sigset_t sigSet=cap->getSigSet();
int sig=-1;
sigwait(&sigSet,&sig); //signalThread: signal capture thread enabled;
cout << "signal=" << sig << " caught,ending process" << endl;
cap->setDone();
return 0;
}
class CapturePkts
{
CapturePkts() : _done(false) {}
sigset_t getSigSet() { return _sigSet; }
void setDone() {_done=true;}
bool receive( uint8_t *buffer, int32_t bufSz, int32_t &nbytes)
{
bool ret=true;
while( ! _done ) {
nbytes = ::recv( _sockid, buffer, bufSz, 0 );
if(nbytes < 1 ) {
if (errno == EAGAIN || errno == EWOULDBLOCK) {
nbytes=0; //wait for next read event
else
ret=false;
}
return ret;
}
private:
sigset_t _sigSet;
bool _done;
};
Related
I'm a newbie in c development. Recently, I noticed a problem when I was learning multi-threaded development, when I set a signal in the main thread of Action and when I try to block the signal action set by the main thread in the child thread, I find that it does not work.
Here is a brief description of the code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <pthread.h>
#include <unistd.h>
#include <signal.h>
void *thread_start(void *_arg) {
sleep(2);
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGUSR2);
pthread_sigmask(SIG_BLOCK, &mask, NULL);
printf("child-thread executed\n");
while (true) {
sleep(1);
}
return NULL;
}
void sig_handler(int _sig) {
printf("executed\n");
}
int main(int argc, char *argv[]) {
pthread_t t_id;
int s = pthread_create(&t_id, NULL, thread_start, NULL);
if (s != 0) {
char *msg = strerror(s);
printf("%s\n", msg);
}
printf("main-thread executed, create [%lu]\n", t_id);
signal(SIGUSR2, sig_handler);
while (true) {
sleep(1);
}
return EXIT_SUCCESS;
}
The signal mask is a per-thread property, a thread will inherit whatever the parent has at time of thread creation but, after that, it controls its own copy.
In other words, blocking a signal in a thread only affects the delivery of signals for that thread, not for any other.
In any case, even if it were shared (it's not), you would have a potential race condition since you start the child thread before setting up the signal in the main thread. Hence it would be indeterminate as to whether the order was "parent sets up signal, then child blocks" or vice versa. But, as stated, that's irrelevant due to the thread-specific nature of the signal mask.
If you want a thread to control the signal mask of another thread, you will need to use some form of inter-thread communication to let the other thread do it itself.
As I wrote in a comment, any USR1 signal sent to the process will be delivered using the main thread. It's output will not tell you exactly what happened, so it is not really a good way to test threads and signal masks. Additionally, it uses printf() in a signal handler, which may or may not work: printf() is not an async-signal safe function, so it must not be used in a signal handler.
Here is a better example:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <pthread.h>
#include <limits.h>
#include <string.h>
#include <errno.h>
#include <stdio.h>
/* This function writes a message directly to standard error,
without using the stderr stream. This is async-signal safe.
Returns 0 if success, errno error code if an error occurs.
errno is kept unchanged. */
static int write_stderr(const char *msg)
{
const char *end = msg;
const int saved_errno = errno;
int retval = 0;
ssize_t n;
/* If msg is non-NULL, find the string-terminating '\0'. */
if (msg)
while (*end)
end++;
/* Write the message to standard error. */
while (msg < end) {
n = write(STDERR_FILENO, msg, (size_t)(end - msg));
if (n > 0) {
msg += n;
} else
if (n != 0) {
/* Bug, should not occur */
retval = EIO;
break;
} else
if (errno != EINTR) {
retval = errno;
break;
}
}
/* Paranoid check that exactly the message was written */
if (!retval)
if (msg != end)
retval = EIO;
errno = saved_errno;
return retval;
}
static volatile sig_atomic_t done = 0;
pthread_t main_thread;
pthread_t other_thread;
static void signal_handler(int signum)
{
const pthread_t id = pthread_self();
const char *thread = (id == main_thread) ? "Main thread" :
(id == other_thread) ? "Other thread" : "Unknown thread";
const char *event = (signum == SIGHUP) ? "HUP" :
(signum == SIGUSR1) ? "USR1" :
(signum == SIGINT) ? "INT" :
(signum == SIGTERM) ? "TERM" : "Unknown signal";
if (signum == SIGTERM || signum == SIGINT)
done = 1;
write_stderr(thread);
write_stderr(": ");
write_stderr(event);
write_stderr(".\n");
}
static int install_handler(int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = signal_handler;
act.sa_flags = 0;
if (sigaction(signum, &act, NULL) == -1)
return -1;
return 0;
}
void *other(void *unused __attribute__((unused)))
{
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGTERM);
sigaddset(&mask, SIGHUP);
pthread_sigmask(SIG_BLOCK, &mask, NULL);
while (!done)
sleep(1);
return NULL;
}
int main(void)
{
pthread_attr_t attrs;
sigset_t mask;
int result;
main_thread = pthread_self();
other_thread = pthread_self(); /* Just to initialize it to a sane value */
/* Install HUP, USR1, INT, and TERM signal handlers. */
if (install_handler(SIGHUP) ||
install_handler(SIGUSR1) ||
install_handler(SIGINT) ||
install_handler(SIGTERM)) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
/* Create the other thread. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, 2*PTHREAD_STACK_MIN);
result = pthread_create(&other_thread, &attrs, other, NULL);
pthread_attr_destroy(&attrs);
if (result) {
fprintf(stderr, "Cannot create a thread: %s.\n", strerror(result));
return EXIT_FAILURE;
}
/* This thread blocks SIGUSR1. */
sigemptyset(&mask);
sigaddset(&mask, SIGUSR1);
pthread_sigmask(SIG_BLOCK, &mask, NULL);
/* Ready to handle signals. */
printf("Send a HUP, USR1, or TERM signal to process %d.\n", (int)getpid());
fflush(stdout);
while (!done)
sleep(1);
pthread_join(other_thread, NULL);
return EXIT_SUCCESS;
}
Save it as e.g. example.c, and compile and run using
gcc -Wall -O2 example.c -pthread -o exprog
./exprog
It will block the USR1 signal in the main thread, and HUP and TERM in the other thread. It will also catch the INT signal (Ctrl+C), which is not blocked in either thread. When you send it the INT or TERM signal, the program will exit.
If you send the program the USR1 signal, you'll see that it will always be delivered using the other thread.
If you send the program a HUP signal, you'll see that it will always be delivered using the main thread.
If you send the program a TERM signal, it too will be delivered using the main thread, but it will also cause the program to exit (nicely).
If you send the program an INT signal, it will be delivered using one of the threads. It depends on several factors whether you'll always see it being delivered using the same thread or not, but at least in theory, it can be delivered using either thread. This signal too will cause the program to exit (nicely).
In the context of an existing multi-threaded application I want to suspend a list of threads for a specific duration then resume their normal execution. I know some of you wil say that I should not do that but I know that and I don't have a choice.
I came up with the following code that sort of work but randomly failed. For each thread I want to suspend, I send a signal and wait for an ack via a semaphore. The signal handler when invoked, post the semaphore and sleep for the specified duration.
The problem is when the system is fully loaded, the call to sem_timedwait sometimes fails with ETIMEDOUT and I am left with an inconsistent logic with semaphore used for the ack: I don't know if the signal has been dropped or is just late.
// compiled with: gcc main.c -o test -pthread
#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <semaphore.h>
#include <sys/types.h>
#include <sys/syscall.h>
#define NUMTHREADS 40
#define SUSPEND_SIG (SIGRTMIN+1)
#define SUSPEND_DURATION 80 // in ms
static sem_t sem;
void checkResults(const char *msg, int rc) {
if (rc == 0) {
//printf("%s success\n", msg);
} else if (rc == ESRCH) {
printf("%s failed with ESRCH\n", msg);
} else if (rc == EINVAL) {
printf("%s failed with EINVAL\n", msg);
} else {
printf("%s failed with unknown error: %d\n", msg, rc);
}
}
static void suspend_handler(int signo) {
sem_post(&sem);
usleep(SUSPEND_DURATION*1000);
}
void installSuspendHandler() {
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = suspend_handler;
int rc = sigaction(SUSPEND_SIG, &sa, NULL);
checkResults("sigaction SUSPEND", rc);
}
void *threadfunc(void *param) {
int tid = *((int *) param);
free(param);
printf("Thread %d entered\n", tid);
// this is an example workload, the real app is doing many things
while (1) {
int rc = sleep(30);
if (rc != 0 && errno == EINTR) {
//printf("Thread %d got a signal delivered to it\n", tid);
} else {
//printf("Thread %d did not get expected results! rc=%d, errno=%d\n", tid, rc, errno);
}
}
return NULL;
}
int main(int argc, char **argv) {
pthread_t threads[NUMTHREADS];
int i;
sem_init(&sem, 0, 0);
installSuspendHandler();
for(i=0; i<NUMTHREADS; ++i) {
int *arg = malloc(sizeof(*arg));
if ( arg == NULL ) {
fprintf(stderr, "Couldn't allocate memory for thread arg.\n");
exit(EXIT_FAILURE);
}
*arg = i;
int rc = pthread_create(&threads[i], NULL, threadfunc, arg);
checkResults("pthread_create()", rc);
}
sleep(3);
printf("Will start to send signals...\n");
while (1) {
printf("***********************************************\n");
for(i=0; i<NUMTHREADS; ++i) {
int rc = pthread_kill(threads[i], SUSPEND_SIG);
checkResults("pthread_kill()", rc);
printf("Waiting for Semaphore for thread %d ...\n", i);
// compute timeout abs timestamp for ack
struct timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
const int TIMEOUT = SUSPEND_DURATION*1000*1000; // in nano-seconds
ts.tv_nsec += TIMEOUT; // timeout to receive ack from signal handler
// normalize timespec
ts.tv_sec += ts.tv_nsec / 1000000000;
ts.tv_nsec %= 1000000000;
rc = sem_timedwait(&sem, &ts); // try decrement semaphore
if (rc == -1 && errno == ETIMEDOUT) {
// timeout
// semaphore is out of sync
printf("Did not received signal handler sem_post before timeout of %d ms for thread %d", TIMEOUT/1000000, i);
abort();
}
checkResults("sem_timedwait", rc);
printf("Received Semaphore for thread %d.\n", i);
}
sleep(1);
}
for(i=0; i<NUMTHREADS; ++i) {
int rc = pthread_join(threads[i], NULL);
checkResults("pthread_join()\n", rc);
}
printf("Main completed\n");
return 0;
}
Questions?
Is it possible for a signal to be dropped and never delivered?
What causes the timeout on the semaphore at random time when the system is loaded?
usleep() is not among the async-signal-safe functions (though sleep() is, and there are other async-signal-safe functions by which you can produce a timed delay). A program that calls usleep() from a signal handler is therefore non-conforming. The specifications do not describe what may happen -- neither with such a call itself nor with the larger program execution in which it occurs. Your questions can be answered only for a conforming program; I do that below.
Is it possible for a signal to be dropped and never delivered?
It depends on what exactly you mean:
If a normal (not real-time) signal is delivered to a thread that already has that signal queued then no additional instance is queued.
A thread can die with signals still queued for it; those signals will not be handled.
A thread can change a given signal's disposition (to SIG_IGN, for example), though this is a per-process attribute, not a per-thread one.
A thread can block a signal indefinitely. A blocked signal is not dropped -- it remains queued for the thread and will eventually be received some time after it is unblocked, if that ever happens.
But no, having successfully queued a signal via the kill() or raise() function, that signal will not be randomly dropped.
What causes the timeout on the semaphore at random time when the system is loaded?
A thread can receive a signal only when it is actually running on a core. On a system with more runnable processes than cores, some runnable processes must be suspended, without a timeslice on any core, at any given time. On a heavily-loaded system, that's the norm. Signals are asynchronous, so you can send one to a thread that is currently waiting for a timeslice without the sender blocking. It is entirely possible, then, that the thread you have signaled does not get scheduled to run before the timeout expires. If it does run, it may have the signal blocked for one reason or another, and not get around to unblocking it before it uses up its timeslice.
Ultimately, you can use your semaphore-based approach to check whether the target thread handled the signal within any timeout of your choice, but you cannot predict in advance how long it will take for the thread to handle the signal, nor even whether it will do so in any finite amount of time (for example, it could die for one reason or another before doing so).
I am learning how to use pselect. I took an example code which worked fine and modified it to call the same code from a thread which is spawned from main and it does not work (pselect remains blocked forever)
#include <sys/select.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <arpa/inet.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
/* Flag that tells the daemon to exit. */
static volatile int exit_request = 0;
/* Signal handler. */
static void hdl (int sig)
{
exit_request = 1;
printf("sig=%d\n", sig);
}
/* Accept client on listening socket lfd and close the connection
* immediatelly. */
static void handle_client (int lfd)
{
int sock = accept (lfd, NULL, 0);
if (sock < 0) {
perror ("accept");
exit (1);
}
puts ("accepted client");
close (sock);
}
void *mythread(void *arg __attribute__ ((unused)))
{
int lfd;
struct sockaddr_in myaddr;
int yes = 1;
sigset_t mask;
sigset_t orig_mask;
struct sigaction act;
memset (&act, 0, sizeof(act));
act.sa_handler = hdl;
/* This server should shut down on SIGUSR1. */
if (sigaction(SIGUSR1, &act, 0)) {
perror ("sigaction");
return NULL;
}
sigemptyset (&mask);
sigaddset (&mask, SIGUSR1);
if (pthread_sigmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
perror ("pthread_sigmask");
return NULL;
}
lfd = socket (AF_INET, SOCK_STREAM, 0);
if (lfd < 0) {
perror ("socket");
return NULL;
}
if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR,
&yes, sizeof(int)) == -1) {
perror ("setsockopt");
return NULL;
}
memset (&myaddr, 0, sizeof(myaddr));
myaddr.sin_family = AF_INET;
myaddr.sin_addr.s_addr = INADDR_ANY;
myaddr.sin_port = htons (10000);
if (bind(lfd, (struct sockaddr *)&myaddr, sizeof(myaddr)) < 0) {
perror ("bind");
return NULL;
}
if (listen(lfd, 5) < 0) {
perror ("listen");
return NULL;
}
while (!exit_request) {
fd_set fds;
int res;
/* BANG! we can get SIGUSR1 at this point, but it will be
* delivered while we are in pselect(), because now
* we block SIGUSR1.
*/
FD_ZERO (&fds);
FD_SET (lfd, &fds);
res = pselect (lfd + 1, &fds, NULL, NULL, NULL, &orig_mask);
if (res < 0 && errno != EINTR) {
perror ("select");
return NULL;
}
else if (exit_request) {
puts ("exited");
break;
}
else if (res == 0)
continue;
if (FD_ISSET(lfd, &fds)) {
handle_client (lfd);
}
}
return NULL;
}
int main (int argc, char *argv[])
{
void * res;
pthread_t mythr_h;
pthread_create(&mythr_h, (pthread_attr_t *)NULL, mythread, NULL);
pthread_join(mythr_h, &res);
return 0;
}
strong text
After sending SIGUSR1 to this program I see that it remains blocked in the pselect call. When the code in mythread function is moved back into main and not spawning any thread from main, it works perfectly.
After sending SIGUSR1 to this program I see that it remains blocked in
the pselect call. When the code in mythread function is moved back
into main and not spawning any thread from main, it works perfectly.
That's to be expected -- there is no guarantee that a signal will be delivered to the "right" thread, since there is no well-defined notion of what the "right" thread would be.
Signals and multithreading don't mix particularly well, but if you want to do it, I suggest getting rid of the exit_request flag (note: the volatile keyword isn't sufficient to work reliably in multithreaded scenarios anyway), and instead create a connected pair of file descriptors (by calling either the pipe() function or the socketpair() function). All your signal handler function (hdl()) needs to do is write a byte into one of the two file descriptors. Have your thread include the other file descriptor in its read-socket-set (fds) so that when the byte is written that will cause pselect() to return and then your subsequent call to FD_ISSET(theSecondFileDescriptorOfThePair, &fds) will return true, which is how your thread will know it's time to exit now.
The signal is delivered to the main thread, other than the thread blocking on the pselect() call. If there are multiple threads that have the signal unblocked, the signal can be delivered to any one of the threads.
Since you didn't specify your platform, first I'm quoting from the POSIX standard (System Interfaces volume, 2.4.1 Signal Generation and Delivery).
Signals generated for the process shall be delivered to exactly one of those threads within the process which is in a call to a sigwait() function selecting that signal or has not blocked delivery of the signal.
You can also see similar statements in Linux manpage signal(7).
A process-directed signal may be delivered to any
one of the threads that does not currently have the signal blocked.
If more than one of the threads has the signal unblocked, then the
kernel chooses an arbitrary thread to which to deliver the signal.
And FreeBSD manpage sigaction(2).
For signals directed at the process, if the
signal is not currently blocked by all threads then it is delivered to
one thread that does not have it blocked (the selection of which is
unspecified).
So what you can do is to block SIGUSR1 for all the threads in the process except for the one that calls pselect(). Luckily when a new thread is created, it inherits the signal mask from its creator.
From the same POSIX section above,
The signal mask for a thread shall be initialized from that of its parent or creating thread....
Linux pthread_sigmask(3),
A new thread inherits a copy of its creator's signal mask.
FreeBSD sigaction(2),
The signal mask for a thread is initialized from that of its parent (normally empty).
You can make the following changes to your code. In main(), before creating any threads, block SIGUSR1. In the thread, pass a signal mask that has SIGUSR1 unblocked into pselect().
--- old.c Mon Mar 21 22:48:52 2016
+++ new.c Mon Mar 21 22:53:54 2016
## -56,14 +56,14 ##
return NULL;
}
- sigemptyset (&mask);
- sigaddset (&mask, SIGUSR1);
-
- if (pthread_sigmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
+ sigemptyset(&orig_mask);
+ if (pthread_sigmask(SIG_BLOCK, NULL, &orig_mask) < 0) {
perror ("pthread_sigmask");
return NULL;
}
+ sigdelset(&orig_mask, SIGUSR1);
+
lfd = socket (AF_INET, SOCK_STREAM, 0);
if (lfd < 0) {
perror ("socket");
## -126,6 +126,15 ##
{
void * res;
pthread_t mythr_h;
+ sigset_t mask;
+
+ sigemptyset (&mask);
+ sigaddset (&mask, SIGUSR1);
+
+ if (pthread_sigmask(SIG_BLOCK, &mask, NULL) != 0) {
+ return 1;
+ }
+
pthread_create(&mythr_h, (pthread_attr_t *)NULL, mythread, NULL);
pthread_join(mythr_h, &res);
return 0;
Last thing is off topic. printf() is not an async-signal-safe function, so should not be called in the signal handler.
If I setup and signal handler for SIGABRT and meanwhile I have a thread that waits on sigwait() for SIGABRT to come (I have a blocked SIGABRT in other threads by pthread_sigmask).
So which one will be processed first ? Signal handler or sigwait() ?
[I am facing some issues that sigwait() is get blocked for ever. I am debugging it currently]
main()
{
sigset_t signal_set;
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGABRT);
sigprocmask(SIG_BLOCK, &signal_set, NULL);
// Dont deliver SIGABORT while running this thread and it's kids.
pthread_sigmask(SIG_BLOCK, &signal_set, NULL);
pthread_create(&tAbortWaitThread, NULL, WaitForAbortThread, NULL);
..
Create all other threads
...
}
static void* WaitForAbortThread(void* v)
{
sigset_t signal_set;
int stat;
int sig;
sigfillset( &signal_set);
pthread_sigmask( SIG_BLOCK, &signal_set, NULL ); // Dont want any signals
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGABRT); // Add only SIGABRT
// This thread while executing , will handle the SIGABORT signal via signal handler.
pthread_sigmask(SIG_UNBLOCK, &signal_set, NULL);
stat= sigwait( &signal_set, &sig ); // lets wait for signal handled in CatchAbort().
while (stat == -1)
{
stat= sigwait( &signal_set, &sig );
}
TellAllThreadsWeAreGoingDown();
sleep(10);
return null;
}
// Abort signal handler executed via sigaction().
static void CatchAbort(int i, siginfo_t* info, void* v)
{
sleep(20); // Dont return , hold on till the other threads are down.
}
Here at sigwait(), i will come to know that SIGABRT is received. I will tell other threads about it. Then will hold abort signal handler so that process is not terminated.
I wanted to know the interaction of sigwait() and the signal handler.
From sigwait() documentation :
The sigwait() function suspends execution of the calling thread until
one of the signals specified in the signal set becomes pending.
A pending signal means a blocked signal waiting to be delivered to one of the thread/process. Therefore, you need not to unblock the signal like you did with your pthread_sigmask(SIG_UNBLOCK, &signal_set, NULL) call.
This should work :
static void* WaitForAbortThread(void* v){
sigset_t signal_set;
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGABRT);
sigwait( &signal_set, &sig );
TellAllThreadsWeAreGoingDown();
sleep(10);
return null;
}
I got some information from this <link>
It says :
To allow a thread to wait for asynchronously generated signals, the threads library provides the sigwait subroutine. The sigwait subroutine blocks the calling thread until one of the awaited signals is sent to the process or to the thread. There must not be a signal handler installed on the awaited signal using the sigwait subroutine.
I will remove the sigaction() handler and try only sigwait().
From the code snippet you've posted, it seems you got the use of sigwait() wrong. AFAIU, you need WaitForAbortThread like below:
sigemptyset( &signal_set); // change it from sigfillset()
for (;;) {
stat = sigwait(&signal_set, &sig);
if (sig == SIGABRT) {
printf("here's sigbart.. do whatever you want.\n");
pthread_kill(tid, signal); // thread id and signal
}
}
I don't think pthread_sigmask() is really needed. Since you only want to handle SIGABRT, first init signal_set as empty then simply add SIGABRT, then jump into the infinite loop, sigwait will wait for the particular signal that you're looking for, you check the signal if it's SIGABRT, if yes - do whatever you want. NOTE the uses of pthread_kill(), use it to sent any signal to other threads specified via tid and the signal you want to sent, make sure you know the tid of other threads you want to sent signal. Hope this will help!
I know this question is about a year old, but I often use a pattern, which solves exactly this issue using pthreads and signals. It is a little length but takes care of any issues I am aware of.
I recently used in combination with a library wrapped with SWIG and called from within Python. An annoying issue was that my IRQ thread waiting for SIGINT using sigwait never received the SIGINT signal. The same library worked perfectly when called from Matlab, which didn't capture the SIGINT signal.
The solution was to install a signal handler
#define _NTHREADS 8
#include <signal.h>
#include <pthread.h>
#include <unistd.h>
#include <sched.h>
#include <linux/unistd.h>
#include <sys/signal.h>
#include <sys/syscall.h>
#include <setjmp.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h> // strerror
#define CallErr(fun, arg) { if ((fun arg)<0) \
FailErr(#fun) }
#define CallErrExit(fun, arg, ret) { if ((fun arg)<0) \
FailErrExit(#fun,ret) }
#define FailErrExit(msg,ret) { \
(void)fprintf(stderr, "FAILED: %s(errno=%d strerror=%s)\n", \
msg, errno, strerror(errno)); \
(void)fflush(stderr); \
return ret; }
#define FailErr(msg) { \
(void)fprintf(stderr, "FAILED: %s(errno=%d strerror=%s)\n", \
msg, errno, strerror(errno)); \
(void)fflush(stderr);}
typedef struct thread_arg {
int cpu_id;
int thread_id;
} thread_arg_t;
static jmp_buf jmp_env;
static struct sigaction act;
static struct sigaction oact;
size_t exitnow = 0;
pthread_mutex_t exit_mutex;
pthread_attr_t attr;
pthread_t pids[_NTHREADS];
pid_t tids[_NTHREADS+1];
static volatile int status[_NTHREADS]; // 0: suspended, 1: interrupted, 2: success
sigset_t mask;
static pid_t gettid( void );
static void *thread_function(void *arg);
static void signalHandler(int);
int main() {
cpu_set_t cpuset;
int nproc;
int i;
thread_arg_t thread_args[_NTHREADS];
int id;
CPU_ZERO( &cpuset );
CallErr(sched_getaffinity,
(gettid(), sizeof( cpu_set_t ), &cpuset));
nproc = CPU_COUNT(&cpuset);
for (i=0 ; i < _NTHREADS ; i++) {
thread_args[i].cpu_id = i % nproc;
thread_args[i].thread_id = i;
status[i] = 0;
}
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_mutex_init(&exit_mutex, NULL);
// We pray for no locks on buffers and setbuf will work, if not we
// need to use filelock() on on FILE* access, tricky
setbuf(stdout, NULL);
setbuf(stderr, NULL);
act.sa_flags = SA_NOCLDSTOP | SA_NOCLDWAIT;
act.sa_handler = signalHandler;
sigemptyset(&act.sa_mask);
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
if (setjmp(jmp_env)) {
if (gettid()==tids[0]) {
// Main Thread
printf("main thread: waiting for clients to terminate\n");
for (i = 0; i < _NTHREADS; i++) {
CallErr(pthread_join, (pids[i], NULL));
if (status[i] == 1)
printf("thread %d: terminated\n",i+1);
}
// On linux this can be done immediate after creation
CallErr(pthread_attr_destroy, (&attr));
CallErr(pthread_mutex_destroy, (&exit_mutex));
return 0;
}
else {
// Should never happen
printf("worker thread received signal");
}
return -1;
}
// Install handler
CallErr(sigaction, (SIGINT, &act, &oact));
// Block SIGINT
CallErr(pthread_sigmask, (SIG_BLOCK, &mask, NULL));
tids[0] = gettid();
srand ( time(NULL) );
for (i = 0; i < _NTHREADS; i++) {
// Inherits main threads signal handler, they are blocking
CallErr(pthread_create,
(&pids[i], &attr, thread_function,
(void *)&thread_args[i]));
}
if (pthread_sigmask(SIG_UNBLOCK, &mask, NULL)) {
fprintf(stderr, "main thread: can't block SIGINT");
}
printf("Infinite loop started - CTRL-C to exit\n");
for (i = 0; i < _NTHREADS; i++) {
CallErr(pthread_join, (pids[i], NULL));
//printf("%d\n",status[i]);
if (status[i] == 2)
printf("thread %d: finished succesfully\n",i+1);
}
// Clean up and exit
CallErr(pthread_attr_destroy, (&attr));
CallErr(pthread_mutex_destroy, (&exit_mutex));
return 0;
}
static void signalHandler(int sig) {
int i;
pthread_t id;
id = pthread_self();
for (i = 0; i < _NTHREADS; i++)
if (pids[i] == id) {
// Exits if worker thread
printf("Worker thread caught signal");
break;
}
if (sig==2) {
sigaction(SIGINT, &oact, &act);
}
pthread_mutex_lock(&exit_mutex);
if (!exitnow)
exitnow = 1;
pthread_mutex_unlock(&exit_mutex);
longjmp(jmp_env, 1);
}
void *thread_function(void *arg) {
cpu_set_t set;
thread_arg_t* threadarg;
int thread_id;
threadarg = (thread_arg_t*) arg;
thread_id = threadarg->thread_id+1;
tids[thread_id] = gettid();
CPU_ZERO( &set );
CPU_SET( threadarg->cpu_id, &set );
CallErrExit(sched_setaffinity, (gettid(), sizeof(cpu_set_t), &set ),
NULL);
int k = 8;
// While loop waiting for exit condition
while (k>0) {
sleep(rand() % 3);
pthread_mutex_lock(&exit_mutex);
if (exitnow) {
status[threadarg->thread_id] = 1;
pthread_mutex_unlock(&exit_mutex);
pthread_exit(NULL);
}
pthread_mutex_unlock(&exit_mutex);
k--;
}
status[threadarg->thread_id] = 2;
pthread_exit(NULL);
}
static pid_t gettid( void ) {
pid_t pid;
CallErr(pid = syscall, (__NR_gettid));
return pid;
}
I run serveral tests and the conbinations and results are:
For all test cases, I register a signal handler by calling sigaction in the main thread.
main thread block target signal, thread A unblock target signal by calling pthread_sigmask, thread A sleep, send target signal.
result: signal handler is executed in thread A.
main thread block target signal, thread A unblock target signal by calling pthread_sigmask, thread A calls sigwait, send target signal.
result: sigwait is executed.
main thread does not block target signal, thread A does not block target signal, thread A calls sigwait, send target signal.
result: main thread is chosen and the registered signal handler is executed in the main thread.
As you can see, conbination 1 and 2 are easy to understand and conclude.
It is:
If a signal is blocked by a thread, then the process-wide signal handler registered by sigaction just can't catch or even know it.
If a signal is not blocked, and it's sent before calling sigwait, the process-wide signal handler wins. And that's why APUE the books require us to block the target signal before calling sigwait. Here I use sleep in thread A to simulate a long "window time".
If a signal is not blocked, and it's sent when sigwait has already been waiting, sigwait wins.
But you should notice that for test case 1 and 2, main thread is designed to block the target signal.
At last for test case 3, when main thread is not blocked the target signal, and sigwait in thread A is also waiting, the signal handler is executed in the main thread.
I believe the behaviour of test case 3 is what APUE talks about:
From APUE ยง12.8:
If a signal is being caught (the process has established a signal
handler by using sigaction, for example) and a thread is waiting for
the same signal in a call to sigwait, it is left up to the
implementation to decide which way to deliver the signal. The
implementation could either allow sigwait to return or invoke the
signal handler, but not both.
Above all, if you want to accomplish one thread <-> one signal model, you should:
block all signals in the main thread with pthread_sigmask (subsequent thread created in main thread inheris the signal mask)
create threads and call sigwait(target_signal) with target signal.
test code
#define _POSIX_C_SOURCE 200809L
#include <signal.h>
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
FILE* file;
void* threadA(void* argv){
fprintf(file, "%ld\n", pthread_self());
sigset_t m;
sigemptyset(&m);
sigaddset(&m, SIGUSR1);
int signo;
int err;
// sigset_t q;
// sigemptyset(&q);
// pthread_sigmask(SIG_SETMASK, &q, NULL);
// sleep(50);
fprintf(file, "1\n");
err = sigwait(&m, &signo);
if (err != 0){
fprintf(file, "sigwait error\n");
exit(1);
}
switch (signo)
{
case SIGUSR1:
fprintf(file, "SIGUSR1 received\n");
break;
default:
fprintf(file, "?\n");
break;
}
fprintf(file, "2\n");
}
void hello(int signo){
fprintf(file, "%ld\n", pthread_self());
fprintf(file, "hello\n");
}
int main(){
file = fopen("daemon", "wb");
setbuf(file, NULL);
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_handler = hello;
sigaction(SIGUSR1, &sa, NULL);
sigset_t n;
sigemptyset(&n);
sigaddset(&n, SIGUSR1);
// pthread_sigmask(SIG_BLOCK, &n, NULL);
pthread_t pid;
int err;
err = pthread_create(&pid, NULL, threadA, NULL);
if(err != 0){
fprintf(file, "create thread error\n");
exit(1);
}
pause();
fprintf(file, "after pause\n");
fclose(file);
return 0;
}
run with ./a.out & (run in the background), and use kill -SIGUSR1 pid to test. Do not use raise. raise, sleep, pause are thread-wide.
I am trying to implement a basic event loop with pselect, so I have blocked some signals, saved the signal mask and used it with pselect so that the signals will only be delivered during that call.
If a signal is sent outside of the pselect call, it is blocked until pselect as it should, however it does not interrupt the pselect call. If a signal is sent while pselect is blocking, it will be handled AND pselect will be interrupted. This behaviour is only present in OSX, in linux it seems to function correctly.
Here is a code example:
#include <stdio.h>
#include <string.h>
#include <sys/select.h>
#include <errno.h>
#include <unistd.h>
#include <signal.h>
int shouldQuit = 0;
void signalHandler(int signal)
{
printf("Handled signal %d\n", signal);
shouldQuit = 1;
}
int main(int argc, char** argv)
{
sigset_t originalSignals;
sigset_t blockedSignals;
sigemptyset(&blockedSignals);
sigaddset(&blockedSignals, SIGINT);
if(sigprocmask(SIG_BLOCK, &blockedSignals, &originalSignals) != 0)
{
perror("Failed to block signals");
return -1;
}
struct sigaction signalAction;
memset(&signalAction, 0, sizeof(struct sigaction));
signalAction.sa_mask = blockedSignals;
signalAction.sa_handler = signalHandler;
if(sigaction(SIGINT, &signalAction, NULL) == -1)
{
perror("Could not set signal handler");
return -1;
}
while(!shouldQuit)
{
fd_set set;
FD_ZERO(&set);
FD_SET(STDIN_FILENO, &set);
printf("Starting pselect\n");
int result = pselect(STDIN_FILENO + 1, &set, NULL, NULL, NULL, &originalSignals);
printf("Done pselect\n");
if(result == -1)
{
if(errno != EAGAIN && errno != EWOULDBLOCK && errno != EINTR)
{
perror("pselect failed");
}
}
else
{
printf("Start Sleeping\n");
sleep(5);
printf("Done Sleeping\n");
}
}
return 0;
}
The program waits until you input something on stdin, then sleeps for 5 seconds. To create the problem, "a" is typed to create data on stdin. Then, while the program is sleeping, an INT signal is sent with Crtl-C.
On Linux:
Starting pselect
a
Done pselect
Start Sleeping
^CDone Sleeping
Starting pselect
Handled signal 2
Done pselect
On OSX:
Starting pselect
a
Done pselect
Start Sleeping
^CDone Sleeping
Starting pselect
Handled signal 2
^CHandled signal 2
Done pselect
Confirmed that it acts that way on OSX, and if you look at the source for pselect (http://www.opensource.apple.com/source/Libc/Libc-320.1.3/gen/FreeBSD/pselect.c), you'll see why.
After sigprocmask() restores the signal mask, the kernel delivers the signal to the process, and your handler gets invoked. The problem here is, that the signal can be delivered before select() gets invoked, so select() won't return with an error.
There's some more discussion about the issue at http://lwn.net/Articles/176911/ - linux used to use a similar userspace implementation that had the same problem.
If you want to make that pattern safe on all platforms, you'll have to either use something like libev or libevent and let them handle the messiness, or use sigprocmask() and select() yourself.
e.g.
sigset_t omask;
if (sigprocmask(SIG_SETMASK, &originalSignals, &omask) < 0) {
perror("sigprocmask");
break;
}
/* Must re-check the flag here with signals re-enabled */
if (shouldQuit)
break;
printf("Starting select\n");
int result = select(STDIN_FILENO + 1, &set, NULL, NULL, NULL);
int save_errno = errno;
if (sigprocmask(SIG_SETMASK, &omask, NULL) < 0) {
perror("sigprocmask");
break;
}
/* Recheck again after the signal is blocked */
if (shouldQuit)
break;
printf("Done pselect\n");
if(result == -1)
{
errno = save_errno;
if(errno != EAGAIN && errno != EWOULDBLOCK && errno != EINTR)
{
perror("pselect failed");
}
}
There are a couple of other things you should do with your code:
declare your 'shouldQuit' variable as volatile sig_atomic_t
volatile sig_atomic_t shouldQuit = 0;
always save errno before calling any other function (such as printf()), since that function may cause errno to be overwritten with another value. Thats why the code above aves errno immediately after the select() call.
Really, I strongly recommend using an existing event loop handling library like libev or libevent - I do, even though I can write my own, because it is so easy to get wrong.