How to properly handle signals when using the worker thread pattern? - c

I have a simple server that looks something like this:
void *run_thread(void *arg) {
// Communicate via a blocking socket
}
int main() {
// Initialization happens here...
// Main event loop
while (1) {
new_client = accept(socket, ...);
pthread_create(&thread, NULL, &run_thread, *thread_data*);
pthread_detach(thread);
}
// Do cleanup stuff:
close(socket);
// Wait for existing threads to finish
exit(0);
)
Thus when a SIGINT or SIGTERM is received I need to break out of the main event loop to get to the clean up code. Moreover most likely the master thread is waiting on the accept() call so it's not able to check some other variable to see if it should break;.
Most of the advice I found was along the lines of this: http://devcry.blogspot.com/2009/05/pthreads-and-unix-signals.html (creating a special signal handling thread to catch all the signals and do processing on those). However, it's the processing portion that I can't really wrap my head around: how can I possibly tell the main thread to return from the accept() call and check on an external variable to see if it should break;?

Usually I am waiting on select(listeninig-socket-here) not on accept(). accept() is usually a method where a program doesn't spend lots of time waiting. And when I wait in select() and the signal SIGTERM is sent to that thread (in your case it is the main thread) I exit from that select and select returns interrupted system call .

I second skwllsp in his opinion that you should be using select call instead of accept. But, my additional suggestion is that you follow the advice of the blog whose link you have posted and create a seperate signal handling thread and ignore signals in all other threads. Then when the signal is received in the signal handling thread, use pthread_cancel to cancel the other threads. When you use pthread_cancel, if the thread which is being cancelled is in cancellation point (select happens to be one), it will come out and enter into your handler and you can cleanup and exit the thread.
You can find more info on this here, here and here

Related

Raising SIGINT, but getting stuck in thread in c

I'm doing a program that utilizes threads. I also have a SIGINT handler that closes everything correctly, for an orderly shutdown. However, since my threads are in a while(1) loop the pthread_join function in my handler gets stuck and i have to press ctrl+c a bunch of times, to close each thread singularly. How can i do this with just 1 click?
Here's my thread worker function:
void *worker(){
struct message msg;
while(1){
if(wr.fnode != NULL){
sem_wait(&tsem);
stats->nptri++;
msg.patient = *(wr.fnode);
wr_deletefnode();
sem_post(&tsem);
sleep((float)(msg.patient.ttime/1000));
msgsnd(mqid,&msg,sizeof(msg)-sizeof(long),0);
}
}
}
It's depend how you are sending signal (SIGINT or any) to a thread. for sending a signal to thread you should use pthread_kill() instead of kill() or raise() because signal handler(signal or sigaction) handles only processes ,not threads.
int pthread_kill(pthread_t thread, int sig);
If you ever try to kill running thread using kill command/function OS will throw warning like Warning: Program '/bin/bash' crashed.
observe running thread using ps -eL | grep pts/0 before and after sending signal.
I hope you got something.
Two things you need to solve:
How to break the endless while loop
How to return from blocking system call (like for example sem_wait())
Referring 1.:
Change
while (1)
to be
while (!exit_flag)
Define exit_flag globally
volatile sig_atomic_t exit_flag = 0;
and set it to 1 on reception of the SIGINT signal.
To catch a signal do not setup a signal handler, but create a separate thread using sigwait() to wait for and receive a SIGINT.
Make sure all other threads are created blocking SIGINT.
Referring 2.:
Keep track of all thread running by memorising their thread-ids.
After the signal-listener thread mentioned under 1. set the exit_flag, then loop over the list of running threads and one by one
enable them to receive SIGINT
signal them using pthread_kill(), in case they were stuck inside a system call, the call would return setting errno to EINTR.
join the thread calling pthread_join().

Persistent signal handling

I'm trying to write a signal handler to catch any number of consecutive SIGINT signals and prevent the program from exiting. The program is a simple file server. The handler sets a global flag which causes the while loop accepting new connections to end, a call to pthread_exit() ensures that main lets current connections finish before exiting. It all goes like clockwork when I hit ctrl-C once but a second time exits the program immediately.
I tried first with signal():
signal(SIGINT, catch_sigint);
...
static void catch_sigint(int signo)
{
...
signal(SIGINT, catch_sigint);
}
I also tried it using sigaction:
struct sigaction sigint_handler;
sigint_handler.sa_handler = catch_sigint;
sigemptyset(&sigint_handler.sa_mask);
sigint_handler.sa_flags = 0;
sigaction(SIGINT, &sigint_handler, NULL);
Unsure how to "reinstall" this one I just duplicated this code in the handler similar to the handler using the signal() method.
Neither one of these works as I expected.
Additional info:
The program is a simple file server. It receives a request from the client which is simply a string consisting of the requested file name. It utilizes pthreads so that transfers can occur simultaneously. Upon receiving SIGINT I wish for the server to exit the while loop and wait for all current transfers to complete then close. As is, no matter how I code the signal handler a second SIGINT terminates the program immediately.
int serverStop = 0;
...
int main()
{
/* set up the server -- socket(), bind() etc. */
struct sigaction sigint_hadler;
sigint_handler.sa_handler = catch_sigint;
sigint_handler.sa_flags = 0;
sigemptyset(&sigint_handler.sa_mask);
sigaction(SIGINT, &sigint_handler, NULL);
/* signal(SIGINT, catch_sigint); */
while(serverStop == 0)
{
/* accept new connections and pthread_create() for each */
}
pthread_exit(NULL);
}
...
static void catch_sigint(int signo)
{
serverStop = 1;
/* signal(SIGINT, catch_sigint) */
}
I don't think any other code could be pertinent but feel free to ask for elaboration
On Linux, you should not have to reinstall the signal handler, using either signal (which implements BSD semantics by default) or sigaction.
when I hit ctrl-C once but a second time exits the program immediately.
That's not because your handler got reset, but likely because your signal handler is doing something it shouldn't.
Here is how I would debug this issue: run the program under GDB and
(gdb) catch syscall exit
(gdb) catch syscall exit_group
(gdb) run
Now wait a bit for the program to start working, and hit Control-C. That will give you (gdb) prompt. Now continue the program as if it has received SIGINT: signal SIGINT (this will invoke your handler). Repeat the 'Control-C/signal SIGINT' sequence again. If you get stopped in either exit or exit_group system call, see where that is coming from (using GDB where command).
Update:
Given the new code you posted, it's not clear exactly where you call pthread_exit to "ensures that main lets current connections finish before exiting". As written, your main thread will exit the loop on first Control-C, and proceed to call exit which would not wait for other threads to finish.
Either you didn't show your actual code, or the "second Control-C" is a red herring and your first Control-C takes you out already (without finishing work in other threads).
NOTE: this is largely guesswork.
I'm pretty sure that calling pthread_exit in the main thread is a bad idea. If the main thread has quit, then the OS may try to send subsequent signals to some other thread.
I recommend that instead of using pthread_exit in the main thread, you just pthread_join() all the other threads, then exit normally.
But it's also important to ensure that the other threads do not get the signals. Normally this is done with sigprocmask (or maybe more correctly pthread_sigmask, which is the same under Linux) to mask the signal out in the worker threads. This ensures that the signal is never delivered to them.
Note that to avoid race conditions, you should use pthread_sigmask in the main thread just before creating a child thread, then set the signal mask back again in the main thread afterwards. This ensures that there is no window, however small, during which a child thread can possibly get unwanted signals.
I'm not sure to understand. A signal handler should usually not re-install any signal handler (including itself), because the signal handler stays in function till another is installed. See also SA_NODEFER flag to sigaction to be able to catch the signal during its handling.
A signal handler should be short. See my answer to this question. It usually mostly sets a volatile sig_atomic_t variable.
What is not working? Don't do complex or long-lasting processing inside signal handlers.
Please show your code...

pthreads: How to handle signals in a main thread that creates other threads? (specific code shown)

I have a main thread, which stays in the main function, i.e. I do not create it specifically as in pthread_create, because it's not necessary. This thread opens a file, then creates other threads, waits for them to finish their work (i.e., does the join), cleans up everything (pointers, semaphores, conditional variables and so on...).
Now, I have to apply this code to block SIGINT:
sigset_t set;
int sig;
sigemptyset(&set);
sigaddset(&set, SIGINT);
pthread_sigmask(SIG_BLOCK, &set, NULL);
while (1) {
sigwait(&set, &sig);
switch (sig) {
case SIGINT:
/* handle interrupts */
break;
default:
/* unexpected signal */
pthread_exit((void *)-1);
}
}
and it says You must use the main() function to launch the N+1 threads and wait for their completion. If a SIGINT signal arrives at the program it should be handled by the main thread in order to shutdown the program and its threads a clean way
My doubt is how should I put this code? Is it wrong to put it on a background thread created in main() ? Because I already have a cicle, with an exit flag, that creates and join all the other threads, so I don't understand if this code goes exactly to the main function where all is done/called to initiate the program. If I put it on a thread, with this code and the handler to clean, is this considerated as busy waiting?
"It says"? What says? The homework assignment?
The first thing you should understand about programming with threads and signals is that you have very little control over which thread a signal is delivered to. If your main thread wants to be the one to get the signal, it should block the signal before creating any new threads and possible unblock it after it finishes creating them, to ensure that the signal is not delivered to them.
However, if you're following best practices for signal handlers, it probably doesn't matter which thread handles the signal. All the signal handler should do is set a global flag or write a byte to a pipe (whichever works best to get the main thread to notice that the signal happened. (Note that you cannot use condition variables or any locking primitives from signal handlers!) As in the code fragment in your question, blocking the signal and using sigwait is also possible (be aware, again, that it needs to be blocked in all threads), but most programs can't afford to stop and wait just for signals; they need to wait for condition variables and/or input from files as well. One way to solve this issue is to make a dedicated thread to call sigwait, but that's rather wasteful. A better solution, if you're already using select, would be to switch to pselect that can wait for signals as well as file descriptor events (at the same time).
Rather than asking us for the answers (which would be hard to give anyway without seeing the full program you're trying to make this work with), you'd be much better off trying to really understand the intricacies of signals with threads.

Adding signal handler to a function in C for a thread library

I am writing a basic user level thread library. The function prototype for thread creation is
thr_create (start_func_pointer,arg)
{
make_context(context_1,start_func)
}
start_func will be user defined and can change depending on user/program
once after creation of thread, if I start executing it using
swapcontext(context_1,context_2)
the function start_func would start running. Now , if a signal comes in , I need to handle it. Unfortunately, I just have the handle to start_func so I cant really define signal action inside the start_func
is there a way I can add a signal handling structure inside the start_function and point it to my code. something like this
thr_create (start_func_pointer,arg)
{
start_func.add_signal_hanlding_Structure = my_signal_handler();
make_context(context_1,start_func)
}
Does anybody know how posix does it ?
If you are talking about catching real signals from the actual operating system you are running on I believe that you are going to have to do this application wide and then pass the signals on down into each thread (more on this later). The problem with this is that it gets complicated if two (or more) of your threads are trying to use alarm which uses SIGALRM -- when the real signal happens you can catch it, but then who do you deliver it to (one or all of the threads?).
If you are talking about sending and catching signals just among the threads within a program using your library then sending a signal to a thread would cause it to be marked ready to run, even if it were waiting on something else previously, and then any signal handling functionality would be called from your thread resume code. If I remember from your previous questions you had a function called thread_yield which was called to allow the next thread to run. If this is the case then thread_yield needs to check a list of pending signals and preform their actions before returning to where ever thread_yield was called (unless one of the signal handlers involved killing the current thread, in which case you have to do something different).
As far as how to implement registering of signal handlers, in POSIX that is done by system calls made by the main function (either directly or indirectly). So you could have:
static int foo_flag = 0;
static void foo_handle(int sig) {
foo_flag = 1;
}
int start_func(void * arg) {
thread_sig_register(SIGFOO, foo_handle);
thread_pause();
// this is a function that you could write that would cause the current thread
// to mark itself as not ready to run and then call thread_yield, so that
// thread_pause() will return only after something else (a signal) causes the
// thread to become ready to run again.
if (foo_flag) {
printf("I got SIGFOO\n");
} else {
printf("I don't know what woke me up\n");
}
return 0;
}
Now, from another thread you can send this thread a SIGFOO (which is just a signal I made up for demonstration purposes).
Each of your thread control blocks (or whatever you are calling them) will have to have a signal handler table (or list, or something) and a pending signal list or a way to mark the signals as pending. The pending signals will be examined (possibly in some priority based order) and the handler action is done for each pending signal before returning to that threads normal code.

How to join a thread that is hanging on blocking IO?

I have a thread running in the background that is reading events from an input device in a blocking fashion, now when I exit the application I want to clean up the thread properly, but I can't just run a pthread_join() because the thread would never exit due to the blocking IO.
How do I properly solve that situation? Should I send a pthread_kill(theard, SIGIO) or a pthread_kill(theard, SIGALRM) to break the block? Is either of that even the right signal? Or is there another way to solve this situation and let that child thread exit the blocking read?
Currently a bit puzzled since none of my googling turned up a solution.
This is on Linux and using pthreads.
Edit: I played around a bit with SIGIO and SIGALRM, when I don't install a signal handler they break the blocking IO up, but give a message on the console ("I/O possible") but when I install a signal handler, to avoid that message, they no longer break the blocking IO, so the thread doesn't terminate. So I am kind of back to step one.
The canonical way to do this is with pthread_cancel, where the thread has done pthread_cleanup_push/pop to provide cleanup for any resources it is using.
Unfortunately this can NOT be used in C++ code, ever. Any C++ std lib code, or ANY try {} catch() on the calling stack at the time of pthread_cancel will potentially segvi killing your whole process.
The only workaround is to handle SIGUSR1, setting a stop flag, pthread_kill(SIGUSR1), then anywhere the thread is blocked on I/O, if you get EINTR check the stop flag before retrying the I/O. In practice, this does not always succeed on Linux, don't know why.
But in any case it's useless to talk about if you have to call any 3rd party lib, because they will most likely have a tight loop that simply restarts I/O on EINTR. Reverse engineering their file descriptor to close it won't cut it either—they could be waiting on a semaphore or other resource. In this case, it is simply impossible to write working code, period. Yes, this is utterly brain-damaged. Talk to the guys who designed C++ exceptions and pthread_cancel. Supposedly this may be fixed in some future version of C++. Good luck with that.
I too would recommend using a select or some other non-signal-based means of terminating your thread. One of the reasons we have threads is to try and get away from signal madness. That said...
Generally one uses pthread_kill() with SIGUSR1 or SIGUSR2 to send a signal to the thread. The other suggested signals--SIGTERM, SIGINT, SIGKILL--have process-wide semantics that you may not be interested in.
As for the behavior when you sent the signal, my guess is that it has to do with how you handled the signal. If you have no handler installed, the default action of that signal are applied, but in the context of the thread that received the signal. So SIGALRM, for instance, would be "handled" by your thread, but the handling would consist of terminating the process--probably not the desired behavior.
Receipt of a signal by the thread will generally break it out of a read with EINTR, unless it is truly in that uninterruptible state as mentioned in an earlier answer. But I think it's not, or your experiments with SIGALRM and SIGIO would not have terminated the process.
Is your read perhaps in some sort of a loop? If the read terminates with -1 return, then break out of that loop and exit the thread.
You can play with this very sloppy code I put together to test out my assumptions--I am a couple of timezones away from my POSIX books at the moment...
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <signal.h>
int global_gotsig = 0;
void *gotsig(int sig, siginfo_t *info, void *ucontext)
{
global_gotsig++;
return NULL;
}
void *reader(void *arg)
{
char buf[32];
int i;
int hdlsig = (int)arg;
struct sigaction sa;
sa.sa_handler = NULL;
sa.sa_sigaction = gotsig;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
if (sigaction(hdlsig, &sa, NULL) < 0) {
perror("sigaction");
return (void *)-1;
}
i = read(fileno(stdin), buf, 32);
if (i < 0) {
perror("read");
} else {
printf("Read %d bytes\n", i);
}
return (void *)i;
}
main(int argc, char **argv)
{
pthread_t tid1;
void *ret;
int i;
int sig = SIGUSR1;
if (argc == 2) sig = atoi(argv[1]);
printf("Using sig %d\n", sig);
if (pthread_create(&tid1, NULL, reader, (void *)sig)) {
perror("pthread_create");
exit(1);
}
sleep(5);
printf("killing thread\n");
pthread_kill(tid1, sig);
i = pthread_join(tid1, &ret);
if (i < 0)
perror("pthread_join");
else
printf("thread returned %ld\n", (long)ret);
printf("Got sig? %d\n", global_gotsig);
}
Your select() could have a timeout, even if it is infrequent, in order to exit the thread gracefully on a certain condition. I know, polling sucks...
Another alternative is to have a pipe for each child and add that to the list of file descriptors being watched by the thread. Send a byte to the pipe from the parent when you want that child to exit. No polling at the cost of a pipe per thread.
Old question which could very well get a new answer as things have evolved and a new technology is now available to better handle signals in threads.
Since Linux kernel 2.6.22, the system offers a new function called signalfd() which can be used to open a file descriptor for a given set of Unix signals (outside of those that outright kill a process.)
// defined a set of signals
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGUSR1);
// ... you can add more than one ...
// prevent the default signal behavior (very important)
sigprocmask(SIG_BLOCK, &set, nullptr);
// open a file descriptor using that set of Unix signals
f_socket = signalfd(-1, &set, SFD_NONBLOCK | SFD_CLOEXEC);
Now you can use the poll() or select() functions to listen to the signal along the more usual file descriptor (socket, file on disk, etc.) you were listening on.
The NONBLOCK is important if you want a loop that can check signals and other file descriptors over and over again (i.e. it is also important on your other file descriptor).
I have such an implementation that works with (1) timers, (2) sockets, (3) pipes, (4) Unix signals, (5) regular files. Actually, really any file descriptor plus timers.
https://github.com/m2osw/snapcpp/blob/master/snapwebsites/libsnapwebsites/src/snapwebsites/snap_communicator.cpp
https://github.com/m2osw/snapcpp/blob/master/snapwebsites/libsnapwebsites/src/snapwebsites/snap_communicator.h
You may also be interested by libraries such as libevent
Depends how it's waiting for IO.
If the thread is in the "Uninterruptible IO" state (shown as "D" in top), then there really is absolutely nothing you can do about it. Threads normally only enter this state briefly, doing something such as waiting for a page to be swapped in (or demand-loaded, e.g. from mmap'd file or shared library etc), however a failure (particularly of a NFS server) could cause it to stay in that state for longer.
There is genuinely no way of escaping from this "D" state. The thread will not respond to signals (you can send them, but they will be queued).
If it's a normal IO function such as read(), write() or a waiting function like select() or poll(), signals would be delivered normally.
One solution that occurred to me the last time I had an issue like this was to create a file (eg. a pipe) that existed only for the purpose of waking up blocking threads.
The idea would be to create a file from the main loop (or 1 per thread, as timeout suggests - this would give you finer control over which threads are woken). All of the threads that are blocking on file I/O would do a select(), using the file(s) that they are trying to operate on, as well as the file created by the main loop (as a member of the read file descriptor set). This should make all of the select() calls return.
Code to handle this "event" from the main loop would need to be added to each of the threads.
If the main loop needed to wake up all of the threads it could either write to the file or close it.
I can't say for sure if this works, as a restructure meant that the need to try it vanished.
I think, as you said, the only way would be to send a signal then catch and deal with it appropriately. Alternatives might be SIGTERM, SIGUSR1, SIGQUIT, SIGHUP, SIGINT, etc.
You could also use select() on your input descriptor so that you only read when it is ready. You could use select() with a timeout of, say, one second and then check if that thread should finish.
I always add a "kill" function related to the thread function which I run before join that ensures the thread will be joinable within reasonable time. When a thread uses blocking IO I try to utilize the system to break the lock. For example, when using a socket I would have kill call shutdown(2) or close(2) on it which would cause the network stack to terminate it cleanly.
Linux' socket implementation is thread safe.
I'm surprised that nobody has suggested pthread_cancel. I recently wrote a multi-threaded I/O program and calling cancel() and the join() afterwards worked just great.
I had originally tried the pthread_kill() but ended up just terminating the entire program with the signals I tested with.
If you're blocking in a third-party library that loops on EINTR, you might want to consider a combination of using pthread_kill with a signal (USR1 etc) calling an empty function (not SIG_IGN) with actually closing/replacing the file descriptor in question. By using dup2 to replace the fd with /dev/null or similar, you'll cause the third-party library to get an end-of-file result when it retries the read.
Note that by dup()ing the original socket first, you can avoid needing to actually close the socket.
Signals and thread is a subtle problem on Linux according to the different man pages.
Do you use LinuxThreads, or NPTL (if you are on Linux) ?
I am not sure of this, but I think the signal handler affects the whole process, so either you terminate your whole process or everything continue.
You should use timed select or poll, and set a global flag to terminate your thread.
I think the cleanest approach would have the thread using conditional variables in a loop for continuing.
When an i/o event is fired, the conditional should be signaled.
The main thread could just signal the condition while chaning the loop predicate to false.
something like:
while (!_finished)
{
pthread_cond_wait(&cond);
handleio();
}
cleanup();
Remember with conditional variables to properly handle signals. They can have things such as 'spurious wakeups'. So i would wrap your own function around the cond_wait function.
struct pollfd pfd;
pfd.fd = socket;
pfd.events = POLLIN | POLLHUP | POLLERR;
pthread_lock(&lock);
while(thread_alive)
{
int ret = poll(&pfd, 1, 100);
if(ret == 1)
{
//handle IO
}
else
{
pthread_cond_timedwait(&lock, &cond, 100);
}
}
pthread_unlock(&lock);
thread_alive is a thread specific variable that can be used in combination with the signal to kill the thread.
as for the handle IO section you need to make sure that you used open with the O_NOBLOCK option, or if its a socket there is a similar flag you can set MSG_NOWAIT??. for other fds im not sure

Resources