SIGPOLL (SIGIO) problem: interrupt while executing a handler

SIGPOLL (SIGIO) problem: interrupt while executing a handler - c

I am implementing a client and server that communicate over UDP using sendto() and recvfrom(). I wanted to use SIGPOLL for my client when receiving data from the server. My issue is that while in the handler, another signal arrives and it gets lost. I've read there is a kernel variable (a flag) that gets set that I could check before exiting the handler, but I can't seem to find out which flag that is.
void my_receive_server_data(int sig_num)
{
/* execute recvfrom() */
}
And in main:
setup_action.sa_handler = my_receive_server_data;
if (sigaction(SIGPOLL, &setup_action, NULL) == -1)
perror("Sigaction");
if (fcntl(sock, F_SETOWN, getpid()) < 0) {
perror("fcntl");
}
if (fcntl(sock, F_SETFL, O_RDONLY | FASYNC) < 0) {
perror("fcntl");
}
Currently, if I do not put a sleep() after sendto() in the server, only the first sendto() gets executed by the client (in the handler, as a recvfrom()). Putting sleep() fixes the problem, so I highly believe it is because the handler only gets run once (because it is getting the data while still executing my_receive_server_data).
I would like to please know what flag I should check before returning from my_receive_server_data to check if any input has arrived while that handler was executing please.
Thank you very much.

It's very hard to create a solid system the way you're trying to. Instead of signals, learn to use blocking call models, and I/O multiplexing with select() or poll().
If you intend to use signals no matter what, try to reduce the code in the signal handler to a minimum -- maybe just sending a byte down a pipe to wake up a sleeping thread or process, or whatever.

The flag is SA_NODEFER passed to sigaction when defining a handler for SIGPOLL.

Related

Using a sig_atomic_t flag together with blocking calls

Say I have a flag to indicate an exit condition that I with to enable with a signal. Then I can attach the following handler to SIGUSR1 for instance.
volatile sig_atomic_t finished = 0;
void catch_signal(int sig)
{
finished = 1;
}
I then use the flag to determine when a particular loop should end. In this particular case I have a thread running (but I believe my problem applies without threads also, so don't focus on that part).
void *thread_routine(void *arg)
{
while (!finished) {
/* What if the signal happens here? */
if ((clientfd = accept(sockfd, &remote_addr, &addr_size)) == -1) {
if (errno == EINTR)
continue;
/* Error handling */
}
handle_client(clientfd);
}
}
This loop is supposed to continue to run until I raise my SIGUSR1 signal. When it receives the signal I want it to stop gracefully as soon as possible. Since I have a blocking accept call I don't have the loop spinning around wasting CPU cycles, which is good, and the signal can at any moment interrupt the blocking accept and cause the loop to terminate.
The problem is, as shown in the comment in the code, that the signal could be delivered right after the while condition but before the accept call. Then the signal handler will set finished to true, but after the execution resumes, accept will be called and block indefinitely. How can I avoid this condition and make sure that I always will be able to terminate the loop with my signal?
Assuming I still want to use a signal to control this, I can think of two possible solutions. The first one is to turn on some alarm that re-raises a signal after a while if the signal was missed the first time. The second one is to put a timeout on the socket so that accept returns after some amount time so that the flag can be examined again. But these solutions are more like workarounds (especially since I change the blocking behaviour of accept in my second solution) and if there is some cleaner and more straightforward solution I'd like to use that instead.

The Self-Pipe Trick can be used in such cases.
You open a pipe and use select to wait both on the pipefd and sockfd. The handler writes a char to the pipe. After the select, checking fd set helps you determine if you can go for accept or not.

I realize this question is over a year old, now, but pselect() was designed exactly for this type of situation. You can provide pselect() (and select() generally) with file descriptors of listening sockets, and those functions will return when there is an accept()able connection available.
The general approach is you block all relevant signals, and then call pselect() with a signal mask to unblock them. pselect() will atomically:
Unblock the signal(s)
Call accept()
Block the signal(s) again when accept() returns
so you can essentially guarantee that the only time that signal will actually be delivered and handled is when pselect() is running, and you don't have to worry about it being caught after you check finished but before you call accept(). In other words, you make sure that whenever that signal is delivered, it'll always interrupt pselect() and set errno to EINTR, so that's the only place you have to check for it.

Is there a version of the wait() system call that sets a timeout?

Is there any way to use the wait() system call with a timeout, besides using a busy-waiting or busy-sleeping loop?
I've got a parent process that forks itself and execs a child executable. It then waits for the child to finish, grabs its output by whatever means appropriate, and and performs further processing. If the process does not finish within a certain period of time, it assumes that its execution timed out, and does something else. Unfortunately, this timeout detection is necessary given the nature of the problem.

There's not a wait call that takes a timeout.
What you can do instead is install a signal handler that sets a flag for SIGCHLD, and use select() to implement a timeout. select() will be interrupted by a signal.
static volatile int punt;
static void sig_handler(int sig)
{
punt = 1;
}
...
struct timeval timeout = {10,0};
int rc;
signal(SIGCHLD, sig_handler);
fork/exec stuff
//select will get interrupted by a signal
rc = select(0, NULL,NULL,NULL, &timeout );
if (rc == 0) {
// timed out
} else if (punt) {
//child terminated
}
More logic is needed if you have other signal you need to handle as well though

You can use waitpid together with the WNOHANG option and a sleep.
while(waitpid(pid, &status, WNOHANG) == 0) {
sleep(1);
}
But this will be an active sleeping. However I see no other way using the wait type of functions.

On linux, you can also solve this problem using signalfd. signalfd essentially takes a set of signals and creates an fd which you can read; each block you read corresponds to a signal which has fired. (You should block these signals with sigprocmask so that they are not actually sent.)
The advantage of signalfd is that you can use the fd with select, poll, or epoll, all of which allow for timeouts, and all of which allow you to wait for other things as well.
One note: If the same signal fires twice before the corresponding struct signalfd_siginfo is read, you'll only receive a single indication. So when you get a SIGCHLD indication, you need to waitpid(-1, &status, &WNOHANG) repeatedly until it returns -1.
On FreeBSD, you can achieve the same effect rather more directly using kqueue and a kevent of type EVFILT_PROC. (You can also kqueue a SIGCHLD event, but EVFILT_PROC lets you specify the events by child pid instead of globally for all children.) This should also work on Mac OS X, but I've never tried it.

How to cleanly interrupt a thread blocking on a recv call?

I have a multithreaded server written in C, with each client thread looking something like this:
ssize_t n;
struct request request;
// Main loop: receive requests from the client and send responses.
while(running && (n = recv(sockfd, &request, sizeof(request), 0)) == sizeof(request)) {
// Process request and send response.
}
if(n == -1)
perror("Error receiving request from client");
else if(n != sizeof(act))
fprintf(stderr, "Error receiving request from client: Incomplete data\n");
// Clean-up code.
At some point, a client meets a certain criteria where it must be disconnected. If the client is regularly sending requests, this is fine because it can be informed of the disconnection in the responses; However sometimes the clients take a long time to send a request, so the client threads end up blocking in the recv call, and the client does not get disconnected until the next request/response.
Is there a clean way to disconnect the client from another thread while the client thread is blocking in the recv call? I tried close(sockfd) but that causes the error Error receiving request from client: Bad file descriptor to occur, which really isn't accurate.
Alternatively, is there a better way for me to be handling errors here?

So you have at least these possibilities:
(1) pthread_kill will blow the thread out of recv with errno == EINTR and you can clean up and exit the thread on your own. Some people think this is nasty. Depends, really.
(2) Make your client socket(s) non-blocking and use select to wait on input for a specific period of time before checking if a switch used between the threads has been set to indicated they should shut down.
(3) In combo with (2) have each thread share a pipe with the master thread. Add it to the select. If it becomes readable and contains a shutdonw request, the thread shuts itself down.
(4) Look into the pthread_cancel mechanism if none of the above (or variations thereof) do not meet your needs.

Shutdown the socket for input from another thread. That will cause the reading thread to receive an EOS, which should cause it to close the socket and terminate if it is correctly written.

To interrupt the thread, make the socket non-blocking (set O_NONBLOCK using fcntl) and then signal the thread with pthread_kill. This way, recv will fail with either EINTR if it was sleeping, or EAGAIN or EWOULDBLOCK if it wasn’t (also maybe if SA_RESTART is in effect, didn’t check). Note that the socket doesn’t need to, and actually should not, be non-blocking before that. (And of course the signal needs to be handled; empty handler is sufficient).
To be sure to catch the stop-signal but not anything else, use a flag; there are things that may go wrong. For example, recv may fail with EINTR on some spurious signal. Or it may succeed if there was some data available, effectively ignoring the stop request.
And what not to do:
Don’t use pthread_kill alone or with any plain check. It may arrive right before issuing the recv syscall, too early to interrupt it but after all the checks.
Don’t close the socket. That may not even work, and as #R.. pointer out, is dangerous as the socket file descriptor may be reused between close and recv (unless you’re sure nothing opens file descriptors).

c / fork / signals / best practice closing open sockets in different processes

Hallo erveyone,
two days before I was asking about threads and fork. Now I ended up using the fork methods.
Creating a second process, parent and child are executing different code, but both end up in a while loop, because one is sending forever packets through a socket and the other one is listening forever on a socket. Now I want them to clean up, when ctrl-c is pressed, i.e. both should close their open sockets before returning.
I have three files, first one, the main file creates the processes. In the second file is written the parent code, in the third the child code. Some more information (code snippets) you can find here: c / interrupted system call / fork vs. thread
Now my question, where do I have to put the signal handler, or do I have to specify two of them, one for each process? It seems like a simple question, but not for me somehow. I tried different ways. But could only make one of the guys successful to clean up before returning (my English is bad, sorry therefore). both have to do different things, that's the problem for me, so one handler wouldn't be enough, right?
struct sigaction new_action;
new_action.sa_handler = termination_handler_1;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction(SIGINT, &new_action, NULL);
....more code here ...
/* will run until crtl-c is pressed */
while(keep_going) {
recvlen = recvfrom(sockfd_in, msg, itsGnMaxSduSize_MIB, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(1);
}
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &athinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(1);
}
}
close(sockfd_in);
/* Close network header socket */
gnwh_close(sfd);
/* Terminate network header library */
gnwh_term();
printf("pc2wsu: signal received, closed all sockets and so on!\n");
return 0;
}
void termination_handler_1(wuint32 signum) {
keep_going = 0;
}
As you can see, handling the signal in my case is just changing the loop condition "keep_going". After exiting the loop, each process should clean up.
Thanks in advance for your help.
nyyrikki

There is no reason to close the sockets. When a process exits (as is the default action for SIGINT), all its file descriptors are inherently closed. Unless you have other essential cleanup to do (like saving data to disk) then forget about handling the signal at all. It's almost surely the wrong thing to do.

Your code suffers from a race condition. You test for keep_going and then enter recvfrom, but it might have gotten the signal between then. That is pretty unlikely, so we will ignore it.
It sounds like the sender and receiver were started by the same process and that process was started from the shell. If you have not done anything, they will be in the same process group and all three processes will receive SIGINT when you hit ^C. Thus it would be best if both processes handled SIGINT if you want to run cleanup code (note closing FDs isn't a good reason...the fds will be autoclosed when the process exits). If these are TCP sockets between the two, closing one side would eventually cause the other side to close (but for sender, not until they try to send again).

How to join a thread that is hanging on blocking IO?

I have a thread running in the background that is reading events from an input device in a blocking fashion, now when I exit the application I want to clean up the thread properly, but I can't just run a pthread_join() because the thread would never exit due to the blocking IO.
How do I properly solve that situation? Should I send a pthread_kill(theard, SIGIO) or a pthread_kill(theard, SIGALRM) to break the block? Is either of that even the right signal? Or is there another way to solve this situation and let that child thread exit the blocking read?
Currently a bit puzzled since none of my googling turned up a solution.
This is on Linux and using pthreads.
Edit: I played around a bit with SIGIO and SIGALRM, when I don't install a signal handler they break the blocking IO up, but give a message on the console ("I/O possible") but when I install a signal handler, to avoid that message, they no longer break the blocking IO, so the thread doesn't terminate. So I am kind of back to step one.

The canonical way to do this is with pthread_cancel, where the thread has done pthread_cleanup_push/pop to provide cleanup for any resources it is using.
Unfortunately this can NOT be used in C++ code, ever. Any C++ std lib code, or ANY try {} catch() on the calling stack at the time of pthread_cancel will potentially segvi killing your whole process.
The only workaround is to handle SIGUSR1, setting a stop flag, pthread_kill(SIGUSR1), then anywhere the thread is blocked on I/O, if you get EINTR check the stop flag before retrying the I/O. In practice, this does not always succeed on Linux, don't know why.
But in any case it's useless to talk about if you have to call any 3rd party lib, because they will most likely have a tight loop that simply restarts I/O on EINTR. Reverse engineering their file descriptor to close it won't cut it either—they could be waiting on a semaphore or other resource. In this case, it is simply impossible to write working code, period. Yes, this is utterly brain-damaged. Talk to the guys who designed C++ exceptions and pthread_cancel. Supposedly this may be fixed in some future version of C++. Good luck with that.

I too would recommend using a select or some other non-signal-based means of terminating your thread. One of the reasons we have threads is to try and get away from signal madness. That said...
Generally one uses pthread_kill() with SIGUSR1 or SIGUSR2 to send a signal to the thread. The other suggested signals--SIGTERM, SIGINT, SIGKILL--have process-wide semantics that you may not be interested in.
As for the behavior when you sent the signal, my guess is that it has to do with how you handled the signal. If you have no handler installed, the default action of that signal are applied, but in the context of the thread that received the signal. So SIGALRM, for instance, would be "handled" by your thread, but the handling would consist of terminating the process--probably not the desired behavior.
Receipt of a signal by the thread will generally break it out of a read with EINTR, unless it is truly in that uninterruptible state as mentioned in an earlier answer. But I think it's not, or your experiments with SIGALRM and SIGIO would not have terminated the process.
Is your read perhaps in some sort of a loop? If the read terminates with -1 return, then break out of that loop and exit the thread.
You can play with this very sloppy code I put together to test out my assumptions--I am a couple of timezones away from my POSIX books at the moment...
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <signal.h>
int global_gotsig = 0;
void *gotsig(int sig, siginfo_t *info, void *ucontext)
{
global_gotsig++;
return NULL;
}
void *reader(void *arg)
{
char buf[32];
int i;
int hdlsig = (int)arg;
struct sigaction sa;
sa.sa_handler = NULL;
sa.sa_sigaction = gotsig;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
if (sigaction(hdlsig, &sa, NULL) < 0) {
perror("sigaction");
return (void *)-1;
}
i = read(fileno(stdin), buf, 32);
if (i < 0) {
perror("read");
} else {
printf("Read %d bytes\n", i);
}
return (void *)i;
}
main(int argc, char **argv)
{
pthread_t tid1;
void *ret;
int i;
int sig = SIGUSR1;
if (argc == 2) sig = atoi(argv[1]);
printf("Using sig %d\n", sig);
if (pthread_create(&tid1, NULL, reader, (void *)sig)) {
perror("pthread_create");
exit(1);
}
sleep(5);
printf("killing thread\n");
pthread_kill(tid1, sig);
i = pthread_join(tid1, &ret);
if (i < 0)
perror("pthread_join");
else
printf("thread returned %ld\n", (long)ret);
printf("Got sig? %d\n", global_gotsig);
}

Your select() could have a timeout, even if it is infrequent, in order to exit the thread gracefully on a certain condition. I know, polling sucks...
Another alternative is to have a pipe for each child and add that to the list of file descriptors being watched by the thread. Send a byte to the pipe from the parent when you want that child to exit. No polling at the cost of a pipe per thread.

Old question which could very well get a new answer as things have evolved and a new technology is now available to better handle signals in threads.
Since Linux kernel 2.6.22, the system offers a new function called signalfd() which can be used to open a file descriptor for a given set of Unix signals (outside of those that outright kill a process.)
// defined a set of signals
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGUSR1);
// ... you can add more than one ...
// prevent the default signal behavior (very important)
sigprocmask(SIG_BLOCK, &set, nullptr);
// open a file descriptor using that set of Unix signals
f_socket = signalfd(-1, &set, SFD_NONBLOCK | SFD_CLOEXEC);
Now you can use the poll() or select() functions to listen to the signal along the more usual file descriptor (socket, file on disk, etc.) you were listening on.
The NONBLOCK is important if you want a loop that can check signals and other file descriptors over and over again (i.e. it is also important on your other file descriptor).
I have such an implementation that works with (1) timers, (2) sockets, (3) pipes, (4) Unix signals, (5) regular files. Actually, really any file descriptor plus timers.
https://github.com/m2osw/snapcpp/blob/master/snapwebsites/libsnapwebsites/src/snapwebsites/snap_communicator.cpp
https://github.com/m2osw/snapcpp/blob/master/snapwebsites/libsnapwebsites/src/snapwebsites/snap_communicator.h
You may also be interested by libraries such as libevent

Depends how it's waiting for IO.
If the thread is in the "Uninterruptible IO" state (shown as "D" in top), then there really is absolutely nothing you can do about it. Threads normally only enter this state briefly, doing something such as waiting for a page to be swapped in (or demand-loaded, e.g. from mmap'd file or shared library etc), however a failure (particularly of a NFS server) could cause it to stay in that state for longer.
There is genuinely no way of escaping from this "D" state. The thread will not respond to signals (you can send them, but they will be queued).
If it's a normal IO function such as read(), write() or a waiting function like select() or poll(), signals would be delivered normally.

One solution that occurred to me the last time I had an issue like this was to create a file (eg. a pipe) that existed only for the purpose of waking up blocking threads.
The idea would be to create a file from the main loop (or 1 per thread, as timeout suggests - this would give you finer control over which threads are woken). All of the threads that are blocking on file I/O would do a select(), using the file(s) that they are trying to operate on, as well as the file created by the main loop (as a member of the read file descriptor set). This should make all of the select() calls return.
Code to handle this "event" from the main loop would need to be added to each of the threads.
If the main loop needed to wake up all of the threads it could either write to the file or close it.
I can't say for sure if this works, as a restructure meant that the need to try it vanished.

I think, as you said, the only way would be to send a signal then catch and deal with it appropriately. Alternatives might be SIGTERM, SIGUSR1, SIGQUIT, SIGHUP, SIGINT, etc.
You could also use select() on your input descriptor so that you only read when it is ready. You could use select() with a timeout of, say, one second and then check if that thread should finish.

I always add a "kill" function related to the thread function which I run before join that ensures the thread will be joinable within reasonable time. When a thread uses blocking IO I try to utilize the system to break the lock. For example, when using a socket I would have kill call shutdown(2) or close(2) on it which would cause the network stack to terminate it cleanly.
Linux' socket implementation is thread safe.

I'm surprised that nobody has suggested pthread_cancel. I recently wrote a multi-threaded I/O program and calling cancel() and the join() afterwards worked just great.
I had originally tried the pthread_kill() but ended up just terminating the entire program with the signals I tested with.

If you're blocking in a third-party library that loops on EINTR, you might want to consider a combination of using pthread_kill with a signal (USR1 etc) calling an empty function (not SIG_IGN) with actually closing/replacing the file descriptor in question. By using dup2 to replace the fd with /dev/null or similar, you'll cause the third-party library to get an end-of-file result when it retries the read.
Note that by dup()ing the original socket first, you can avoid needing to actually close the socket.

Signals and thread is a subtle problem on Linux according to the different man pages.
Do you use LinuxThreads, or NPTL (if you are on Linux) ?
I am not sure of this, but I think the signal handler affects the whole process, so either you terminate your whole process or everything continue.
You should use timed select or poll, and set a global flag to terminate your thread.

I think the cleanest approach would have the thread using conditional variables in a loop for continuing.
When an i/o event is fired, the conditional should be signaled.
The main thread could just signal the condition while chaning the loop predicate to false.
something like:
while (!_finished)
{
pthread_cond_wait(&cond);
handleio();
}
cleanup();
Remember with conditional variables to properly handle signals. They can have things such as 'spurious wakeups'. So i would wrap your own function around the cond_wait function.

struct pollfd pfd;
pfd.fd = socket;
pfd.events = POLLIN | POLLHUP | POLLERR;
pthread_lock(&lock);
while(thread_alive)
{
int ret = poll(&pfd, 1, 100);
if(ret == 1)
{
//handle IO
}
else
{
pthread_cond_timedwait(&lock, &cond, 100);
}
}
pthread_unlock(&lock);
thread_alive is a thread specific variable that can be used in combination with the signal to kill the thread.
as for the handle IO section you need to make sure that you used open with the O_NOBLOCK option, or if its a socket there is a similar flag you can set MSG_NOWAIT??. for other fds im not sure

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SIGPOLL (SIGIO) problem: interrupt while executing a handler - c

The flag is SA_NODEFER passed to sigaction when defining a handler for SIGPOLL.

Related

Using a sig_atomic_t flag together with blocking calls

Is there a version of the wait() system call that sets a timeout?

How to cleanly interrupt a thread blocking on a recv call?

c / fork / signals / best practice closing open sockets in different processes

How to join a thread that is hanging on blocking IO?

Categories

Resources