c / fork / signals / best practice closing open sockets in different processes - c

Hallo erveyone,
two days before I was asking about threads and fork. Now I ended up using the fork methods.
Creating a second process, parent and child are executing different code, but both end up in a while loop, because one is sending forever packets through a socket and the other one is listening forever on a socket. Now I want them to clean up, when ctrl-c is pressed, i.e. both should close their open sockets before returning.
I have three files, first one, the main file creates the processes. In the second file is written the parent code, in the third the child code. Some more information (code snippets) you can find here: c / interrupted system call / fork vs. thread
Now my question, where do I have to put the signal handler, or do I have to specify two of them, one for each process? It seems like a simple question, but not for me somehow. I tried different ways. But could only make one of the guys successful to clean up before returning (my English is bad, sorry therefore). both have to do different things, that's the problem for me, so one handler wouldn't be enough, right?
struct sigaction new_action;
new_action.sa_handler = termination_handler_1;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction(SIGINT, &new_action, NULL);
....more code here ...
/* will run until crtl-c is pressed */
while(keep_going) {
recvlen = recvfrom(sockfd_in, msg, itsGnMaxSduSize_MIB, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(1);
}
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &athinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(1);
}
}
close(sockfd_in);
/* Close network header socket */
gnwh_close(sfd);
/* Terminate network header library */
gnwh_term();
printf("pc2wsu: signal received, closed all sockets and so on!\n");
return 0;
}
void termination_handler_1(wuint32 signum) {
keep_going = 0;
}
As you can see, handling the signal in my case is just changing the loop condition "keep_going". After exiting the loop, each process should clean up.
Thanks in advance for your help.
nyyrikki

There is no reason to close the sockets. When a process exits (as is the default action for SIGINT), all its file descriptors are inherently closed. Unless you have other essential cleanup to do (like saving data to disk) then forget about handling the signal at all. It's almost surely the wrong thing to do.

Your code suffers from a race condition. You test for keep_going and then enter recvfrom, but it might have gotten the signal between then. That is pretty unlikely, so we will ignore it.
It sounds like the sender and receiver were started by the same process and that process was started from the shell. If you have not done anything, they will be in the same process group and all three processes will receive SIGINT when you hit ^C. Thus it would be best if both processes handled SIGINT if you want to run cleanup code (note closing FDs isn't a good reason...the fds will be autoclosed when the process exits). If these are TCP sockets between the two, closing one side would eventually cause the other side to close (but for sender, not until they try to send again).

Related

SigHandler causing program to not terminate

Currently I am trying to create a signal handler that, when it receives a SIGTERM signal, it closes open network sockets and file descriptors.
Here is my SigHandler function
static void SigHandler(int signo){
if(signo == SIGTERM){
log_trace("SIGTERM received - handling signal");
CloseSockets();
log_trace("SIGTERM received - All sockets closed");
if (closeFile() == -1)
log_trace("SIGTERM received - No File associated with XXX open - continuing with shutdown");
else
log_trace("SIGTERM received - Closed File Descriptor for XXX - continuing with shutdown");
log_trace("Gracefully shutting down XXX Service");
} else {
log_trace("%d received - incompatible signal");
return;
}
exit(0);
}
This code below sits in main
if (sigemptyset(&set) == SIGEMPTYSET_ERROR){
log_error("Signal handling initialization failed");
}
else {
if(sigaddset(&set, SIGTERM) == SIGADDSET_ERROR) {
log_error("Signal SIGTERM not valid");
}
action.sa_flags = 0;
action.sa_mask = set;
action.sa_handler = &SigHandler;
if (sigaction(SIGTERM, &action, NULL) == SIGACTION_ERROR) {
log_error("SIGTERM handler initialization error");
}
}
When I send kill -15 PID, nothing happens. The process doesn't terminate, nor does it become a zombie process (not that it should anyway). I do see the traces printing within the SigHandler function however, so I know it is reaching that point in the code. It just seems that when it comes to exit(0), that doesn't work.
When I send SIGKILL (kill -9 PID) it kills the process just fine.
Apologies if this is vague, I'm still quite new to C and UNIX etc so I'm quite unfamiliar with most of how this works at a low level.
Your signal handler routine is conceptually wrong (it does not use just async-signal-safe functions). Read carefully signal(7) and signal-safety(7) to understand why. And your handler could apparently work most of the time but still be undefined behavior.
The usual trick is to set (in your signal handler) some volatile sig_atomic_t variable and test that variable outside of the signal handler.
Another possible trick is the pipe(7) to self trick (the Qt documentation explains it well), with your signal handler just doing a write(2) (which is async-signal-safe) to some global file descriptor obtained by e.g. pipe(2) (or perhaps the Linux specific eventfd(2)...) at program initialization before installing that signal handler.
A Linux specific way is to use signalfd(2) for SIGTERM and handle that in your own event loop (based upon poll(2)). That trick is conceptually a variant of the pipe to self one. But signalfd has some shortcomings, that a web search will find you easily.
Signals are conceptually hard to use (some view them as a design mistake in Unix), especially in multi-threaded programs.
You might want to read the old ALP book. It has some good explanations related to your issue.
PS. If your system is QNX you should read its documentation.
You should be using _exit from the signal handler instead, this also closes all the files.
Also read (very carefully) Basile's answer and take a long hard look at the list of async safe functions which you are allowed to use in signal handlers.
His advice about just changing a flag and testing it in your code is the best way if you need to do something you aren't allowed in the signal handler. Note that all blocking posix calls can be interrupted by signals so testing your atomic variable if you get an error on a blocking call (to say read) is a sure way to know if you have received a signal.

Using signals and sigpipe

I'm working on an assignment that involves writing a program to process data (calculate pi) using fork (processes), signals and select.
I'm working right now on the signals and what I think I want to do is to use SIGPIPE so if the programs catches it, it tries to write to the pipe again (If a process tries to write to a pipe that has no reader, it will be sent the SIGPIPE).
I use fork() in main() to assign each process the same work by sending them to the worker function.
void worker(int id) {
.... (this piece of code is not relevant)
if(write(pfd[id][1], &c, sizeof(c)) == -1)
printf("Error occurred: %s\n",strerror(errno));
}
How can I implement signals in this function to catch SIGPIPE and make it write to the pipe again?
Thank you!
Typically, instead of catching SIGPIPE one ignores it, which causes write to fail with EPIPE instead of silently terminating your program.
However: If you are getting a SIGPIPE when you write to a pipe, then do not try again. It will never work. SIGPIPE means that the pipe has no reader -- and if the pipe has no reader now, it will never have a reader. (Think about it this way: how would a pipe with no reader get one? It is impossible!)
Your problem is that you are closing the other end of the pipe. Fix that, and don't worry about SIGPIPE. SIGPIPE is just the symptom.
Edit: There are two questions to answer here. If you can't answer both of these questions, then don't bother handling SIGPIPE.
What would cause my program to receive SIGPIPE? The only way to receieve SIGPIPE is for the reading end of the pipe to get closed. This happens if the reading process crashes, or if it is programmed to close the pipe. If you are writing a network server, or communicating with an unknown process, this might be common. However, if you write both programs, both run locally, then it probably indicates a programming error.
What would my program do when it catches SIGPIPE? If you are writing a client process that uses a pipe to communicate with a server, then what are you supposed to do with SIGPIPE? You can't try again, and clients usually can't restart the server they're connected to. Just do the sensible, default thing and let SIGPIPE terminate your program. However, if the server is sending data to a client it controls and gets SIGPIPE, it could restart the client. But this might be a very bad idea -- for example, if the client is deterministic, it will just crash again, and you will end up with an infinite loop rather than a simple crash.
So the general maxim here is "Only catch errors you are prepared to handle." Don't catch errors just for the sake of completeness. Just let them crash your program, or cause the operation to fail, and you can go back and debug it later.
Code snippet: This is a snippet of code from one of my projects. If you run it, SIGPIPE will not terminate your process. Instead, write will generate an EPIPE error. If you are writing a network server, then EPIPE is one possible way that a client might suddenly disconnect.
void
ignore_sigpipe(void)
{
struct sigaction act;
int r;
memset(&act, 0, sizeof(act));
act.sa_handler = SIG_IGN;
act.sa_flags = SA_RESTART;
r = sigaction(SIGPIPE, &act, NULL);
if (r)
err(1, "sigaction");
}

SIGPOLL (SIGIO) problem: interrupt while executing a handler

I am implementing a client and server that communicate over UDP using sendto() and recvfrom(). I wanted to use SIGPOLL for my client when receiving data from the server. My issue is that while in the handler, another signal arrives and it gets lost. I've read there is a kernel variable (a flag) that gets set that I could check before exiting the handler, but I can't seem to find out which flag that is.
void my_receive_server_data(int sig_num)
{
/* execute recvfrom() */
}
And in main:
setup_action.sa_handler = my_receive_server_data;
if (sigaction(SIGPOLL, &setup_action, NULL) == -1)
perror("Sigaction");
if (fcntl(sock, F_SETOWN, getpid()) < 0) {
perror("fcntl");
}
if (fcntl(sock, F_SETFL, O_RDONLY | FASYNC) < 0) {
perror("fcntl");
}
Currently, if I do not put a sleep() after sendto() in the server, only the first sendto() gets executed by the client (in the handler, as a recvfrom()). Putting sleep() fixes the problem, so I highly believe it is because the handler only gets run once (because it is getting the data while still executing my_receive_server_data).
I would like to please know what flag I should check before returning from my_receive_server_data to check if any input has arrived while that handler was executing please.
Thank you very much.
It's very hard to create a solid system the way you're trying to. Instead of signals, learn to use blocking call models, and I/O multiplexing with select() or poll().
If you intend to use signals no matter what, try to reduce the code in the signal handler to a minimum -- maybe just sending a byte down a pipe to wake up a sleeping thread or process, or whatever.
The flag is SA_NODEFER passed to sigaction when defining a handler for SIGPOLL.

Converting to Multi-Threaded Socket Application

As I am currently doing this project in only C, I've up untill this point only used my webserver as a single threaded application. However, I dont want that anymore! So I have the following code that handles my Work.
void BeginListen()
{
CreateSocket();
BindSocket();
ListenOnSocket();
while ( 1 )
{
ProcessConnections();
}
}
Now I've added fork(); before the start of ProcessConnection(); which helpes me allowing multiple connections! However, when I add code for daemoning the application found in this answer. I've encounted a little problem, using fork() will create a copy of my whole running app, which is the purpose of fork(). So, I'd like to solve this problem.
My ProcessConnection() looks like this
void ProcessConnections()
{
fork();
addr_size = sizeof(connector);
connecting_socket = accept(current_socket, (struct sockaddr *)&connector, &addr_size);
if ( connecting_socket < 0 )
{
perror("Accepting sockets");
exit(-1);
}
HandleCurrentConnection(connecting_socket);
DisposeCurrentConnection();
}
How would I do to simply just add a couple of lines above or after connecting=socket = accept... in order to make it accept more than one connection at the time? Can i use fork(); but when it comes down to DisposeCurrentConnection(); I want to kill that process and just have the parent-thread running.
I'm not a 100% sure what it is that you're trying to do, buy off the top of my head, I'd prefer to do the fork after the accept, and simply exit() when you're done. Keep in mind though, that you need to react to the SIGCHLD signal when the child process exits, otherwise you'll have a ton of zombie-processes hanging around, waiting to deliver their exit-status to the parent process. C-pseudo-code:
for (;;) {
connecting_socket = accept(server_socket);
if (connecting_socket < 0)
{
if (errno == EINTR)
continue;
else
{
// handle error
break;
}
}
if (! (child_pid = fork ()))
{
// child process, do work with connecting socket
exit (0);
}
else if (child_pid > 0)
{
// parent process, keep track of child_pid if necessary.
}
else
{
// fork failed, unable to service request, send 503 or equivalent.
}
}
The child_pid is needed to (as already mentioned) to kill the child-process, but also if you wish to use waitpid to collect the exit status.
Concerning the zombie-processes, if you're not interested in what happened to the process, you could install a signal hander for SIGCHLD and just loop on waitpid with -1 until it there are no more child-processes, like this
while (-1 != waitpid (-1, NULL, WNOHANG))
/* no loop body */ ;
The waitpid function will return the pid of the child that exited, so if you wish you can correlate this to some other information about the connection (if you did keep track of the pid). Keep in mind that accept will probably exit with errno set to EINTR, without a valid connection if a SIGCHLD is caught, so remember to check for this on accepts return.
EDIT:
Don't forget to check for error conditions, i.e. fork returns -1.
Talking about fork() and threads on unix is not strictly correct. Fork creates a whole new process, which has no shared address space with the parent.
I think you are trying to achieve a process-per-request model, much like a traditional unix web server such as NCSA httpd or Apache 1.x, or possibly build a multi-threaded server with shared global memory:
Process-per-request servers:
When you call fork(), the system creates a clone of the parent process, including file descriptiors. This means that you can accept the socket request and then fork. The child process has the socket request, which it can reply to and then terminate.
This is relatively efficient on unix, as the memory of the process is not physically copied - the pages are shared between the process. The system uses a mechanism called copy-on-write to make copies on a page-by-page basis when the child process writes to memory. Thus, the overhead of a process-per-request server on unix is not that great, and many systems use this architecture.
Better to use select() function which enables u to listen and connect from different
requests in one program.... It avoids blocking but forking creates a new address space
for the copy of the program which leads to memory inefficiency....
select(Max_descr, read_set, write_set, exception_set, time_out);
i.e u can
fd_set* time_out;
fd_set* read_set;
listen(1);
listen(2);
while(1)
{
if(select(20, read_set, NULL,NULL, timeout) >0)
{
accept(1);
accept(2); .....
pthread_create(func);....
}
else
}
Check the return value of fork(). If it is zero, you are the child process, and you can exit() after doing your work. If it is a positive number then it's the process ID of the newly created process. This can let you kill() the child processes if they are hanging around too long for some reason.
As per my comment, this server is not really multi-threaded, it is multi-process.
If you want a simple way to make it accept multiple connections (and you don't care too much about performance) then you can make it work with inetd. This leaves the work of spawning the processes and being a daemon to inetd, and you just need to write a program that handles and processes a single connection. edit: or if this is a programming exercise for you, you could grab the source of inetd and see how it does it
You can also do what you want to do without either threads or new processes, using select.
Here's an article that explains how to use select (pretty low overhead compared to fork or threads - here's an example of a lightweight web server written this way)
Also if you're not wedded to doing this in C, and C++ is OK, you might consider porting your code to use ACE. That is also a good place to look for design patterns of how to do this as I believe it supports pretty much any connection handling model and is very portable.

How to join a thread that is hanging on blocking IO?

I have a thread running in the background that is reading events from an input device in a blocking fashion, now when I exit the application I want to clean up the thread properly, but I can't just run a pthread_join() because the thread would never exit due to the blocking IO.
How do I properly solve that situation? Should I send a pthread_kill(theard, SIGIO) or a pthread_kill(theard, SIGALRM) to break the block? Is either of that even the right signal? Or is there another way to solve this situation and let that child thread exit the blocking read?
Currently a bit puzzled since none of my googling turned up a solution.
This is on Linux and using pthreads.
Edit: I played around a bit with SIGIO and SIGALRM, when I don't install a signal handler they break the blocking IO up, but give a message on the console ("I/O possible") but when I install a signal handler, to avoid that message, they no longer break the blocking IO, so the thread doesn't terminate. So I am kind of back to step one.
The canonical way to do this is with pthread_cancel, where the thread has done pthread_cleanup_push/pop to provide cleanup for any resources it is using.
Unfortunately this can NOT be used in C++ code, ever. Any C++ std lib code, or ANY try {} catch() on the calling stack at the time of pthread_cancel will potentially segvi killing your whole process.
The only workaround is to handle SIGUSR1, setting a stop flag, pthread_kill(SIGUSR1), then anywhere the thread is blocked on I/O, if you get EINTR check the stop flag before retrying the I/O. In practice, this does not always succeed on Linux, don't know why.
But in any case it's useless to talk about if you have to call any 3rd party lib, because they will most likely have a tight loop that simply restarts I/O on EINTR. Reverse engineering their file descriptor to close it won't cut it either—they could be waiting on a semaphore or other resource. In this case, it is simply impossible to write working code, period. Yes, this is utterly brain-damaged. Talk to the guys who designed C++ exceptions and pthread_cancel. Supposedly this may be fixed in some future version of C++. Good luck with that.
I too would recommend using a select or some other non-signal-based means of terminating your thread. One of the reasons we have threads is to try and get away from signal madness. That said...
Generally one uses pthread_kill() with SIGUSR1 or SIGUSR2 to send a signal to the thread. The other suggested signals--SIGTERM, SIGINT, SIGKILL--have process-wide semantics that you may not be interested in.
As for the behavior when you sent the signal, my guess is that it has to do with how you handled the signal. If you have no handler installed, the default action of that signal are applied, but in the context of the thread that received the signal. So SIGALRM, for instance, would be "handled" by your thread, but the handling would consist of terminating the process--probably not the desired behavior.
Receipt of a signal by the thread will generally break it out of a read with EINTR, unless it is truly in that uninterruptible state as mentioned in an earlier answer. But I think it's not, or your experiments with SIGALRM and SIGIO would not have terminated the process.
Is your read perhaps in some sort of a loop? If the read terminates with -1 return, then break out of that loop and exit the thread.
You can play with this very sloppy code I put together to test out my assumptions--I am a couple of timezones away from my POSIX books at the moment...
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <signal.h>
int global_gotsig = 0;
void *gotsig(int sig, siginfo_t *info, void *ucontext)
{
global_gotsig++;
return NULL;
}
void *reader(void *arg)
{
char buf[32];
int i;
int hdlsig = (int)arg;
struct sigaction sa;
sa.sa_handler = NULL;
sa.sa_sigaction = gotsig;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
if (sigaction(hdlsig, &sa, NULL) < 0) {
perror("sigaction");
return (void *)-1;
}
i = read(fileno(stdin), buf, 32);
if (i < 0) {
perror("read");
} else {
printf("Read %d bytes\n", i);
}
return (void *)i;
}
main(int argc, char **argv)
{
pthread_t tid1;
void *ret;
int i;
int sig = SIGUSR1;
if (argc == 2) sig = atoi(argv[1]);
printf("Using sig %d\n", sig);
if (pthread_create(&tid1, NULL, reader, (void *)sig)) {
perror("pthread_create");
exit(1);
}
sleep(5);
printf("killing thread\n");
pthread_kill(tid1, sig);
i = pthread_join(tid1, &ret);
if (i < 0)
perror("pthread_join");
else
printf("thread returned %ld\n", (long)ret);
printf("Got sig? %d\n", global_gotsig);
}
Your select() could have a timeout, even if it is infrequent, in order to exit the thread gracefully on a certain condition. I know, polling sucks...
Another alternative is to have a pipe for each child and add that to the list of file descriptors being watched by the thread. Send a byte to the pipe from the parent when you want that child to exit. No polling at the cost of a pipe per thread.
Old question which could very well get a new answer as things have evolved and a new technology is now available to better handle signals in threads.
Since Linux kernel 2.6.22, the system offers a new function called signalfd() which can be used to open a file descriptor for a given set of Unix signals (outside of those that outright kill a process.)
// defined a set of signals
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGUSR1);
// ... you can add more than one ...
// prevent the default signal behavior (very important)
sigprocmask(SIG_BLOCK, &set, nullptr);
// open a file descriptor using that set of Unix signals
f_socket = signalfd(-1, &set, SFD_NONBLOCK | SFD_CLOEXEC);
Now you can use the poll() or select() functions to listen to the signal along the more usual file descriptor (socket, file on disk, etc.) you were listening on.
The NONBLOCK is important if you want a loop that can check signals and other file descriptors over and over again (i.e. it is also important on your other file descriptor).
I have such an implementation that works with (1) timers, (2) sockets, (3) pipes, (4) Unix signals, (5) regular files. Actually, really any file descriptor plus timers.
https://github.com/m2osw/snapcpp/blob/master/snapwebsites/libsnapwebsites/src/snapwebsites/snap_communicator.cpp
https://github.com/m2osw/snapcpp/blob/master/snapwebsites/libsnapwebsites/src/snapwebsites/snap_communicator.h
You may also be interested by libraries such as libevent
Depends how it's waiting for IO.
If the thread is in the "Uninterruptible IO" state (shown as "D" in top), then there really is absolutely nothing you can do about it. Threads normally only enter this state briefly, doing something such as waiting for a page to be swapped in (or demand-loaded, e.g. from mmap'd file or shared library etc), however a failure (particularly of a NFS server) could cause it to stay in that state for longer.
There is genuinely no way of escaping from this "D" state. The thread will not respond to signals (you can send them, but they will be queued).
If it's a normal IO function such as read(), write() or a waiting function like select() or poll(), signals would be delivered normally.
One solution that occurred to me the last time I had an issue like this was to create a file (eg. a pipe) that existed only for the purpose of waking up blocking threads.
The idea would be to create a file from the main loop (or 1 per thread, as timeout suggests - this would give you finer control over which threads are woken). All of the threads that are blocking on file I/O would do a select(), using the file(s) that they are trying to operate on, as well as the file created by the main loop (as a member of the read file descriptor set). This should make all of the select() calls return.
Code to handle this "event" from the main loop would need to be added to each of the threads.
If the main loop needed to wake up all of the threads it could either write to the file or close it.
I can't say for sure if this works, as a restructure meant that the need to try it vanished.
I think, as you said, the only way would be to send a signal then catch and deal with it appropriately. Alternatives might be SIGTERM, SIGUSR1, SIGQUIT, SIGHUP, SIGINT, etc.
You could also use select() on your input descriptor so that you only read when it is ready. You could use select() with a timeout of, say, one second and then check if that thread should finish.
I always add a "kill" function related to the thread function which I run before join that ensures the thread will be joinable within reasonable time. When a thread uses blocking IO I try to utilize the system to break the lock. For example, when using a socket I would have kill call shutdown(2) or close(2) on it which would cause the network stack to terminate it cleanly.
Linux' socket implementation is thread safe.
I'm surprised that nobody has suggested pthread_cancel. I recently wrote a multi-threaded I/O program and calling cancel() and the join() afterwards worked just great.
I had originally tried the pthread_kill() but ended up just terminating the entire program with the signals I tested with.
If you're blocking in a third-party library that loops on EINTR, you might want to consider a combination of using pthread_kill with a signal (USR1 etc) calling an empty function (not SIG_IGN) with actually closing/replacing the file descriptor in question. By using dup2 to replace the fd with /dev/null or similar, you'll cause the third-party library to get an end-of-file result when it retries the read.
Note that by dup()ing the original socket first, you can avoid needing to actually close the socket.
Signals and thread is a subtle problem on Linux according to the different man pages.
Do you use LinuxThreads, or NPTL (if you are on Linux) ?
I am not sure of this, but I think the signal handler affects the whole process, so either you terminate your whole process or everything continue.
You should use timed select or poll, and set a global flag to terminate your thread.
I think the cleanest approach would have the thread using conditional variables in a loop for continuing.
When an i/o event is fired, the conditional should be signaled.
The main thread could just signal the condition while chaning the loop predicate to false.
something like:
while (!_finished)
{
pthread_cond_wait(&cond);
handleio();
}
cleanup();
Remember with conditional variables to properly handle signals. They can have things such as 'spurious wakeups'. So i would wrap your own function around the cond_wait function.
struct pollfd pfd;
pfd.fd = socket;
pfd.events = POLLIN | POLLHUP | POLLERR;
pthread_lock(&lock);
while(thread_alive)
{
int ret = poll(&pfd, 1, 100);
if(ret == 1)
{
//handle IO
}
else
{
pthread_cond_timedwait(&lock, &cond, 100);
}
}
pthread_unlock(&lock);
thread_alive is a thread specific variable that can be used in combination with the signal to kill the thread.
as for the handle IO section you need to make sure that you used open with the O_NOBLOCK option, or if its a socket there is a similar flag you can set MSG_NOWAIT??. for other fds im not sure

Resources