Problems with signal handling - c

After a fork call, i have one father that must send sigusr1 or sigusr2 (based on the value of the 'cod' variable) to his child. The child have to install the proper handlers before receiving sigusr1 or sigusr2. For doing so, i pause the father waiting for the child to signal him telling that he's done with the handler installation. The father is signaled by sigusr1 and the handler for this signal is installed before the fork call. However, it seems the father can't return from pause making me think that he actually never call the sigusr1 handler.
[...]
typedef enum{FALSE, TRUE} boolean;
boolean sigusr1setted = FALSE;
boolean sigusr2setted = FALSE;
void
sigusr1_handler0(int signo){
return;
}
void
sigusr1_handler(int signo){
sigusr1setted = TRUE;
}
void
sigusr2_handler(int signo){
sigusr2setted = TRUE;
}
int main(int argc, char *argv[]){
[...]
if(signal(SIGUSR1, sigusr1_handler0) == SIG_ERR){
perror("signal 0 error");
exit(EXIT_FAILURE);
}
pid = fork();
if (pid == 0){
if(signal(SIGUSR1, sigusr1_handler) == SIG_ERR){
perror("signal 1 error");
exit(EXIT_FAILURE);
}
if(signal(SIGUSR2, sigusr2_handler) == SIG_ERR){
perror("signal 2 error");
exit(EXIT_FAILURE);
}
kill(SIGUSR1, getppid()); // wake up parent by signaling him with sigusr1
// Wait for the parent to send the signals...
pause();
if(sigusr1setted){
if(execl("Prog1", "Prog1", (char*)0) < 0){
perror("exec P1 error");
exit(EXIT_FAILURE);
}
}
if(sigusr2setted){
if(execl("Prog2", "Prog2", (char*)0) < 0){
perror("exec P2 error");
exit(EXIT_FAILURE);
}
}
// Should'nt reach this point : something went wrong...
exit(EXIT_FAILURE);
}else if (pid > 0){
// The father must wake only after the child has done with the handlers installation
pause();
// Never reaches this point ...
if (cod == 1)
kill(SIGUSR1, pid);
else
kill(SIGUSR2, pid);
// Wait for the child to complete..
if(wait(NULL) == -1){
perror("wait 2 error");
exit(EXIT_FAILURE);
}
[...]
}else{
perror("fork 2 error");
exit(EXIT_FAILURE);
}
[...]
exit(EXIT_SUCCESS);
}

Assembling a plausible answer from the comments - so this is Community Wiki from the outset. (If Oli provides an answer, up-vote that instead of this!)
Oli Charlesworth gave what is probably the core of the problem:
I suspect you have produced a race condition in the opposite direction to what you anticipated. The child sent SIGUSR1 to the parent before the parent reached the pause().
ouah noted accurately:
An object shared between the signal handler and the non-handler code (your boolean objects) must have a volatile sig_atomic_t type otherwise the code is undefined.
That said, POSIX allows a little more laxity than standard C does for what can be done inside a signal handler. We might also note the C99 provides <stdbool.h> to define the bool type.
The original poster commented:
I don't see how can I make sure that the parent goes in the pause() call first without using sleep() in the child (which guarantees nothing). Any ideas?
Suggestion: Use usleep() (ยต-sleep, or sleep in microseconds), or nanosleep() (sleep in nanoseconds)?
Or use a different synchronization mechanism, such as:
parent process creates FIFO;
fork();
child opens FIFO for writing (blocking until there is a reader);
parent opens FIFO for reading (blocking until there is a writer);
when unblocked because the open() calls return, both processes simply close the FIFO;
the parent removes the FIFO.
Note that there is no data communication between the two processes via the FIFO; the code is simply relying on the kernel to block the processes until there is a reader and a writer, so both processes are ready to go.
Another possibility, is that the parent process could try if (siguser1setted == FALSE) pause(); to reduce the window for the race condition. However, it only reduces the window; it does not guarantee that the race condition cannot occur. That is, Murphy's Law applies and the signal could arrive between the time the test is complete and the time the pause() is executed.
All of this goes to show that signals are not a very good IPC mechanism. They can be used for IPC, but they should seldom actually be used for synchronization.
Incidentally, there's no need to test the return value of any of the exec*() family of functions. If the system call returns, it failed.
And the questioner asked again:
Wouldn't it be better to use POSIX semaphores shared between processes?
Semaphores would certainly be another valid mechanism for synchronizing the two processes. Since I'd certainly have to look at the manual pages for semaphores whereas I can remember how to use FIFOs without looking, I'm not sure that I'd actually use them, but creating and removing a FIFO has its own set of issues so it is not clear that it is in any way 'better' (or 'worse'); just different. It's mkfifo(), open(), close(), unlink() for FIFOs versus sem_open() (or sem_init()), sem_post(), sem_wait(), sem_close(), and maybe sem_unlink() (or sem_destroy()) for semaphores. You might want to think about registering a 'FIFO removal' or 'semaphore cleanup' function with atexit() to make sure the FIFO or semaphore is destroyed under as many circumstances as possible. However, that's probably OTT for a test program.

Related

How can waitpid() reap more than one child?

In this example from the CSAPP book chap.8:
\#include "csapp.h"
/* WARNING: This code is buggy! \*/
void handler1(int sig)
{
int olderrno = errno;
if ((waitpid(-1, NULL, 0)) < 0)
sio_error("waitpid error");
Sio_puts("Handler reaped child\n");
Sleep(1);
errno = olderrno;
}
int main()
{
int i, n;
char buf[MAXBUF];
if (signal(SIGCHLD, handler1) == SIG_ERR)
unix_error("signal error");
/* Parent creates children */
for (i = 0; i < 3; i++) {
if (Fork() == 0) {
printf("Hello from child %d\n", (int)getpid());
exit(0);
}
}
/* Parent waits for terminal input and then processes it */
if ((n = read(STDIN_FILENO, buf, sizeof(buf))) < 0)
unix_error("read");
printf("Parent processing input\n");
while (1)
;
exit(0);
}
It generates the following output:
......
Hello from child 14073
Hello from child 14074
Hello from child 14075
Handler reaped child
Handler reaped child //more than one child reaped
......
The if block used for waitpid() is used to generate a mistake that waitpid() is not able to reap all children. While I understand that waitpid() is to be put in a while() loop to ensure reaping all children, what I don't understand is that why only one waitpid() call is made, yet was able to reap more than one children(Note in the output more than one child is reaped by handler)? According to this answer: Why does waitpid in a signal handler need to loop?
waitpid() is only able to reap one child.
Thanks!
update:
this is irrelevant, but the handler is corrected in the following way(also taken from the CSAPP book):
void handler2(int sig)
{
int olderrno = errno;
while (waitpid(-1, NULL, 0) > 0) {
Sio_puts("Handler reaped child\n");
}
if (errno != ECHILD)
Sio_error("waitpid error");
Sleep(1);
errno = olderrno;
}
Running this code on my linux computer.
The signal handler you designated runs every time the signal you assigned to it (SIGCHLD in this case) is received. While it is true that waitpid is only executed once per signal receival, the handler still executes it multiple times because it gets called every time a child terminates.
Child n terminates (SIGCHLD), the handler springs into action and uses waitpid to "reap" the just exited child.
Child n+1 terminates and its behaviour follows the same as Child n. This goes on for every child there is.
There is no need to loop it as it gets called only when needed in the first place.
Edit: As pointed out below, the reason as to why the book later corrects it with the intended loop is because if multiple children send their termination signal at the same time, the handler may only end up getting one of them.
signal(7):
Standard signals do not queue. If multiple instances of a
standard signal are generated while that signal is blocked, then
only one instance of the signal is marked as pending (and the
signal will be delivered just once when it is unblocked).
Looping waitpid assures the reaping of all exited children and not just one of them as is the case right now.
Why is looping solving the issue of multiple signals?
Picture this: you are currently inside the handler, handling a SIGCHLD signal you have received and whilst you are doing that, you receive more signals from other children that have terminated in the meantime. These signals cannot queue up. By constantly looping waitpid, you are making sure that even if the handler itself can't deal with the multiple signals being sent, waitpid still picks them up as it's constantly running, rather than only running when the handler activates, which can or can't work as intended depending on whether signals have been merged or not.
waitpid still exits correctly once there are no more children to reap. It is important to understand that the loop is only there to catch signals that are sent when you are already in the signal handler and not during normal code execution as in that case the signal handler will take care of it as normal.
If you are still in doubt, try reading these two answers to your question.
How to make sure that `waitpid(-1, &stat, WNOHANG)` collect all children processes
Why does waitpid in a signal handler need to loop? (first two paragraphs)
The first one uses flags such as WNOHANG, but this only makes waitpid return immediately instead of waiting, if there is no child process ready to be reaped.

Why does waitpid in a signal handler need to loop?

I read in an ebook that waitpid(-1, &status, WNOHANG) should be put under a while loop so that if multiple child process exits simultaniously , they are all get reaped.
I tried this concept by creating and terminating 2 child processes at the same time and reaping it by waitpid WITHOUT using loop. And the are all been reaped .
Question is , is it very necessary to put waitpid under a loop ?
#include<stdio.h>
#include<sys/wait.h>
#include<signal.h>
int func(int pid)
{
if(pid < 0)
return 0;
func(pid - 1);
}
void sighand(int sig)
{
int i=45;
int stat, pid;
printf("Signal caught\n");
//while( (
pid = waitpid(-1, &stat, WNOHANG);
//) > 0){
printf("Reaped process %d----%d\n", pid, stat);
func(pid);
}
int main()
{
int i;
signal(SIGCHLD, sighand);
pid_t child_id;
if( (child_id=fork()) == 0 ) //child process
{
printf("Child ID %d\n",getpid());
printf("child exiting ...\n");
}
else
{
if( (child_id=fork()) == 0 ) //child process
{
printf("Child ID %d\n",getpid());
printf("child exiting ...\n");
}
else
{
printf("------------Parent with ID %d \n",getpid());
printf("parent exiting ....\n");
sleep(10);
sleep(10);
}
}
}
Yes.
Okay, I'll elaborate.
Each call to waitpid reaps one, and only one, child. Since you put the call inside the signal handler, there is no guarantee that the second child will exit before you finish executing the first signal handler. For two processes that is okay (the pending signal will be handled when you finish), but for more, it might be that two children will finish while you're still handling another one. Since signals are not queued, you will miss a notification.
If that happens, you will not reap all children. To avoid that problem, the loop recommendation was introduced. If you want to see it happen, try running your test with more children. The more you run, the more likely you'll see the problem.
With that out of the way, let's talk about some other issues.
First, your signal handler calls printf. That is a major no-no. Very few functions are signal handler safe, and printf definitely isn't one. You can try and make your signal handler safer, but a much saner approach is to put in a signal handler that merely sets a flag, and then doing the actual wait call in your main program's flow.
Since your main flow is, typically, to call select/epoll, make sure to look up pselect and epoll_pwait, and to understand what they do and why they are needed.
Even better (but Linux specific), look up signalfd. You might not need the signal handler at all.
Edited to add:
The loop does not change the fact that two signal deliveries are merged into one handler call. What it does do is that this one call handles all pending events.
Of course, once that's the case, you must use WNOHANG. The same artifacts that cause signals to be merged might also cause you to handle an event for which a signal is yet to be delivered.
If that happens, then once your first signal handler exists, it will get called again. This time, however, there will be no pending events (as the events were already extracted by the loop). If you do not specify WNOHANG, your wait block, and the program will be stuck indefinitely.

Calling kill on a child process with SIGTERM terminates parent process, but calling it with SIGKILL keeps the parent alive

This is a continuation of How to prevent SIGINT in child process from propagating to and killing parent process?
In the above question, I learned that SIGINT wasn't being bubbled up from child to parent, but rather, is issued to the entire foreground process group, meaning I needed to write a signal handler to prevent the parent from exiting when I hit CTRL + C.
I tried to implement this, but here's the problem. Regarding specifically the kill syscall I invoke to terminate the child, if I pass in SIGKILL, everything works as expected, but if I pass in SIGTERM, it also terminates the parent process, showing Terminated: 15 in the shell prompt later.
Even though SIGKILL works, I want to use SIGTERM is because it seems just like a better idea in general from what I've read about it giving the process it's signaling to terminate a chance to clean itself up.
The below code is a stripped down example of what I came up with
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
pid_t CHILD = 0;
void handle_sigint(int s) {
(void)s;
if (CHILD != 0) {
kill(CHILD, SIGTERM); // <-- SIGKILL works, but SIGTERM kills parent
CHILD = 0;
}
}
int main() {
// Set up signal handling
char str[2];
struct sigaction sa = {
.sa_flags = SA_RESTART,
.sa_handler = handle_sigint
};
sigaction(SIGINT, &sa, NULL);
for (;;) {
printf("1) Open SQLite\n"
"2) Quit\n"
"-> "
);
scanf("%1s", str);
if (str[0] == '1') {
CHILD = fork();
if (CHILD == 0) {
execlp("sqlite3", "sqlite3", NULL);
printf("exec failed\n");
} else {
wait(NULL);
printf("Hi\n");
}
} else if (str[0] == '2') {
break;
} else {
printf("Invalid!\n");
}
}
}
My educated guess as to why this is happening would be something intercepts the SIGTERM, and kills the entire process group. Whereas, when I use SIGKILL, it can't intercept the signal so my kill call works as expected. That's just a stab in the dark though.
Could someone explain why this is happening?
As I side note, I'm not thrilled with my handle_sigint function. Is there a more standard way of killing an interactive child process?
You have too many bugs in your code (from not clearing the signal mask on the struct sigaction) for anyone to explain the effects you are seeing.
Instead, consider the following working example code, say example.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
/* Child process PID, and atomic functions to get and set it.
* Do not access the internal_child_pid, except using the set_ and get_ functions.
*/
static pid_t internal_child_pid = 0;
static inline void set_child_pid(pid_t p) { __atomic_store_n(&internal_child_pid, p, __ATOMIC_SEQ_CST); }
static inline pid_t get_child_pid(void) { return __atomic_load_n(&internal_child_pid, __ATOMIC_SEQ_CST); }
static void forward_handler(int signum, siginfo_t *info, void *context)
{
const pid_t target = get_child_pid();
if (target != 0 && info->si_pid != target)
kill(target, signum);
}
static int forward_signal(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = forward_handler;
act.sa_flags = SA_SIGINFO | SA_RESTART;
if (sigaction(signum, &act, NULL))
return errno;
return 0;
}
int main(int argc, char *argv[])
{
int status;
pid_t p, r;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s COMMAND [ ARGS ... ]\n", argv[0]);
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
/* Install signal forwarders. */
if (forward_signal(SIGINT) ||
forward_signal(SIGHUP) ||
forward_signal(SIGTERM) ||
forward_signal(SIGQUIT) ||
forward_signal(SIGUSR1) ||
forward_signal(SIGUSR2)) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
p = fork();
if (p == (pid_t)-1) {
fprintf(stderr, "Cannot fork(): %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (!p) {
/* Child process. */
execvp(argv[1], argv + 1);
fprintf(stderr, "%s: %s.\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
/* Parent process. Ensure signals are reflected. */
set_child_pid(p);
/* Wait until the child we created exits. */
while (1) {
status = 0;
r = waitpid(p, &status, 0);
/* Error? */
if (r == -1) {
/* EINTR is not an error. Occurs more often if
SA_RESTART is not specified in sigaction flags. */
if (errno == EINTR)
continue;
fprintf(stderr, "Error waiting for child to exit: %s.\n", strerror(errno));
status = EXIT_FAILURE;
break;
}
/* Child p exited? */
if (r == p) {
if (WIFEXITED(status)) {
if (WEXITSTATUS(status))
fprintf(stderr, "Command failed [%d]\n", WEXITSTATUS(status));
else
fprintf(stderr, "Command succeeded [0]\n");
} else
if (WIFSIGNALED(status))
fprintf(stderr, "Command exited due to signal %d (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
else
fprintf(stderr, "Command process died from unknown causes!\n");
break;
}
}
/* This is a poor hack, but works in many (but not all) systems.
Instead of returning a valid code (EXIT_SUCCESS, EXIT_FAILURE)
we return the entire status word from the child process. */
return status;
}
Compile it using e.g.
gcc -Wall -O2 example.c -o example
and run using e.g.
./example sqlite3
You'll notice that Ctrl+C does not interrupt sqlite3 -- but then again, it does not even if you were to run sqlite3 directly --; instead, you just see ^C on screen. This is because sqlite3 sets up the terminal in such a way that Ctrl+C does not cause a signal, and is just interpreted as normal input.
You can exit from sqlite3 using the .quit command, or pressing Ctrl+D at the start of a line.
You'll see that the original program will output a Command ... [] line afterwards, before returning you to the command line. Thus, the parent process is not killed/harmed/bothered by the signals.
You can use ps f to look at a tree of your terminal processes, and that way find out the PIDs of the parent and child processes, and send signals to either one to observe what happens.
Note that because SIGSTOP signal cannot be caught, blocked, or ignored, it would be nontrivial to reflect the job control signals (as in when you use Ctrl+Z). For proper job control, the parent process would need to set up a new session and a process group, and temporarily detach from the terminal. That too is quite possible, but a bit beyond the scope here, as it involves quite detailed behaviour of sessions, process groups, and terminals, to manage correctly.
Let's deconstruct the above example program.
The example program itself first installs some signal reflectors, then forks a child process, and that child process executes the command sqlite3. (You can speficy any executable and any parameters strings to the program.)
The internal_child_pid variable, and set_child_pid() and get_child_pid() functions, are used to manage the child process atomically. The __atomic_store_n() and __atomic_load_n() are compiler-provided built-ins; for GCC, see here for details. They avoid the problem of a signal occurring while the child pid is only partially assigned. On some common architectures this cannot occur, but this is intended as a careful example, so atomic accesses are used to ensure only a completely (old or new) value is ever seen. We could avoid using these completely, if we blocked the related signals temporarily during the transition instead. Again, I decided the atomic accesses are simpler, and might be interesting to see in practice.
The forward_handler() function obtains the child process PID atomically, then verifies it is nonzero (that we know we have a child process), and that we are not forwarding a signal sent by the child process (just to ensure we don't cause a signal storm, the two bombarding each other with signals). The various fields in the siginfo_t structure are listed in the man 2 sigaction man page.
The forward_signal() function installs the above handler for the specified signal signum. Note that we first use memset() to clear the entire structure to zeros. Clearing it this way ensures future compatibility, if some of the padding in the structure is converted to data fields.
The .sa_mask field in the struct sigaction is an unordered set of signals. The signals set in the mask are blocked from delivery in the thread that is executing the signal handler. (For the above example program, we can safely say that these signals are blocked while the signal handler is run; it's just that in multithreaded programs, the signals are only blocked in the specific thread that is used to run the handler.)
It is important to use sigemptyset(&act.sa_mask) to clear the signal mask. Simply setting the structure to zero does not suffice, even if it works (probably) in practice on many machines. (I don't know; I haven't even checked. I prefer robust and reliable over lazy and fragile any day!)
The flags used includes SA_SIGINFO because the handler uses the three-argument form (and uses the si_pid field of the siginfo_t). SA_RESTART flag is only there because the OP wished to use it; it simply means that if possible, the C library and the kernel try to avoid returning errno == EINTR error if a signal is delivered using a thread currently blocking in a syscall (like wait()). You can remove the SA_RESTART flag, and add a debugging fprintf(stderr, "Hey!\n"); in a suitable place in the loop in the parent process, to see what happens then.
The sigaction() function will return 0 if there is no error, or -1 with errno set otherwise. The forward_signal() function returns 0 if the forward_handler was assigned successfully, but a nonzero errno number otherwise. Some do not like this kind of return value (they prefer just returning -1 for an error, rather than the errno value itself), but I'm for some unreasonable reason gotten fond of this idiom. Change it if you want, by all means.
Now we get to main().
If you run the program without parameters, or with a single -h or --help parameter, it'll print an usage summary. Again, doing this this way is just something I'm fond of -- getopt() and getopt_long() are more commonly used to parse command-line options. For this kind of trivial program, I just hardcoded the parameter checks.
In this case, I intentionally left the usage output very short. It would really be much better with an additional paragraph about exactly what the program does. These kinds of texts -- and especially comments in the code (explaining the intent, the idea of what the code should do, rather than describing what the code actually does) -- are very important. It's been well over two decades since the first time I got paid to write code, and I'm still learning how to comment -- describe the intent of -- my code better, so I think the sooner one starts working on that, the better.
The fork() part ought to be familiar. If it returns -1, the fork failed (probably due to limits or some such), and it is a very good idea to print out the errno message then. The return value will be 0 in the child, and the child process ID in the parent process.
The execlp() function takes two arguments: the name of the binary file (the directories specified in the PATH environment variable will be used to search for such a binary), as well as an array of pointers to the arguments to that binary. The first argument will be argv[0] in the new binary, i.e. the command name itself.
The execlp(argv[1], argv + 1); call is actually quite simple to parse, if you compare it to the above description. argv[1] names the binary to be executed. argv + 1 is basically equivalent to (char **)(&argv[1]), i.e. it is an array of pointers that start with argv[1] instead of argv[0]. Once again, I'm simply fond of the execlp(argv[n], argv + n) idiom, because it allows one to execute another command specified on the command line without having to worry about parsing a command line, or executing it through a shell (which is sometimes downright undesirable).
The man 7 signal man page explains what happens to signal handlers at fork() and exec(). In short, the signal handlers are inherited over a fork(), but reset to defaults at exec(). Which is, fortunately, exactly what we want, here.
If we were to fork first, and then install the signal handlers, we'd have a window during which the child process already exists, but the parent still has default dispositions (mostly termination) for the signals.
Instead, we could just block these signals using e.g. sigprocmask() in the parent process before forking. Blocking a signal means it is made to "wait"; it will not be delivered until the signal is unblocked. In the child process, the signals could stay blocked, as the signal dispositions are reset to defaults over an exec() anyway. In the parent process, we could then -- or before forking, it does not matter -- install the signal handlers, and finally unblock the signals. This way we would not need the atomic stuff, nor even check if the child pid is zero, since the child pid will be set to its actual value well before any signal can be delivered!
The while loop is basically just a loop around the waitpid() call, until the exact child process we started exits, or something funny happens (the child process vanishes somehow). This loop contains pretty careful error checking, as well as the correct EINTR handing if the signal handlers were to be installed without the SA_RESTART flags.
If the child process we forked exits, we check the exit status and/or reason it died, and print a diagnostic message to standard error.
Finally, the program ends with a horrible hack: instead of returning EXIT_SUCCESS or EXIT_FAILURE, we return the entire status word we obtained with waitpid when the child process exited. The reason I left this in, is because it is sometimes used in practice, when you want to return the same or as similar exit status code as a child process returned with. So, it's for illustration. If you ever find yourself to be in a situation when your program should return the same exit status as a child process it forked and executed, this is still better than setting up machinery to have the process kill itself with the same signal that killed the child process. Just put a prominent comment there if you ever need to use this, and a note in the installation instructions so that those who compile the program on architectures where that might be unwanted, can fix it.

C: fork() inform parent when child process disconnects

I am doing a simple server/client program in C which listens on a network interface and accepts clients. Each client is handled in a forked process.
The goal I have is to let the parent process know, once a client has disconnected from the child process.
Currently my main loop looks like this:
for (;;) {
/* 1. [network] Wait for new connection... (BLOCKING CALL) */
fd_listen[client] = accept(fd_listen[server], (struct sockaddr *)&cli_addr, &clilen);
if (fd_listen[client] < 0) {
perror("ERROR on accept");
exit(1);
}
/* 2. [process] Call socketpair */
if ( socketpair(AF_LOCAL, SOCK_STREAM, 0, fd_comm) != 0 ) {
perror("ERROR on socketpair");
exit(1);
}
/* 3. [process] Call fork */
pid = fork();
if (pid < 0) {
perror("ERROR on fork");
exit(1);
}
/* 3.1 [process] Inside the Child */
if (pid == 0) {
printf("[child] num of clients: %d\n", num_client+1);
printf("[child] pid: %ld\n", (long) getpid());
close(fd_comm[parent]); // Close the parent socket file descriptor
close(fd_listen[server]); // Close the server socket file descriptor
// Tasks that the child process should be doing for the connected client
child_processing(fd_listen[client]);
exit(0);
}
/* 3.2 [process] Inside the Parent */
else {
num_client++;
close(fd_comm[child]); // Close the child socket file descriptor
close(fd_listen[client]); // Close the client socket file descriptor
printf("[parent] num of clients: %d\n", num_client);
while ( (w = waitpid(-1, &status, WNOHANG)) > 0) {
printf("[EXIT] child %d terminated\n", w);
num_client--;
}
}
}/* end of while */
It all works well, the only problem I have is (probably) due to the blocking accept call.
When I connect to the above server, a new child process is created and child_processing is called.
However when I disconnect with that client, the main parent process does not know about it and does NOT output printf("[EXIT] child %d terminated\n", w);
But, when I connect with a second client after the first client has disconnected, the main loop is able to finally process the while ( (w = waitpid(-1, &status, WNOHANG)) > 0) part and tell me that the first client has disconnected.
If there will be only ever one client connecting and disconnecting afterwards, my main parent process will never be able to tell if it has disconnected or not.
Is there any way to tell the parent process that my client already left?
UPDATE
As I am a real beginner with c, it would be nice if you provide some short snippets to your answer so I can actually understand it :-)
Your waitpid usage is not correct. You have a non-blocking call so if the child is not finished then then the call gets 0:
waitpid(): on success, returns the process ID of the child whose state
has changed; if WNOHANG was specified and one or more child(ren)
specified by pid exist, but have not yet changed state, then 0 is
returned. On error, -1 is returned.
So your are going immediately out of the while loop. Of course this can be catched later when the first children terminates and a second one lets you process the waitpid again.
As you need to have a non-blocking call to wait I can suggest you not to manage termination directly but through SIGCHLD signal that will let you catch termination of any children and then appropriately call waitpid in the handler:
void handler(int signal) {
while (waitpid(...)) { // find an adequate condition and paramters for your needs
}
...
struct sigaction act;
act.sa_flag = 0;
sigemptyset(&(act.sa_mask));
act.sa_handler = handler;
sigaction(SIGCHLD,&act,NULL);
... // now ready to receive SIGCHLD when at least a children changes its state
If I understand correctly, you want to be able to servicve multiple clients at once, and therefore your waitpid call is correct in that it does not block if no child has terminated.
However, the problem you then have is that you need to be able to process asynchronous child termination while waiting for new clients via accept.
Assuming that you're dealing with a POSIXy system, merely having a SIGCHLD handler established and having the signal unmasked (via sigprocmask, though IIRC it is unmasked by default), should be enough to cause accept to fail with EINTR if a child terminates while you are waiting for a new client to connect - and you can then handle EINTR appropriately.
The reason for this is that a SIGCHLD signal will be automatically sent to the parent process when a child process terminates. In general, system calls such as accept will return an error of EINTR ("interrupted") if a signal is received while they are waiting.
However, there would still be a race condition, where a child terminates just before you call accept (i.e. in between where already have waitpid and accept). There are two main possibilities to overcome this:
Do all the child termination processing in your SIGCHLD handler, instead of the main loop. This may not be feasible, however, since there are significant limits to what you are allowed to do within a signal handler. You may not call printf for example (though you may use write).
I do not suggest you go down this path, although it may seem simpler at first it is the least flexible option and may prove unworkable later.
Write to one end of a non-blocking pipe in your SIGCHLD signal handler. Within the main loop, instead of calling accept directly, use poll (or select) to look for readiness on both the socket and the read end of the pipe, and handle each appropriately.
On Linux (and OpenBSD, I'm not sure about others) you can use ppoll (man page) to avoid the need to create a pipe (and in this case you should leave the signal masked, and have it unmasked during the poll operation; if ppoll fails with EINTR, you know that a signal was received, and you should call waitpid). You still need to set a signal handler for SIGCHLD, but it doesn't need to do anything.
Another option on Linux is to use signalfd (man page) to avoid both the need to create a pipe and set up a signal handler (I think). You should mask the SIGCHLD signal (using sigprocmask) if you use this. When poll (or equivalent) indicates that the signalfd is active, read the signal data from it (which clears the signal) and then call waitpid to reap the child.
On various BSD systems you can use kqueue (OpenBSD man page) instead of poll and watch for signals without needing to establish a signal handler.
On other POSIX systems you may be able to use pselect (documentation) in a similar way to ppoll as described above.
There is also the option of using a library such as libevent to abstract away the OS-specifics.
The Glibc manual has an example of using select. Consult the manual pages for poll, ppoll, pselect for more information about those functions. There is an online book on using Libevent.
Rough example for using select, borrowed from Glibc documentation (and modified):
/* Set up a pipe and set signal handler for SIGCHLD */
int pipefd[2]; /* must be a global variable */
pipe(pipefd); /* TODO check for error return */
fcntl(pipefd[1], F_SETFL, O_NONBLOCK); /* set write end non-blocking */
/* signal handler */
void sigchld_handler(int signum)
{
char a = 0; /* write anything, doesn't matter what */
write(pipefd[1], &a, 1);
}
/* set up signal handler */
signal(SIGCHLD, sigchld_handler);
Where you currently have accept, you need to check status of the server socket and the read end of the pipe:
fd_set set, outset;
struct timeval timeout;
/* Initialize the file descriptor set. */
FD_ZERO (&set);
FD_SET (fdlisten[server], &set);
FD_SET (pipefds[0], &set);
FD_ZERO(&outset);
for (;;) {
select (FD_SETSIZE, &set, NULL, &outset, NULL /* no timeout */));
/* TODO check for error return.
EINTR should just continue the loop. */
if (FD_ISSET(fdlisten[server], &outset)) {
/* now do accept() etc */
}
if (FD_ISSET(pipefds[0], &outset)) {
/* now do waitpid(), and read a byte from the pipe */
}
}
Using other mechanisms is generally simpler, so I leave those as an exercise :)

ctrl-c killing my background processes in my shell [duplicate]

I have one simple program that's using Qt Framework.
It uses QProcess to execute RAR and compress some files. In my program I am catching SIGINT and doing something in my code when it occurs:
signal(SIGINT, &unix_handler);
When SIGINT occurs, I check if RAR process is done, and if it isn't I will wait for it ... The problem is that (I think) RAR process also gets SIGINT that was meant for my program and it quits before it has compressed all files.
Is there a way to run RAR process so that it doesn't receive SIGINT when my program receives it?
Thanks
If you are generating the SIGINT with Ctrl+C on a Unix system, then the signal is being sent to the entire process group.
You need to use setpgid or setsid to put the child process into a different process group so that it will not receive the signals generated by the controlling terminal.
[Edit:]
Be sure to read the RATIONALE section of the setpgid page carefully. It is a little tricky to plug all of the potential race conditions here.
To guarantee 100% that no SIGINT will be delivered to your child process, you need to do something like this:
#define CHECK(x) if(!(x)) { perror(#x " failed"); abort(); /* or whatever */ }
/* Block SIGINT. */
sigset_t mask, omask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
CHECK(sigprocmask(SIG_BLOCK, &mask, &omask) == 0);
/* Spawn child. */
pid_t child_pid = fork();
CHECK(child_pid >= 0);
if (child_pid == 0) {
/* Child */
CHECK(setpgid(0, 0) == 0);
execl(...);
abort();
}
/* Parent */
if (setpgid(child_pid, child_pid) < 0 && errno != EACCES)
abort(); /* or whatever */
/* Unblock SIGINT */
CHECK(sigprocmask(SIG_SETMASK, &omask, NULL) == 0);
Strictly speaking, every one of these steps is necessary. You have to block the signal in case the user hits Ctrl+C right after the call to fork. You have to call setpgid in the child in case the execl happens before the parent has time to do anything. You have to call setpgid in the parent in case the parent runs and someone hits Ctrl+C before the child has time to do anything.
The sequence above is clumsy, but it does handle 100% of the race conditions.
What are you doing in your handler? There are only certain Qt functions that you can call safely from a unix signal handler. This page in the documentation identifies what ones they are.
The main problem is that the handler will execute outside of the main Qt event thread. That page also proposes a method to deal with this. I prefer getting the handler to "post" a custom event to the application and handle it that way. I posted an answer describing how to implement custom events here.
Just make the subprocess ignore SIGINT:
child_pid = fork();
if (child_pid == 0) {
/* child process */
signal(SIGINT, SIG_IGN);
execl(...);
}
man sigaction:
During an execve(2), the dispositions of handled signals are reset to the default;
the dispositions of ignored signals are left unchanged.

Resources