how to clear stdout after CTRL - C in linux c - c

We dont want anything to be printed after user interrupt via CTRL-C. We have tried adding __fpurge as well fflush inside sigInt signal handler, but it is not working.
How can I clear buffered stdout values immediately? I have came across few similar thread but no where i could able to find a working solution .
Few additional info's:
Inside sigInt signal handler even after adding exit(0) , buffer content are getting printed but the processor is killed .
added exit(0) to narrow down the issue , i dont want to kill the processor
I know the above is expected behavior , not sure how to avoid it .

Consider this edited example -- edited; this one does not exit the process:
#define _POSIX_C_SOURCE 200809L /* For nanosleep() */
#include <unistd.h>
#include <stdlib.h>
#include <termios.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <stdio.h>
static void exit_handler(int signum)
{
int fd, result;
/* If the standard streams are connected to a tty,
* tell the kernel to discard already buffered data.
* (That is, in kernel buffers. Not C library buffers.)
*/
if (isatty(STDIN_FILENO))
tcflush(STDIN_FILENO, TCIOFLUSH);
if (isatty(STDOUT_FILENO))
tcflush(STDOUT_FILENO, TCIOFLUSH);
if (isatty(STDERR_FILENO))
tcflush(STDERR_FILENO, TCIOFLUSH);
/* Redirect standard streams to /dev/null,
* so that nothing further is output.
* This is a nasty thing to do, and a code analysis program
* may complain about this; it is suspicious behaviour.
*/
do {
fd = open("/dev/null", O_RDWR);
} while (fd == -1 && errno == EINTR);
if (fd != -1) {
if (fd != STDIN_FILENO)
do {
result = dup2(fd, STDIN_FILENO);
} while (result == -1 && (errno == EINTR || errno == EBUSY));
if (fd != STDOUT_FILENO)
do {
result = dup2(fd, STDOUT_FILENO);
} while (result == -1 && (errno == EINTR || errno == EBUSY));
if (fd != STDERR_FILENO)
do {
result = dup2(fd, STDERR_FILENO);
} while (result == -1 && (errno == EINTR || errno == EBUSY));
if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO)
close(fd);
}
}
static int install_exit_handler(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = exit_handler;
act.sa_flags = 0;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}
int main(void)
{
if (install_exit_handler(SIGINT)) {
fprintf(stderr, "Cannot install signal handler: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
while (1) {
struct timespec t = { .tv_sec = 0, .tv_nsec = 200000000L };
printf("Output\n");
fflush(stdout);
nanosleep(&t, NULL);
}
/* Never reached. */
return EXIT_SUCCESS;
}
When the process receives a SIGINT signal, it will first flush whatever is in kernel terminal buffer, then redirect the standard streams to /dev/null (i.e., nowhere).
Note that you'll need to kill the process by sending it the TERM or KILL signal (i.e. killall ./yourprogname in another terminal).
When you are running the verbose process over a remote connection, quite a lot of information may be in flight at all times. Both the local machine and the remote machine running the process will have their socket buffers nearly full, so the latency may be much larger than ordinarily -- I've seen several second latencies in this case even on fast (GbE) local networks.
This means that propagating the signal from the local machine to the remote machine will take a measurable time; in worst cases on the order of seconds. Only then will the remote process stop outputting data. All pending data will still have to be transmitted from the remote machine to the local machine, and that may take quite a long time. (Typically, the bottleneck is the terminal itself; in most cases it is faster to minimize the terminal, so that it does not try to render any of the text it receives, only buffers it internally.)
This is why Ctrl+C does not, and cannot, stop remote output instantaneously.
In most cases, you'll be using an SSH connection to the remote machine. The protocol does not have a "purge" feature, either, that might help here. Many, myself included, have thought about it -- at least my sausage fingers have accidentally tabbed to the executable file instead of the similarly named output file, and not only gotten the terminal full of garbage, but the special characters in binary files sometimes set the terminal state (see e.g. xterm control sequences, ANSI escape codes) to something unrecoverable (i.e., Ctrl+Z followed by reset Enter does not reset the terminal back to a working state; if it did, kill -KILL %- ; fg would stop the errant command in Bash, and get you your terminal back), and you need to break the connection, which will also terminate all processes started from the same terminal running remotely in the background.
The solution here is to use a terminal multiplexer, like GNU screen, which allows you to connect to and disconnect from the remote machine, without interrupting an existing terminal connection. (To put it simply, screen is your terminal avatar on the remote machine.)

First up, a quote from the C11 standard, emphasis mine:
7.14.1.1 The signal function
5 If the signal occurs other than as the result of calling the abort or raise function, the behaviour is undefined if [...] the signal handler calls any function in the standard library other than the abort function, the _Exit function, the quick_exit function, or the signal function with the first argumentt equal to the signal number corresponding to the signal that caused the invocation of the handler.
This means calling fflush is undefined behaviour.
Looking at the functions you may call, abort and _Exit both leave the flushing of buffers implementation-defined, and quick_exit calls _Exit, so you are out of luck as far as far as the standard is concerned since I could not find the implementation's definition on their behaviour for Linux. (Surprise. Not.)
The only other "terminating" function, exit, does flush the buffers, and you may not call it from the handler in the first place.
So you have to look at Linux-specific functionality. The man page to _exit makes no statement on buffers. The close man page warns against closing file descriptors that may be in use by system calls from other threads, and states that "it is not common for a filesystem to flush the buffers when the stream is closed", meaning that it could happen (i.e. close not guaranteeing that unwritten buffer contents are actually discarded).
At this point, if I were you, I would ask myself "is this such a good idea after all"...

The problem is that neither Posix nor Linux library declares that fpurge nor __fpurge to be safe in a signal handler function. As explained by DevSolar, C language itsel does not declare many safe functions for standard library (at least _Exit, but Posix explicitely allows close and write. So, you can always close the underlying file descriptor which should be 1:
void handler(int sig) {
static char msg[] = "Interrupted";
write(2, msg, sizeof(msg) - 1); // carefully use stderr here
close(1); // foo is displayed if this line is commented out
_Exit(1);
}
int main() {
signal(SIGINT, handler);
printf("bar");
sleep(15);
return 0;
}
When I type Ctrl-C during the sleep it gives as expected:
$ ./foo
^CInterrupted with 2
$
The close system call should be enough, because as it closes the underlying file descriptor. So even if there are later attemps to flush stdout buffer, they will write on a closed file descriptor as as such have no effect at all. The downside is that stdout has been redirected, the program should store the new value of the underlying file descriptor in a global variable.

If you do kill(getpid(), SIGKILL); with in the signal handler (which is async-safe), you would get killed immediately by the OS (as you wanted to exit(0) anyway). Further output is not to be expected any more.
Only problem: you won't be able to clean up poperly, exit gracefully afterwards in the main thread. If you can afford that...

Related

Why does this alarm() syscall does not interrupt read()?

I have found a problem where it asks to explain the behaviour of the following program:
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
void sig_alrm(int n) {
write(2, "ALARM!\n", 7);
return;
}
int main() {
int fd[2], n;
char message[6], *s;
s = "HELLO\n";
signal(SIGALRM, sig_alrm);
pipe(fd);
if (fork() == 0) {
close(fd[1]);
alarm(3);
while ((n = read(fd[0], message, 6)) > 0);
alarm(0);
exit(0);
}
close(1);
dup(fd[1]);
close(fd[0]);
close(fd[1]);
while (1)
write(1, s, 6);
}
It's basically a parent process with a shared pipe sending HELLO\n constantly via pipe to a child. The child sets up a SIGALRM in three seconds which will be caught by the sig_alrm function, and then proceeds to read indefinitely from the pipe.
If I understand correctly, SIGALRM should interrupt the read() system call, causing it to error out upon arriving, which would in turn cause the child to exit with an instant alarm with default behavior, and the parent to end too due to a SIGPIPE.
The problem is that I attempted to run the code, and both processes continue to read and write from the pipe happily after the SIGALRM arrives to the child.
Is there something I misunderstood from signal behaviors?
The behavior of signal varies from platform to platform, and you should instead use its successor sigaction, which was standardized in POSIX-1.1988.
On your platform and with your build instructions, signal() installs user handlers in a "restartable" fashion, meaning that interruptable system calls will simply be restarted rather than fail with EINTR. Your read is interrupted by the handler, but then resumes. Some platforms and/or build semantics do not do this.
sigaction resolves this ambiguity by providing a flag, SA_RESTART, which controls whether or not interrupted syscalls are, in fact, restarted. It standardizes other historically divergent behaviors, too.
For what it's worth, when I gcc-compile your code on an older Linux system without specifying any feature macros, an strace reveals that signal is, in fact, implemented in terms of a (real-time) sigaction call with SA_RESTART, which explains the behavior you see:
$ strace -fe trace=\!write,read ./so65742182
....
munmap(0x7f8ddffaf000, 70990) = 0
rt_sigaction(SIGALRM, {0x013370, [ALRM], SA_RESTORER|SA_RESTART, ...
# ^^^^^^^^^^
pipe([3, 4])
....

pselect() on Linux does not deliver signals if events are pending

I'm trying to add a signal handler for proper cleanup to my event-driven application.
My signal handler for SIGINT only changes the value of a global flag variable, which is then checked in the main loop. To avoid races, the signal is blocked at all times, except during the pselect() call. This should cause pending signals to be delivered only during the pselect() call, which should be interrupted and fail with EINTR.
This usually works fine, except if there are already events pending on the monitored file descriptors (e.g. under heavy load, when there's always activity on the file descriptors).
This sample program reproduces the problem:
#include <assert.h>
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <sys/select.h>
#include <fcntl.h>
#include <signal.h>
#include <unistd.h>
volatile sig_atomic_t stop_requested = 0;
void handle_signal(int sig)
{
// Use write() and strlen() instead of printf(), which is not async-signal-safe
const char * out = "Caught stop signal. Exiting.\n";
size_t len = strlen (out);
ssize_t writelen = write(STDOUT_FILENO, out, len);
assert(writelen == (ssize_t) len);
stop_requested = 1;
}
int main(void)
{
int ret;
// Install signal handler
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handle_signal;
ret = sigaction(SIGINT, &sa, NULL);
assert(ret == 0);
}
// Block SIGINT
sigset_t old_sigmask;
{
sigset_t blocked;
sigemptyset(&blocked);
sigaddset(&blocked, SIGINT);
ret = sigprocmask(SIG_BLOCK, &blocked, &old_sigmask);
assert(ret == 0);
}
ret = raise(SIGINT);
assert(ret == 0);
// Create pipe and write data to it
int pipefd[2];
ret = pipe(pipefd);
assert(ret == 0);
ssize_t writelen = write(pipefd[1], "foo", 3);
assert(writelen == 3);
while (stop_requested == 0)
{
printf("Calling pselect().\n");
fd_set fds;
FD_ZERO(&fds);
FD_SET(pipefd[0], &fds);
struct timespec * timeout = NULL;
int ret = pselect(pipefd[0] + 1, &fds, NULL, NULL, timeout, &old_sigmask);
assert(ret >= 0 || errno == EINTR);
printf("pselect() returned %d.\n", ret);
if (FD_ISSET(pipefd[0], &fds))
printf("pipe is readable.\n");
sleep(1);
}
printf("Event loop terminated.\n");
}
This program installs a handler for SIGINT, then blocks SIGINT, sends SIGINT to itself (which will not be delivered yet because SIGINT is blocked), creates a pipe and writes some data into the pipe, and then monitors the read end of the pipe for readability.
This readability monitoring is done using pselect(), which is supposed to unblock SIGINT, which should then interrupt the pselect() and call the signal handler.
However, on Linux (I tested on 5.6 and 4.19), the pselect() call returns 1 instead and indicates readability of the pipe, without calling the signal handler. Since this test program does not read the data that was written to the pipe, the file descriptor will never cease to be readable, and the signal handler is never called. In real programs, a similar situation might arise under heavy load, where a lot of data might be available for reading on different file descriptors (e.g. sockets).
On the other hand, on FreeBSD (I tested on 12.1), the signal handler is called, and then pselect() returns -1 and sets errno to EINTR. This is what I expected to happen on Linux as well.
Am I misunderstanding something, or am I using these interfaces incorrectly? Or should I just fall back to the old self-pipe trick, which (I believe) would handle this case better?
This is a type of resource starvation caused by always checking for active resources in the same order. When resources are always checked in the same order, if the resources checked first are busy enough the resources checked later may never get any attention.
See What is starvation?.
The Linux implementation of pselect() apparently checks file descriptors before checking for signals. The BSD implementation does the opposite.
For what it's worth, the POSIX documentation for pselect() states:
If none of the selected descriptors are ready for the requested operation, the pselect() or select() function shall block until at least one of the requested operations becomes ready, until the timeout occurs, or until interrupted by a signal.
A strict reading of that description requires checking the descriptors first. If any descriptor is active, pselect() will return that instead of failing with errno set to EINTR.
In that case, if the descriptors are so busy that one is always active, the signal processing gets starved.
The BSD implementation likely starves active descriptors if signals come in too fast.
One common solution is to always process all active resources every time a select() call or similar returns. But you can't do that with your current design that mixes signals with descriptors because pselect() doesn't even get to checking for a pending signal if there are active descriptors. As #Shawn mentioned in the comments, you can map signals to file descriptors using signalfd(). Then add the descriptor from signalfd() to the file descriptor set passed to pselect().

POSIX partial write() and Signal Interrupts

From the man page of write()
Note that a successful write() may transfer fewer than count bytes.
Such partial writes can occur for various reasons; for example,
because there was insufficient space on the disk device to write all
of the requested bytes, or because a blocked write() to a socket,
pipe, or similar was interrupted by a signal handler after it had
transferred some, but before it had transferred all of the requested
bytes. In the event of a partial write, the caller can make another
write() call to transfer the remaining bytes. The subsequent call
will either transfer further bytes or may result in an error (e.g.,
if the disk is now full).
I have the following questions
1) In the case of write() being interrupted by signal handler after a partial transfer, will write() set the errno to EINTR ?
2) If errno is not set, is there a way to identify such an event with out extra piece of code (Like installing signal disposition and setting a flag value to true) ?
Note : The further write() calls are successful in transferring the remaining bytes after the event of signal interrupt.
To answer your individual numbered questions:
errno is only meaningful after one of the standard functions returns a value indicating an error - for write, -1 - and before any other standard function or application code that might clobber it is called. So no, if write returns a short write, errno will not be set to anything meaningful. If it's equal to EINTR, it just happens to be; this is not something meaningful you can interpret.
The way you identify such an event is by the return value being strictly less than the nbytes argument. This doesn't actually tell you the cause of the short write, so it could be something else like running out of space. If you need to know, you need to arrange for the signal handler to inform you. But in almost all cases you don't actually need to know.
Regarding the note, if write is returning the full nbytes after a signal arriving, the signal handler was non-interrupting. This is the default on Linux with any modern libc (glibc, musl, anything but libc5 basically), and it's almost always the right thing. If you actually want interrupting signals you have to install the signal handler with sigaction and the SA_RESTART flag clear. (And conversely if you're installing signal handlers you want to have the normal, reasonable, non-interrupting behavior, for portability you should use sigaction and set the SA_RESTART flag rather than using the legacy function signal).
Let's try it and see:
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <signal.h>
static void handle_sigalrm(int sig) {
}
int main(void) {
struct sigaction act;
memset(&act, 0, sizeof act);
act.sa_handler = handle_sigalrm;
sigaction(SIGALRM, &act, NULL);
int fds[2];
pipe(fds);
int bufsize = fcntl(fds[1], F_GETPIPE_SZ) + 10;
char *buf = calloc(bufsize, 1);
ssize_t written;
printf("will attempt to write %d bytes and EINTR is %d\n", bufsize, EINTR);
alarm(1);
errno = 0;
written = write(fds[1], buf, bufsize);
printf("write returned %td and errno is %d\n", written, errno);
return 0;
}
That program makes a pipe that nothing will ever read from, does a write to it that's bigger than the kernel's buffer, and arranges for a signal handler to run while the write is blocking. On my system, it prints this:
will attempt to write 65546 bytes and EINTR is 4
write returned 65536 and errno is 0
Thus, the answer to "In the case of write() being interrupted by signal handler after a partial transfer, will write() set the errno to EINTR?" is "no, it won't".

Calling kill on a child process with SIGTERM terminates parent process, but calling it with SIGKILL keeps the parent alive

This is a continuation of How to prevent SIGINT in child process from propagating to and killing parent process?
In the above question, I learned that SIGINT wasn't being bubbled up from child to parent, but rather, is issued to the entire foreground process group, meaning I needed to write a signal handler to prevent the parent from exiting when I hit CTRL + C.
I tried to implement this, but here's the problem. Regarding specifically the kill syscall I invoke to terminate the child, if I pass in SIGKILL, everything works as expected, but if I pass in SIGTERM, it also terminates the parent process, showing Terminated: 15 in the shell prompt later.
Even though SIGKILL works, I want to use SIGTERM is because it seems just like a better idea in general from what I've read about it giving the process it's signaling to terminate a chance to clean itself up.
The below code is a stripped down example of what I came up with
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
pid_t CHILD = 0;
void handle_sigint(int s) {
(void)s;
if (CHILD != 0) {
kill(CHILD, SIGTERM); // <-- SIGKILL works, but SIGTERM kills parent
CHILD = 0;
}
}
int main() {
// Set up signal handling
char str[2];
struct sigaction sa = {
.sa_flags = SA_RESTART,
.sa_handler = handle_sigint
};
sigaction(SIGINT, &sa, NULL);
for (;;) {
printf("1) Open SQLite\n"
"2) Quit\n"
"-> "
);
scanf("%1s", str);
if (str[0] == '1') {
CHILD = fork();
if (CHILD == 0) {
execlp("sqlite3", "sqlite3", NULL);
printf("exec failed\n");
} else {
wait(NULL);
printf("Hi\n");
}
} else if (str[0] == '2') {
break;
} else {
printf("Invalid!\n");
}
}
}
My educated guess as to why this is happening would be something intercepts the SIGTERM, and kills the entire process group. Whereas, when I use SIGKILL, it can't intercept the signal so my kill call works as expected. That's just a stab in the dark though.
Could someone explain why this is happening?
As I side note, I'm not thrilled with my handle_sigint function. Is there a more standard way of killing an interactive child process?
You have too many bugs in your code (from not clearing the signal mask on the struct sigaction) for anyone to explain the effects you are seeing.
Instead, consider the following working example code, say example.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
/* Child process PID, and atomic functions to get and set it.
* Do not access the internal_child_pid, except using the set_ and get_ functions.
*/
static pid_t internal_child_pid = 0;
static inline void set_child_pid(pid_t p) { __atomic_store_n(&internal_child_pid, p, __ATOMIC_SEQ_CST); }
static inline pid_t get_child_pid(void) { return __atomic_load_n(&internal_child_pid, __ATOMIC_SEQ_CST); }
static void forward_handler(int signum, siginfo_t *info, void *context)
{
const pid_t target = get_child_pid();
if (target != 0 && info->si_pid != target)
kill(target, signum);
}
static int forward_signal(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = forward_handler;
act.sa_flags = SA_SIGINFO | SA_RESTART;
if (sigaction(signum, &act, NULL))
return errno;
return 0;
}
int main(int argc, char *argv[])
{
int status;
pid_t p, r;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s COMMAND [ ARGS ... ]\n", argv[0]);
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
/* Install signal forwarders. */
if (forward_signal(SIGINT) ||
forward_signal(SIGHUP) ||
forward_signal(SIGTERM) ||
forward_signal(SIGQUIT) ||
forward_signal(SIGUSR1) ||
forward_signal(SIGUSR2)) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
p = fork();
if (p == (pid_t)-1) {
fprintf(stderr, "Cannot fork(): %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (!p) {
/* Child process. */
execvp(argv[1], argv + 1);
fprintf(stderr, "%s: %s.\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
/* Parent process. Ensure signals are reflected. */
set_child_pid(p);
/* Wait until the child we created exits. */
while (1) {
status = 0;
r = waitpid(p, &status, 0);
/* Error? */
if (r == -1) {
/* EINTR is not an error. Occurs more often if
SA_RESTART is not specified in sigaction flags. */
if (errno == EINTR)
continue;
fprintf(stderr, "Error waiting for child to exit: %s.\n", strerror(errno));
status = EXIT_FAILURE;
break;
}
/* Child p exited? */
if (r == p) {
if (WIFEXITED(status)) {
if (WEXITSTATUS(status))
fprintf(stderr, "Command failed [%d]\n", WEXITSTATUS(status));
else
fprintf(stderr, "Command succeeded [0]\n");
} else
if (WIFSIGNALED(status))
fprintf(stderr, "Command exited due to signal %d (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
else
fprintf(stderr, "Command process died from unknown causes!\n");
break;
}
}
/* This is a poor hack, but works in many (but not all) systems.
Instead of returning a valid code (EXIT_SUCCESS, EXIT_FAILURE)
we return the entire status word from the child process. */
return status;
}
Compile it using e.g.
gcc -Wall -O2 example.c -o example
and run using e.g.
./example sqlite3
You'll notice that Ctrl+C does not interrupt sqlite3 -- but then again, it does not even if you were to run sqlite3 directly --; instead, you just see ^C on screen. This is because sqlite3 sets up the terminal in such a way that Ctrl+C does not cause a signal, and is just interpreted as normal input.
You can exit from sqlite3 using the .quit command, or pressing Ctrl+D at the start of a line.
You'll see that the original program will output a Command ... [] line afterwards, before returning you to the command line. Thus, the parent process is not killed/harmed/bothered by the signals.
You can use ps f to look at a tree of your terminal processes, and that way find out the PIDs of the parent and child processes, and send signals to either one to observe what happens.
Note that because SIGSTOP signal cannot be caught, blocked, or ignored, it would be nontrivial to reflect the job control signals (as in when you use Ctrl+Z). For proper job control, the parent process would need to set up a new session and a process group, and temporarily detach from the terminal. That too is quite possible, but a bit beyond the scope here, as it involves quite detailed behaviour of sessions, process groups, and terminals, to manage correctly.
Let's deconstruct the above example program.
The example program itself first installs some signal reflectors, then forks a child process, and that child process executes the command sqlite3. (You can speficy any executable and any parameters strings to the program.)
The internal_child_pid variable, and set_child_pid() and get_child_pid() functions, are used to manage the child process atomically. The __atomic_store_n() and __atomic_load_n() are compiler-provided built-ins; for GCC, see here for details. They avoid the problem of a signal occurring while the child pid is only partially assigned. On some common architectures this cannot occur, but this is intended as a careful example, so atomic accesses are used to ensure only a completely (old or new) value is ever seen. We could avoid using these completely, if we blocked the related signals temporarily during the transition instead. Again, I decided the atomic accesses are simpler, and might be interesting to see in practice.
The forward_handler() function obtains the child process PID atomically, then verifies it is nonzero (that we know we have a child process), and that we are not forwarding a signal sent by the child process (just to ensure we don't cause a signal storm, the two bombarding each other with signals). The various fields in the siginfo_t structure are listed in the man 2 sigaction man page.
The forward_signal() function installs the above handler for the specified signal signum. Note that we first use memset() to clear the entire structure to zeros. Clearing it this way ensures future compatibility, if some of the padding in the structure is converted to data fields.
The .sa_mask field in the struct sigaction is an unordered set of signals. The signals set in the mask are blocked from delivery in the thread that is executing the signal handler. (For the above example program, we can safely say that these signals are blocked while the signal handler is run; it's just that in multithreaded programs, the signals are only blocked in the specific thread that is used to run the handler.)
It is important to use sigemptyset(&act.sa_mask) to clear the signal mask. Simply setting the structure to zero does not suffice, even if it works (probably) in practice on many machines. (I don't know; I haven't even checked. I prefer robust and reliable over lazy and fragile any day!)
The flags used includes SA_SIGINFO because the handler uses the three-argument form (and uses the si_pid field of the siginfo_t). SA_RESTART flag is only there because the OP wished to use it; it simply means that if possible, the C library and the kernel try to avoid returning errno == EINTR error if a signal is delivered using a thread currently blocking in a syscall (like wait()). You can remove the SA_RESTART flag, and add a debugging fprintf(stderr, "Hey!\n"); in a suitable place in the loop in the parent process, to see what happens then.
The sigaction() function will return 0 if there is no error, or -1 with errno set otherwise. The forward_signal() function returns 0 if the forward_handler was assigned successfully, but a nonzero errno number otherwise. Some do not like this kind of return value (they prefer just returning -1 for an error, rather than the errno value itself), but I'm for some unreasonable reason gotten fond of this idiom. Change it if you want, by all means.
Now we get to main().
If you run the program without parameters, or with a single -h or --help parameter, it'll print an usage summary. Again, doing this this way is just something I'm fond of -- getopt() and getopt_long() are more commonly used to parse command-line options. For this kind of trivial program, I just hardcoded the parameter checks.
In this case, I intentionally left the usage output very short. It would really be much better with an additional paragraph about exactly what the program does. These kinds of texts -- and especially comments in the code (explaining the intent, the idea of what the code should do, rather than describing what the code actually does) -- are very important. It's been well over two decades since the first time I got paid to write code, and I'm still learning how to comment -- describe the intent of -- my code better, so I think the sooner one starts working on that, the better.
The fork() part ought to be familiar. If it returns -1, the fork failed (probably due to limits or some such), and it is a very good idea to print out the errno message then. The return value will be 0 in the child, and the child process ID in the parent process.
The execlp() function takes two arguments: the name of the binary file (the directories specified in the PATH environment variable will be used to search for such a binary), as well as an array of pointers to the arguments to that binary. The first argument will be argv[0] in the new binary, i.e. the command name itself.
The execlp(argv[1], argv + 1); call is actually quite simple to parse, if you compare it to the above description. argv[1] names the binary to be executed. argv + 1 is basically equivalent to (char **)(&argv[1]), i.e. it is an array of pointers that start with argv[1] instead of argv[0]. Once again, I'm simply fond of the execlp(argv[n], argv + n) idiom, because it allows one to execute another command specified on the command line without having to worry about parsing a command line, or executing it through a shell (which is sometimes downright undesirable).
The man 7 signal man page explains what happens to signal handlers at fork() and exec(). In short, the signal handlers are inherited over a fork(), but reset to defaults at exec(). Which is, fortunately, exactly what we want, here.
If we were to fork first, and then install the signal handlers, we'd have a window during which the child process already exists, but the parent still has default dispositions (mostly termination) for the signals.
Instead, we could just block these signals using e.g. sigprocmask() in the parent process before forking. Blocking a signal means it is made to "wait"; it will not be delivered until the signal is unblocked. In the child process, the signals could stay blocked, as the signal dispositions are reset to defaults over an exec() anyway. In the parent process, we could then -- or before forking, it does not matter -- install the signal handlers, and finally unblock the signals. This way we would not need the atomic stuff, nor even check if the child pid is zero, since the child pid will be set to its actual value well before any signal can be delivered!
The while loop is basically just a loop around the waitpid() call, until the exact child process we started exits, or something funny happens (the child process vanishes somehow). This loop contains pretty careful error checking, as well as the correct EINTR handing if the signal handlers were to be installed without the SA_RESTART flags.
If the child process we forked exits, we check the exit status and/or reason it died, and print a diagnostic message to standard error.
Finally, the program ends with a horrible hack: instead of returning EXIT_SUCCESS or EXIT_FAILURE, we return the entire status word we obtained with waitpid when the child process exited. The reason I left this in, is because it is sometimes used in practice, when you want to return the same or as similar exit status code as a child process returned with. So, it's for illustration. If you ever find yourself to be in a situation when your program should return the same exit status as a child process it forked and executed, this is still better than setting up machinery to have the process kill itself with the same signal that killed the child process. Just put a prominent comment there if you ever need to use this, and a note in the installation instructions so that those who compile the program on architectures where that might be unwanted, can fix it.

Why isn't write(2) returning EINTR?

I've been reading about EINTR on write(2) etc, and trying to determine whether I need to check for it in my program. As a sanity check, I tried to write a program that would run into it. The program loops forever, writing repeatedly to a file.
Then, in a separate shell, I run:
while true; do pkill -HUP test; done
However, the only output I see from test.c is the .s from the signal handler. Why isn't the SIGHUP causing write(2) to fail?
test.c:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
void hup_handler(int sig)
{
printf(".");
fflush(stdout);
}
int main()
{
struct sigaction act;
act.sa_handler = hup_handler;
act.sa_flags = 0;
sigemptyset(&act.sa_mask);
sigaction(SIGHUP, &act, NULL);
int fd = open("testfile", O_WRONLY);
char* buf = malloc(1024*1024*128);
for (;;)
{
if (lseek(fd, 0, SEEK_SET) == -1)
{
printf("lseek failed: %s\n", strerror(errno));
}
if (write(fd, buf, sizeof(buf)) != sizeof(buf))
{
printf("write failed: %s\n", strerror(errno));
}
}
}
Linux tends to avoid EINTR on writes to/reads from files; see discussion here. While a process is blocking on a disk write, it may be placed in an uninterruptible sleep state (process code D) which indicates that it cannot be interrupted at that time. This depends on the device driver; the online copy of Linux Device Drivers, 3rd Edition is a good reference for how this appears from the kernel side.
You still need to handle EINTR for other platforms which may not behave the same, or for pipes and sockets where EINTR definitely can occur.
Note that you're only writing sizeof(void *) bytes at a time:
char* buf = malloc(1024*1024*128);
if (write(fd, buf, sizeof(buf)) != sizeof(buf))
This should be
const size_t BUF_SIZE = 1024*1024*128;
char* buf = malloc(BUF_SIZE);
if (write(fd, buf, BUF_SIZE) != BUF_SIZE)
There are 2 possibilities:
You're writing very few bytes, since you're misusing the sizeof operator. Thus the write happens instantaneously and it never gets interrupted - you're only writing 4 or 8 bytes at a time
Somehow the syscall gets restarted, as if you applied SA_RESTART to sigaction
In your code, since buf is a pointer, sizeof(buf) yields the size of the pointer on your machine, not the (much bigger) allocated space
If you check the manual page for EINTR
The call was interrupted by a signal before any data was written
Also from the signal(7) manual page:
read(2), readv(2), write(2), writev(2), and ioctl(2) calls on "slow" devices. A "slow" device is one where the I/O call may block for an indefinite time, for example, a terminal, pipe, or socket. (A disk is not a slow device according to this definition.) If an I/O call on a slow device has already transferred some data by the time it is interrupted by a signal handler, then the call will return a success status (normally, the number of bytes transferred).
Taking these two together, if writing to a file on a disk, and write has started to write (even if only one single byte has been written) the return from that write call will be a success.

Resources