Treating signals correctly inside system() - c

I have been reading "The Linux Programming Interface". Chapter 27, Program execution.
I understand that the author demonstrates how we could implement the system call using exec and fork. However, the challenging part is handling signals. In particular I am confused with the following text
The first signal to consider is SIGCHLD. Suppose that the program
calling system() is also directly creating children, and has
established a handler for SIGCHLD that performs its own wait(). In
this situation, when a SIGCHLD signal is generated by the termination
of the child created by system(), it is possible that the signal
handler of the main program will be invoked and collect the child’s
status before system() has a chance to call waitpid(). (This is an
example of a race condition.)
The following is the code example without signal handling
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
int system(char *command)
{
int status;
pid_t childPid;
switch(childPid = fork())
{
case -1: /* Error */
return -1;
case 0: /* Child */
execl("/bin/sh", "sh", "-c", command, (char*) NULL);
_exit(127); /* If reached this line than execl failed*/
default: /* Parent */
if (waitpid(childPid), &status, 0) == -1)
return -1;
else
return status;
}
}
I know what the race condition ism but don't understand the whole scenario the author describes. In particular, I don't understand what "the program calling system" might be. What is the "main program"? Which process creates child procs?
Could someone, please, explain by giving examples how a race condition can arise? In C or in pseudocode.

You could have a SIGCHLD handler installed that does int ws; wait(&ws);.
If such a SIGCHLD handler is allowed to run in response to a SIGCHLD, it will race with the waitpid done in system, preventing system from successfully retrieving the exit status of the child if the handler wins the race.
For this reason, POSIX prescribes that SIGCHLD be blocked in system.
You could still have races with wait calls done in other signal handlers or other threads, but that would be a design error that POSIX system won't help you with.
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
int system(char *command)
{
int status;
pid_t childPid;
switch(childPid = fork())
{
case -1: /* Error */
return -1;
case 0: /* Child */
execl("/bin/sh", "sh", "-c", command, (char*) NULL);
_exit(127); /* If reached this line than execl failed*/
default: /* Parent */
/*usleep(1);*/
if (waitpid(childPid, &status, 0) == -1)
return -1;
else
return status;
}
}
void sigchld(int Sig){ int er=errno; wait(0); errno=er; }
int main()
{
/*main program*/
//main program has a sigchld handler
struct sigaction sa;
sa.sa_flags = 0;
sigemptyset(&sa.sa_mask);
sa.sa_handler = sigchld;
sigaction(SIGCHLD, &sa,0);
for(;;){
//the handler may occasionally steal the child status
if(0>system("true") && errno==ECHILD)
puts("Child status stolen!");
}
}

Related

How to cancel waitpid if child has no status change?

Disclaimer: Absolute newbie in C, i was mostly using Java before.
In many C beginner tutorials, waitpid is used in process management examples to wait for its child processes to finish (or have a status change using options like WUNTRACED). However, i couldn't find any information about how to continue if no such status change occurs, either by direct user input or programmatic (e.g. timeout). So what is a good way to undo waitpid? Something like SIGCONT for stopped processes, but instead for processes delayed by waitpid.
Alternatively if the idea makes no sense, it would be interesting to know why.
How about if I suggest using alarm()? alarm() delivers SIGALRM after the countdown passes (See alarm() man page for more details). But from the signals man page, SIGALRM default disposition is to terminate the process. So, you need to register a signal handler for handling the SIGALRM. Code follows like this...
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
void sigalrm(int signo)
{
return; // Do nothing !
}
int main()
{
struct sigaction act, oldact;
act.sa_handler = sigalrm; // Set the signal handler
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
#ifdef SA_INTERRUPT // If interrupt defined set it to prevent the auto restart of sys-call
act.sa_flags |= SA_INTERRUPT;
#endif
sigaction(SIGALRM, &act, &oldact);
pid_t fk_return = fork();
if (fk_return == 0) { // Child never returns
for( ; ; );
}
unsigned int wait_sec = 5;
alarm(wait_sec); // Request for SIGALRM
time_t start = time(NULL);
waitpid(-1, NULL, 0);
int tmp_errno = errno; // save the errno state, it may be modified in between function calls.
time_t end = time(NULL);
alarm(0); // Clear a pending alarm
sigaction(SIGALRM, &oldact, NULL);
if (tmp_errno == EINTR) {
printf("Child Timeout, waited for %d sec\n", end - start);
kill(fk_return, SIGINT);
exit(1);
}
else if (tmp_errno != 0) // Some other fatal error
exit(1);
/* Proceed further */
return 0;
}
OUTPUT
Child Timeout, waited for 5 sec
Note: You don't need to worry about SIGCHLD because its default disposition is to ignore.
EDIT
For the completeness, it is guaranteed that SIGALRM is not delivered to the child. This is from the man page of alarm()
Alarms created by alarm() are preserved across execve(2) and are not inherited by children created via fork(2).
EDIT 2
I don't know why it didn't strike me at first. A simple approach would be to block SIGCHLD and call sigtimedwait() which supports timeout option. The code goes like this...
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main()
{
sigset_t sigmask;
sigemptyset(&sigmask);
sigaddset(&sigmask, SIGCHLD);
sigprocmask(SIG_BLOCK, &sigmask, NULL);
pid_t fk_return = fork();
if (fk_return == 0) { // Child never returns
for( ; ; );
}
if (sigtimedwait(&sigmask, NULL, &((struct timespec){5, 0})) < 0) {
if (errno == EAGAIN) {
printf("Timeout\n");
kill(fk_return, SIGINT);
exit(1);
}
}
waitpid(fk_return, NULL, 0); // Child should have terminated by now.
/* Proceed further */
return 0;
}
OUTPUT
Timeout
The third argument to waitpid takes a set of flags. You want to include the WNOHANG flag, which tells waitpid to return immediately if no child process has exited.
After adding this option, you would sit in a loop a sleep for some period of time and try again if nothing has exited. Repeat until either a child has returned or until your timeout has passed.
Waiting for process to die on a typical Unix system is an absolute PITA. The portable way would be to use various signals to interrupt wait function: SIGALARM for timeout, SIGTERM/SIGINT and others for "user input" event. This relies on a global state and thus might be impossible to do.
The non-portable way would be to use pidfd_open with poll/epoll on Linux, kqueue with a EVFILT_PROC filter on BSDs.
Note that on Linux this allows waiting for a process to terminate, you will still have to retrieve status via waitid with P_PIDFD.
If you still want to mix in "user events", add signalfd to the list of descriptors on Linux or EVFILT_SIGNAL filter of kqueue on BSDs.
Another possible solution is to spawn a "process reaper" thread which is responsible for reaping of all processes and setting some event in a process object of your choice: futex word, eventfd etc. Waiting on such objects can be done with a timeout. This requires everyone to agree to use the same interface for process spawning which might or might not be reasonable. Afaik Java implementations use this strategy.

Return value of waitpid

I am using waitpid(2) to check and mark the status of my processes of my job control program. I am using the WUNTRACED option, to catch useful signal like SIGTSTP in a job control program.
The problem is, when CTRL-Z (SIGTSTP) my program, the PID returned by waitpid(2) is correct (>0), but when killing it with a CTRL-C (SIGINT), the PID returned is -1. How is that ? How can I mark the status of my process then ? Since it return an invalid PID and set errno to ECHILD.
#include <sys/types.h>
#include <stdbool.h>
#include <termios.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
int main(void)
{
pid_t pid;
pid_t ret;
int stat_loc;
if ((!(pid = fork())))
{
execve("/bin/ls", (char *const []){"ls", "-Rl", "/", NULL}, NULL);
}
else if (pid > 0)
{
signal(SIGINT, SIG_IGN);
signal(SIGQUIT, SIG_IGN);
signal(SIGTSTP, SIG_IGN);
signal(SIGTTIN, SIG_IGN);
signal(SIGTTOU, SIG_IGN);
signal(SIGCHLD, SIG_IGN);
ret = waitpid(-1, &stat_loc, WUNTRACED);
printf("\nwaitpid returned %d\n", ret);
}
return (0);
}
EDIT: Problem solved, see the trick with SIGCHLD when you ignore it.
You're ignoring SIGCHLD:
signal(SIGCHLD, SIG_IGN);
Per POSIX:
Status information for a process shall be generated (made available to
the parent process) when the process stops, continues, or terminates
except in the following case:
If the parent process sets the action for the SIGCHLD signal to SIG_IGN, or if the parent sets the SA_NOCLDWAIT flag for the
SIGCHLD signal action, process termination shall not generate new
status information but shall cause any existing status information for
the process to be discarded.
If you want to wait() on a child process, you can't ignore SIGCHLD.

Unix and Signal Handlers (C)

I'm a student and I am trying to understand signals within a course about Unix programming.
To start, I wanted to test a simple example: a process makes a child and needs a confirmation of the actual creation.
I fork, and within the child I send a SIGUSR1 to the father with
kill(getppid(), SIGUSR1);
Then, within the father, I wrote a pause(); sys call to block the process until a signal is received, and then I wrote the
(sigaction(SIGUSR1, &sa, NULL) check.
Problem is, the signal is sent and the program stops, with no handler execution.
I compile it with
$gcc -Wall -o test test.c
I get no warnings, and the output is
$I am the father
$I am the child
$User defined signal 1
I know I could do this in other ways (with sleep sys call, etc.), but I just want to understand why this code doesn't work.
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
void new_child (int sig){
write(1,"I made a new child\n",50);
}
int main(){
pid_t child;
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = new_child;
switch(child = fork()){
case -1:
printf("Error.\n");
exit(EXIT_FAILURE);
case 0:
printf("I am the child\n");
kill(getppid(), SIGUSR1);
break;
default:
printf("I am the father\n");
pause();
if (sigaction(SIGUSR1, &sa, NULL) == -1){
printf("Signal error.\n");
exit(EXIT_FAILURE);
}
printf("I'm done.");
kill (child, SIGKILL);
exit(EXIT_SUCCESS);
}
}
I know this question has already been asked, but I cannot seem to find a solution that works in my case.
You must change the signal at the beginning of the code so that it can execute as soon as SIGUSR is received – Anjaneyulu
"You must change the signal at the beginning of the code ..." this restriction is a bit tight. The signal handler needs to be install latest just before calling fork().
– alk

Parent process is firing signals at a child process but the child's signal handling needs to be improved

I'm doing some practice questions for an exam and one of the questions gives two pieces of code called parent.c and child.c . Parent creates a child and fires signals at it and child displays a message every time it receives a signal. Child will spend rest of it's time printing a message from main. The question is to describe what is wrong with the signal handling in child.c and to re-write the code to correct it. I get the general idea of signals but have a lot of difficulty implementing them. I'm not sure if procmask in child.c is working properly, I'm not completely comfortable with signals but I can't see why you'd put NULL as the last parameter so maybe that's part of why it's wrong? Could someone please point me in the right direction and give me an idea of what part of the code is wrong and why.
Parent.c
#include <unistd.h>
#include <signal.h>
int
main(int argc, char *argv[])
{
pid_t pid;
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGUSR1);
sigprocmask(SIG_BLOCK, &set, NULL);
pid = fork();
if (pid == 0) {
execlp("./child", "./child", NULL);
}
while (1) {
kill(pid, SIGUSR1);
}
return (0);
}
Child.c
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
static void
handler(int signo)
{
printf("This is the SIGUSR1 signal handler!\n");
}
int
main(void)
{
sigset_t set;
sigemptyset(&set);
sigset(SIGUSR1, handler);
sigprocmask(SIG_SETMASK, &set, NULL);
while (1) {
printf("This is main()!\n");
}
return (0);
}
int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);
The last parameter is used to store the old signal mask. When it's NULL, it means we don't need to store the old signal mask.
Note that you shouldn't use printf in a signal handler, because it's not reentrant, see How to avoid using printf in a signal handler?
And the usage of execlp is wrong, because NULL could be defined as 0, and the compiler may think it's an integer, not a null pointer.
execlp("./child", "./child", NULL);
The last parameter should be (char *)0, like this:
execlp("./child", (char *)0);
As man sigset states sigset is obsolete and you should use sigaction. Here is an example on how to use it.

Test cases in C for WIFSIGNALED, WIFSTOPPED, WIFCONTINUED

I'm playing with waitpid() and signal() and I'm looking for reliable test cases for returning WIFSIGNALED(status) = WIFSTOPPED(status) = WIFCONTINUED (status) = true but can't find any...
Care to tell me how can I make sure those return true so I can debug my code?
Also, a few hints about what signals should I catch with signal() to test those macros would be helpful...
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#define NELEMS(x) (sizeof (x) / sizeof (x)[0])
static void testsignaled(void) {
kill(getpid(), SIGINT);
}
static void teststopped(void) {
kill(getpid(), SIGSTOP);
}
static void testcontinued(void) {
kill(getpid(), SIGSTOP);
/* Busy-work to keep us from exiting before the parent waits.
* This is a race.
*/
alarm(1);
while(1) {}
}
int main(void) {
void (*test[])(void) = {testsignaled, teststopped, testcontinued};
pid_t pid[NELEMS(test)];
int i, status;
for(i = 0; i < sizeof test / sizeof test[0]; ++i) {
pid[i] = fork();
if(0 == pid[i]) {
test[i]();
return 0;
}
}
/* Pause to let the child processes to do their thing.
* This is a race.
*/
sleep(1);
/* Observe the stoppage of the third process and continue it. */
wait4(pid[2], &status, WUNTRACED, 0);
kill(pid[2], SIGCONT);
/* Wait for the child processes. */
for(i = 0; i < NELEMS(test); ++i) {
wait4(pid[i], &status, WCONTINUED | WUNTRACED, 0);
printf("%d%s%s%s\n", i, WIFCONTINUED(status) ? " CONTINUED" : "", WIFSIGNALED(status) ? " SIGNALED" : "", WIFSTOPPED(status) ? " STOPPED" : "");
}
return 0;
}
Handling WIFSIGNALED is easy. The child process can commit suicide with the kill() system call. You can also check for core dumps - some signals create them (SIGQUIT, IIRC); some signals do not (SIGINT).
Handling WIFSTOPPED may be harder. The simple step to try is for the child to send itself SIGSTOP with the kill() system call again. Actually, I think that should work. Note that you may want to check on SIGTTIN and SIGTTOU and SIGTSTOP - I believe they count for WIFSTOPPED. (There's also a chance that SIGSTOP only works sanely when sent by a debugger to a process it is running via the non-POSIX system call, ptrace().)
Handling WIFCONTINUED is something that I think the parent has to do; after you detect a process has been stopped, your calling code should make it continue by sending it a SIGCONT signal (kill() again). The child can't deliver this itself; it has been stopped. Again, I'm not sure whether there are extra wrinkles to worry about - probably.
A framework something like the below will allow you check the results of the wait() and waitpid() calls.
pid_t pid = fork();
if (pid == 0) {
/* child */
sleep(200);
}
else {
/* parent */
kill(pid, SIGSTOP);
/* do wait(), waitpid() stuff */
}
You do not actually have to catch the signals (using signal() or related function) that are sent. signal() installs a handler that overrides the default behavior for the specific signal - so if you want to check for a signal terminating your process, pick one that has that default behavior - "man -s7 signal" will give you details a signal's default behavior.
For the macros you have mentioned use SIGSTOP for WIFSTOPPED(status), SIGCONT for WIFCONTINUED (status) and SIGINT for WIFSIGNALED(status)
If you want more flexibility for testing, you could use kill (see "man kill") to send signals to your process. kill -l will list all the signals that can be sent.
in your tests you can fork() and send specific signal to your child processes? In this scenario your child processes are test cases?
EDIT
my answer is about coding a C test. you fork, get the pid of your child process (the process
with signal handlers installed), then you can send signal to it by using kill(2).
In this way you can test the exit status

Resources