How can waitpid() reap more than one child? - c

In this example from the CSAPP book chap.8:
\#include "csapp.h"
/* WARNING: This code is buggy! \*/
void handler1(int sig)
{
int olderrno = errno;
if ((waitpid(-1, NULL, 0)) < 0)
sio_error("waitpid error");
Sio_puts("Handler reaped child\n");
Sleep(1);
errno = olderrno;
}
int main()
{
int i, n;
char buf[MAXBUF];
if (signal(SIGCHLD, handler1) == SIG_ERR)
unix_error("signal error");
/* Parent creates children */
for (i = 0; i < 3; i++) {
if (Fork() == 0) {
printf("Hello from child %d\n", (int)getpid());
exit(0);
}
}
/* Parent waits for terminal input and then processes it */
if ((n = read(STDIN_FILENO, buf, sizeof(buf))) < 0)
unix_error("read");
printf("Parent processing input\n");
while (1)
;
exit(0);
}
It generates the following output:
......
Hello from child 14073
Hello from child 14074
Hello from child 14075
Handler reaped child
Handler reaped child //more than one child reaped
......
The if block used for waitpid() is used to generate a mistake that waitpid() is not able to reap all children. While I understand that waitpid() is to be put in a while() loop to ensure reaping all children, what I don't understand is that why only one waitpid() call is made, yet was able to reap more than one children(Note in the output more than one child is reaped by handler)? According to this answer: Why does waitpid in a signal handler need to loop?
waitpid() is only able to reap one child.
Thanks!
update:
this is irrelevant, but the handler is corrected in the following way(also taken from the CSAPP book):
void handler2(int sig)
{
int olderrno = errno;
while (waitpid(-1, NULL, 0) > 0) {
Sio_puts("Handler reaped child\n");
}
if (errno != ECHILD)
Sio_error("waitpid error");
Sleep(1);
errno = olderrno;
}
Running this code on my linux computer.

The signal handler you designated runs every time the signal you assigned to it (SIGCHLD in this case) is received. While it is true that waitpid is only executed once per signal receival, the handler still executes it multiple times because it gets called every time a child terminates.
Child n terminates (SIGCHLD), the handler springs into action and uses waitpid to "reap" the just exited child.
Child n+1 terminates and its behaviour follows the same as Child n. This goes on for every child there is.
There is no need to loop it as it gets called only when needed in the first place.
Edit: As pointed out below, the reason as to why the book later corrects it with the intended loop is because if multiple children send their termination signal at the same time, the handler may only end up getting one of them.
signal(7):
Standard signals do not queue. If multiple instances of a
standard signal are generated while that signal is blocked, then
only one instance of the signal is marked as pending (and the
signal will be delivered just once when it is unblocked).
Looping waitpid assures the reaping of all exited children and not just one of them as is the case right now.
Why is looping solving the issue of multiple signals?
Picture this: you are currently inside the handler, handling a SIGCHLD signal you have received and whilst you are doing that, you receive more signals from other children that have terminated in the meantime. These signals cannot queue up. By constantly looping waitpid, you are making sure that even if the handler itself can't deal with the multiple signals being sent, waitpid still picks them up as it's constantly running, rather than only running when the handler activates, which can or can't work as intended depending on whether signals have been merged or not.
waitpid still exits correctly once there are no more children to reap. It is important to understand that the loop is only there to catch signals that are sent when you are already in the signal handler and not during normal code execution as in that case the signal handler will take care of it as normal.
If you are still in doubt, try reading these two answers to your question.
How to make sure that `waitpid(-1, &stat, WNOHANG)` collect all children processes
Why does waitpid in a signal handler need to loop? (first two paragraphs)
The first one uses flags such as WNOHANG, but this only makes waitpid return immediately instead of waiting, if there is no child process ready to be reaped.

Related

Get returned value of a child process without holding parent execution

I need to be able to get the returned value from a child process without having to hold the execution of the parent for it.
Notice the a runtime error could happen in the child process.
Here is my program that I'm trying to make:
//In parent process:
do
{
read memory usage from /proc/ID/status
if(max_child_memory_usage > memory_limit)
{
kill(proc, SIGKILL);
puts("Memory limit exceeded");
return -5; // MLE
}
getrusage(RUSAGE_SELF,&r_usage);
check time and memory consumption
if(memory limit exceeded || time limit exceeded)
{
kill(proc, SIGKILL);
return fail;
}
/*
need to catch the returned value from the child somehow with
this loop working.
Notice the a runtime error could happen in the child process.
*/
while(child is alive);
The waitpid function has an option called WNOHANG which causes it to return immediately if the given child has not yet returned:
pid_t rval;
int status;
do {
...
rval = waitpid(proc, &status, WNOHANG);
} while (rval == 0);
if (rval == proc) {
if (WIFEXITED(status)) {
printf("%d exited normal with status %d\n", WEXITSTATUS(status));
} else {
printf("%d exited abnormally\n");
}
}
See the man page for waitpid for more details on checking various abnormal exit conditions.
The solutions using the WNOHANG flag would work only if you need to check just once for the exit status of the child. If however you would like to procure the exit status when the child exits, no matter how late that is, a better solution is to set a signal handler for the SIGCHLD signal.
When the child process terminates whether normally or abnormally, SIGCHLD will be sent to the parent process. Within this signal handler, you can call wait to reap the exit status of the child.
void child_exit_handler(int signo){
int exit_status;
int pid = wait(&exit_status);
// Do things ...
}
// later in the code, before forking and creating the child
signal(SIGCHLD, child_exit_handler);
Depending on other semantics of your program, you may want to use waitpid instead. (SIGCHLD may also be called if the program was stopped, not terminated. The man page for wait(2) describes macros to check for this.)

How to print parent process when all child exited in C [duplicate]

In my program I am forking (in parallel) child processes in a finite while loop and doing exec on each of them. I want the parent process to resume execution (the point after this while loop ) only after all children have terminated. How should I do that?
i have tried several approaches. In one approach, I made parent pause after while loop and sent some condition from SIGCHLD handler only when waitpid returned error ECHILD(no child remaining) but the problem I am facing in this approach is even before parent has finished forking all processes, retStat becomes -1
void sigchld_handler(int signo) {
pid_t pid;
while((pid= waitpid(-1,NULL,WNOHANG)) > 0);
if(errno == ECHILD) {
retStat = -1;
}
}
**//parent process code**
retStat = 1;
while(some condition) {
do fork(and exec);
}
while(retStat > 0)
pause();
//This is the point where I want execution to resumed only when all children have finished
Instead of calling waitpid in the signal handler, why not create a loop after you have forked all the processes as follows:
while (pid = waitpid(-1, NULL, 0)) {
if (errno == ECHILD) {
break;
}
}
The program should hang in the loop until there are no more children. Then it will fall out and the program will continue. As an additional bonus, the loop will block on waitpid while children are running, so you don't need a busy loop while you wait.
You could also use wait(NULL) which should be equivalent to waitpid(-1, NULL, 0). If there's nothing else you need to do in SIGCHLD, you can set it to SIG_DFL.
I think you should use the waitpid() call. It allows you to wait for "any child process", so if you do that the proper number of times, you should be golden.
If that fails (not sure about the guarantees), you could do the brute-force approach sitting in a loop, doing a waitpid() with the NOHANG option on each of your child PIDs, and then delaying for a while before doing it again.

Why does waitpid in a signal handler need to loop?

I read in an ebook that waitpid(-1, &status, WNOHANG) should be put under a while loop so that if multiple child process exits simultaniously , they are all get reaped.
I tried this concept by creating and terminating 2 child processes at the same time and reaping it by waitpid WITHOUT using loop. And the are all been reaped .
Question is , is it very necessary to put waitpid under a loop ?
#include<stdio.h>
#include<sys/wait.h>
#include<signal.h>
int func(int pid)
{
if(pid < 0)
return 0;
func(pid - 1);
}
void sighand(int sig)
{
int i=45;
int stat, pid;
printf("Signal caught\n");
//while( (
pid = waitpid(-1, &stat, WNOHANG);
//) > 0){
printf("Reaped process %d----%d\n", pid, stat);
func(pid);
}
int main()
{
int i;
signal(SIGCHLD, sighand);
pid_t child_id;
if( (child_id=fork()) == 0 ) //child process
{
printf("Child ID %d\n",getpid());
printf("child exiting ...\n");
}
else
{
if( (child_id=fork()) == 0 ) //child process
{
printf("Child ID %d\n",getpid());
printf("child exiting ...\n");
}
else
{
printf("------------Parent with ID %d \n",getpid());
printf("parent exiting ....\n");
sleep(10);
sleep(10);
}
}
}
Yes.
Okay, I'll elaborate.
Each call to waitpid reaps one, and only one, child. Since you put the call inside the signal handler, there is no guarantee that the second child will exit before you finish executing the first signal handler. For two processes that is okay (the pending signal will be handled when you finish), but for more, it might be that two children will finish while you're still handling another one. Since signals are not queued, you will miss a notification.
If that happens, you will not reap all children. To avoid that problem, the loop recommendation was introduced. If you want to see it happen, try running your test with more children. The more you run, the more likely you'll see the problem.
With that out of the way, let's talk about some other issues.
First, your signal handler calls printf. That is a major no-no. Very few functions are signal handler safe, and printf definitely isn't one. You can try and make your signal handler safer, but a much saner approach is to put in a signal handler that merely sets a flag, and then doing the actual wait call in your main program's flow.
Since your main flow is, typically, to call select/epoll, make sure to look up pselect and epoll_pwait, and to understand what they do and why they are needed.
Even better (but Linux specific), look up signalfd. You might not need the signal handler at all.
Edited to add:
The loop does not change the fact that two signal deliveries are merged into one handler call. What it does do is that this one call handles all pending events.
Of course, once that's the case, you must use WNOHANG. The same artifacts that cause signals to be merged might also cause you to handle an event for which a signal is yet to be delivered.
If that happens, then once your first signal handler exists, it will get called again. This time, however, there will be no pending events (as the events were already extracted by the loop). If you do not specify WNOHANG, your wait block, and the program will be stuck indefinitely.

How can I be sure I'm not losing signals when using pause()?

I'm writing a program that uses fork to create child processes and count them when they're done.
How can I be sure I'm not losing signals?
what will happen if a child sends the signal while the main program still handles the previous signal? is the signal "lost"? how can I avoid this situation?
void my_prog()
{
for(i = 0; i<numberOfDirectChildrenGlobal; ++i) {
pid = fork();
if(pid > 0)//parent
//do parent thing
else if(0 == pid) //child
//do child thing
else
//exit with error
}
while(numberOfDirectChildrenGlobal > 0) {
pause(); //waiting for signal as many times as number of direct children
}
kill(getppid(),SIGUSR1);
exit(0);
}
void sigUsrHandler(int signum)
{
//re-register to SIGUSR1
signal(SIGUSR1, sigUsrHandler);
//update number of children that finished
--numberOfDirectChildrenGlobal;
}
It's recommended to use sigaction instead of signal, but in both cases it won't provide what you need. If a child sends a signal while the previous signal is still being handled, it will become a pending signal, but if more signals are sent they will be discarded (on systems that are not blocking incoming signals, the signals can be delivered before reestablishment of the handler and again resulting in missing signals). There is no workaround for this.
What one usually does is to assume that some signals are missing, and lets the handler take care of exiting children.
In your case, instead of sending a signal from your children, just let the children terminate. Once they terminate, the parent's SIGCHLD handler should be used to reap them. Using waitpid with WNOHANG option ensures that the parent will catch all the children even if they all terminate at the same time.
For example, a SIGCHLD handler that counts the number of exited children can be :
pid_t pid;
while((pid = waitpid(-1, NULL, WNOHANG)) > 0) {
nrOfChildrenHandled++;
}
To avoid this situation you can use the posix real-time signals.
Use sigaction instead of signal to register your handlers, and the delivery of the signals is assured.

Waiting for all child processes before parent resumes execution UNIX

In my program I am forking (in parallel) child processes in a finite while loop and doing exec on each of them. I want the parent process to resume execution (the point after this while loop ) only after all children have terminated. How should I do that?
i have tried several approaches. In one approach, I made parent pause after while loop and sent some condition from SIGCHLD handler only when waitpid returned error ECHILD(no child remaining) but the problem I am facing in this approach is even before parent has finished forking all processes, retStat becomes -1
void sigchld_handler(int signo) {
pid_t pid;
while((pid= waitpid(-1,NULL,WNOHANG)) > 0);
if(errno == ECHILD) {
retStat = -1;
}
}
**//parent process code**
retStat = 1;
while(some condition) {
do fork(and exec);
}
while(retStat > 0)
pause();
//This is the point where I want execution to resumed only when all children have finished
Instead of calling waitpid in the signal handler, why not create a loop after you have forked all the processes as follows:
while (pid = waitpid(-1, NULL, 0)) {
if (errno == ECHILD) {
break;
}
}
The program should hang in the loop until there are no more children. Then it will fall out and the program will continue. As an additional bonus, the loop will block on waitpid while children are running, so you don't need a busy loop while you wait.
You could also use wait(NULL) which should be equivalent to waitpid(-1, NULL, 0). If there's nothing else you need to do in SIGCHLD, you can set it to SIG_DFL.
I think you should use the waitpid() call. It allows you to wait for "any child process", so if you do that the proper number of times, you should be golden.
If that fails (not sure about the guarantees), you could do the brute-force approach sitting in a loop, doing a waitpid() with the NOHANG option on each of your child PIDs, and then delaying for a while before doing it again.

Resources