In the case of establishing a pipe between two processes, if those two have a brother to brother relationship rather than a father-child, will they be more error prone ?
My question for this arose when I investigated the code example below:
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
void runpipe();
int
main(int argc, char **argv)
{
int pid, status;
int fd[2];
pipe(fd);
switch (pid = fork())
{
case 0: /* child */
runpipe(fd);
exit(0);
default: /* parent */
while ((pid = wait(&status)) != -1) {
fprintf(stderr, "process %d exits with %d\n", pid, WEXITSTATUS(status));
exit(0);
}
case -1:
perror("fork");
exit(1);
}
exit(0);
}
char *cmd1[] = { "ls", "-al", "/", 0 };
char *cmd2[] = { "tr", "a-z", "A-Z", 0 };
void
runpipe(int pfd[])
{
int pid;
switch (pid = fork())
{
case 0: /* child */
dup2(pfd[0], 0);
close(pfd[1]); /* the child does not need this end of the pipe */
execvp(cmd2[0], cmd2);
perror(cmd2[0]);
default: /* parent */
dup2(pfd[1], 1);
close(pfd[0]); /* the parent does not need this end of the pipe */
execvp(cmd1[0], cmd1);
perror(cmd1[0]);
case -1:
perror("fork");
exit(1);
}
}
In the example above, parent(grandpa) forks a child(parent), which then forks another child(grandchild). Grandpa waits for dad but dad does not wait for grandson because they both execute execvp. What happens if child finishes earlier than dad (zombie) or dad finishes earlier than child (orphan) ? On the other hand if we had two brothers connected with the pipe and one father and waiting for them (total three processes), even if they both brothers executed execvp, ones exit would not harm the other.
In the case of establishing a pipe between two processes, if those two have a brother to brother relationship rather than a father-child, will they be more error prone ?
As far as the pipe is concerned, everything depends on the I/O operations that each performs. If the process at the read end tries to read data that the process at the other end is not prepared to write, then it will block until the writer writes or exits. In the latter case, the read will either report an error or return short data.
What happens if child finishes earlier than dad (zombie) or dad finishes earlier than child (orphan) ?
If the father calls an exec() function after forking a child and before collecting it via wait() or waitpid(), as in the example code, then it is unlikely ever to wait on the child.
Regardless, child and dad each become zombies when they terminate. This is true of the child whether or not it is orphaned first. If dad never collects child (as it won't in your example), then once dad terminates, the child (whether live or zombie) is inherited by process 0 (init), which can be relied upon to clean up all its zombie children. Similarly, if grandpa never collects dad then init eventually will do.
Under certain circumstances it is possible for zombies to build up uncollected. This is a form of resource leak, but it will ultimately be cleaned up when the zombies are inherited by init. That is slightly exacerbated by the grandpa -> parent -> child topology you've set up, but I wouldn't characterize it as "error prone."
What happens if child finishes earlier than dad (zombie)...
It will be a zombie process. Once the parent finishes without waiting on the child, the child will be re-parented to init. init will then wait on the child, retrieving its exit code and allowing it to finally exit.
...or dad finishes earlier than child (orphan) ?
Orphaned processes are re-parented to init. The process will then be the same as above.
Related
I want to catch all child processes forked by a parent process, then collect the last child's exit status. To that end, I called sigsuspend() to wait for a SIGCHLD signal. When I receive the SIGCHLD signal, then the handler will call waitpid in a loop until it indicates there are no children left to reap. The exit status will be set, and the main will break out of the loop and terminate.
However, I noticed that this is not correct, as all the children aren't always reaped. How can I fix this?
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/wait.h>
volatile sig_atomic_t exit_stat;
// Signal Handler
void sigchld_handler(int sig) {
pid_t pid;
int status;
while(1) {
pid = waitpid(-1, &status, WNOHANG);
if(pid <= 0) {break;}
if(WIFEXITED(status)) {
printf("%s", "Exited correctly.");
}
else {
printf("%s", "Bad exit.");
}
}
exit_stat = status;
}
// Executing code.
int main() {
signal(SIGCHLD, sigchld_handler);
sigset_t mask_child;
sigset_t old_mask;
sigemptyset(&mask_child);
sigaddset(&mask_child, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask_child, &old_mask);
for(int i = 0; i < 5; i++) {
int child_pid = fork();
if(child_pid != 0) {
//Perform execvp call.
char* argv[] = {"echo", "hi", NULL};
execvp(argv[0], argv);
}
}
while(!exit_stat) {
sigsuspend(&old_mask);
}
return 0;
}
Transferring lightly modified comments into an answer.
The WNOHANG option to waitpid() means "return immediately if there are no children left, OR if there are children left but they're still running". If you really want to wait for all children to exit, either omit the WNOHANG option to waitpid() or simply use wait() instead. Note that if there were tasks launched in the background, they may not terminate for a very long time, if ever. It also depends on the context whether 'the last child to die' is the correct one to report on. It is possible to imagine scenarios where that is not appropriate.
You're right, in this instance, I meant that "the last child to die" is the last child that was forked. Can I fix this by adding a simple condition to check if the returned pid of wait == the pid of the last forked child?
If you're interested in the last child in the most recent pipeline (e.g. ls | grep … | sort … | wc and you want to wait for wc), then you know the PID for wc, and you can use waitpid(wc_pid, &status, 0) to wait for that process specifically to die. Or you can use your loop to collect bodies until you either find the body of wc or get 'no dead processes left'. At that point, you can decide to wait specifically for the wc PID, or (better) use waitpid() without WNOHANG (or use wait()) until some process dies — and again you can decide whether it was wc or not, and if not, repeat the WNOHANG corpse collection process to collect any zombies. Repeat until you do find the corpse of wc.
And also, you said that background tasks may not terminate for a long time. By this, do you mean that waitpid(-1, &status, 0) will completely suspend all processes until a child is ready to be reaped?
waitpid(-1, &status, 0); will make the parent process wait indefinitely until some child process dies, or it will return because there are no children left to wait for (which indicates there was a housekeeping error; children should not die without the parent knowing).
Note that using a 'wait for any child' loop avoids leaving zombies around (children that have died but not been waited for). This is generally a good idea. But capturing when the child you're currently interested in dies ensures that your shell doesn't hang around waiting when it wasn't necessary. So, you need to capture both the PID and the exit status of the dead child processes.
I'm trying to create a zombie process with the kill function but it simply kills the child and returns 0.
int main ()
{
pid_t child_pid;
child_pid = fork ();
if (child_pid > 0) {
kill(getpid(),SIGKILL);
}
else {
exit (0);
}
return 0;
}
When I check the status of the process there is no z in the status column.
Here is a simple recipe which should create a zombie:
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
int main()
{
int pid = fork();
if(pid == 0) {
/* child */
while(1) pause();
} else {
/* parent */
sleep(1);
kill(pid, SIGKILL);
printf("pid %d should be a zombie\n", pid);
while(1) pause();
}
}
The key is that the parent -- i.e. this program -- keeps running but does not do a wait() on the dying child.
Zombies are dead children that have not been waited for. If this program waited for its dead child, it would go away and not be a zombie. If this program exited, the zombie child would be inherited by somebody else (probably init), which would probably do the wait, and the child would go away and not be a zombie.
As far as I know, the whole reason for zombies is that the dead child exited with an exit status, which somebody might want. But where Unix stores the exit status is in the empty husk of the dead process, and how you fetch a dead child's exit status is by waiting for it. So Unix is keeping the zombie around just to keep its exit status around just in case the parent wants it but hasn't gotten around to calling wait yet.
So it's actually kind of poetic: Unix's philosophy here is basically that no child's death should go unnoticed.
I have this code that requires a parent to fork 3 children.
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
What is the command to view zombie processes if you have Linux
virtual box?
main(){
pid_t child;
printf("-----------------------------------\n");
about("Parent");
printf("Now .. Forking !!\n");
child = fork();
int i=0;
for (i=0; i<3; i++){
if (child < 0) {
perror ("Unable to fork");
break;
}
else if (child == 0){
printf ("creating child #%d\n", (i+1));
about ("Child");
break;
}
else{
child = fork();
}
}
}
void about(char * msg){
pid_t me;
pid_t oldone;
me = getpid();
oldone = getppid();
printf("***[%s] PID = %d PPID = %d.\n", msg, me, oldone);
}
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
If your parent spawns only a small, fixed number of children; does not care when or whether they stop, resume, or finish; and itself exits quickly, then you do not need to use wait() or waitpid() to clean up the child processes. The init process (pid 1) takes responsibility for orphaned child processes, and will clean them up when they finish.
Under any other circumstances, however, you must wait() for child processes. Doing so frees up resources, ensures that the child has finished, and allows you to obtain the child's exit status. Via waitpid() you can also be notified when a child is stopped or resumed by a signal, if you so wish.
As for where to perform the wait,
You must ensure that only the parent wait()s.
You should wait at or before the earliest point where you need the child to have finished (but not before forking), OR
if you don't care when or whether the child finishes, but you need to clean up resources, then you can periodically call waitpid(-1, NULL, WNOHANG) to collect a zombie child if there is one, without blocking if there isn't any.
In particular, you must not wait() (unconditionally) immediately after fork()ing because parent and child run the same code. You must use the return value of fork() to determine whether you are in the child (return value == 0), or in the parent (any other return value). Furthermore, the parent must wait() only if forking was successful, in which case fork() returns the child's pid, which is always greater than zero. A return value less than zero indicates failure to fork.
Your program doesn't really need to wait() because it spawns exactly four (not three) children, then exits. However, if you wanted the parent to have at most one live child at any time, then you could write it like this:
int main() {
pid_t child;
int i;
printf("-----------------------------------\n");
about("Parent");
for (i = 0; i < 3; i++) {
printf("Now .. Forking !!\n");
child = fork();
if (child < 0) {
perror ("Unable to fork");
break;
} else if (child == 0) {
printf ("In child #%d\n", (i+1));
about ("Child");
break;
} else {
/* in parent */
if (waitpid(child, NULL, 0) < 0) {
perror("Failed to collect child process");
break;
}
}
}
return 0;
}
If the parent exits before one or more of its children, which can happen if it does not wait, then the child will thereafter see its parent process being pid 1.
Others have already answered how to get a zombie process list via th ps command. You may also be able to see zombies via top. With your original code you are unlikely to catch a glimpse of zombies, however, because the parent process exits very quickly, and init will then clean up the zombies it leaves behind.
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
You can use wait() anywhere in the parent process, and when the child process terminates it'll be removed from the system. Where to put it is up to you, in your specific case you probably want to put it immediately after the child = fork(); line so that the parent process won't resume its execution until its child has exited.
What is the command to view zombie processes if you have Linux virtual box?
You can use the ps aux command to view all processes in the system (including zombie processes), and the STAT column will be equal to Z if the process is a zombie. An example output would be:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
daniel 1000 0.0 0.0 0 0 ?? Z 17:15 0:00 command
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
You can register a signal handler for SIGCHLD that sets a global volatile sig_atomic_t flag = 0 variable to 1. Then, at some convenient place in your program, test whether flag is set to 1, and, if so, set it back to 0 and afterwards (for otherwise you might miss a signal) call waitpid(-1, NULL, WNOHANG) in a loop until it tells you that no more processes are to be waited for. Note that the signal will interrupt system calls with EINTR, which is a good condition to check for the value of flag. If you use an indefinitely blocking system call like select(), you might want to specify a timeout after which you check for flag, since otherwise you might miss a signal that was raised after your last waitpid() call but before entering the indefinitely blocking system call. An alternative to this kludge is to use pselect().
Use:
ps -e -opid,ppid,pgid,stat,etime,cmd | grep defunct
to see your zombies, also the ppid and pgid to see the parent ID and process group ID. The etime to see the elapsed (cpu) time your zombie has been alive. The parent ID is useful to send custom signals to the parent process.
If the parent process is right coded to catch and handle the SIGCHLD signal, and to what expected (i.e., wait/reap the zombies), then you can submit:
kill -CHLD <parent_pid>
to tell the parent to reap all their zombies.
I wrote the following program to understand the way fork works when called without wait() or waitpid().
int main()
{
pid_t childpid;
int retval = 0;
int i;
while(1){
//usleep(1);
childpid = fork();
if (childpid >= 0)
{
i++;
if (childpid == 0)
{
exit(retval);
}
else
{
//printf("childpid is %d\n", childpid);
}
}
else
{
printf("total no. of processes created = %d\n", i);
perror("fork");
exit(0);
}
}
}
Here's the output I get->
total no. of processes created = 64901
fork: Cannot allocate memory
I expected the program to go on as I'm exiting the child process instantly and fork() should reuse the pids after pid > pid_max. Why doesn't this happen?
The exited child processes do remain in the process table as zombies. Zombie processes exist until their parent calls wait or waitpid to obtain their exit status. Also, the corresponding process id is kept, to prevent other newly created processes of duplicating it.
In your case, the process table becomes too large and the system rejects the creation of new processes.
Forking processes and then not retrieving their exit status can be regarded as a resource leak. When the parent exits, they will be adopted by the init process and then reaped, but if the parent stays alive for too long, there is no way for the system to just remove some of the zombies, because it is assumed that the parent should get interested in them at some point via wait or waitpid.
Child processes also hold some resource like memory. But they are not released because parent process can not process SIGCHLD signal, which will be sent by child processes when they exit.
Those child processes will become zombie.
You can use "ps -aux" to dump those fd.
It seems that if I create a process, fork it and send a SIGHUP from the parent to the child, the child dies but it's "/proc/PID" dir doesn't dissappear until the parent also dies.
(See code below).
What is the right way to let the parent check if the child is dead ?
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <errno.h>
#include <signal.h>
void testprocdir(pid_t pid) {
struct stat sb;
char path[1024];
sprintf(path,"/proc/%d",pid);
if(stat(path, &sb)==-1 && errno == ENOENT) {
printf("%s does not exist\n", path);
} else {
printf("%s exists\n", path);
}
}
int main(int argc,char **argv) {
pid_t parent,child;
parent=getpid();
printf("I am %d\n",parent);
child=fork();
switch(child) {
case -1:
printf("Forking failed\n");
return 2;
case 0:
parent=getppid();
child=getpid();
printf("I am the child (%d) and my parent is %d\n", child, parent);
while(1) { sleep(1); printf("I am the child and I have slept 1s\n");}
printf("This line should not be visible\n");
}
sleep(1); //make sure kid is in the while loop
printf("I am the parent (%d) and my kid is %d\n", parent, child);
kill(child,SIGHUP);
testprocdir(parent);
printf("Waiting 5s before testing if the procdir of the child (/proc/%d) is removed\n",child);
sleep(5);
testprocdir(child);
return 0;
}
You could use the wait family of system-calls.
fork returns the PID of the child process in the parent process, and 0 in the child process.
man waitpid should provide more than enough direction beyond that to call waitpid in the parent, allowing you to check that child process or all child processes ― including the ability to allow the parent to continue executing if the child is still alive or stop all execution in the parent until the child is dead.
I will start with some concepts:
The OS will keep a child process' entry in the process table (including exit status) around until the parent calls waitpid (or another wait-family function) or until the parent exits (at which point the status is collected by the init process). This is what a "zombie" process is: a process that has exited by is still resident in the process table for exactly this purpose. The process' entry in the table should go away after the first call to waitpid.
Also, from the man page :
A child that terminates, but has not been waited for becomes a "zombie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child.
So, by using the wait family of functions you can examine the status of child process.
There are some macros also that can be used with with wait family of functions to examine the status of child process like WEXITSTATUS, WIFSIGNALED, WIFEXITED etc .