does linux never end child process until the parent ends? - c

Please consider this code in c:
int main()
{
pid_t cpid;
cpid = fork();
if (cpid == -1)
{
perror("fork");
return 0;
}
if (cpid == 0)
{
printf("I'm child\n");
_exit(0);
}
else
{
while(1)
{
printf("I'm parent\n");
sleep(1);
}
}
return 0;
}
After running the code, I expect it to run child and exits it once it's done.
But when I run
pgrep executable_name
or
ps fax
it shows the child process id and I don't know if its just a history crap of working process or it really does not end/terminate the child process?
thanks in advance

The child will remain until its parent dies or the parent cleans it up with the wait system calls. (In the time between the child terminating and it being cleaned up, it is referred to as a zombie process.)
The reason is that the parent might be interested in the child's return value or final output, so the process entry stays active until that information is queried.
edit:
Example code for using the sigchld handler to immediately clean up processes when they die without blocking:
http://arsdnet.net/child.c
Be mindful of the fact that system calls (like sleep, select, or file read/writes) can be interrupted by signals. This is a normal thing you should handle anyway in unix - they fail and set errno to EINTR. When this happens, you can just try again to finish the operation. This is why my example code calls sleep twice in the parent - the first long sleep is interrupted by the child dying, then the second, shorter sleep lets us confirm the process is actually cleaned up before the parent dies.
BTW signal handlers usually shouldn't do much, they should return as soon as possible and avoid things that aren't thread safe; printfing in them is usually discouraged. I did it here just so you can watch everything as it happens.

You need to call wait() in the parent, otherwise the child process will never be reaped (it becomes a zombie).*
* Unless the parent itself also exits.

Related

Why does a process create a zombie if execv fails, but not if execv is successful and terminates?

So I am confused by the behavior of my C program. I am using the construct,
int pid = fork();
if (pid == 0) {
if(file_upload_script_path) {
rc = execv(file_upload_script_path, args);
if(rc == -1) {
printf("Error has occured when starting file_upload.exp!\n");
exit(0);
}
} else {
printf("Error with memory allocation!\n");
}
}
else {
printf("pid=%d\n", pid);
}
To fork the process and run a script for doing file upload. The script will by itself terminate safely, either by finishing the upload or failing.
Now, there was a problem with the script path, causing execv to fail. Here I noted the child process will terminate successfully if execv finishes, but in case it fails (r==-1) and I exit the process, it will become a zombie. Anyone knows why this happens?
Note here, I know why the child-process becomes a zombie. What I am confused about is why the process not becomes a zombie if execv works.
EDIT:
I got a question about errno and the cause of the error. The cause of the error is known. There were a problem with the build process, so the path of the script were another than expected.
However, this may happen again and I want to make sure my program does not start spawning zombies when it does. The behavoir where zombies are created in some situations and not others are very confusing.
BR
Patrik
If you don't want to create zombies, your program has to reap zombie processes no matter if they call execv or not call it or no matter if the execv call succeeds. To reap zombie processes "automagically" handle SIGCHLD signal:
void handle_sigchld(int sig) {
int saved_errno = errno;
while (waitpid((pid_t)(-1), 0, WNOHANG) > 0) {}
errno = saved_errno;
}
int main() {
signal(SIGCHLD, handle_sigchld);
// rest of your program....
}
Inspired (no... ripped off) from: this link.
Or maybe you want only to reap only this specified child, because later you want to call fork() and handle childs return value. Then pass the returned pid from fork() in your parent to the signal handler and wait on this pid in sigchld if needed (with some checking, ex. if the pid already finished then ignore future SIGCHLD etc...).
In this scenario, when the execv fails, the child process is killed. The fun part, I think is what happens when you call exec family of functions.
The exec family of functions replaces the current image of the process with the new image of the binary you are about to exec.
So, whatever code was will not remain - and the error in your script would cause its death.
Here, the parent needs to listen on the death of the child process using wait flavour of functions (read: waitpid).
When you say that there's problem in the script, it means that the execv actually succeeded in creating the new image; but the latter failed of its own accord.
This is what I think is happening...
If the printf of if (rc==-1) is being executed, then perhaps changing exit(0) to _exit(0) should take care of it.

Is there any way to figure out that child process was killed with SIGKILL by the kernel when the parent process isn't root

I have a situation where there is a nonroot (so i can't read kernel logs) parent process and its child, the child may have been killed with SIGKILL by the kernel for consuming a lot of memory. When it happens the parent process should know that the child was killed because of exceeding the limit of memory (ideally), but i don't even know whether i can to figure out that it was killed by SIGKILL, not to mention about the reason. So i need to understand from side of parent process whether the child was killed with SIGKILL, and if it was why it happened (but this is second issue).
Can someone give me advice? Thank you.
You need to wait(2) on the child and use the macro WIFSIGNALED to check if it was terminated by a signal.
int status = 0;
// wait for child to exit
pid_t child_pid = wait(&status);
if (WIFEXITED(status))
{
printf("exited with %d\n", WEXITSTATUS(status));
}
else if (WIFSIGNALED(status))
{
printf("Signaled with %d\n", WTERMSIG(status));
}
If you have multiple child processes, you can use a loop to wait for them all.
WTERMSIG(status) would return the signal number. To figure out the signal, you could check:
if (WTERMSIG(status) == SIGKILL) {
...
} else if (WTERMSIG(status) == SIGTERM) {
...
}
There's no way to figure out exactly who sent a kill (whether by the OOM killer or something else e.g., one could do kill -9 PID from the shell).
It's reasonable to assume that signals are not sent indiscriminately on a system and that it's usually the kernel itself (OOM killer) that sends SIGKILL.
The status provided by waitXXX( )(see man page) makes it possible to determine that the child has been killed by a signal:
First check by calling WIFSIGNALED(wstatus) if that happened, then you can call WTERMSIG(wstatus) to determine the signal number. However you can't determine if the process was killed by the kernel or by another process calling kill().

How do you kill zombie process using wait()

I have this code that requires a parent to fork 3 children.
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
What is the command to view zombie processes if you have Linux
virtual box?
main(){
pid_t child;
printf("-----------------------------------\n");
about("Parent");
printf("Now .. Forking !!\n");
child = fork();
int i=0;
for (i=0; i<3; i++){
if (child < 0) {
perror ("Unable to fork");
break;
}
else if (child == 0){
printf ("creating child #%d\n", (i+1));
about ("Child");
break;
}
else{
child = fork();
}
}
}
void about(char * msg){
pid_t me;
pid_t oldone;
me = getpid();
oldone = getppid();
printf("***[%s] PID = %d PPID = %d.\n", msg, me, oldone);
}
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
If your parent spawns only a small, fixed number of children; does not care when or whether they stop, resume, or finish; and itself exits quickly, then you do not need to use wait() or waitpid() to clean up the child processes. The init process (pid 1) takes responsibility for orphaned child processes, and will clean them up when they finish.
Under any other circumstances, however, you must wait() for child processes. Doing so frees up resources, ensures that the child has finished, and allows you to obtain the child's exit status. Via waitpid() you can also be notified when a child is stopped or resumed by a signal, if you so wish.
As for where to perform the wait,
You must ensure that only the parent wait()s.
You should wait at or before the earliest point where you need the child to have finished (but not before forking), OR
if you don't care when or whether the child finishes, but you need to clean up resources, then you can periodically call waitpid(-1, NULL, WNOHANG) to collect a zombie child if there is one, without blocking if there isn't any.
In particular, you must not wait() (unconditionally) immediately after fork()ing because parent and child run the same code. You must use the return value of fork() to determine whether you are in the child (return value == 0), or in the parent (any other return value). Furthermore, the parent must wait() only if forking was successful, in which case fork() returns the child's pid, which is always greater than zero. A return value less than zero indicates failure to fork.
Your program doesn't really need to wait() because it spawns exactly four (not three) children, then exits. However, if you wanted the parent to have at most one live child at any time, then you could write it like this:
int main() {
pid_t child;
int i;
printf("-----------------------------------\n");
about("Parent");
for (i = 0; i < 3; i++) {
printf("Now .. Forking !!\n");
child = fork();
if (child < 0) {
perror ("Unable to fork");
break;
} else if (child == 0) {
printf ("In child #%d\n", (i+1));
about ("Child");
break;
} else {
/* in parent */
if (waitpid(child, NULL, 0) < 0) {
perror("Failed to collect child process");
break;
}
}
}
return 0;
}
If the parent exits before one or more of its children, which can happen if it does not wait, then the child will thereafter see its parent process being pid 1.
Others have already answered how to get a zombie process list via th ps command. You may also be able to see zombies via top. With your original code you are unlikely to catch a glimpse of zombies, however, because the parent process exits very quickly, and init will then clean up the zombies it leaves behind.
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
You can use wait() anywhere in the parent process, and when the child process terminates it'll be removed from the system. Where to put it is up to you, in your specific case you probably want to put it immediately after the child = fork(); line so that the parent process won't resume its execution until its child has exited.
What is the command to view zombie processes if you have Linux virtual box?
You can use the ps aux command to view all processes in the system (including zombie processes), and the STAT column will be equal to Z if the process is a zombie. An example output would be:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
daniel 1000 0.0 0.0 0 0 ?? Z 17:15 0:00 command
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
You can register a signal handler for SIGCHLD that sets a global volatile sig_atomic_t flag = 0 variable to 1. Then, at some convenient place in your program, test whether flag is set to 1, and, if so, set it back to 0 and afterwards (for otherwise you might miss a signal) call waitpid(-1, NULL, WNOHANG) in a loop until it tells you that no more processes are to be waited for. Note that the signal will interrupt system calls with EINTR, which is a good condition to check for the value of flag. If you use an indefinitely blocking system call like select(), you might want to specify a timeout after which you check for flag, since otherwise you might miss a signal that was raised after your last waitpid() call but before entering the indefinitely blocking system call. An alternative to this kludge is to use pselect().
Use:
ps -e -opid,ppid,pgid,stat,etime,cmd | grep defunct
to see your zombies, also the ppid and pgid to see the parent ID and process group ID. The etime to see the elapsed (cpu) time your zombie has been alive. The parent ID is useful to send custom signals to the parent process.
If the parent process is right coded to catch and handle the SIGCHLD signal, and to what expected (i.e., wait/reap the zombies), then you can submit:
kill -CHLD <parent_pid>
to tell the parent to reap all their zombies.

Make child process wait for parent

I have to write a program in C that will fork a new process and then use that processes pid for another function. However I need to call this function before the child process can run and I don't know how to do this.
Here's some pseudo code of what I'm trying to do.
pid_t pid = fork();
if(in_child){ //In the child process
//launch child application
//somehow stop the child application before it actually executes any code
}
else{
//call my function with the child's pid
//resume the child process
//do other stuff
}
If you need any additional info please ask. Thanks.
Edit: I do not have access to the code for the child. I'm just wanting to run an executable.
If you mean any code at all, that can be difficult. You can use clone with CLONE_STOPPED instead of fork to start the application into a stopped state (needing SIGCONT to get it going again).
However, if you simply mean specific code in the child and you can modify the child code, you can, as the first thing in main, simply set up a handler for a USR1 signal (any IPC would probably do but a signal seems the simplest in this particular case) and then wait for it to fire before carrying on.
That way, the process itself will be running but won't be doing anything yet.
You then have the parent weave whatever magic it needs to do, then send a SIGUSR1 to the child.
But since, according to a comment, you don't have access to the client code, the first option may be the best, assuming that SIGCONT won't actually cause problems with the child. That will require testing.
Of course, one thing to keep in mind is that neither clone() nor fork() will actually load your new program into the child process, that has to be done with an exec-type call after the split. This is a result of the UNIX split between fork and exec functionality, detailed here.
That means that, while you don't control the child program, you do control the child process, so your code can wait for whatever signal it wants before loading up the new child program. Hence it's doable even with just fork().
Unfortunately, that also means that neither clone nor fork can stop your process after the new program has been loaded with exec (at least not deterministically) so, if the fiddling you want to do is to the new program (such as manipulating its variables by attaching to its memory), you can't do it.
The best you can do is to fiddle with the new process while it still has a copy of the old program (before the exec).
There's a simpler way, assuming your OS will let you share the address space before the child execs. Pseudo-code follows.
volatile int barrier;
int safe_fork(routine_to_call)
{
pid_t pid;
barrier = 0;
pid = fork();
if (pid == 0) {
/* parent */
routine_to_call()
barrier = 1;
} else if (pid > 0) {
while (barrier = 0)
; /* or sleep if it's a slow routine */
exec()
//if we get here, exec failed; exit with failure code
} else {
/* return failure */
}
/* must be parent; return success */
}
You may need to do something special to get the sharing behaviour, rather than having them both start with independent copies. I know it's doable on FreeBSD. In linux, check out the CLONE_VM flag to clone(); it looks like it should let you do what you need here.
What you are looking for is interprocess condition variable.
https://en.wikipedia.org/wiki/Monitor_(synchronization)
The way it would work (roughly) :-
Before forking you set a variable asking child to wait :- child_continue = false
1.) CHILD process begins to execute (or parent, doesn't matter)
If the variable child_continue == false
Sleep on a condition variable and wait for signal from parent
2.) Parent process waits for its chance to run (note the order of run doesn't matter). When the parent process is ready to run, it does whatever it wants with the child PID (or something else) and signals the child process to continue.
In order to do this, you'd need interprocess mutex and interprocess condition variable.
//#include "pthread.h" in main file
//create IPC MUTEX which can be shared by both child and parent.
pthread_mutexattr_t mutex_attr;
pthread_condattr_t cond_attr;
pthread_mutex_t mtx;
pthread_cond_t cond;
if (0!= pthread_mutexattr_init(&mutex_attr))
{
//errror handling
}
if (0!= pthread_condattr_init(&cond_attr))
{
//errror handling
}
if (0 != pthread_condattr_setpshared(&cond_attr,PTHREAD_PROCESS_SHARED)
{
//error handling
}
if (0 != pthread_mutexattr_setpshared(&mutex_attr,PTHREAD_PROCESS_SHARED)
{
//error handling
}
if (0 !=pthread_mutex_init(&mtx,&mtx_attr))
{
//error handling
}
if (0 !=pthread_cond_init(&cond,&cond_attr))
{
//error handling
}
boolean child_continue = false;
//now fork !!
pid_t pi = fork();
if (pi ==0) //child
{
if (0 !=pthread_mutex_lock(&mtx))
{
//error handling
}
while (!child_continue) //wait until we receive signal from parent.
{
if (0 !=pthread_cond_wait(&cond,&mtx))
{
//error handling
}
}
if (0 !=pthread_mutex_unlock(&mtx))
{
//error handling
}
//Parent is done!! either we woke up by condition variable or, parent was done before hand
//in which case, child_continue was true already.
}
else
{
//in parent process do whatever you want with child pid (pi variable)
//once you are done, set child_continue to true and wake up child.
if (0 !=pthread_mutex_lock(&mtx))
{
//error handling
}
child_continue = true;
if (0 !=pthread_cond_signal(&cond))
{
//error handling
}
if (0 !=pthread_mutex_unlock(&mtx))
{
//error handling
}
}

How to make child process die after parent exits?

Suppose I have a process which spawns exactly one child process. Now when the parent process exits for whatever reason (normally or abnormally, by kill, ^C, assert failure or anything else) I want the child process to die. How to do that correctly?
Some similar question on stackoverflow:
(asked earlier) How can I cause a child process to exit when the parent does?
(asked later) Are child processes created with fork() automatically killed when the parent is killed?
Some similar question on stackoverflow for Windows:
How do I automatically destroy child processes in Windows?
Kill child process when parent process is killed
Child can ask kernel to deliver SIGHUP (or other signal) when parent dies by specifying option PR_SET_PDEATHSIG in prctl() syscall like this:
prctl(PR_SET_PDEATHSIG, SIGHUP);
See man 2 prctl for details.
Edit: This is Linux-only
I'm trying to solve the same problem, and since my program must run on OS X, the Linux-only solution didn't work for me.
I came to the same conclusion as the other people on this page -- there isn't a POSIX-compatible way of notifying a child when a parent dies. So I kludged up the next-best thing -- having the child poll.
When a parent process dies (for any reason) the child's parent process becomes process 1. If the child simply polls periodically, it can check if its parent is 1. If it is, the child should exit.
This isn't great, but it works, and it's easier than the TCP socket/lockfile polling solutions suggested elsewhere on this page.
I have achieved this in the past by running the "original" code in the "child" and the "spawned" code in the "parent" (that is: you reverse the usual sense of the test after fork()). Then trap SIGCHLD in the "spawned" code...
May not be possible in your case, but cute when it works.
Under Linux, you can install a parent death signal in the child, e.g.:
#include <sys/prctl.h> // prctl(), PR_SET_PDEATHSIG
#include <signal.h> // signals
#include <unistd.h> // fork()
#include <stdio.h> // perror()
// ...
pid_t ppid_before_fork = getpid();
pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
; // continue parent execution
} else {
int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
if (r == -1) { perror(0); exit(1); }
// test in case the original parent exited just
// before the prctl() call
if (getppid() != ppid_before_fork)
exit(1);
// continue child execution ...
Note that storing the parent process id before the fork and testing it in the child after prctl() eliminates a race condition between prctl() and the exit of the process that called the child.
Also note that the parent death signal of the child is cleared in newly created children of its own. It is not affected by an execve().
That test can be simplified if we are certain that the system process who is in charge of adopting all orphans has PID 1:
pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
; // continue parent execution
} else {
int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
if (r == -1) { perror(0); exit(1); }
// test in case the original parent exited just
// before the prctl() call
if (getppid() == 1)
exit(1);
// continue child execution ...
Relying on that system process being init and having PID 1 isn't portable, though. POSIX.1-2008 specifies:
The parent process ID of all of the existing child processes and zombie processes of the calling process shall be set to the process ID of an implementation-defined system process. That is, these processes shall be inherited by a special system process.
Traditionally, the system process adopting all orphans is PID 1, i.e. init - which is the ancestor of all processes.
On modern systems like Linux or FreeBSD another process might have that role. For example, on Linux, a process can call prctl(PR_SET_CHILD_SUBREAPER, 1) to establish itself as system process that inherits all orphans of any of its descendants (cf. an example on Fedora 25).
If you're unable to modify the child process, you can try something like the following:
int pipes[2];
pipe(pipes)
if (fork() == 0) {
close(pipes[1]); /* Close the writer end in the child*/
dup2(pipes[0], STDIN_FILENO); /* Use reader end as stdin (fixed per  maxschlepzig */
exec("sh -c 'set -o monitor; child_process & read dummy; kill %1'")
}
close(pipes[0]); /* Close the reader end in the parent */
This runs the child from within a shell process with job control enabled. The child process is spawned in the background. The shell waits for a newline (or an EOF) then kills the child.
When the parent dies--no matter what the reason--it will close its end of the pipe. The child shell will get an EOF from the read and proceed to kill the backgrounded child process.
For completeness sake. On macOS you can use kqueue:
void noteProcDeath(
CFFileDescriptorRef fdref,
CFOptionFlags callBackTypes,
void* info)
{
// LOG_DEBUG(#"noteProcDeath... ");
struct kevent kev;
int fd = CFFileDescriptorGetNativeDescriptor(fdref);
kevent(fd, NULL, 0, &kev, 1, NULL);
// take action on death of process here
unsigned int dead_pid = (unsigned int)kev.ident;
CFFileDescriptorInvalidate(fdref);
CFRelease(fdref); // the CFFileDescriptorRef is no longer of any use in this example
int our_pid = getpid();
// when our parent dies we die as well..
LOG_INFO(#"exit! parent process (pid %u) died. no need for us (pid %i) to stick around", dead_pid, our_pid);
exit(EXIT_SUCCESS);
}
void suicide_if_we_become_a_zombie(int parent_pid) {
// int parent_pid = getppid();
// int our_pid = getpid();
// LOG_ERROR(#"suicide_if_we_become_a_zombie(). parent process (pid %u) that we monitor. our pid %i", parent_pid, our_pid);
int fd = kqueue();
struct kevent kev;
EV_SET(&kev, parent_pid, EVFILT_PROC, EV_ADD|EV_ENABLE, NOTE_EXIT, 0, NULL);
kevent(fd, &kev, 1, NULL, 0, NULL);
CFFileDescriptorRef fdref = CFFileDescriptorCreate(kCFAllocatorDefault, fd, true, noteProcDeath, NULL);
CFFileDescriptorEnableCallBacks(fdref, kCFFileDescriptorReadCallBack);
CFRunLoopSourceRef source = CFFileDescriptorCreateRunLoopSource(kCFAllocatorDefault, fdref, 0);
CFRunLoopAddSource(CFRunLoopGetMain(), source, kCFRunLoopDefaultMode);
CFRelease(source);
}
Inspired by another answer here, I came up with the following all-POSIX solution. The general idea is to create an intermediate process between the parent and the child, that has one purpose: Notice when the parent dies, and explicitly kill the child.
This type of solution is useful when the code in the child can't be modified.
int p[2];
pipe(p);
pid_t child = fork();
if (child == 0) {
close(p[1]); // close write end of pipe
setpgid(0, 0); // prevent ^C in parent from stopping this process
child = fork();
if (child == 0) {
close(p[0]); // close read end of pipe (don't need it here)
exec(...child process here...);
exit(1);
}
read(p[0], 1); // returns when parent exits for any reason
kill(child, 9);
exit(1);
}
There are two small caveats with this method:
If you deliberately kill the intermediate process, then the child won't be killed when the parent dies.
If the child exits before the parent, then the intermediate process will try to kill the original child pid, which could now refer to a different process. (This could be fixed with more code in the intermediate process.)
As an aside, the actual code I'm using is in Python. Here it is for completeness:
def run(*args):
(r, w) = os.pipe()
child = os.fork()
if child == 0:
os.close(w)
os.setpgid(0, 0)
child = os.fork()
if child == 0:
os.close(r)
os.execl(args[0], *args)
os._exit(1)
os.read(r, 1)
os.kill(child, 9)
os._exit(1)
os.close(r)
Does the child process have a pipe to/from the parent process? If so, you'd receive a SIGPIPE if writing, or get EOF when reading - these conditions could be detected.
I don't believe it's possible to guarantee that using only standard POSIX calls. Like real life, once a child is spawned, it has a life of its own.
It is possible for the parent process to catch most possible termination events, and attempt to kill the child process at that point, but there's always some that can't be caught.
For example, no process can catch a SIGKILL. When the kernel handles this signal it will kill the specified process with no notification to that process whatsoever.
To extend the analogy - the only other standard way of doing it is for the child to commit suicide when it finds that it no longer has a parent.
There is a Linux-only way of doing it with prctl(2) - see other answers.
This solution worked for me:
Pass stdin pipe to child - you don't have to write any data into the stream.
Child reads indefinitely from stdin until EOF. An EOF signals that the parent has gone.
This is foolproof and portable way to detect when the parent has gone. Even if parent crashes, OS will close the pipe.
This was for a worker-type process whose existence only made sense when the parent was alive.
Some posters have already mentioned pipes and kqueue. In fact you can also create a pair of connected Unix domain sockets by the socketpair() call. The socket type should be SOCK_STREAM.
Let us suppose you have the two socket file descriptors fd1, fd2. Now fork() to create the child process, which will inherit the fds. In the parent you close fd2 and in the child you close fd1. Now each process can poll() the remaining open fd on its own end for the POLLIN event. As long as each side doesn't explicitly close() its fd during normal lifetime, you can be fairly sure that a POLLHUP flag should indicate the other's termination (no matter clean or not). Upon notified of this event, the child can decide what to do (e.g. to die).
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <poll.h>
#include <stdio.h>
int main(int argc, char ** argv)
{
int sv[2]; /* sv[0] for parent, sv[1] for child */
socketpair(AF_UNIX, SOCK_STREAM, 0, sv);
pid_t pid = fork();
if ( pid > 0 ) { /* parent */
close(sv[1]);
fprintf(stderr, "parent: pid = %d\n", getpid());
sleep(100);
exit(0);
} else { /* child */
close(sv[0]);
fprintf(stderr, "child: pid = %d\n", getpid());
struct pollfd mon;
mon.fd = sv[1];
mon.events = POLLIN;
poll(&mon, 1, -1);
if ( mon.revents & POLLHUP )
fprintf(stderr, "child: parent hung up\n");
exit(0);
}
}
You can try compiling the above proof-of-concept code, and run it in a terminal like ./a.out &. You have roughly 100 seconds to experiment with killing the parent PID by various signals, or it will simply exit. In either case, you should see the message "child: parent hung up".
Compared with the method using SIGPIPE handler, this method doesn't require trying the write() call.
This method is also symmetric, i.e. the processes can use the same channel to monitor each other's existence.
This solution calls only the POSIX functions. I tried this in Linux and FreeBSD. I think it should work on other Unixes but I haven't really tested.
See also:
unix(7) of Linux man pages, unix(4) for FreeBSD, poll(2), socketpair(2), socket(7) on Linux.
Install a trap handler to catch SIGINT, which kills off your child process if it's still alive, though other posters are correct that it won't catch SIGKILL.
Open a .lockfile with exclusive access and have the child poll on it trying to open it - if the open succeeds, the child process should exit
As other people have pointed out, relying on the parent pid to become 1 when the parent exits is non-portable. Instead of waiting for a specific parent process ID, just wait for the ID to change:
pit_t pid = getpid();
switch (fork())
{
case -1:
{
abort(); /* or whatever... */
}
default:
{
/* parent */
exit(0);
}
case 0:
{
/* child */
/* ... */
}
}
/* Wait for parent to exit */
while (getppid() != pid)
;
Add a micro-sleep as desired if you don't want to poll at full speed.
This option seems simpler to me than using a pipe or relying on signals.
I think a quick and dirty way is to create a pipe between child and parent. When parent exits, children will receive a SIGPIPE.
Another way to do this that is Linux specific is to have the parent be created in a new PID namespace. It will then be PID 1 in that namespace, and when it exits it all of it's children will be immediately killed with SIGKILL.
Unfortunately, in order to create a new PID namespace you have to have CAP_SYS_ADMIN. But, this method is very effective and requires no real change to the parent or the children beyond the initial launch of the parent.
See clone(2), pid_namespaces(7), and unshare(2).
Under POSIX, the exit(), _exit() and _Exit() functions are defined to:
If the process is a controlling process, the SIGHUP signal shall be sent to each process in the foreground process group of the controlling terminal belonging to the calling process.
So, if you arrange for the parent process to be a controlling process for its process group, the child should get a SIGHUP signal when the parent exits. I'm not absolutely sure that happens when the parent crashes, but I think it does. Certainly, for the non-crash cases, it should work fine.
Note that you may have to read quite a lot of fine print - including the Base Definitions (Definitions) section, as well as the System Services information for exit() and setsid() and setpgrp() - to get the complete picture. (So would I!)
If you send a signal to the pid 0, using for instance
kill(0, 2); /* SIGINT */
that signal is sent to the entire process group, thus effectively killing the child.
You can test it easily with something like:
(cat && kill 0) | python
If you then press ^D, you'll see the text "Terminated" as an indication that the Python interpreter have indeed been killed, instead of just exited because of stdin being closed.
In case it is relevant to anyone else, when I spawn JVM instances in forked child processes from C++, the only way I could get the JVM instances to terminate properly after the parent process completed was to do the following. Hopefully someone can provide feedback in the comments if this wasn't the best way to do this.
1) Call prctl(PR_SET_PDEATHSIG, SIGHUP) on the forked child process as suggested before launching the Java app via execv, and
2) Add a shutdown hook to the Java application that polls until its parent PID equals 1, then do a hard Runtime.getRuntime().halt(0). The polling is done by launching a separate shell that runs the ps command (See: How do I find my PID in Java or JRuby on Linux?).
EDIT 130118:
It seems that was not a robust solution. I'm still struggling a bit to understand the nuances of what's going on, but I was still sometimes getting orphan JVM processes when running these applications in screen/SSH sessions.
Instead of polling for the PPID in the Java app, I simply had the shutdown hook perform cleanup followed by a hard halt as above. Then I made sure to invoke waitpid in the C++ parent app on the spawned child process when it was time to terminate everything. This seems to be a more robust solution, as the child process ensures that it terminates, while the parent uses existing references to make sure that its children terminate. Compare this to the previous solution which had the parent process terminate whenever it pleased, and had the children try to figure out if they had been orphaned before terminating.
I found 2 solutions, both not perfect.
1.Kill all children by kill(-pid) when received SIGTERM signal.
Obviously, this solution can not handle "kill -9", but it do work for most case and very simple because it need not to remember all child processes.
var childProc = require('child_process').spawn('tail', ['-f', '/dev/null'], {stdio:'ignore'});
var counter=0;
setInterval(function(){
console.log('c '+(++counter));
},1000);
if (process.platform.slice(0,3) != 'win') {
function killMeAndChildren() {
/*
* On Linux/Unix(Include Mac OS X), kill (-pid) will kill process group, usually
* the process itself and children.
* On Windows, an JOB object has been applied to current process and children,
* so all children will be terminated if current process dies by anyway.
*/
console.log('kill process group');
process.kill(-process.pid, 'SIGKILL');
}
/*
* When you use "kill pid_of_this_process", this callback will be called
*/
process.on('SIGTERM', function(err){
console.log('SIGTERM');
killMeAndChildren();
});
}
By same way, you can install 'exit' handler like above way if you call process.exit somewhere.
Note: Ctrl+C and sudden crash have automatically been processed by OS to kill process group, so no more here.
2.Use chjj/pty.js to spawn your process with controlling terminal attached.
When you kill current process by anyway even kill -9, all child processes will be automatically killed too (by OS?). I guess that because current process hold another side of the terminal, so if current process dies, the child process will get SIGPIPE so dies.
var pty = require('pty.js');
//var term =
pty.spawn('any_child_process', [/*any arguments*/], {
name: 'xterm-color',
cols: 80,
rows: 30,
cwd: process.cwd(),
env: process.env
});
/*optionally you can install data handler
term.on('data', function(data) {
process.stdout.write(data);
});
term.write(.....);
*/
Even though 7 years have passed I've just run into this issue as I'm running SpringBoot application that needs to start webpack-dev-server during development and needs to kill it when the backend process stops.
I try to use Runtime.getRuntime().addShutdownHook but it worked on Windows 10 but not on Windows 7.
I've change it to use a dedicated thread that waits for the process to quit or for InterruptedException which seems to work correctly on both Windows versions.
private void startWebpackDevServer() {
String cmd = isWindows() ? "cmd /c gradlew webPackStart" : "gradlew webPackStart";
logger.info("webpack dev-server " + cmd);
Thread thread = new Thread(() -> {
ProcessBuilder pb = new ProcessBuilder(cmd.split(" "));
pb.redirectOutput(ProcessBuilder.Redirect.INHERIT);
pb.redirectError(ProcessBuilder.Redirect.INHERIT);
pb.directory(new File("."));
Process process = null;
try {
// Start the node process
process = pb.start();
// Wait for the node process to quit (blocking)
process.waitFor();
// Ensure the node process is killed
process.destroyForcibly();
System.setProperty(WEBPACK_SERVER_PROPERTY, "true");
} catch (InterruptedException | IOException e) {
// Ensure the node process is killed.
// InterruptedException is thrown when the main process exit.
logger.info("killing webpack dev-server", e);
if (process != null) {
process.destroyForcibly();
}
}
});
thread.start();
}
Historically, from UNIX v7, the process system has detected orphanity of processes by checking a process' parent id. As I say, historically, the init(8) system process is a special process by only one reason: It cannot die. It cannot die because the kernel algorithm to deal with assigning a new parent process id, depends on this fact. when a process executes its exit(2) call (by means of a process system call or by external task as sending it a signal or the like) the kernel reassigns all children of this process the id of the init process as their parent process id. This leads to the most easy test, and most portable way of knowing if a process has got orphan. Just check the result of the getppid(2) system call and if it is the process id of the init(2) process then the process got orphan before the system call.
Two issues emerge from this approach that can lead to issues:
first, we have the possibility of changing the init process to any user process, so How can we assure that the init process will always be parent of all orphan processes? Well, in the exit system call code there's a explicit check to see if the process executing the call is the init process (the process with pid equal to 1) and if that's the case, the kernel panics (It should not be able anymore to maintain the process hierarchy) so it is not permitted for the init process to do an exit(2) call.
second, there's a race condition in the basic test exposed above. Init process' id is assumed historically to be 1, but that's not warranted by the POSIX approach, that states (as exposed in other response) that only a system's process id is reserved for that purpose. Almost no posix implementation does this, and you can assume in original unix derived systems that having 1 as response of getppid(2) system call is enough to assume the process is orphan. Another way to check is to make a getppid(2) just after the fork and compare that value with the result of a new call. This simply doesn't work in all cases, as both call are not atomic together, and the parent process can die after the fork(2) and before the first getppid(2) system call. The processparent id only changes once, when its parent does anexit(2)call, so this should be enough to check if thegetppid(2)result changed between calls to see that parent process has exit. This test is not valid for the actual children of the init process, because they are always children ofinit(8)`, but you can assume safely these processes as having no parent either (except when you substitute in a system the init process)
I've passed parent pid using environment to the child,
then periodically checked if /proc/$ppid exists from the child.
I managed to do a portable, non-polling solution with 3 processes by abusing terminal control and sessions.
The trick is:
process A is started
process A creates a pipe P (and never reads from it)
process A forks into process B
process B creates a new session
process B allocates a virtual terminal for that new session
process B installs SIGCHLD handler to die when the child exits
process B sets a SIGPIPE handler
process B forks into process C
process C does whatever it needs (e.g. exec()s the unmodified binary or runs whatever logic)
process B writes to pipe P (and blocks that way)
process A wait()s on process B and exits when it dies
That way:
if process A dies: process B gets a SIGPIPE and dies
if process B dies: process A's wait() returns and dies, process C gets a SIGHUP (because when the session leader of a session with a terminal attached dies, all processes in the foreground process group get a SIGHUP)
if process C dies: process B gets a SIGCHLD and dies, so process A dies
Shortcomings:
process C can't handle SIGHUP
process C will be run in a different session
process C can't use session/process group API because it'll break the brittle setup
creating a terminal for every such operation is not the best idea ever
If parent dies, PPID of orphans change to 1 - you only need to check your own PPID.
In a way, this is polling, mentioned above.
here is shell piece for that:
check_parent () {
parent=`ps -f|awk '$2=='$PID'{print $3 }'`
echo "parent:$parent"
let parent=$parent+0
if [[ $parent -eq 1 ]]; then
echo "parent is dead, exiting"
exit;
fi
}
PID=$$
cnt=0
while [[ 1 = 1 ]]; do
check_parent
... something
done

Resources