Differences between fork and exec - c

What are the differences between fork and exec?

The use of fork and exec exemplifies the spirit of UNIX in that it provides a very simple way to start new processes.
The fork call basically makes a duplicate of the current process, identical in almost every way. Not everything is copied over (for example, resource limits in some implementations) but the idea is to create as close a copy as possible.
The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork - the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork call works - if not, no child is created and the parent gets an error code.
The exec call is a way to basically replace the entire current process with a new program. It loads the program into the current process space and runs it from the entry point.
So, fork and exec are often used in sequence to get a new program running as a child of a current process. Shells typically do this whenever you try to run a program like find - the shell forks, then the child loads the find program into memory, setting up all command line arguments, standard I/O and so forth.
But they're not required to be used together. It's perfectly acceptable for a program to fork itself without execing if, for example, the program contains both parent and child code (you need to be careful what you do, each implementation may have restrictions). This was used quite a lot (and still is) for daemons which simply listen on a TCP port and fork a copy of themselves to process a specific request while the parent goes back to listening.
Similarly, programs that know they're finished and just want to run another program don't need to fork, exec and then wait for the child. They can just load the child directly into their process space.
Some UNIX implementations have an optimized fork which uses what they call copy-on-write. This is a trick to delay the copying of the process space in fork until the program attempts to change something in that space. This is useful for those programs using only fork and not exec in that they don't have to copy an entire process space.
If the exec is called following fork (and this is what happens mostly), that causes a write to the process space and it is then copied for the child process.
Note that there is a whole family of exec calls (execl, execle, execve and so on) but exec in context here means any of them.
The following diagram illustrates the typical fork/exec operation where the bash shell is used to list a directory with the ls command:
+--------+
| pid=7 |
| ppid=4 |
| bash |
+--------+
|
| calls fork
V
+--------+ +--------+
| pid=7 | forks | pid=22 |
| ppid=4 | ----------> | ppid=7 |
| bash | | bash |
+--------+ +--------+
| |
| waits for pid 22 | calls exec to run ls
| V
| +--------+
| | pid=22 |
| | ppid=7 |
| | ls |
V +--------+
+--------+ |
| pid=7 | | exits
| ppid=4 | <---------------+
| bash |
+--------+
|
| continues
V

fork() splits the current process into two processes. Or in other words, your nice linear easy to think of program suddenly becomes two separate programs running one piece of code:
int pid = fork();
if (pid == 0)
{
printf("I'm the child");
}
else
{
printf("I'm the parent, my child is %i", pid);
// here we can kill the child, but that's not very parently of us
}
This can kind of blow your mind. Now you have one piece of code with pretty much identical state being executed by two processes. The child process inherits all the code and memory of the process that just created it, including starting from where the fork() call just left off. The only difference is the fork() return code to tell you if you are the parent or the child. If you are the parent, the return value is the id of the child.
exec is a bit easier to grasp, you just tell exec to execute a process using the target executable and you don't have two processes running the same code or inheriting the same state. Like #Steve Hawkins says, exec can be used after you forkto execute in the current process the target executable.

I think some concepts from "Advanced Unix Programming" by Marc Rochkind were helpful in understanding the different roles of fork()/exec(), especially for someone used to the Windows CreateProcess() model:
A program is a collection of instructions and data that is kept in a regular file on disk. (from 1.1.2 Programs, Processes, and Threads)
.
In order to run a program, the kernel is first asked to create a new process, which is an environment in which a program executes. (also from 1.1.2 Programs, Processes, and Threads)
.
It’s impossible to understand the exec or fork system calls without fully understanding the distinction between a process and a program. If these terms are new to you, you may want to go back and review Section 1.1.2. If you’re ready to proceed now, we’ll summarize the distinction in one sentence: A process is an execution environment that consists of instruction, user-data, and system-data segments, as well as lots of other resources acquired at runtime, whereas a program is a file containing instructions and data that are used to initialize the instruction and user-data segments of a process. (from 5.3 exec System Calls)
Once you understand the distinction between a program and a process, the behavior of fork() and exec() function can be summarized as:
fork() creates a duplicate of the current process
exec() replaces the program in the current process with another program
(this is essentially a simplified 'for dummies' version of paxdiablo's much more detailed answer)

Fork creates a copy of a calling process.
generally follows the structure
int cpid = fork( );
if (cpid = = 0)
{
//child code
exit(0);
}
//parent code
wait(cpid);
// end
(for child process text(code),data,stack is same as calling process)
child process executes code in if block.
EXEC replaces the current process with new process's code,data,stack.
generally follows the structure
int cpid = fork( );
if (cpid = = 0)
{
//child code
exec(foo);
exit(0);
}
//parent code
wait(cpid);
// end
(after exec call unix kernel clears the child process text,data,stack and fills with foo process related text/data)
thus child process is with different code (foo's code {not same as parent})

They are use together to create a new child process. First, calling fork creates a copy of the current process (the child process). Then, exec is called from within the child process to "replace" the copy of the parent process with the new process.
The process goes something like this:
child = fork(); //Fork returns a PID for the parent process, or 0 for the child, or -1 for Fail
if (child < 0) {
std::cout << "Failed to fork GUI process...Exiting" << std::endl;
exit (-1);
} else if (child == 0) { // This is the Child Process
// Call one of the "exec" functions to create the child process
execvp (argv[0], const_cast<char**>(argv));
} else { // This is the Parent Process
//Continue executing parent process
}

The main difference between fork() and exec() is that,
The fork() system call creates a clone of the currently running program. The original program continues execution with the next line of code after the fork() function call. The clone also starts execution at the next line of code.
Look at the following code that i got from http://timmurphy.org/2014/04/26/using-fork-in-cc-a-minimum-working-example/
#include <stdio.h>
#include <unistd.h>
int main(int argc, char **argv)
{
printf("--beginning of program\n");
int counter = 0;
pid_t pid = fork();
if (pid == 0)
{
// child process
int i = 0;
for (; i < 5; ++i)
{
printf("child process: counter=%d\n", ++counter);
}
}
else if (pid > 0)
{
// parent process
int j = 0;
for (; j < 5; ++j)
{
printf("parent process: counter=%d\n", ++counter);
}
}
else
{
// fork failed
printf("fork() failed!\n");
return 1;
}
printf("--end of program--\n");
return 0;
}
This program declares a counter variable, set to zero, before fork()ing. After the fork call, we have two processes running in parallel, both incrementing their own version of counter. Each process will run to completion and exit. Because the processes run in parallel, we have no way of knowing which will finish first. Running this program will print something similar to what is shown below, though results may vary from one run to the next.
--beginning of program
parent process: counter=1
parent process: counter=2
parent process: counter=3
child process: counter=1
parent process: counter=4
child process: counter=2
parent process: counter=5
child process: counter=3
--end of program--
child process: counter=4
child process: counter=5
--end of program--
The exec() family of system calls replaces the currently executing code of a process with another piece of code. The process retains its PID but it becomes a new program. For example, consider the following code:
#include <stdio.h>
#include <unistd.h>
main() {
char program[80],*args[3];
int i;
printf("Ready to exec()...\n");
strcpy(program,"date");
args[0]="date";
args[1]="-u";
args[2]=NULL;
i=execvp(program,args);
printf("i=%d ... did it work?\n",i);
}
This program calls the execvp() function to replace its code with the date program. If the code is stored in a file named exec1.c, then executing it produces the following output:
Ready to exec()...
Tue Jul 15 20:17:53 UTC 2008
The program outputs the line ―Ready to exec() . . . ‖ and after calling the execvp() function, replaces its code with the date program. Note that the line ― . . . did it work‖ is not displayed, because at that point the code has been replaced. Instead, we see the output of executing ―date -u.‖

fork() creates a copy of the current process, with execution in the new child starting from just after the fork() call. After the fork(), they're identical, except for the return value of the fork() function. (RTFM for more details.) The two processes can then diverge still further, with one unable to interfere with the other, except possibly through any shared file handles.
exec() replaces the current process with a new one. It has nothing to do with fork(), except that an exec() often follows fork() when what's wanted is to launch a different child process, rather than replace the current one.

fork():
It creates a copy of running process. The running process is called parent process & newly created process is called child process. The way to differentiate the two is by looking at the returned value:
fork() returns the process identifier (pid) of the child process in the parent
fork() returns 0 in the child.
exec():
It initiates a new process within a process. It loads a new program into the current process, replacing the existing one.
fork() + exec():
When launching a new program is to firstly fork(), creating a new process, and then exec() (i.e. load into memory and execute) the program binary it is supposed to run.
int main( void )
{
int pid = fork();
if ( pid == 0 )
{
execvp( "find", argv );
}
//Put the parent to sleep for 2 sec,let the child finished executing
wait( 2 );
return 0;
}

The prime example to understand the fork() and exec() concept is the shell,the command interpreter program that users typically executes after logging into the system.The shell interprets the first word of command line as a command name
For many commands,the shell forks and the child process execs the command associated with the name treating the remaining words on the command line as parameters to the command.
The shell allows three types of commands. First, a command can be an
executable file that contains object code produced by compilation of source code (a C program for example). Second, a command can be an executable file that
contains a sequence of shell command lines. Finally, a command can be an internal shell command.(instead of an executable file ex->cd,ls etc.)

Related

How can the multi-core cpu run the program interleaved?

The output of the program are not obviously contents from the printf()s in teh code. Instead it looks like characters in irregular sequence. I know the reason is because the parent process and child process are running
at the same time, but in this program I only see pid=fork(), which I think means pid is only the id of child process.
So why can the parent process print?
How do the two processes run together?
// fork.c: create a new process
#include "kernel/types.h"
#include "user/user.h"
int
main()
{
int pid;
pid = fork();
printf("fork() returned %d\n", pid);
if(pid == 0){
printf("child\n");
} else {
printf("parent\n");
}
exit(0);
}
output:
ffoorrkk(()) rreettuurrnende d 0
1c9h
ilpda
rent
I focus my answer on showing how the observed output can result from the shown program. I think that it will already clear things up for you.
This is your output.
I edited it to use a good guess of what is parent (p) and child (c):
ffoorrkk(()) rreettuurrnende d 0\n
cpcpcpcpcpcpcpcpcpcpcpcpccpcpcppccc
1 c9h\n
pccpcpp
ilpda\n
ccpcpcc
rent
pppp
If you only use the chars with a "c" beneath, you get
fork() returned 0
child
If you only use the chars with a "p" beneath, you get
fork() returned 19
parent
Split that way, it should match what you know about how fork() works.
Comments already provided the actual answer to the three "?"-adorned questions in title and body of your question post.
Lundin:
It creates two processes and they are executed just as any other process, decided by the OS scheduler.
Yourself:
each time fork() is called it will return twice, the parent process will return the id of child process, and child process will return 0
Maybe for putting a more obvious point on it:
The parent process receives the child ID and also continues executing the program after the fork().
That is why the output occurs twice, similarily, interleaved, with differences in PID value and the selected if branch.
Relevant is also that in the given situation there is no line buffering. Otherwise there would be no character-by-character interleaving and everthing would be much more readable.

How to know if a process is a parent or a child

How does one identify if a process is a child/grandchild of another process using its pid?
Process IDs: Child- and parent processes
All running programs have a unique process ID. The
process ID, a
non-negative integer, is the only identifier of a process that is
always unique. But, process IDs are reused.
As a process terminates its ID becomes available for reuse. Certain
systems delay reuse so that newly created processes are not confused
with old ones.
Certain IDs are "reserved" in the sense that they are being used by
system processes, such as the scheduler process. Another example is
the init process that always occupies PID 1. Depending on the system
the ID might be actively reserved.
Running the commands
> ps -eaf | head -n 5
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 11:49 ? 00:00:02 /sbin/init splash
root 2 0 0 11:49 ? 00:00:00 [kthreadd]
root 3 2 0 11:49 ? 00:00:00 [ksoftirqd/0]
root 5 2 0 11:49 ? 00:00:00 [kworker/0:0H]
and
> pidof init
1
will allow you to independently verify this.1
In C we can use the following functions to get the process ID of the
calling process and the parent process ID of the calling process,
#include <unistd.h>
pid_t getpid(void);
pid_t getppid(void);
A process can create other processes. The created processes are called
"child processes" and we refer to the process that created them as
the "parent process".
Creating a new process using fork()
To create a child process we use the system call
fork()
#include <unistd.h>
pid_t fork(void);
The function is called once, by the parent process, but it returns
twice. The return value in the child process is 0, and the return
value in the parent process is the process ID of the new child.1
A process can have multiple child processes but there is no system
call for a process to get the process IDs of all of its children, so
the parent observes the return value of the child process and can use
these identifiers to manage them.
A process can only have a single parent process, which is always
obtainable by calling getppid.
The child is a copy of the parent, it gets a copy of the parent's
data space, heap and stack. They do not share these portions of
memory! 2
We will compile and execute the following code snippet to see how
this works,
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/syscall.h>
int main(void) {
int var = 42; // This variable is created on the stack
pid_t pid;
// Two processes are created here
// v~~~~~~~~~~|
if ((pid = fork()) < 0) {
perror("Fork failed");
} else if (pid == 0) { // <- Both processes continue executing here
// This variable gets copied
var++;
printf("This is the child process:\n"
"\t my pid=%d\n"
"\t parent pid=%d\n"
"\t var=%d\n", getpid(), getppid(), var);
} else {
printf("This is the parent process:\n"
"\t my pid=%d\n"
"\t child pid=%d\n"
"\t var=%d\n", getpid(), pid, var);
}
return 0;
}
We will see when we execute the program that there are no guarantees
as to which process gets to execute first. They may even operate
simultaneously, effectively interleaving their output. 3
$ # Standard compilation
$ gcc -std=c99 -Wall fork_example1.c -o fork_example1
$ # Sometimes the child executes in its entirety first
$ ./fork_example1
This is the child process:
my pid=26485
parent pid=26484
var=43
This is the parent process:
my pid=26484
child pid=26485
var=42
$ # and sometimes the parent executes in its entirety first
$ ./fork_example1
This is the parent process:
my pid=26461
child pid=26462
var=42
This is the child process:
my pid=26462
parent pid=26461
var=43
$ # At times the two might interleave
$ ./fork_example1
This is the parent process:
my pid=26455
This is the child process:
my pid=26456
parent pid=26455
var=43
child pid=26456
var=42
1 PID stands for Process ID and PPID stands for
Parent Process ID.
2 The process ID 0 is reserved for use by the kernel, so
it is not possible for 0 to be the process ID of a child.
3 Many systems do not perform a complete copy of these
memory segments and instead only creates a copy when either process
performs a write. Initially, the shared regions are marked by the
kernel as "read-only" and whenever a process tries to modify these
regions the kernel awards each process their own copy of that memory.
4 Standard out is buffered so it's not a perfect example.
Use getpid() and getppid() function to get process id and parent process id.

At what point does a fork() child process actually begin?

Does the process begin when fork() is declared? Is anything being killed here?
pid_t child;
child = fork();
kill (child, SIGKILL);
Or do you need to declare actions for the fork process in order for it to actually "begin"?
pid_t child;
child = fork();
if (child == 0) {
// do something
}
kill (child, SIGKILL);
I ask because what I am trying to do is create two children, wait for the first to complete, and then kill the second before exiting:
pid_t child1;
pid_t child2;
child1 = fork();
child2 = fork();
int status;
if (child1 == 0) { //is this line necessary?
}
waitpid(child1, &status, 0);
kill(child2, SIGKILL);
The C function fork is defined in the standard C library (glibc on linux). When you call it, it performs an equivalent system call (on linux its name is clone) by the means of a special CPU instruction (on x86 sysenter). This causes the CPU to switch to a privileged mode and start executing instructions of the kernel. The kernel then creates a new process (a record in a list and accompanying structures), which inherits a copy of memory mappings of the original process (text, heap, stack, and others), file descriptors and more.
The memory areas are marked as non-writable, so that when the new or the original process tries to overwrite them, the kernel gets to handle a CPU exception and perform a copy-on-write (therefore delaying the need to copy a memory page until absolutely necessary). That's because the mappings initially point to the same pages (pieces of physical memory) in both processes.
The kernel then gives execution to the scheduler, which decides which process to run next. It could be the original process, the child process, or any other process running in the system.
Note: The Linux kernel actually puts the child process in front of the parent process in the run queue, so it is run earlier than the parent. This is deemed to give better performance when the child calls exec right after forking.
When execution is given to the original process, the CPU is switched back to nonprivileged mode and starts executing the next instruction. In this case it continues with the fork function of the standard library, which returns the PID of the child process (as returned by the clone system call).
Similarly, the child process continues execution in the fork function, but here it returns 0 to the calling function.
After that, the program continues in both cases normally. The child process has the original process as the parent (this is noted in a structure in the kernel). When it exists, the parent process is supposed to do the cleanup (receiving the exit status of the child) by calling wait.
Note: The clone system call is rather complicated, because it unifies fork with the creation of threads, as well as linux namespaces. Other operating systems have different implementation of fork, e.g. FreeBSD has fork system call by itself.
Disclaimer: I am not a kernel developer. If you know better, please correct the answer.
See Also
clone (2)
The Design and Implementation of the FreeBSD Operating System (Google Books)
Understanding the Linux Kernel (Google Books)
Is it true that fork() calls clone() internally?
"Declare" is the wrong word to use in this context; C uses that word to talk about constructs that merely assert the existence of something, e.g.
extern int fork(void);
is a declaration of the function fork. Writing that in your code (or having it written for you as a consequence of #include <unistd.h>) does not cause fork to be called.
Now, the statement in your sample code, child = fork(); when written inside a function body, does (generate code to) make a call to the function fork. That function, assuming it is in fact the system primitive fork(2) on your operating system, and assuming it succeeds, has the special behavior of returning twice, once in the original process and once in a new process, with different return values in each so you can tell which is which.
So the answer to your question is that in both of the code fragments you showed, assuming the things I mentioned in the previous paragraph, all of the code after the child = fork(); line is at least potentially executed twice, once by the child and once by the parent. The if (child == 0) { ... } construct (again, this is not a "declaration") is the standard idiom for making parent and child do different things.
EDIT: In your third code sample, yes, the child1 == 0 block is necessary, but not to ensure that the child is created. Rather, it is there to ensure that whatever you want child1 to do is done only in child1. Moreover, as written (and, again, assuming all calls succeed) you are creating three child processes, because the second fork call will be executed by both parent and child! You probably want something like this instead:
pid_t child1, child2;
int status;
child1 = fork();
if (child1 == -1) {
perror("fork");
exit(1);
}
else if (child1 == 0) {
execlp("program_to_run_in_child_1", (char *)0);
/* if we get here, exec failed */
_exit(127);
}
child2 = fork();
if (child2 == -1) {
perror("fork");
kill(child1, SIGTERM);
exit(1);
}
else if (child2 == 0) {
execlp("program_to_run_in_child_2", (char *)0);
/* if we get here, exec failed */
_exit(127);
}
/* control reaches this point only in the parent and only when
both fork calls succeeded */
if (waitpid(child1, &status, 0) != child1) {
perror("waitpid");
kill(child1, SIGTERM);
}
/* only use SIGKILL as a last resort */
kill(child2, SIGTERM);
FYI, this is only a skeleton. If I were writing code to do this for real (which I have: see for instance https://github.com/zackw/tbbscraper/blob/master/scripts/isolate.c ) there would be a whole bunch more code just to comprehensively detect and report errors, plus the additional logic required to deal with file descriptor management in the children and a few other wrinkles.
The fork process spawns a new process identical to the old one and returns in both functions.
This happens automatically so you don't have to take any actions.
But nevertheless, it is cleaner to check if the call indeed succeeded:
A value below 0 indicates failure. In this case, it is not good to call kill().
A value == 0 indicates that we are the child process. In this case, it is not very clean to call kill().
A value > 0 indicates that we are the parent process. In this case, the return value is our child. Here it is safe to call kill().
In your case, you even end up with 4 processes:
Your parent calls fork(), being left with 2 processes.
Both of them call fork() again, resulting in a new child process for each of them.
You should move the 2nd fork() process into the branch where the parent code runs.
The child process begins some time after fork() has been called (there is some setup which happens in the context of the child).
You can be sure that the child is running when fork() returns.
So the code
pid_t child = fork();
kill (child, SIGKILL);
will kill the child. The child might execute kill(0, SIGKILL) which does nothing and returns an error.
There is no way to tell whether the child might ever live long enough to execute it's kill. Most likely, it won't since the Linux kernel will set up the process structure for the child and let the parent continue. The child will just be waiting in the ready list of the processes. The kill will then remove it again.
EDIT If fork() returns a value <= 0, then you shouldn't wait or kill.

Where does code Execution start in a child process?

Consider the code:
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <unistd.h>
/* main --- do the work */
int main(int argc, char **argv)
{
pid_t child;
if ((child = fork()) < 0) {
fprintf(stderr, "%s: fork of child failed: %s\n",
argv[0], strerror(errno));
exit(1);
} else if (child == 0) {
// do something in child
}
} else {
// do something in parent
}
}
My question is from where does in the code the child process starts executing, i.e. which line is executed first??
If it executes the whole code, it will also create its own child process and thing will go on happening continuously which does not happen for sure!!!
If it starts after the fork() command, how does it goes in if statement at first??
It starts the execution of the child in the return of the fork function. Not in the start of the code. The fork returns the pid of the child in the parent process, and return 0 in the child process.
When you execute a fork() the thread is duplicated into memory.
So what effectively happens is that you will have two threads that executes the snippet you posted but their fork() return values will be different.
For the child thread fork() will return 0, so the other branch of the if won't be executed, same thing happens for the father thread.
When fork() is called the operating system assigns a new address space to the new thread that is going to spawn, then starts it, they will both share the same code segment but since the return value will be different they'll execute different parts of the code (if correctly split, like in your example)
The child starts by executing the next instruction (not line) after fork. So in your case it is the assignment of the fork's return value to the child variable.
Well, if i understand your question correctly, i can say to you that your code will run as a process already.When you run a code,it is already a process , so that this process goes if statement anyway. After fork(), you will have another process(child process).
In Unix, a process can create another process, that's why that happens.
Code execution in a child process starts from the next instruction following the fork() system call.
fork() system call just creates a seperate address space for the child process therefore it is a cloned copy of the parent process and the child process has all the memory elements of it's parent's process.
Thus, after spawning a child process through fork(), both processes (the parent process and the child process) resumes the execution right from the next instruction following the fork() system call.

How to make child process die after parent exits?

Suppose I have a process which spawns exactly one child process. Now when the parent process exits for whatever reason (normally or abnormally, by kill, ^C, assert failure or anything else) I want the child process to die. How to do that correctly?
Some similar question on stackoverflow:
(asked earlier) How can I cause a child process to exit when the parent does?
(asked later) Are child processes created with fork() automatically killed when the parent is killed?
Some similar question on stackoverflow for Windows:
How do I automatically destroy child processes in Windows?
Kill child process when parent process is killed
Child can ask kernel to deliver SIGHUP (or other signal) when parent dies by specifying option PR_SET_PDEATHSIG in prctl() syscall like this:
prctl(PR_SET_PDEATHSIG, SIGHUP);
See man 2 prctl for details.
Edit: This is Linux-only
I'm trying to solve the same problem, and since my program must run on OS X, the Linux-only solution didn't work for me.
I came to the same conclusion as the other people on this page -- there isn't a POSIX-compatible way of notifying a child when a parent dies. So I kludged up the next-best thing -- having the child poll.
When a parent process dies (for any reason) the child's parent process becomes process 1. If the child simply polls periodically, it can check if its parent is 1. If it is, the child should exit.
This isn't great, but it works, and it's easier than the TCP socket/lockfile polling solutions suggested elsewhere on this page.
I have achieved this in the past by running the "original" code in the "child" and the "spawned" code in the "parent" (that is: you reverse the usual sense of the test after fork()). Then trap SIGCHLD in the "spawned" code...
May not be possible in your case, but cute when it works.
Under Linux, you can install a parent death signal in the child, e.g.:
#include <sys/prctl.h> // prctl(), PR_SET_PDEATHSIG
#include <signal.h> // signals
#include <unistd.h> // fork()
#include <stdio.h> // perror()
// ...
pid_t ppid_before_fork = getpid();
pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
; // continue parent execution
} else {
int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
if (r == -1) { perror(0); exit(1); }
// test in case the original parent exited just
// before the prctl() call
if (getppid() != ppid_before_fork)
exit(1);
// continue child execution ...
Note that storing the parent process id before the fork and testing it in the child after prctl() eliminates a race condition between prctl() and the exit of the process that called the child.
Also note that the parent death signal of the child is cleared in newly created children of its own. It is not affected by an execve().
That test can be simplified if we are certain that the system process who is in charge of adopting all orphans has PID 1:
pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
; // continue parent execution
} else {
int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
if (r == -1) { perror(0); exit(1); }
// test in case the original parent exited just
// before the prctl() call
if (getppid() == 1)
exit(1);
// continue child execution ...
Relying on that system process being init and having PID 1 isn't portable, though. POSIX.1-2008 specifies:
The parent process ID of all of the existing child processes and zombie processes of the calling process shall be set to the process ID of an implementation-defined system process. That is, these processes shall be inherited by a special system process.
Traditionally, the system process adopting all orphans is PID 1, i.e. init - which is the ancestor of all processes.
On modern systems like Linux or FreeBSD another process might have that role. For example, on Linux, a process can call prctl(PR_SET_CHILD_SUBREAPER, 1) to establish itself as system process that inherits all orphans of any of its descendants (cf. an example on Fedora 25).
If you're unable to modify the child process, you can try something like the following:
int pipes[2];
pipe(pipes)
if (fork() == 0) {
close(pipes[1]); /* Close the writer end in the child*/
dup2(pipes[0], STDIN_FILENO); /* Use reader end as stdin (fixed per  maxschlepzig */
exec("sh -c 'set -o monitor; child_process & read dummy; kill %1'")
}
close(pipes[0]); /* Close the reader end in the parent */
This runs the child from within a shell process with job control enabled. The child process is spawned in the background. The shell waits for a newline (or an EOF) then kills the child.
When the parent dies--no matter what the reason--it will close its end of the pipe. The child shell will get an EOF from the read and proceed to kill the backgrounded child process.
For completeness sake. On macOS you can use kqueue:
void noteProcDeath(
CFFileDescriptorRef fdref,
CFOptionFlags callBackTypes,
void* info)
{
// LOG_DEBUG(#"noteProcDeath... ");
struct kevent kev;
int fd = CFFileDescriptorGetNativeDescriptor(fdref);
kevent(fd, NULL, 0, &kev, 1, NULL);
// take action on death of process here
unsigned int dead_pid = (unsigned int)kev.ident;
CFFileDescriptorInvalidate(fdref);
CFRelease(fdref); // the CFFileDescriptorRef is no longer of any use in this example
int our_pid = getpid();
// when our parent dies we die as well..
LOG_INFO(#"exit! parent process (pid %u) died. no need for us (pid %i) to stick around", dead_pid, our_pid);
exit(EXIT_SUCCESS);
}
void suicide_if_we_become_a_zombie(int parent_pid) {
// int parent_pid = getppid();
// int our_pid = getpid();
// LOG_ERROR(#"suicide_if_we_become_a_zombie(). parent process (pid %u) that we monitor. our pid %i", parent_pid, our_pid);
int fd = kqueue();
struct kevent kev;
EV_SET(&kev, parent_pid, EVFILT_PROC, EV_ADD|EV_ENABLE, NOTE_EXIT, 0, NULL);
kevent(fd, &kev, 1, NULL, 0, NULL);
CFFileDescriptorRef fdref = CFFileDescriptorCreate(kCFAllocatorDefault, fd, true, noteProcDeath, NULL);
CFFileDescriptorEnableCallBacks(fdref, kCFFileDescriptorReadCallBack);
CFRunLoopSourceRef source = CFFileDescriptorCreateRunLoopSource(kCFAllocatorDefault, fdref, 0);
CFRunLoopAddSource(CFRunLoopGetMain(), source, kCFRunLoopDefaultMode);
CFRelease(source);
}
Inspired by another answer here, I came up with the following all-POSIX solution. The general idea is to create an intermediate process between the parent and the child, that has one purpose: Notice when the parent dies, and explicitly kill the child.
This type of solution is useful when the code in the child can't be modified.
int p[2];
pipe(p);
pid_t child = fork();
if (child == 0) {
close(p[1]); // close write end of pipe
setpgid(0, 0); // prevent ^C in parent from stopping this process
child = fork();
if (child == 0) {
close(p[0]); // close read end of pipe (don't need it here)
exec(...child process here...);
exit(1);
}
read(p[0], 1); // returns when parent exits for any reason
kill(child, 9);
exit(1);
}
There are two small caveats with this method:
If you deliberately kill the intermediate process, then the child won't be killed when the parent dies.
If the child exits before the parent, then the intermediate process will try to kill the original child pid, which could now refer to a different process. (This could be fixed with more code in the intermediate process.)
As an aside, the actual code I'm using is in Python. Here it is for completeness:
def run(*args):
(r, w) = os.pipe()
child = os.fork()
if child == 0:
os.close(w)
os.setpgid(0, 0)
child = os.fork()
if child == 0:
os.close(r)
os.execl(args[0], *args)
os._exit(1)
os.read(r, 1)
os.kill(child, 9)
os._exit(1)
os.close(r)
Does the child process have a pipe to/from the parent process? If so, you'd receive a SIGPIPE if writing, or get EOF when reading - these conditions could be detected.
I don't believe it's possible to guarantee that using only standard POSIX calls. Like real life, once a child is spawned, it has a life of its own.
It is possible for the parent process to catch most possible termination events, and attempt to kill the child process at that point, but there's always some that can't be caught.
For example, no process can catch a SIGKILL. When the kernel handles this signal it will kill the specified process with no notification to that process whatsoever.
To extend the analogy - the only other standard way of doing it is for the child to commit suicide when it finds that it no longer has a parent.
There is a Linux-only way of doing it with prctl(2) - see other answers.
This solution worked for me:
Pass stdin pipe to child - you don't have to write any data into the stream.
Child reads indefinitely from stdin until EOF. An EOF signals that the parent has gone.
This is foolproof and portable way to detect when the parent has gone. Even if parent crashes, OS will close the pipe.
This was for a worker-type process whose existence only made sense when the parent was alive.
Some posters have already mentioned pipes and kqueue. In fact you can also create a pair of connected Unix domain sockets by the socketpair() call. The socket type should be SOCK_STREAM.
Let us suppose you have the two socket file descriptors fd1, fd2. Now fork() to create the child process, which will inherit the fds. In the parent you close fd2 and in the child you close fd1. Now each process can poll() the remaining open fd on its own end for the POLLIN event. As long as each side doesn't explicitly close() its fd during normal lifetime, you can be fairly sure that a POLLHUP flag should indicate the other's termination (no matter clean or not). Upon notified of this event, the child can decide what to do (e.g. to die).
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <poll.h>
#include <stdio.h>
int main(int argc, char ** argv)
{
int sv[2]; /* sv[0] for parent, sv[1] for child */
socketpair(AF_UNIX, SOCK_STREAM, 0, sv);
pid_t pid = fork();
if ( pid > 0 ) { /* parent */
close(sv[1]);
fprintf(stderr, "parent: pid = %d\n", getpid());
sleep(100);
exit(0);
} else { /* child */
close(sv[0]);
fprintf(stderr, "child: pid = %d\n", getpid());
struct pollfd mon;
mon.fd = sv[1];
mon.events = POLLIN;
poll(&mon, 1, -1);
if ( mon.revents & POLLHUP )
fprintf(stderr, "child: parent hung up\n");
exit(0);
}
}
You can try compiling the above proof-of-concept code, and run it in a terminal like ./a.out &. You have roughly 100 seconds to experiment with killing the parent PID by various signals, or it will simply exit. In either case, you should see the message "child: parent hung up".
Compared with the method using SIGPIPE handler, this method doesn't require trying the write() call.
This method is also symmetric, i.e. the processes can use the same channel to monitor each other's existence.
This solution calls only the POSIX functions. I tried this in Linux and FreeBSD. I think it should work on other Unixes but I haven't really tested.
See also:
unix(7) of Linux man pages, unix(4) for FreeBSD, poll(2), socketpair(2), socket(7) on Linux.
Install a trap handler to catch SIGINT, which kills off your child process if it's still alive, though other posters are correct that it won't catch SIGKILL.
Open a .lockfile with exclusive access and have the child poll on it trying to open it - if the open succeeds, the child process should exit
As other people have pointed out, relying on the parent pid to become 1 when the parent exits is non-portable. Instead of waiting for a specific parent process ID, just wait for the ID to change:
pit_t pid = getpid();
switch (fork())
{
case -1:
{
abort(); /* or whatever... */
}
default:
{
/* parent */
exit(0);
}
case 0:
{
/* child */
/* ... */
}
}
/* Wait for parent to exit */
while (getppid() != pid)
;
Add a micro-sleep as desired if you don't want to poll at full speed.
This option seems simpler to me than using a pipe or relying on signals.
I think a quick and dirty way is to create a pipe between child and parent. When parent exits, children will receive a SIGPIPE.
Another way to do this that is Linux specific is to have the parent be created in a new PID namespace. It will then be PID 1 in that namespace, and when it exits it all of it's children will be immediately killed with SIGKILL.
Unfortunately, in order to create a new PID namespace you have to have CAP_SYS_ADMIN. But, this method is very effective and requires no real change to the parent or the children beyond the initial launch of the parent.
See clone(2), pid_namespaces(7), and unshare(2).
Under POSIX, the exit(), _exit() and _Exit() functions are defined to:
If the process is a controlling process, the SIGHUP signal shall be sent to each process in the foreground process group of the controlling terminal belonging to the calling process.
So, if you arrange for the parent process to be a controlling process for its process group, the child should get a SIGHUP signal when the parent exits. I'm not absolutely sure that happens when the parent crashes, but I think it does. Certainly, for the non-crash cases, it should work fine.
Note that you may have to read quite a lot of fine print - including the Base Definitions (Definitions) section, as well as the System Services information for exit() and setsid() and setpgrp() - to get the complete picture. (So would I!)
If you send a signal to the pid 0, using for instance
kill(0, 2); /* SIGINT */
that signal is sent to the entire process group, thus effectively killing the child.
You can test it easily with something like:
(cat && kill 0) | python
If you then press ^D, you'll see the text "Terminated" as an indication that the Python interpreter have indeed been killed, instead of just exited because of stdin being closed.
In case it is relevant to anyone else, when I spawn JVM instances in forked child processes from C++, the only way I could get the JVM instances to terminate properly after the parent process completed was to do the following. Hopefully someone can provide feedback in the comments if this wasn't the best way to do this.
1) Call prctl(PR_SET_PDEATHSIG, SIGHUP) on the forked child process as suggested before launching the Java app via execv, and
2) Add a shutdown hook to the Java application that polls until its parent PID equals 1, then do a hard Runtime.getRuntime().halt(0). The polling is done by launching a separate shell that runs the ps command (See: How do I find my PID in Java or JRuby on Linux?).
EDIT 130118:
It seems that was not a robust solution. I'm still struggling a bit to understand the nuances of what's going on, but I was still sometimes getting orphan JVM processes when running these applications in screen/SSH sessions.
Instead of polling for the PPID in the Java app, I simply had the shutdown hook perform cleanup followed by a hard halt as above. Then I made sure to invoke waitpid in the C++ parent app on the spawned child process when it was time to terminate everything. This seems to be a more robust solution, as the child process ensures that it terminates, while the parent uses existing references to make sure that its children terminate. Compare this to the previous solution which had the parent process terminate whenever it pleased, and had the children try to figure out if they had been orphaned before terminating.
I found 2 solutions, both not perfect.
1.Kill all children by kill(-pid) when received SIGTERM signal.
Obviously, this solution can not handle "kill -9", but it do work for most case and very simple because it need not to remember all child processes.
var childProc = require('child_process').spawn('tail', ['-f', '/dev/null'], {stdio:'ignore'});
var counter=0;
setInterval(function(){
console.log('c '+(++counter));
},1000);
if (process.platform.slice(0,3) != 'win') {
function killMeAndChildren() {
/*
* On Linux/Unix(Include Mac OS X), kill (-pid) will kill process group, usually
* the process itself and children.
* On Windows, an JOB object has been applied to current process and children,
* so all children will be terminated if current process dies by anyway.
*/
console.log('kill process group');
process.kill(-process.pid, 'SIGKILL');
}
/*
* When you use "kill pid_of_this_process", this callback will be called
*/
process.on('SIGTERM', function(err){
console.log('SIGTERM');
killMeAndChildren();
});
}
By same way, you can install 'exit' handler like above way if you call process.exit somewhere.
Note: Ctrl+C and sudden crash have automatically been processed by OS to kill process group, so no more here.
2.Use chjj/pty.js to spawn your process with controlling terminal attached.
When you kill current process by anyway even kill -9, all child processes will be automatically killed too (by OS?). I guess that because current process hold another side of the terminal, so if current process dies, the child process will get SIGPIPE so dies.
var pty = require('pty.js');
//var term =
pty.spawn('any_child_process', [/*any arguments*/], {
name: 'xterm-color',
cols: 80,
rows: 30,
cwd: process.cwd(),
env: process.env
});
/*optionally you can install data handler
term.on('data', function(data) {
process.stdout.write(data);
});
term.write(.....);
*/
Even though 7 years have passed I've just run into this issue as I'm running SpringBoot application that needs to start webpack-dev-server during development and needs to kill it when the backend process stops.
I try to use Runtime.getRuntime().addShutdownHook but it worked on Windows 10 but not on Windows 7.
I've change it to use a dedicated thread that waits for the process to quit or for InterruptedException which seems to work correctly on both Windows versions.
private void startWebpackDevServer() {
String cmd = isWindows() ? "cmd /c gradlew webPackStart" : "gradlew webPackStart";
logger.info("webpack dev-server " + cmd);
Thread thread = new Thread(() -> {
ProcessBuilder pb = new ProcessBuilder(cmd.split(" "));
pb.redirectOutput(ProcessBuilder.Redirect.INHERIT);
pb.redirectError(ProcessBuilder.Redirect.INHERIT);
pb.directory(new File("."));
Process process = null;
try {
// Start the node process
process = pb.start();
// Wait for the node process to quit (blocking)
process.waitFor();
// Ensure the node process is killed
process.destroyForcibly();
System.setProperty(WEBPACK_SERVER_PROPERTY, "true");
} catch (InterruptedException | IOException e) {
// Ensure the node process is killed.
// InterruptedException is thrown when the main process exit.
logger.info("killing webpack dev-server", e);
if (process != null) {
process.destroyForcibly();
}
}
});
thread.start();
}
Historically, from UNIX v7, the process system has detected orphanity of processes by checking a process' parent id. As I say, historically, the init(8) system process is a special process by only one reason: It cannot die. It cannot die because the kernel algorithm to deal with assigning a new parent process id, depends on this fact. when a process executes its exit(2) call (by means of a process system call or by external task as sending it a signal or the like) the kernel reassigns all children of this process the id of the init process as their parent process id. This leads to the most easy test, and most portable way of knowing if a process has got orphan. Just check the result of the getppid(2) system call and if it is the process id of the init(2) process then the process got orphan before the system call.
Two issues emerge from this approach that can lead to issues:
first, we have the possibility of changing the init process to any user process, so How can we assure that the init process will always be parent of all orphan processes? Well, in the exit system call code there's a explicit check to see if the process executing the call is the init process (the process with pid equal to 1) and if that's the case, the kernel panics (It should not be able anymore to maintain the process hierarchy) so it is not permitted for the init process to do an exit(2) call.
second, there's a race condition in the basic test exposed above. Init process' id is assumed historically to be 1, but that's not warranted by the POSIX approach, that states (as exposed in other response) that only a system's process id is reserved for that purpose. Almost no posix implementation does this, and you can assume in original unix derived systems that having 1 as response of getppid(2) system call is enough to assume the process is orphan. Another way to check is to make a getppid(2) just after the fork and compare that value with the result of a new call. This simply doesn't work in all cases, as both call are not atomic together, and the parent process can die after the fork(2) and before the first getppid(2) system call. The processparent id only changes once, when its parent does anexit(2)call, so this should be enough to check if thegetppid(2)result changed between calls to see that parent process has exit. This test is not valid for the actual children of the init process, because they are always children ofinit(8)`, but you can assume safely these processes as having no parent either (except when you substitute in a system the init process)
I've passed parent pid using environment to the child,
then periodically checked if /proc/$ppid exists from the child.
I managed to do a portable, non-polling solution with 3 processes by abusing terminal control and sessions.
The trick is:
process A is started
process A creates a pipe P (and never reads from it)
process A forks into process B
process B creates a new session
process B allocates a virtual terminal for that new session
process B installs SIGCHLD handler to die when the child exits
process B sets a SIGPIPE handler
process B forks into process C
process C does whatever it needs (e.g. exec()s the unmodified binary or runs whatever logic)
process B writes to pipe P (and blocks that way)
process A wait()s on process B and exits when it dies
That way:
if process A dies: process B gets a SIGPIPE and dies
if process B dies: process A's wait() returns and dies, process C gets a SIGHUP (because when the session leader of a session with a terminal attached dies, all processes in the foreground process group get a SIGHUP)
if process C dies: process B gets a SIGCHLD and dies, so process A dies
Shortcomings:
process C can't handle SIGHUP
process C will be run in a different session
process C can't use session/process group API because it'll break the brittle setup
creating a terminal for every such operation is not the best idea ever
If parent dies, PPID of orphans change to 1 - you only need to check your own PPID.
In a way, this is polling, mentioned above.
here is shell piece for that:
check_parent () {
parent=`ps -f|awk '$2=='$PID'{print $3 }'`
echo "parent:$parent"
let parent=$parent+0
if [[ $parent -eq 1 ]]; then
echo "parent is dead, exiting"
exit;
fi
}
PID=$$
cnt=0
while [[ 1 = 1 ]]; do
check_parent
... something
done

Resources