Linux Fork: pid reuse - c

I wrote the following program to understand the way fork works when called without wait() or waitpid().
int main()
{
pid_t childpid;
int retval = 0;
int i;
while(1){
//usleep(1);
childpid = fork();
if (childpid >= 0)
{
i++;
if (childpid == 0)
{
exit(retval);
}
else
{
//printf("childpid is %d\n", childpid);
}
}
else
{
printf("total no. of processes created = %d\n", i);
perror("fork");
exit(0);
}
}
}
Here's the output I get->
total no. of processes created = 64901
fork: Cannot allocate memory
I expected the program to go on as I'm exiting the child process instantly and fork() should reuse the pids after pid > pid_max. Why doesn't this happen?

The exited child processes do remain in the process table as zombies. Zombie processes exist until their parent calls wait or waitpid to obtain their exit status. Also, the corresponding process id is kept, to prevent other newly created processes of duplicating it.
In your case, the process table becomes too large and the system rejects the creation of new processes.
Forking processes and then not retrieving their exit status can be regarded as a resource leak. When the parent exits, they will be adopted by the init process and then reaped, but if the parent stays alive for too long, there is no way for the system to just remove some of the zombies, because it is assumed that the parent should get interested in them at some point via wait or waitpid.

Child processes also hold some resource like memory. But they are not released because parent process can not process SIGCHLD signal, which will be sent by child processes when they exit.
Those child processes will become zombie.
You can use "ps -aux" to dump those fd.

Related

Fork() and execv() not running in parallel, why?

I'm trying to run a series of commands through execv() and forking a new process in C to run each one, and yet for some reason they aren't running in parallel. The following code is run for each process, with "full" being the filepath and "args" being the arguments. I know that the execv() part isn't the issue, it has to do with the way I'm forking and waiting.
int status;
pid_t pid = fork();
if (pid == 0) {
execv(full, args);
//perror("execv");
} else if (pid < 0) {
printf("%s\n", "Failed to fork");
status = -1;
} else {
if (waitpid(pid, &status, 0) != pid) {
status = -1;
return status;
}
}
When running this code, the forked commands simply run one after the other. I don't know how this could be happening.
If you don't want to wait for each child process, don't call waitpid immediately; as written, you fork a child, then immediately stop all processing in the parent until the child process exits, preventing you from forking any further children. If you want to launch multiple children without leaving zombie processes lying around (and possibly monitoring them all at some point to figure out their exit status), you can do one of:
Store off the pids from each fork in an array, and call waitpid on them one by one after you've launched all the processes you need to launch
Store a count of successfully launched child processes and call wait that many times to wait on them in whatever order they complete.
Ignore the SIGCHLD from the child processes entirely, for when you don't care when they exit, don't need to know their status, etc.

How to fork and create a specific number of children that perform the same task?

I need to write a C program that calls fork() a given number of times. Each child process needs to perform the same task (adding some random numbers until a given sum is reached). The parent process waits until all of the child processes have exited. I have written the following code but my output shows that another child doesn't start executing its code until the first one is done.
for (i = 0; i < num_forks; i++) {
child_pid = fork();
if (child_pid < 0) {
perror("fork\n");
} else if (child_pid == 0) {
childProcess(i, goal);
} else {
parentProcess();
}
}
EDIT: The objective is to have all of the child processes run simultaneously. The parent waits for any of the child processes to exit. As soon as any one child process exits, the parent process prints the pid of the child that exited. The remaining child processes continue to run simulatenously until another child exits and so on. If I call parentProcess() outside of the loop, the parent only prints the exiting child pid when the last child process exits.
You need to move the call to parentProcess() outside the loop:
for (i = 0; i < num_forks; i++) {
child_pid = fork();
if (child_pid < 0) {
perror("fork\n");
} else if (child_pid == 0) {
childProcess(i, goal);
}
}
parentProcess();
Otherwise, the parent waits for each child in turn before running the next.
You should be using wait() or waitpid() in a loop inside parentProcess() to collect all the children as they die — and you may do well to provide num_forks as an argument to parentProcess(). Or you need to redefine what you want done. The question suggests that you want the children running simultaneously, while the parent waits for them all to die. That means you have to launch all the children before waiting for them — subject to not running out of processes (so num_forks is a sane number like 20, not an insane number like 2,000 or 2,000,000).
So, your code in parentProcess() should be roughly like this, as a basic minimum:
void parentProcess(void)
{
int status;
int corpse;
while ((corpse = wait(&status)) != -1)
printf("%5d: 0x%.4X\n", corpse, status);
}
And this should be called outside the loop. If called inside the loop, the parent will launch one child, wait for it to finish, then repeat the process. I assume that the childProcess() function never returns; it will lead to chaos if it does return.

How do you kill zombie process using wait()

I have this code that requires a parent to fork 3 children.
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
What is the command to view zombie processes if you have Linux
virtual box?
main(){
pid_t child;
printf("-----------------------------------\n");
about("Parent");
printf("Now .. Forking !!\n");
child = fork();
int i=0;
for (i=0; i<3; i++){
if (child < 0) {
perror ("Unable to fork");
break;
}
else if (child == 0){
printf ("creating child #%d\n", (i+1));
about ("Child");
break;
}
else{
child = fork();
}
}
}
void about(char * msg){
pid_t me;
pid_t oldone;
me = getpid();
oldone = getppid();
printf("***[%s] PID = %d PPID = %d.\n", msg, me, oldone);
}
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
If your parent spawns only a small, fixed number of children; does not care when or whether they stop, resume, or finish; and itself exits quickly, then you do not need to use wait() or waitpid() to clean up the child processes. The init process (pid 1) takes responsibility for orphaned child processes, and will clean them up when they finish.
Under any other circumstances, however, you must wait() for child processes. Doing so frees up resources, ensures that the child has finished, and allows you to obtain the child's exit status. Via waitpid() you can also be notified when a child is stopped or resumed by a signal, if you so wish.
As for where to perform the wait,
You must ensure that only the parent wait()s.
You should wait at or before the earliest point where you need the child to have finished (but not before forking), OR
if you don't care when or whether the child finishes, but you need to clean up resources, then you can periodically call waitpid(-1, NULL, WNOHANG) to collect a zombie child if there is one, without blocking if there isn't any.
In particular, you must not wait() (unconditionally) immediately after fork()ing because parent and child run the same code. You must use the return value of fork() to determine whether you are in the child (return value == 0), or in the parent (any other return value). Furthermore, the parent must wait() only if forking was successful, in which case fork() returns the child's pid, which is always greater than zero. A return value less than zero indicates failure to fork.
Your program doesn't really need to wait() because it spawns exactly four (not three) children, then exits. However, if you wanted the parent to have at most one live child at any time, then you could write it like this:
int main() {
pid_t child;
int i;
printf("-----------------------------------\n");
about("Parent");
for (i = 0; i < 3; i++) {
printf("Now .. Forking !!\n");
child = fork();
if (child < 0) {
perror ("Unable to fork");
break;
} else if (child == 0) {
printf ("In child #%d\n", (i+1));
about ("Child");
break;
} else {
/* in parent */
if (waitpid(child, NULL, 0) < 0) {
perror("Failed to collect child process");
break;
}
}
}
return 0;
}
If the parent exits before one or more of its children, which can happen if it does not wait, then the child will thereafter see its parent process being pid 1.
Others have already answered how to get a zombie process list via th ps command. You may also be able to see zombies via top. With your original code you are unlikely to catch a glimpse of zombies, however, because the parent process exits very quickly, and init will then clean up the zombies it leaves behind.
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
You can use wait() anywhere in the parent process, and when the child process terminates it'll be removed from the system. Where to put it is up to you, in your specific case you probably want to put it immediately after the child = fork(); line so that the parent process won't resume its execution until its child has exited.
What is the command to view zombie processes if you have Linux virtual box?
You can use the ps aux command to view all processes in the system (including zombie processes), and the STAT column will be equal to Z if the process is a zombie. An example output would be:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
daniel 1000 0.0 0.0 0 0 ?? Z 17:15 0:00 command
How do you know (and) where to put the "wait()" statement to kill
zombie processes?
You can register a signal handler for SIGCHLD that sets a global volatile sig_atomic_t flag = 0 variable to 1. Then, at some convenient place in your program, test whether flag is set to 1, and, if so, set it back to 0 and afterwards (for otherwise you might miss a signal) call waitpid(-1, NULL, WNOHANG) in a loop until it tells you that no more processes are to be waited for. Note that the signal will interrupt system calls with EINTR, which is a good condition to check for the value of flag. If you use an indefinitely blocking system call like select(), you might want to specify a timeout after which you check for flag, since otherwise you might miss a signal that was raised after your last waitpid() call but before entering the indefinitely blocking system call. An alternative to this kludge is to use pselect().
Use:
ps -e -opid,ppid,pgid,stat,etime,cmd | grep defunct
to see your zombies, also the ppid and pgid to see the parent ID and process group ID. The etime to see the elapsed (cpu) time your zombie has been alive. The parent ID is useful to send custom signals to the parent process.
If the parent process is right coded to catch and handle the SIGCHLD signal, and to what expected (i.e., wait/reap the zombies), then you can submit:
kill -CHLD <parent_pid>
to tell the parent to reap all their zombies.

using fork to do something while the child is running

So basically what i need is:
pid = fork();
if (pid == -1)
exit(1);
if (pid == 0)
{
// do stuff in child
}
else
{
// ONLY do stuff while child is running
}
would I need to create a tmp file right before the child exits saying that it is no longer running so the parent knows the child has exited when that file exists, or is there a simpler way to do this?
You can use waitpid to know if a child process is still running:
int status;
if (waitpid(pid, &status, WNOHANG) == 0) {
// still running
}
With WNOHANG, waitpid returns immediately so that the program can do something else.
When you have nothing to do other than waiting for the child process to terminate, call waitpid without WNOHANG.
The standard way to know that the child has terminated (and get its exit code) is to use the waitpid() system call.
Check wait () and waitpid () : http://linux.die.net/man/2/wait
Here is some more resource: http://users.actcom.co.il/~choo/lupg/tutorials/multi-process/multi-process.html#child_death_wait_syscall
There's a bunch of ways to do it. If you don't need to do anything with the child output, you can set a SIGCHLD handler to reap the child when it exits, and then forget about it in your main thread of execution. You can use the SIGCHLD handler to flag the exit of the child process via an IPC mechanism.
Or you can add a while loop that checks waitpid in your else clause. You would be doing discrete units of work between polls of the child state and you wouldn't get interrupted immediately on child exit.
Use the system wait() call if you just need to check if the child has stopped running.
int pid;
int status;
while (true)
{
pid = wait(&status);
if (pid < 0)
//keep waiting
else if (pid == 0)
//child is done
}

What exactly does fork return?

On success, the PID of the child
process is returned in the parent’s
thread of execution, and a 0 is
returned in the child’s thread of execution.
p = fork();
I'm confused at its manual page,is p equal to 0 or PID?
I'm not sure how the manual can be any clearer! fork() creates a new process, so you now have two identical processes. To distinguish between them, the return value of fork() differs. In the original process, you get the PID of the child process. In the child process, you get 0.
So a canonical use is as follows:
p = fork();
if (0 == p)
{
// We're the child process
}
else if (p > 0)
{
// We're the parent process
}
else
{
// We're the parent process, but child couldn't be created
}
p = fork();
/* assume no errors */
/* you now have two */
/* programs running */
--------------------
if (p > 0) { | if (p == 0) {
printf("parent\n"); | printf("child\n");
... | ...
Processes are structured in a directed tree where you only know your single-parent (getppid()). In short, fork() returns -1 on error like many other system functions, non-zero value is useful for initiator of the fork call (the parent) to know its new-child pid.
Nothing is as good as example:
/* fork/getpid test */
#include <sys/types.h>
#include <unistd.h> /* fork(), getpid() */
#include <stdio.h>
int main(int argc, char* argv[])
{
int pid;
printf("Entry point: my pid is %d, parent pid is %d\n",
getpid(), getppid());
pid = fork();
if (pid == 0) {
printf("Child: my pid is %d, parent pid is %d\n",
getpid(), getppid());
}
else if (pid > 0) {
printf("Parent: my pid is %d, parent pid is %d, my child pid is %d\n",
getpid(), getppid(), pid);
}
else {
printf("Parent: oops! can not create a child (my pid is %d)\n",
getpid());
}
return 0;
}
And the result (bash is pid 2249, in this case):
Entry point: my pid is 16051, parent pid is 2249
Parent: my pid is 16051, parent pid is 2249, my child pid is 16052
Child: my pid is 16052, parent pid is 16051
If you need to share some resources (files, parent pid, etc.) between parent and child, look at clone() (for GNU C library, and maybe others)
Once fork is executed, you have two processes. The call returns different values to each process.
If you do something like this
int f;
f = fork();
if (f == 0) {
printf("I am the child\n");
} else {
printf("I am the parent and the childs pid is %d\n",f);
}
You will see both the messages printed. They're being printed by two separate processes. This is they way you can differentiate between the two processes created.
This is the cool part. It's equal to BOTH.
Well, not really. But once fork returns, you now have two copies of your program running! Two processes. You can sort of think of them as alternate universes. In one, the return value is 0. In the other, it's the ID of the new process!
Usually you will have something like this:
p = fork();
if (p == 0){
printf("I am a child process!\n");
//Do child things
}
else {
printf("I am the parent process! Child is number %d\n", p);
//Do parenty things
}
In this case, both strings will get printed, but by different processes!
fork() is invoked in the parent process. Then a child process is spawned. By the time the child process spawns, fork() has finished its execution.
At this point, fork() is ready to return, but it returns a different value depending on whether it's in the parent or child. In the child process, it returns 0, and in the parent process/thread, it returns the child's process ID.
Fork creates a duplicate process and a new process context. When it returns a 0 value it means that a child process is running, but when it returns another value that means a parent process is running. We usually use wait statement so that a child process completes and parent process starts executing.
I think that it works like this:
when pid = fork(), the code should be executed two times, one is in current process, one is in child process.
So it explains why if/else both execute.
And the order is, first current process, and then execute the child.

Resources