How the below program works and create a Zombie process under linux?
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
int main ()
{
pid_t child_pid;
child_pid = fork ();
if (child_pid > 0) {
sleep (60);
}
else {
exit (0);
}
return 0;
}
It creates children and doesn't wait (with one of the wait* system call) for them. And zombies are just that: children that the parents hasn't waited yet, the kernel has to maintain some information for them -- mainly the exit status -- in order to be able to return it to the parent.
The setsid() command is missing.
Every *nix process produces an exit status that must be reaped. This is supposed to be reaped by the parent process using a wait() statement, if the child is supposed to terminate first.
The setsid() command switches the parent process to init when the parent terminates before the child process.
Root should be able to remove zombies from the process list using kill -9. Inexperienced programmers sometimes omit setsid(), which will hide bugs that produce errors that would otherwise clog the disk drive.
In days of old, the system administrator would use zombies to identify inexperienced programmers that need additional training to produce good code.
The exit status harvested by init is sent to syslog when the kernel terminates a program prematurely. That exit status is used to identify the nature of the bug that caused the early termination (error conditions not handled by the programmer).
Exit status reported in this way becomes part of the syslog or klog files, which are commonly used to debug code.
Related
I tried to create a zombie process, using the ps command for verification. Although the solution is good, it is not very suggestive for identifying the child as a zombie. Can anyone help me with some improvements?
this is my code:
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
int pid=fork();
if (pid>0)
{
printf("in parent process");
sleep(30);
execlp("ps","ps",NULL);
}
else if (pid==0)
{
printf("in child process");
}
return 0;
}
A Zombie process doesn't represent anything but a dead process for which it's parent has not already wait(2)ed. It's existence is just to provide the kernel a means to pass back to it's parent process it's accounting details (like cpu time spent, or exit code) in order to make the wait(2) system call reliable.
There's nothing else, depending on the operating system, you cannot even get the command line parameters used to call it, or the name of the process that it represents.
The only thing you can do with a zombie process is for its parent to wait(2) for it, and so, pass up the accounting details, exit code and other info upto its parent. No memory is assigned to it, no system resources are dedicated or locked, all its file descriptors are already closed, only it's pid, process-group id, session-id, and accumulated system and user space cpu times (for it, and the accumulated of its children) is stored in the process table, so the wait(2) system call can also accumulate them when the parent process exit(2)s or wait(2)s for it.
As you'll probably have already detected, you cannot kill(2) a zombie process (well, you can indeed kill(2) its parent, so it will rest in peace forever ---and the parent) as it is already dead.
In order to get a better identification of the process, you have to gather that info in the parent process (the parent receives all of it in the wait(2) family of system calls) You'll get the pid_t process id, so you'll know which of your child processes is the one you have wait(2)ed for. As you (the parent) created it, you'll know everything you need to know your children (you got this pid from the fork(2) system call when you created it)
I am writing a shell in C and I am trying to add signal handling. In the shell, fork() is called and the child process executes a shell command. The child process is put into its own process group. This way, if Ctrl-C is pressed when a child process is in the foreground, it closes all of the processes that share the same process group id. The shell executes the commands as expected.
The problem is the signals. When, for example, I execute "sleep 5", and then I press Ctrl-C for SIGINT, the "shell>" prompt comes up as expected but the process is still running in the background. If I quickly run "ps" after I press Ctrl-C, the sleep call is still there. Then after the 5 seconds are up and I run "ps" again, it's gone. The same thing happens when I press Ctrl-Z (SIGTSTP). With SIGTSTP, the process goes to the background, as expected, but it doesn't pause execution. It keeps running until it's finished.
Why are these processes being sent to the background like this and continuing to run?
Here is the gist of my code...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int status;
void sig_handler_parent()
{
printf("\n");
}
void sig_handler_sigchild(int signum)
{
waitpid(-1, &status, WNOHANG);
}
int main()
{
signal(SIGCHLD, sig_handler_sigchild);
signal(SIGINT, sig_handler_parent);
signal(SIGQUIT, sig_handler_parent);
signal(SIGTERM, sig_handler_parent);
signal(SIGCONT, sig_handler_parent);
signal(SIGTSTP, sig_handler_parent);
while (1)
{
printf("shell> ");
// GET COMMAND INPUT HERE
pid = fork();
if (pid == 0)
{
setpgid(getpid(), getpid());
execvp(cmd[0], cmd);
printf("%s: unknown command\n", cmd[0]);
exit(1);
}
else
waitpid(0, &status, WUNTRACED);
}
return 0;
}
p.s. I have already tried setting all of the signal handlers to SIG_DFL before the exec command.
The code you provide does not compile, and an attempt to fix it shows
that you omitted a lot. I am only guessing.
In order to bring you forward, I'll point out a number of facts that
you might have misunderstood. Together with a couple of documentation
links, I hope this is helpful.
Error Handling
First: please make a habit of handling errors, especially when you
know there's something that you don't understand. For example, the
parent (your shell) waits until the child terminates,
waitpid(0, &status, WUNTRACED);
You say,
When, for example, I execute "sleep 5", and then I press Ctrl-C for
SIGINT, the "shell>" prompt comes up as expected but the process is
still running in the background.
What actually happens is that once you press Ctrl-C, the parent (not the
child; see below for why) receives SIGINT (the kernel's terminal
subsystem handles keyboard input, sees that someone holds "Ctrl" and
"C" at the same time, and concludes that all processes with that
controlling terminal must be sent SIGINT).
Change the parent branch to,
int error = waitpid(0, &status, WUNTRACED);
if (error != 0)
perror("waitpid");
With this, you'd see perror() print something like:
waitpid: interrupted system call
You want SIGINT to go to the child, so something must be wrong.
Signal Handlers, fork(), and exec()
Next, what happens to your signal handlers across fork() and
exec()?
The signal overview man
page states,
A child created via fork(2) inherits a copy of its parent's signal
dispositions. During an execve(2), the dispositions of handled
signals are reset to the default; the dispositions of ignored
signals are left unchanged.
So, ideally, what this means is that:
The parent (shell) sees SIGINT, as observed above, and prints
"interrupted system call".
The child's signal handlers are reset back to their defaults. For SIGINT,
this means to terminate.
You do not fiddle with the controlling terminal, so the child
inherits the controlling terminal of the parent. This means that
SIGINT is delivered to both parent and child. Given that the
child's SIGINT behavior is to terminate, I'd bet that no process is
left running.
Except when you use setpgid() to create a new process group.
Process Groups, Sessions, and Controlling Terminal
Someone once called me a UNIX greybeard. While this is true form a
visual point of view, I must reject that compliment because I rarely
hang around in one of the darkest corners of UNIX - the terminal
subsystem. Shell writers have to understand that too though.
In this context, it's the "NOTES" section of the setpgid() man
page. I suggest
you read that, especially where it says,
At any time, one (and only one) of the process groups in the session
can be the foreground process group for the terminal; (...)
The shell (bash maybe) from which you start your shell program has
done so for the foreground invocation of your program, and marked that
as "foreground process group". Effectively this means, "Please, dear
terminal, whenever someone presses Ctrl-C, send a SIGINT to all
processes in that group. I (your parent) just sit and wait (waitpid()) until all is over, and will take control again then.".
You create a process group for the child, but don't tell the terminal
about it. What you want is to
Detach the parent from the terminal.
Set the child process group as the foregroud process group of the terminal.
Wait for the child (you already do).
Regain terminal foreground.
Further down in the "NOTES" section of said man page, they give links
to how that is done. Follow those, read thoroughly, try out things,
and make sure you handle errors. In most cases, such errors are signs
of misunderstanding. And, in most cases, such errors are fixed by
re-reading the documentation.
Are you sure that your child process is actually receiving the signals from your tty? I believe you need to make a call to tcsetpgrp to actually tell the controlling terminal to send signals to the process group of your child process.
For example, after you call fork, and before exec, try this from within your child.
tcsetpgrp(STDIN_FILENO, getpid())
Here is the man page for tcsetpgrp(3)
I am writing a project for class that finds zombies and reaps them in a Linux kernel.
I have found code that will create a single zombie, which gets reaped after a wait(), but my program must reap many, on the order of 1000.
I am very new to kernel manipulation/multi-threading and the resources I have found online dealing with zombies are either too technical, or ambiguous.
This is the code I am using:
pid_t child_pid;
child_pid = fork ();
if (child_pid > 0) {
sleep (60);
} else {
exit (0);
}
Once again, my question is: How should I go about creating multiple zombies, for my program to reap?
Much thanks -Jared
A zombie is no more than one terminated process who got a parent that didn't read his exit status (in a nutshell: parent didn't call wait() after the child exit) and keep memory and resources busy.
To achieve what you need just fork a lot of processes (use a loop for example) and never call wait()
I know I can use the trick if (fork()) exit(0); to change the pid of the current process. So, the following program would have a pid changing very quickly. How to kill a process like this? Is there some better method than executing a lot of killall procname until one get able to run kill() before it forks? I know it is not a 'process', but many processes that run for a few microseconds each.
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int main()
{
pid_t self = getpid();
while (1)
{
if (fork()) exit(0);
if (self + 10000 < getpid()) break; // Just to kill it after some time
usleep(1000);
}
return 0;
}
Also the only way I found to list the process was executing ps -A | grep procname a few times until one showed some output. Why isn't the process always listed?
Such a process is called a "comet" by systems administrators.
The process group ID (PGID) doesn't change on fork, so you can kill it (or SIGSTOP it) by sending a signal to the process group (you pass a negated PGID instead of a PID to kill).
The only reason I can see why you wouldn't see it is that the forked child has not been created yet but the parent has progressed far enough in it's death that it is no longer listed.
Unfortunately I don't think it's possible to kill this kind of process without some guessing. To do so would require knowing the next pid in advance. You can guess the next pid but not be certain that no other pid gets it assigned.
This question already has answers here:
What is the reason for performing a double fork when creating a daemon?
(9 answers)
Closed 8 years ago.
Nagios lets me configure child_processes_fork_twice=<0/1>.
The documentation says
This option determines whether or not Nagios will fork() child processes twice when it executes host and service checks. By default, Nagios fork()s twice. However, if the use_large_installation_tweaks option is enabled, it will only fork() once.
As far as I know fork() will spawn a new child process. Why would I want to do that twice?
All right, so now first of all: what is a zombie process? It's a process that is dead, but its parent was busy doing some other work, hence it could not collect the child's exit status. In some cases, the child runs for a very long time, the parent cannot wait for that long, and will continue with it's work (note that the parent doesn't die, but continues its remaining tasks but doesn't care about the child). In this way, a zombie process is created. Now let's get down to business. How does forking twice help here? The important thing to note is that the grandchild does the work which the parent process wants its child to do. Now the first time fork is called, the first child simply forks again and exits. This way, the parent doesn't have to wait for a long time to collect the child's exit status (since the child's only job is to create another child and exit). So, the first child doesn't become a zombie. As for the grandchild, its parent has already died. Hence the grandchild will be adopted by the init process, which always collects the exit status of all its child processes. So, now the parent doesn't have to wait for very long, and no zombie process will be created. There are other ways to avoid a zombie process; this is just a common technique. Hope this helps!
In Linux, a daemon is typically created by forking twice with the intermediate process exiting after forking the grandchild. This has the effect of orphaning the grandchild process. As a result, it becomes the responsibility of the OS to clean up after it if it terminates. The reason has to do with what are known as zombie processes which continue to live and consume resources after exiting because their parent, who'd normally be responsible for the cleaning up, has also died.
Also from the documentation,
Normally Nagios will fork() twice when it executes host and service checks. This is done to (1) ensure a high level of resistance against plugins that go awry and segfault and (2) make the OS deal with cleaning up the grandchild process once it exits.
Unix Programming Faq ยง1.6.2:
1.6.2 How do I prevent them from occuring?
You need to ensure that your parent process calls wait() (or
waitpid(), wait3(), etc.) for every child process that terminates;
or, on some systems, you can instruct the system that you are
uninterested in child exit states.
Another approach is to fork() twice, and have the immediate child
process exit straight away. This causes the grandchild process to be
orphaned, so the init process is responsible for cleaning it up. For
code to do this, see the function fork2() in the examples section.
To ignore child exit states, you need to do the following (check your
system's manpages to see if this works):
struct sigaction sa;
sa.sa_handler = SIG_IGN;
#ifdef SA_NOCLDWAIT
sa.sa_flags = SA_NOCLDWAIT;
#else
sa.sa_flags = 0;
#endif
sigemptyset(&sa.sa_mask);
sigaction(SIGCHLD, &sa, NULL);
If this is successful, then the wait() functions are prevented from
working; if any of them are called, they will wait until all child
processes have terminated, then return failure with errno == ECHILD.
The other technique is to catch the SIGCHLD signal, and have the
signal handler call waitpid() or wait3(). See the examples section
for a complete program.
This code demonstrates how to use the double fork method to allow the grandchild process to become adopted by init, without risk of zombie processes.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
int main()
{
pid_t p1 = fork();
if (p1 != 0)
{
printf("p1 process id is %d", getpid());
wait();
system("ps");
}
else
{
pid_t p2 = fork();
int pid = getpid();
if (p2 != 0)
{
printf("p2 process id is %d", pid);
}
else
{
printf("p3 process id is %d", pid);
}
exit(0);
}
}
The parent will fork the new child process, and then wait for it to finish. The child will fork a grandchild process, and then exit(0).
In this case, the grandchild doesn't do anything except exit(0), but could be made to do whatever you'd like the daemon process to do. The grandchild may live long and will be reclaimed by the init process, when it is complete.