kill - does it kill the process right away?
I found my answer and I set up a signal handler for SIGCHLD and introduced wait in that handler. That way, whenever parent process kills a child process, this handler is called and it calls wait to reap the child. - motive is to clear process table entry.
I am still seeing some child processes going for a few seconds even without its parent process dying. - how is this possible?
I am seeing this via ps. Precisely ps -o user,pid,ppid,command -ax and greping for parent process, child process and defunct.
A process goes defunct (zombie) immediately upon exiting (from a signal, call to exit, return from main, whatever). It stays zombie until wait'd on by its parent.
So, all processes at least briefly become zombies upon exit.
If the parent process takes a bit (because it was doing other work, or just because the scheduler hasn't given it CPU time yet) before calling wait, then you'll see the zombie for a bit. If the parent never calls wait, then when it eventually exits, init (pid 1) will adopt its zombied children, and call wait on them.
A child process goes defunct (becomes a zombie) only when its parent process hasn't died and hasn't yet waited for it. If the original parent died, then the child's parent becomes process ID 1, and that process's main task is to wait for its (inherited) children to die and remove them from the process list, so that they are not zombies. (Note: an orphaned child, or a daemon, is automatically inherited by PID 1; it does not get assigned to grandparents or great-grandparents up the hierarchy of processes.)
Between the time that the child dies and the parent collects the exit information via wait() (or waitpid(), or waitid() or any of the other variants), it is a zombie in the process list, shown as defunct by ps.
But to answer your question's title:
Yes, a process can go defunct without its parent dying.
(And it can only go defunct if its parent has not died.)
Related
I have just had a lecture that sums reaping as:
Reaping
Performed by parent on terminated child (using wait or waitpid)
Parent is given exit status informaton
Kernel then deletes zombie child process
So I understand that reaping is done by calling wait or waitpid from the parent process after which the kernel deletes the zombie process. If this actually is the case, that reaping is done only when calling wait or waitpid, why do the child processes actually go away after returning in theor entry function - I mean that indeed does seem as if the child processes have been reaped and thus no resources are wasted even though the parent process may not be waiting.
So is "reaping" only possible when calling wait or waitpid? Is processes are "reaped" as long as they return and exit from their entry function (which I assume all processes do) - what is the point of talking about "reaping" as if it was something special?
The child process does not fully "go away" when it exits. It ceases to exist as a running process, and most/all of its resources (memory, open files, etc.) are released, but it still remains in the process table. It remains in the process table because that's where its exit status is stored, so that the parent can retrieve it by calling one of the wait variants. If the parent fails to call wait, the process table entry sticks around — and that's what makes it a "zombie".
I said that most/all of its resources are released, but the one resource that's definitely still consumed is that process table slot.
As long as the (dead) child's parent exists, the kernel doesn't know that the parent isn't going to call wait eventually, so the process table slot has to stay there, so that the eventual call to wait (if there is one) can return the proper exit status.
If the parent eventually exits (without ever calling wait), the child will be inherited by the grandparent, which is usually a "master" process like the shell, or init, that does routinely call wait and that will finally "reap" the poor young zombie.
So, yes, it really is true that the only way for the parent to properly "reap" the child is, just as was said in your lecture, to call one of the wait functions. (Or to exit, but that's not an option if the parent is long-running.)
Footnote: I said "the child will be inherited by the grandparent", but I think I was wrong, there. Under Unix and Linux, orphaned processes are generally always inherited by pid 1, aka init.
The purpose of the wait*() call is to allow the child process to report a status back to the parent process. When the child process exits, the operating system holds that status data in a little data structure until the parent reads it. Reaping in that sense is cleaning out that little data structure.
If the parent does not care about waiting for status from the child, the code could be written in a way to allow the parent to ignore the status, and so the reaping occurs semi-automatically. One way is to ignore the SIGCHLD signal.
Another way is to perform a double-fork to create a grandchild process instead. When doing this, the "parent" does a blocking wait() after a call to fork(). Then, the child performs another fork() to create the grandchild and then immediately exits, causing the parent to unblock. The grandchild now does the real work, and is automatically reaped by the init process.
I have just had a lecture that sums reaping as:
Reaping
Performed by parent on terminated child (using wait or waitpid)
Parent is given exit status informaton
Kernel then deletes zombie child process
So I understand that reaping is done by calling wait or waitpid from the parent process after which the kernel deletes the zombie process. If this actually is the case, that reaping is done only when calling wait or waitpid, why do the child processes actually go away after returning in theor entry function - I mean that indeed does seem as if the child processes have been reaped and thus no resources are wasted even though the parent process may not be waiting.
So is "reaping" only possible when calling wait or waitpid? Is processes are "reaped" as long as they return and exit from their entry function (which I assume all processes do) - what is the point of talking about "reaping" as if it was something special?
The child process does not fully "go away" when it exits. It ceases to exist as a running process, and most/all of its resources (memory, open files, etc.) are released, but it still remains in the process table. It remains in the process table because that's where its exit status is stored, so that the parent can retrieve it by calling one of the wait variants. If the parent fails to call wait, the process table entry sticks around — and that's what makes it a "zombie".
I said that most/all of its resources are released, but the one resource that's definitely still consumed is that process table slot.
As long as the (dead) child's parent exists, the kernel doesn't know that the parent isn't going to call wait eventually, so the process table slot has to stay there, so that the eventual call to wait (if there is one) can return the proper exit status.
If the parent eventually exits (without ever calling wait), the child will be inherited by the grandparent, which is usually a "master" process like the shell, or init, that does routinely call wait and that will finally "reap" the poor young zombie.
So, yes, it really is true that the only way for the parent to properly "reap" the child is, just as was said in your lecture, to call one of the wait functions. (Or to exit, but that's not an option if the parent is long-running.)
Footnote: I said "the child will be inherited by the grandparent", but I think I was wrong, there. Under Unix and Linux, orphaned processes are generally always inherited by pid 1, aka init.
The purpose of the wait*() call is to allow the child process to report a status back to the parent process. When the child process exits, the operating system holds that status data in a little data structure until the parent reads it. Reaping in that sense is cleaning out that little data structure.
If the parent does not care about waiting for status from the child, the code could be written in a way to allow the parent to ignore the status, and so the reaping occurs semi-automatically. One way is to ignore the SIGCHLD signal.
Another way is to perform a double-fork to create a grandchild process instead. When doing this, the "parent" does a blocking wait() after a call to fork(). Then, the child performs another fork() to create the grandchild and then immediately exits, causing the parent to unblock. The grandchild now does the real work, and is automatically reaped by the init process.
what does kill exactly do?
I have a parent process which is creating 100 (as an example) child processes one after another. At the end of any child's job, I kill the child with kill(pid_of_child, SIGKILL) and I cannot see that in ps output. But if something goes wrong with the parent process and I exit from the parent process with exit(1) (at this point only 1 child is there - I can check tht in ps), at that point I see a lot of <defunct> processes whose ppid is pid of parent process.
How is that possible? did kill not kill the child processes entirely?
kill doesn't kill anything. It sends signals to the target process. SIGKILL is just a signal. Now, the standard action for SIGKILL -- the only action, actually, since SIGKILL can't be handled or ignored by a process -- is to exit, that's true.
The "<defunct>" process is a child that hasn't been reaped, meaning that the parent hasn't called wait() to retrieve the exit status of the child. Until the parent calls wait(), the defunct (or "zombie") process will hang around.
Whenever a process ends, no matter how it ends (kill or otherwise), it will stay in the kernel's process table until its parent process retrieves its exit status (with wait and friends). Leaving it in the process table avoids a number of nasty race conditions.
If your parent process has exited, the children should get reassigned to init, which periodically reaps its children.
Yes, SIGKILL terminates the process, but in any case (either normal exit or terminated), processes have an exit status, which needs to be kept around for potential readers - as such an entry in the process table may remain until this is done. See http://en.wikipedia.org/wiki/Zombie_process .
My Linux process has 4 children. After some execution time all children adopted by the init process. How do we prevent this situation? (this is not the case with Zombie processes).
The process is written in C and the OS is Linux. My code calls waitpid! What might be the problem? In 99,99% we don't have this problem.
Last update: What if someone executes "kill -9 "? This terminates immediately the parent process and leaves the children orphan.
If your processes are being reparented by init, that means that their parent process has died. When a process' parent dies, init adopts it so that it can reap the zombie by wait()ing on the child when it (that is, init) receives SIGCHLD.
If you do not want init to become the parent of your children, you will have to ensure that your process lives until all of your children have died and been reaped by your program.
Wait for the children to exit before exiting yourself. See the wait(2) man page for more details.
Check from main page for your waitpid API parameters, and make sure your parent process should not be over before all child processes are finished.
Can you post your code here?
I have a parent and a child process written in C language. Somewhere in the parent process HUP signal is sent to the child. I want my parent process to detect if the child is dead. But when I send SIGHUP, the child process becomes a zombie. How can I detect if the child is a zombie in the parent process? I try the code below, but it doesn't return me the desired result since the child process is still there but it is defunct.
kill(childPID, 0);
One more question; can I kill the zombie child without killing the parent?
Thanks.
from wikipedia:
On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution but still has an entry in the process table. This entry is still needed to allow the process that started the (now zombie) process to read its exit status.
If the parent fetches the exit status by calling wait, waitpid or the like, the zombie should disappear.
You can detect whether a process is alive through the wait functions (man wait).