How is wait implemented in C?

How is wait implemented in C? - c

When I create two child processes, I can't use SIGCHLD to tell whether both child processes have terminated since once a signal is delivered, future signals of the same type are discarded. When I receive a SIGCHLD signal and handle that signal, I cannot be sure whether that means both child processes have terminated and sent the SIGCHLD signal or just one of them has terminated. In other words, signals are not queued. However, with the function wait(), if I have two child processes, then I can call wait() twice to reap both child processes, and I wonder how it is implemented under the hood. It seems it is not using SIGCHLD as signals are not queued. So how is it able to handle terminated child processes one by one?

In cases like this, you'd instead call waitpid() with the WNOHANG flag in a loop until it returns 0 to indicate none of the specified processes have changed state (or -1 to indicate an error).
int status;
pid_t p;
while ((p = waitpid(-1, &status, WNOHANG)) > 0) {
// Inspect status and do whatever
}
As for how they work under the hood? The kernel keeps exited but not yet waited for processes in its list of processes (so called zombie processes). wait() just returns the status of one of the caller's exited children, or blocks until there is one (or an error), at which point the calling process is scheduled to run again.

Related

Kill child process spawned with execl without making it zombie

I need to spawn a long-running child process and then kill it from the parent code. At the moment I do like this:
int PID = fork();
if (PID == 0) {
execl("myexec", "myexec", nullptr);
perror("ERROR");
exit(1);
}
// do something in between and then:
kill(PID, SIGKILL);
This does the job in the sense that the child process is stopped, but then it remains as a zombie. I tried to completely remove it by adding:
kill(getpid(), SIGCHLD);
to no avail. I must be doing something wrong but I can't figure out what, so any help would be greatly appreciated. Thanks.

signal(SIGCHLD, SIG_IGN);
kill(getpid(), SIGCHLD);
Presto. No zombie.
By ignoring SIGCHLD we tell the kernel we don't care about exit codes so the zombies just go away immediately.

You have been answered with:
signal(SIGCHLD, SIG_IGN);
to ignore the signal sent to the parent when a child dies. This is an old mechanism to avoid zombies, but zombies are your friends, as my answer will explain.
The zombies are not a bug, but a feature of the system. They are there to complete the fork(2), wait(2), exit(2), kill(2) group of system calls.
When you wait(2) for a child to die, the kernel tests if there's a child running with the characteristics you state in the wait(2). If it exists, the wait(2) will block, because the wait(2) system call is the one used in unix to give the parent the exit status of the waited child. If you use wait() and you have done no fork() a new child previously, wait() should give you an error, because you are calling wait with no fork (i'll stop boldening the system calls in this discussion from here on) but what happens if the parent did a fork but the child died before the parent was capable of making a wait. Should this be taken as an error? No. The system maintains the process table entry for the child proces, until one of two things happen: The parent dies (then all children processess get orphaned, being adopted by process id 1 ---init/systemd--- which is continously blocked in wait calls; or the parent does a wait, in which case the status of one (or the one requested) of the children is reported.
So in a proper usage of the system, it is possible (or necessary) to make a wait call for each fork you make. if you do more waits than forks, you get errors... if you make more forks than waits, you get zombies. In order to compensate this, your code should be changed to make the appropiate wait call.
kill(PID, SIGINT); /* don't use SIGKILL in the first time, give time to your process to arrange its last will before dying */
res = waitpid(PID, &status, 0);
And this will allow the child to die normally. The child is going to die, because you killed it (except if the child has decided to ignore the signal you send to it)
The reason for no race condition here (the child could die before is is wait()ed for) is the zombie process. Zombie processes are not proper processes, they don't accept signals, it is impossible to kill them, because they are dead already (no pun here ;)). They only occupy the process table slot, but no resource is allocated to them. When a parent does a wait call, if there's a zombie, it will be freed and the accounting info will be transferred to the parent (this is how the accounting is done), including the exit status, and if there isn't (because it died prematurely and you had invoked the above behaviour) you will get an error from wait, and the accounting info will be tranferred to init/systemd, which will cope for this. If you decide to ignore the SIGCHLD signal, you are cancelling the production of zombies, but the accounting is being feed in the wron way to init/systemd, and not accounted in the parent. (no such process can be waited for) you cannot distinguish if the wait fails because the child process died or because you didn't spawn it correctly. More is to come.
Let's say that the child cannot exec a new program and it dies (calling exit()). When you kill it, nothing happens, as there's no target process (well, you should receive an error from kill call, but I assume you are not interested in the kill result, as it is irrelevant for this case, you are interested in the status of the child, or how did the child died. This means you need to wait for it. if you get a normal exit, with a status of 1, (as you did an exit in case exec() fails) you will know that the child was not able to exec (you still need to distinguish if the 1 exit code was produced by the child or by the program later run by the child). If you successfully killed the child, you should get a status telling you that the child was killed with signal (the one you sent) and you will know that your code is behaving properly.
In case you don't want to block your parent process in the wait system call (well, your child program could have decided to ignore signals and the kill had no effect), then you can substitute the above by this:
kill(PID, SIGINT);
res = waitpid(PID, &status, WNOHANG);
that will not block the parent, in the case the child program has decided to ignore the signal you send to it. In this case, if wait returns -1 and errno value EINTR, then you know that your child has decided to ignore the signal you sent to it, and you need help from the operator (or be more drastic, e.g. killing it with SIGKILL).
A good approach should be
void alarm_handler()
{
}
...
kill(PID, SIGINT); /* kill it softly (it's your child, man!!) */
void *saved = signal(SIGALRM, alarm_handler);
alarm(3); /* put an awakener, you will be interrupted in 3s. */
res = waitpid(PID, &status, 0);
signal(SIGALRM, saaved); /* restore the previous signal handler */
if (res == -1 && errno == EINTR) {
/* we where interrupted by the alarm, and child didn't die. */
kill(PID, SIGKILL); /* be more rude */
}

Why to use non-blocking waitpid instead of blocking wait?

I am reading https://www.amazon.com/Unix-Network-Programming-Sockets-Networking/dp/0131411551, and there the author handle the sigchld in handler that calls waitpid rather then wait.
In Figure 5.7, we cannot call wait in a loop, because there is no way to prevent wait from blocking if there are
running children that have not yet terminated.
The handler is as follows:
void sig_chld(int signo)
{
pid_t pid;
int stat;
// while ((pid = wait(&stat)) > 0)
while ((pid = waitpid(-1, &stat, WNOHANG)) > 0)
{
printf("child %d terminated\n", pid);
}
}
The question is, even if I use the blocking version of wait (as is commented out), the child are terminated anyway (which is what I want in order to not have zombies), so why to even bother whether it is in blocking way or non-blocking?
I assume when it is non-blocking way (i.e. with waitpid), then I can call the handler multiple times? (when some childs are terminated, and other are still running). But still I can just block and wait in that handler for all child to terminate. So no difference between calling the handler multiple times or just once. Or is there any other reason for non-blocking and calling the handler multiple times?

The while loop condition will run one more time than there are zombie child processes that need to be waited for. So if you use wait() instead of waitpid() with the WNOHANG flag, it'll potentially block forever if you have another still running child - as wait() only returns early with an ECHLD error if there are no child processes at all. A robust generic handler will use waitpid() to avoid that.
Picture a case where the parent process starts multiple children to do various things, and periodically sends them instructions about what to do. When the first one exits, using wait() in a loop in the SIGCHLD handler will cause it to block forever while the other child processes are hanging around waiting for more instructions that they'll never receive.
Or, say, an inetd server that listens for network connections and forks off a new process to handle each one. Some services finish quickly, some can run for hours or days. If it uses a signal handler to catch exiting children, it won't be able to do anything else until the long-lived one exits if you use wait() in a loop once that handler is triggered by a short-lived service process exiting.

It took me a second to understand what the problem was, so let me spell it out. As everyone else has pointed out, wait(2) may block. waitpid(2) will not block if the WNOHANG option is specified.
You are correct to say that the first call to wait(2) should not block since the the signal handler will only be called if a child exited. However, when the signal handler is called there may be several children that may have exited. Signals are delivered asynchronously, and can be consolidated. So, if two children exit at close to the same time, it's possible that the operating system only sends one signal to the parent process. Thus, a loop must be used to iteratively check if more than one child has exited.
Now that we've established that the loop is necessary for checking if multiple children have exited, it is clear that we can't use wait(2) since it will block on the second iteration if there is a child that has not exited.
TL;DR The loop is necessary, and hence using waitpid(2) is necessary.

How do I get the PID of a set process in the handler of its own SIGCHLD?

I'm writting a toy shell for university and I have to update the status of a background process when it ends. So I came up with the idea of making the handler of SIGCHLD do that, since that signal is sent when the process ends. The problem is that in order to implement the command jobs, I have to update the status from "running" to "terminated" and first I have to find that specific process in the array I have dedicated to it, and one way to do it is by searching by pid, since the array stores the information that is display in jobs. Each entry stores the process pid, the status (which is a string) and the command itself.
Now the question is:
Is there a way to get the pid of the process that called the signal when it ended?
Right now this is what my handler function looks like:
void handler(int sig){
int child_pid;
child_pid = wait(NULL);
//finds the process with a pid identical
//to child_pid in the list and updates its status
...
}
Since wait(NULL) returns the pid of the first process that ends since it's called, the status is only updated when another background process ends and therefore the wrong process status is updated.
We haven't been tought many things from the wait() and waitpid()functions apart from that they waits for a process to end, so any insight may be helpful.

While it is not a good idea to use wait in a signal handler, you can do the following to accomplish what you are trying to do.
void handler(int sig){
int child_pid;
int status;
child_pid = waitpid(-1, &status, WUNTRACED | WNOHANG);
if(child_pid > 0)
{
// your code
// make sure to deal with the cases of child_pid < 0 and child_pid == 0
}
}
What you are doing is not technically wrong, however, it would be better to use waitpid. When you use waitpid(-1,...) it works similarly to if you used wait(...) and will not only wait for a specified process but any process that terminates. The main difference is that you can specify WUNTRACED which will suspend execution until a process in the wait set becomes either terminated or stopped. The WNOHANG will tell waitpid to not suspend the execution of the process. You don't want your handler to be suspended.
If multiple signals are sent to the same process i.e. due to multiple children terminating at the same time, then it will seem like only one signal was sent because the signal handler will not be executed again. This is due to how signals are sent; when a signal is created it is put into an exception table for the process to then "receive" it; signals do not use queues. To account for this, you will need to iterate over the waitpid(-1,...) call until it returns 0 to make sure you are reaping all of the terminated children.
Additionally, do watch out for where else you are reaping the child (note that if the child has already been reaped then waitpid will return 0 if you use the WNOHANG flag). I would assume that this is what is causing the behavior you are seeing of the status only updating when another background process ends. For example, because you are making a toy shell I assume that you are waiting for the foreground processes somewhere, and if you use the wait function there as well as in your handler you could get 1 of two things to happen. The child gets reaped in the 'wait for foreground process' method and then when the handler is executed there is nothing for it to reap. And 2, the child gets reaped in the handler method, and then the 'wait for foreground process' never exits.

Is waitpid() an atomic operation in Linux?

For example, in the parent process, I forked a child process and wait on the child process:
int main() {
setSignal(SIGCHLD, sigchld_handler)
while(1) {
// fork some child processes
myForkFunction()
waitpid(-1, &status, 0)
}
}
Moreover, I have a SIGCHLD signal handler:
void
sigchld_handler(int sig) {
while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
// Reap zombie processes
}
}
As can be seen, waitpid() appears both in the main() function and in the sigchld_handler() function. I was wondering whether waitpid can be interrupted by SIGCHLD. If it can be interrupted by SIGCHLD, what will happen then?
Does anyone have any ideas about this?

The POSIX specification for waitpid() says in part:
If _POSIX_REALTIME_SIGNALS is defined, and the implementation queues the SIGCHLD signal, then if wait() or waitpid() returns because the status of a child process is available, any pending SIGCHLD signal associated with the process ID of the child process shall be discarded. Any other pending SIGCHLD signals shall remain pending.
Otherwise, if SIGCHLD is blocked, if wait() or waitpid() return because the status of a child process is available, any pending SIGCHLD signal shall be cleared unless the status of another child process is available.
For all other conditions, it is unspecified whether child status will be available when a SIGCHLD signal is delivered.
The third of the quoted paragraphs seems to imply that you're treading on thin ice. It doesn't mention 'implementation defined' or similar — unspecified means that the standard says nothing about what shall happen and you may or may not get any information from the implementation-specific documentation.
There is a lot of (very densely worded) information in the POSIX specification. There are also some examples, and a rationale — which mentions sigwait() and sigwaitinfo(). It is worth reading the whole of the waipid() page. You should probably also read about Signal concepts too — more dense reading. (One of these days, I'll do it, too — when I need to know about bits of signals that I haven't covered before.)
Why are you using WUNTRACED instead of 0 or WNOHANG? WUNTRACED is a very specialized condition — POSIX says:
WUNTRACED
The status of any child processes specified by pid that are stopped, and whose status has not yet been reported since they stopped, shall also be reported to the requesting process.
Similar comments apply to WCONTINUED. Those two flags are useful when you need them, but you very seldom need them.
I suggest you should normally use either 0 or WNOHANG in the third argument to waitpid().

Yes, in the sense that only one of them can succeed for a given child process; if the signal handler interrupts the one in main, then after the signal handler returns, the child will already have been reaped and the call in main will fail.
With that said, however, it's bad practice to write code like this. There should be a single place you handle reaping of a given child process, and usually a signal handler is a very bad choice because it's global and it would have to be aware of all possible child processes your program might have finishing, and have a way to communicate those results to the proper parts of your program.
Instead, it's generally better to monitor the termination of child processes via poll on a pipe to/from the child process, and only waitpid after you know it's terminated, or to perform blocking waitpid from a thread whose only job is to wait for the child.

Does a KILL signal exit a process immediately?

I'm working on a server code that uses fork() and exec to create child processes. The PID of the child is registered when fork() succeeds and cleaned up when the CHILD signal has been caught.
If the server needs to stop, all programs are killed, eventually with a KILL signal. Now, this works by means of iterating through all registered PIDs and waiting for the CHILD signal handler to remove the PIDs. This will fail if child program did not exit properly. Therefore I want to use kill in combination with waitpid to ensure that PID list is cleaned up and log and do some other stuff otherwise.
Consider the next code sample:
kill(pid, SIGKILL);
waitpid(pid, NULL, WNOHANG);
Excerpt from waitpid(2):
waitpid(): on success, returns the process ID of the child whose state has changed; if WNOHANG was specified and one or more child(ren)
specified by pid exist, but have not yet changed state, then 0 is returned. On error, -1 is returned.
Is the process given by pid always gone before the next function kicks in? Will waitpid always return -1 in the above case?

Is the process given by pid always gone before the next function kicks in?
There is no guarantee for that. On a multiprocessor your process might be on CPU 0 while the cleanup in the kernel for the killed process takes place on CPU 1. That's a classical race-condition. Even on singlecore processors there is no guarantee for that.
Will waitpid always return -1 in the above case?
Since it is a race condition - in most cases it perhaps will. But there is no guarantee.
Since you are not interested in the status, this semicode might be more appropriate in your case:
// kill all childs
foreach(pid from pidlist)
kill(pid, SIGKILL);
// gather results - remove zombies
while( not_empty(pidlist) )
pid = waitpid(-1, NULL, WNOHANG);
if( pid > 0 )
remove_list_item(pidlist, pid);
else if( pid == 0 )
sleep(1);
else
break;

The KILL signal handler will run during the killed processes CPU time. This is potentially much later than your waitpid call, especially on a loaded system, so waitpid can very well return 0.

JFYI there is also the option of pidfd_send_signal(). From the manual:
The pidfd_send_signal() system call allows the avoidance of race
conditions that occur when using traditional interfaces (such as
kill(2)) to signal a process. The problem is that the
traditional interfaces specify the target process via a process
ID (PID), with the result that the sender may accidentally send a
signal to the wrong process if the originally intended target
process has terminated and its PID has been recycled for another
process. By contrast, a PID file descriptor is a stable
reference to a specific process; if that process terminates,
pidfd_send_signal() fails with the error ESRCH.
You can see an example on how to use it here.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight