I'm writing a program that has to execute other external processes; right now the program launches the processes' commandlines via popen, grabs any output, and then grabs the exit status via pclose.
What is happening, however, is that for fast-running processes (e.g. the launched process errors out quickly) the pclose call cannot get the exit status (pclose returns -1, errno is ECHILD).
Is there a way for me to mimic the popen/pclose type behavior, except in a manner that guarantees capturing the process end "event" and the resultant return code? How do I avoid the inherent race condition with pclose and the termination of the launched process?
fork/exec/wait
popen is just a wrapper to simplify the fork/exec calls. If you want to acquire the output of the child, you'll need to create a pipe, call fork, dup the child's file descriptors to the pipe, and then exec. The parent can read the output from the pipe and call wait to get the child's exit status.
You can use vfork() and execv().
Related
I have a pipe which I opened with FILE *telnet = popen("telnet server", "w". If telnet exits after a while because server is not found, the pipe is closed from the other extreme.
Then I would expect some error, either in fprintf(telnet, ...) or fflush(telnet) calls, but instead, my program suddenly dies at fflush(telnet) without reporting the error. Is this normal behaviour? Why is it?
Converting (expanded) comments into an answer.
If you write to a pipe when there's no process at the other end of the pipe to read the data, you get a SIGPIPE signal to let you know, and the default behaviour for SIGPIPE is to exit (no core dump, but exit with prejudice).
If you examine the exit status in the shell, you should see $? is 141 (128 + SIGPIPE, which is normally 13).
If you don't mind that the process exits, you need do nothing. Alternatively, you can set the signal handler for SIGPIPE to SIG_IGN, in which case your writing operation should fail with an error, rather than terminating the process. Or you can set up more elaborate signal handling.
Note that one of the reasons you need to be careful to close unused file descriptors from pipes is that if the current process is writing to a pipe but also has the read end of the pipe open, it won't get SIGPIPE — but it might get blocked because it can't write more information to the pipe until some process reads from the pipe, but the only process that can read from the pipe is the one that's trying to write to it.
I have a while loop that reads data from a child process using blocking I/O by redirecting stdout of the child process to the parent process. Normally, as soon as the child process exits, a blocking read() in this case will return since the pipe that is read from is closed by the child process.
Now I have a case where the read() call does not exit for a child process that finishes. The child process ends up in a zombie state, since the operating system is waiting for my code to reap it, but instead my code is blocking on the read() call.
The child process itself does not have any child processes running at the time of the hang, and I do not see any file descriptors listed when looking in /proc/<child process PID>/fd. The child process did however fork two daemon processes, whose purpose seems to be to monitor the child process (the child process is a proprietary application I do not have any control over, so it is hard to say for sure).
When run from a terminal, the child process I try to read() from exits automatically, and in turn the daemon processes it forked terminate as well.
Linux version is 4.19.2.
What could be the reason of read() not returning in this case?
Follow-up: How to avoid read() from hanging in the following situation?
The child process did however fork two daemon processes ... What could be the reason of read() not returning in this case?
Forked processes still have the file descriptor open when the child terminates. Hence read call never returns 0.
Those daemon processes should close all file descriptors and open files for logging.
A possible reason (the most common) for read(2) blocking on a pipe with a dead child, is that the parent has not closed the writing side of the pipe, so there's still an open (for writing) descriptor for that pipe. Close the writing side of the pipe in the parent process before reading from it. The child is dead (you said zombie) so it cannot be the process with the writing side of the pipe open. And don't forget to wait(2) for the child in the parent, or you'll get a system full of zombies :)
Remember, you have to do two closes in your code:
One in the parent process, to close the writing side of the pipe, leaving the parent process with only a reading descriptor.
One in the child process (just before exec(2)ing) closing the reading side of the pipe, leaving the child process only with a writing descriptor.
In case you want to use the pipe(2) to send information to the child, change the reading for writing and viceversa in the above two points.
I do not know why the parent process needs to close both the file descriptors of a pipe before calling wait()?
I have a C program which does:
Parent creates child_a, which executes ls -l using execvp, and writes to the pipe (after closing read end of pipe).
Parent creates another child (without closing any file descriptor for pipe), called child_b, which executes 'wc' by reading from pipe.(after closing write end of pipe).
Parent waits for both children to complete by calling wait() twice.
I noticed that program is blocked if parent does not close both file descriptors of the pipe before calling the wait() syscall. Also after reading few questions already posted online it looks like this is the general rule and needs to be done. But I could not find the reason why this has to be done?
Why does wait() not return if the parent does not close the file descriptors of the pipe?
I was thinking that, in the worst case, if the parent does not close the file descriptor of pipe, then the only consequence would be that the pipe would keep existing (which is a waste of resource). But I never thought this would block the execution of child process (as can be seen because wait() does not return).
Also remember, parent is not using the pipe at all. It is child_a writing in the pipe, and child_b reading from the pipe.
If the parent process doesn't close the write ends of the pipes, the child processes never get EOF (zero bytes read) because there's a process that might (but won't) write to the pipe. The child process must also close the write end of the pipe for the same reason — if it doesn't, there's a process (itself) that might (but won't) write to the pipe, so the read won't return EOF.
If you duplicate one end of a pipe to standard output or standard error, you should close both ends of that pipe. It is a common mistake not to have enough calls to close() in multiprocess code using pipes. Occasionally, you get away with being sloppy, but the details vary by case and usually you don't.
I'am using fork to create child and parent processes and used pipe to send and receive messages between them.I'm running parent process with some inputs and it calls child process.If parent process is executed successfully, child process is getting killed automatically but If i press ctrl+c when parent process is running or if there is any segmentation fault, parent process is getting killed but child process is not getting killed.
Can anybody post me the logic to kill child process when parent process is abrupted ?
Since you already use a pipe for communication, this should be easy.
Pipe reads block until there's data to read and if parent process is the writer and died, you get EOF immediately. If your parent process never closes it's write end, then you have a reliable way to detect death.
Pipe writes hit SIGPIPE and return EPIPE from the call if the signal is ignored when there are no readers.
In the child, select (if you can block) on the fd of pipe and kill the process at appropriate time. There is no SIGCHLD equivalent to parent dying.
man 7 pipe for a good overview. Excerpt:
If all file descriptors referring to the write end of a pipe have been closed, then an attempt to read(2) from
the pipe will see end-of-file (read(2) will return 0). If all file descriptors referring to the read end of a
pipe have been closed, then a write(2) will cause a SIGPIPE signal to be generated for the calling process. If
the calling process is ignoring this signal, then write(2) fails with the error EPIPE. An application that
uses pipe(2) and fork(2) should use suitable close(2) calls to close unnecessary duplicate file descriptors;
this ensures that end-of-file and SIGPIPE/EPIPE are delivered when appropriate.
Let's suppose we have a code doing something like this:
int pipes[2];
pipe(pipes);
pid_t p = fork();
if(0 == p)
{
dup2(pipes[1], STDOUT_FILENO);
execv("/path/to/my/program", NULL);
...
}
else
{
//... parent process stuff
}
As you can see, it's creating a pipe, forking and using the pipe to read the child's output (I can't use popen here, because I also need the PID of the child process for other purposes).
Question is, what should happen if in the above code, execv fails? Should I call exit() or abort()? As far as I know, those functions close the open file descriptors. Since fork-ed process inherits the parent's file descriptors, does it mean that the file descriptors used by the parent process will become unusable?
UPD
I want to emphasize that the question is not about the executable loaded by exec() failing, but exec itself, e.g. in case the file referred by the first argument is not found or is not executable.
You should use exit(int) since the (low byte) of the argument can be read by the parent process using waitpid(). This lets you handle the error appropriately in the parent process. Depending on what your program does you may want to use _exit instead of exit. The difference is that _exit will not run functions registered with atexit nor will it flush stdio streams.
There are about a dozen reasons execv() can fail and you might want to handle each differently.
The child failing is not going to affect the parent's file descriptors. They are, in effect, reference counted.
You should call _exit(). It does everything exit() does, but it avoids invoking any registered atexit() functions. Calling _exit() means that the parent will be able to get your failed child's exit status, and take any necessary steps.