My book on C applied to Linux, says that if a process creates a child with a fork(), then the pipe created between them follow this principle:
It is important to notice that both the parent process and the child process initially close their unused ends of the pipe
If both processes start with their pipe-end closed, how they know when the other is free to communicate? Maybe, is there an intermediate buffer between the processes?
Pipes on computers works very much like pipes in real life. There are two ends, you put something into one end and it comes out the other end.
Normally when using pipes in a program, you usually only want the input-end, where you write data, or you want the output-end, where data is read from. If the parent process only wants to write to the child process, and the child process only reads from the parent process, then the parent process could close the read end after the fork, and the child process can close the write end.
Pipe is an interprocess communication mechanism provided by the kernel. A process writing on the pipe need not worry whether there is some other process to read it. The communication is asynchronous. The kernel takes care of the data in transit.
Related
I have a while loop that reads data from a child process using blocking I/O by redirecting stdout of the child process to the parent process. Normally, as soon as the child process exits, a blocking read() in this case will return since the pipe that is read from is closed by the child process.
Now I have a case where the read() call does not exit for a child process that finishes. The child process ends up in a zombie state, since the operating system is waiting for my code to reap it, but instead my code is blocking on the read() call.
The child process itself does not have any child processes running at the time of the hang, and I do not see any file descriptors listed when looking in /proc/<child process PID>/fd. The child process did however fork two daemon processes, whose purpose seems to be to monitor the child process (the child process is a proprietary application I do not have any control over, so it is hard to say for sure).
When run from a terminal, the child process I try to read() from exits automatically, and in turn the daemon processes it forked terminate as well.
Linux version is 4.19.2.
What could be the reason of read() not returning in this case?
Follow-up: How to avoid read() from hanging in the following situation?
The child process did however fork two daemon processes ... What could be the reason of read() not returning in this case?
Forked processes still have the file descriptor open when the child terminates. Hence read call never returns 0.
Those daemon processes should close all file descriptors and open files for logging.
A possible reason (the most common) for read(2) blocking on a pipe with a dead child, is that the parent has not closed the writing side of the pipe, so there's still an open (for writing) descriptor for that pipe. Close the writing side of the pipe in the parent process before reading from it. The child is dead (you said zombie) so it cannot be the process with the writing side of the pipe open. And don't forget to wait(2) for the child in the parent, or you'll get a system full of zombies :)
Remember, you have to do two closes in your code:
One in the parent process, to close the writing side of the pipe, leaving the parent process with only a reading descriptor.
One in the child process (just before exec(2)ing) closing the reading side of the pipe, leaving the child process only with a writing descriptor.
In case you want to use the pipe(2) to send information to the child, change the reading for writing and viceversa in the above two points.
I know there's another thread with the same name, but this is actually a different question.
When a process forks multiple times, does the parent finish executing before the children? Vice versa? Concurrently?
Here's an example. Lets say I have a for loop that forks 1 parent process into 4 children. At the end of that for loop, I want the parent process to feed some data to the children via pipes. The data is written to each child process' respective stdin.
Will the parent send the data first, before any of the children execute their code? This is important, because we don't want it to start working from an invalid stdin.
The order of the execution is determined by the specific OS scheduling policy and not guaranteed by anything. In order to synchronize the processes there are special facilities for the inter-process communication (IPC) which are designed for this purpose. The mentioned pipes are one example. They make the reading process to actually wait for the other process to write it, creating a (one-way) synchronization point. The other examples would be FIFOs and sockets. For simpler tasks the wait() family of functions or signals can be used.
When a process forks multiple times, does the parent finish executing before the children? Vice versa? Concurrently? -
Concurrently and depends on the scheduler and its unpredictable.
Using pipe to pass integer values between parent and child
This link explains in detail about sharing data between parent process and child.
Since you have four child process you may need to create different individual pipes between each child process.
Each byte of data written to a pipe will be read exactly once. It isn't duplicated to every process with the read end of the pipe open.
Multiple child processes reading/writing on the same pipe
Alternatively you can try shared memory for the data transfer.
They will execute concurrently. This is basically the point of processes.
Look into mutexes or other ways to deal with concurrency.
I am attempting to write a program which forks and waits for his child to finish, then the child does some work on an input and then forks the same way it's parent does and so on.
Now, I know that forking copies to the child the array of file descriptors and that I should close the ones associated with the parent, but I can't figure out which are the parents. Do I need to give to my child it's parents pid?
I've been trying to wrap my head around it for the better part of an hour and I think I have some kind of a mind block because I can't come to a conclusion.
TL;DR: As a child process how do I know which file descriptors belong to my parent?
Just after the fork (and before any exec function) your child process has the same state as its parent process (except for the result of the fork, which is 0 only in the child). So you know what are the file descriptors, since you have coded the program running in parent&child. On Linux you might also read the /proc/self/fd/ directory, see proc(5).
You might close most file descriptors after the fork and before the exec; you could code something like
for (int fd=3; fd<64; fd++) (void) close(fd);
we are starting from 3 which is after STDERR_FILENO which is 2, and we are stopping arbitrarily at 64, and the cast to (void) on the close call means to the reader that we don't care about failing close.... Of course, if you have e.g. some pipe(7)-s to communicate between parent and child you'll be careful to avoid closing their relevant file descriptor(s).
(However, doing a closing loop like above is poor taste and old fashion)
In general, you'll be careful in your program to set the close-on-exec flag on most file descriptors (e.g. fcntl(2) on F_SETFD operation and FD_CLOEXEC flag, or directly open(2) with O_CLOEXEC), then the execve(2) (done in most child processes after the fork) would close them.
I have a process that forks, performs computation and writes out data to stdout. The child process also writes out data to stdout after performing some computation. Currently, the output from the parent and the child come out separately. However, I'm concerned that the output from the child may be printed mixed with the output from the parent.
I.e. I have this line in both the child and the process :-
fprintf(stdout, "%s\n", do_computation());
Is there any neat way to prevent the writes from being interleaved? It hasn't happened so far, however I'm concerned that it may.
This is the standard multitasking issue, and is solved the same way any other shared resource is protected: it's your responsibility to create and manage semaphores so the processes can negotiate periods of exclusive access to shared resources such as these streams, or to arrange similarly safe mechanisms for them to communicate amongst themselves (eg having the child processes respond not to stdout but via a pipe per process back to the parent, and having the parent poll those pipes and report their results as complete messages become available).
There should be plenty of good tutorials on the web on multiprocess programming in C.
Conceptually, you want to achieve before-or-after atomicity for the fprintf calls in different process. You can in the parent process, waitpid for the child process right before the fprintf call so that the call in parent process is guaranteed to be executed after the child terminates without reducing the parallelism of the computation.
Almost all the pipe examples I've seen advice closing the unused write/read ends. Also man clearly states that pipe() creates a pipe, a unidirectional data channel But I've tried reading and writing to both ends of the pipe in both the parent and the child and everything seems to be OK.
So my doubt is why do we need 2 pipes if two processes have to both read and write to each other and why not do it using a single pipe?
If you use the same pipe how does the child separate its messages from the parents messages and vice versa?
For example:
Parent writes to pipe
Parent reads from pipe hoping to get message from child but gets its own message :(
It is much easier to use one pipe for child->parent and another pipe for parent->child.
Even if you have some protocol for reading/writing it is quite easy to deadlock the parent and child process.
You can read and write at both ends of the created pipe, but uni-directional means that data only travels in one direction at any time, from parent to child or vice versa. Two pipes are needed for non-blocking sending and receiving of data, meaning that you can read and write at the same time with two pipes, but with one pipe you must finish reading before you can write to the pipe or you must finish writing something before you can read the pipe. In layman terms, you can only read or write at any point of time with only one pipe