My problem is a bit hard to explain properly as I do not understand fully the behavior behind it.
I have been working on pipe and pipelines in C, and I noticed some behavior that is a bit mysterious to me.
Let's take a few example: Let's try to pipe yes with head. (yes | head). Even though I coded the behavior in a custom program, I don't understand how the pipe knows when to stop piping ? It seems two underlying phenomenons are causing this (maybe), the SIGPIPE and/or the internal size a pipe can take. How does the pipe stop piping, is it when it's full ? But the size of a pipe is way superior to 10 "yes\n" no ? And SIGPIPE only works when the end read/write is closed no ?
Also let's take another example, for example cat and ls: cat | ls or even cat | cat | ls.
It seems the stdin of the pipe is waiting for input, but how does it know when to stop, i.e. after one input ? What are the mechanism that permits this behavior?
Also can anyone provide me with others examples of these very specific behavior if there are any in pipes and pipelines so I can get an good overview of theses mechanism ?
In my own implementation, I managed to replicate that behavior using waitpid. However how does the child process itself know when to stop ? Is it command specific ?
The write operation will block when the pipe buffer is full, the read operation will block when the buffer is empty.
When the write end of the pipe is closed, the reading process will get an EOF indication after reading all data from the buffer. Many programs will terminate in this case.
When the read end of the pipe is closed, the writing process will get a SIGPIPE. This will also terminate most programs.
When you run cat | ls, STDOUT of cat is connected to STDIN of ls, but ls does not read from STDIN. On the system where I checked this, ls simply ignores STDIN and the file descriptor will be closed when ls terminates.
You will see the output of ls, and cat will be waiting for input.
cat will not write anything to STDOUT before it has read enough data from STDIN, so it will not notice that the other end of the pipe has been closed.
cat will terminate when it detects EOF on STDIN which can be done by pressing CTRL+D or by redirecting STDIN from /dev/null, or when it gets SIGPIPE after trying to write to the pipe which will happen when you (type something and) press ENTER.
You can see the behavior with strace.
cat terminates after EOF on input which is shown as read(0, ...) returning 0.
strace cat < /dev/null | ls
cat killed by SIGPIPE.
strace cat < /dev/zero | ls
How does the pipe stop piping
The pipe stops piping when either end is closed.
If the input(write) end of the pipe is closed, then any data in the pipe is held until it is read from the output end. Once the buffer is emptied, anyone subsequently reading from the output end will get an EOF.
If the output(read) end of the pipe is closed, any data in the pipe will be discarded. Anyone subsequently writing to the input end will get a SIGPIPE/EPIPE. Note that a process merely holding open the input but not actively writing to it will not be signalled.
So when you type cat | ls you get a cat program with stdout connected to the input of the pipe and ls with stdin connected to the output. ls runs and outputs some stuff (to its stdout, which is still the terminal) and never reads from stdin. Once done it exits and closes the output of the pipe. Meanwhile cat is waiting for input from its stdin (the terminal). When it gets it (you type a line), it writes it to stdout, gets a SIGPIPE/EPIPE and exits (discarding the data as there's noone to write it to.) This closes the input of the pipe, so the pipe goes away now that both ends have been closed.
Now lets look at what happens with cat | cat | ls. You now have two pipes and two cat programs. As before ls runs and exits, closing the output of the second pipe. Now you type a line and the first cat reads it and copies it to the first pipe (still fully open) where the second cat reads it and copies it to the second pipe (which has its output closed), so it (the second cat) gets a SIGPIPE/EPIPE and exits (which closes the output of the first pipe). At this point the first cat is still waiting for input, so if you type a second line, it copies that to the now closed first pipe and gets a SIGPIPE/EPIPE and exits
How does the pipe stop piping, is it when it's full ?
A pipe has several states:
if you obtain the pipe through a call to pipe(2) (an unnamed pipe) both file descriptors are already open, so this doesn't apply to it (you start in point 2. below). When you open a named pipe, your open(2) call (depending if you have open with O_READ, O_WRITE, or O_RDWR. The pipe has two sides, the writer and the reader side. When you open it, you attach to the sides, depending on how do you open it. Well, up to here, the pipe blocks any open(2) call, until both sides have at least one process tied to them. So, if you open a pipe and read(2) from it, then your open will be blocked, until other process has opened it to read.
once both extremes have it open, the readers (the process issuing a read(2) call) block when the pipe is empty, and the writers (the processes issuing a write(2) call) block whenever the write call cannot be satisfied due to fillin completely the pipe. Old implementations of pipes used the filesystem to hold the data, and the data was stored only in the direct addressed disk blocks. This meant (as there are 10 such blocks in an inode) that you normally had space in the pipe to hold 10 blocks, after that, the writers are blocked. Later, pipes were implemented using the socket infrastructure in BSD systems, which allowed you to control the buffer size with ioctl(2) calls. Today, IMHO, pipes use a common implementation, that is separate from sockets also.
When the processes close the pipe continues to work as said in point 2. above, until the number of readers/writers collapses to zero. At that point, the pipe starts giving End Of File condition to all readers (this means read(2) syscall will return 0 bytes, without blocking) and error (cannot write to pipe) to writers. In addition, the kernel sends a signal (which normally aborts the writer processes) SIGPIPE to every process that has the pipe open for writing. If you have not ignored that signal or you have not installed a signal handler for it, your process will die. In this state, it's impossible to reopen the pipe again, until all processes have closed it.
A common error is when you pipe() or you open a pipe with O_RDWR, and the other process closes its file descriptor, and you don't get anything indicating about the other's close call..... this is due to the thing that both sides of the pipe are still open (by the same process) so it will not receive anything because it can still write to the pipe.
Any other kind of misbehaviour could be explained if you had posted any code, but you didn't, so IMHO, thi answer is still incomplete, but the number of different scenarios is difficult to enumerate, so I'll be pendant of any update to your question with some faulty (or needed of explanation) code.
In a Windows program with two threads: thread1 and thread2.
When thread1 is blocked in the call fread(buffer, 1, 10, stdin) waiting for input, is it possible to do something from thread2 to force fread to return?
So far I tried calling fclose(stdin) from thread2, but it doesn't seem to work. The program gets stuck at the fclose call until some input is avaible in the stdin stream.
What I'm trying to achieve is to terminate thread1 gracefully instead of just killing it with TerminateThread, because there's some work that thread1 has to do at the end.
Another thing to consider is that stdin is one end of a named pipe. I don't have control over the program at the other end of the pipe.
What I need is just to disconnect my program from its end of the pipe (stdin in this case).
Calling fclose(stdin) is a very bad idea; it causes undefined behavior if it happens before the fread (which it's not ordered with respect to) or if the thread calling fread does anything else with stdin after fread returns, and it does not unblock the fread since fclose cannot proceed until it obtains a lock on stdin, which the in-progress fread is excluding.
stdio is just fundamentally unsuitable for what you want to do here. You could patch it up via forwarding through a second pipe, with yet another thread reading from stdin and writing into your new pipe. Then you could abort an fread on the read end of your pipe by closing the write end of it. The extra thread would still be stuck, but that doesn't really matter if you're going to be terminating anyway. Alternatively (and this is probably cleaner) you would use file descriptors (or Windows file handles) instead of stdio and poll (or the Windows equivalent) to determine whether there's input to be read. You could combine these approaches to put the Windows-specific file handle logic in the extra thread (thus being able to terminate it cleanly) and continue to use portable stdio in your program logic thread.
If you can't use CancelSynchronousIo you can close the underlying file handle with CloseHandle, like this:
CloseHandle((HANDLE)_get_osfhandle(_fileno(stdin))) ;
That should cause fread to return.
I'm trying to write to stdin and read from stdout ( and stderr ) from an external program, without changing the code.
I've tried using named pipes, but stdout doesn't show until the program is terminated and stdin only works on the first input( then cin is null ).
i've tried using /proc/[pid]/fd but that only writes and reads from the terminal and not the program.
i've tried writing a character device file for this and it worked, but only one program at a time ( this needs to work for multiple programs at a time ).
at this point, to my knowledge, I could write the driver that worked to multiplex the io across multiple programs but I don't think that's the "right" solution.
the main purpose of this is to view a feed of a program through a web interface. I'm sure there has to be someway to do this. is there anything I haven't tried that's been done before?
The typical way of doing this is:
Create anonymous pipes (not named pipes) with the pipe(2) system call for the new process's standard streams
Call fork(2) to spawn the child process
close(2) the appropriate ends of the pipes in both the parent and the child (e.g. for the stdin pipe, close the read end in the parent and close the write end in the child; vice-versa for the stdout and stderr pipes)
Use dup2(2) in the child to copy the pipe file descriptors onto file descriptors 0, 1, and 2, and then close(2) the remaining old descriptors
exec(3) the external application in the child process
In the parent process, simultaneously write to the child's stdin pipe and read from the child's stdout and stderr pipes. However, depending on how the child behaves, this can easily lead to deadlock if you're not careful. One way to avoid deadlock is to spawn separate threads to handle each of the 3 streams; another way is to use the select(2) system call to wait until one of the streams can be read from/written to without blocking, and then process that stream.
Even if you do this all correctly, you may still not see your program's output right away. This is typically due to buffering stdout. Normally, when stdout is going to a terminal, it's line-buffered—it gets flushed after every newline gets written. But when stdout is a pipe (or anything else that's not a terminal, like a file or a socket), it's fully buffered, and it only gets written to when the program has outputted a full buffer's worth of data (e.g. 4 KB).
Many programs have command line options to change their buffering behavior. For example, grep(1) has the --line-buffered flag to force it to line-buffer its output even when stdout isn't a terminal. If your external program has such an option, you should probably use it. If not, it's still possible to change the buffering behavior, but you have to use some sneaky tricks—see this question and this question for how to do that.
I have a scenario where two pipes are used for IPC between child and parent. The child process uses execvp to execute a remote program. The parent process takes care of writing data to the pipe. The remote programs stdin is duplicated to read end of one pipe. To the same pipe parent writes data at the write end. The remote program has a simple getchar() in one of the functions that is called twice in the remote program's main function.
The parent writes data in the following sequence.
writes data to the pipe. Closes all the required handles. (say wrote 1)
after some time writes data again to the pipe. Closes handle (say wrote 2)
The getchar in the remote program reads "1" in the proper fashion. But the problem comes while reading "2". The getchar is reading garbage values.
I have debugged using GDB and the program exits normally. No "signals" are raised while debugging.
I have used the fork(), dup2() and pipe() functions and need to stick to it.
I am trying to execute a program I created in popen and then collect the output. The program I execute takes 30 seconds (there are 30 sleep(1)s) and after each seconds it sends chunks of output. What puzzles me is that when I call pipe = popen("test -flag", "r") it finishes immediately and the FILE stream pipe is empty. Are my assumptions that that the program will halt and wait for test to finish executing before continuing or does it initiate the order to collect the output and immediately continue on? If it is the latter, is there any way to pause the program until the pipe has all the output before continuing?
Thanks!
The call to popen() is supposed to be relatively quick, so your program can get on with reading the output from the program. Certainly, popen() itself does not wait for the invoked program to finish. Once the popen() returns, your program should be able to read from the file stream; it will hang until there is input waiting, or until the other process closes the pipe (e.g. terminates), at which point it will get an EOF indication. You can then pclose() the stream you were reading from.
How does your test program behave when you run it directly with its output sent to a pipe? Note that standard I/O typically behaves differently when the output is a pipe (full buffering) as against a terminal (line buffering).
Read pipe until EOF is returned.