Pipe is not receiving all output from child process - c

I wanted to open up a pipe to a program and read output from it. My initial inclination was to use popen(), but the program takes a number of options, and rather that fighting with shell quoting/escaping, I decided to use a combination of pipe(), fork(), dup() to tie the ends of the pipe to stdin/stdout in the parent/child, and execv() to replace the child with an invocation of the program passed all of the options it expects as an array.
The program outputs many lines of data (and flushes stdout after each line). The parent code sets stdin to non-blocking and reads from it in a loop using fgets(). The loop runs while fgets() return non-NULL or stdin has an error condition that is EAGAIN or EWOULDBLOCK.
It receives most of the lines successfully, but towards the end it seems to drop off, with the last fgets() failing with an odd error of "No such file or directory."
Does anyone know what I might have done wrong here?

I found the problem. I stupidly was not resetting errno to zero each iteration. I guess I just assumed fgets() would take care of it or something... My stupid mistake. Now it is working fine. Always reset errno!
Thanks for the help anyway.

not sure, there is a cool function on linux called posix_spawn (example here http://www.opengroup.org/onlinepubs/000095399/xrat/xsh_chap03.html#tag_03_03_01_02) sometimes it makes it easier to do pipes... but sounds like a possible blocking issue or pipe....

Make sure you open a pipe to STDERR. Most programs write error data there instead of STDIN.

Related

Capturing stdout/stderr separately and simultaneously from child process results in wrong total order (libc/unix)

I'm writing a library that should execute a program in a child process, capture the output, and make the output available in a line by line (string vector) way. There is one vector for STDOUT, one for STDERR, and one for "STDCOMBINED", i.e. all output in the order it was printed by the program. The child process is connected via two pipes to a parent process. One pipe for STDOUT and one for STDERR. In the parent process I read from the read-ends of the pipes, in the child process I dup2()'ed STDOUT/STDERR to the write ends of the pipes.
My problem:
I'd like to capture STDOUT, STDERR, and "STDCOMBINED" (=both in the order they appeared). But the order in the combined vector is different to the original order.
My approach:
I iterate until both pipes show EOF and the child process exited. At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR. This works so far. But when I capture out the lines as they come in the parent process, the order of STDOUT and STDERR is not the same as if I execute the program in a shell and look at the output.
Why is this so and how can I fix this? Is this possible at all? I know in the child process I could redirect STDOUT and STDERR both to a single pipe but I need STDOUT and STDERR separately, and "STDCOMBINED".
PS: I'm familiar with libc/unix system calls, like dup2(), pipe(), etc. Therefore I didn't post code. My question is about the general approach and not a coding problem in a specific language. I'm doing it in Rust against the raw libc bindings.
PPS: I made a simple test program, that has a mixup of 5 stdout and 5 stderr messages. That's enough to reproduce the problem.
At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR.
This is the problem. This will only capture the correct order if that was exactly the order of output in the child process.
You need to capture the asynchronous nature of the beast: make your pipe endpoints nonblocking, select* on the pipes, and read whatever data is present, as soon as select returns. Then you'll capture the correct order of the output. Of course now you can't be reading "exactly one line": you'll have to read whatever data is available and no more, so that you won't block, and maintain a per-pipe buffer where you append new data, extract any lines that are present, shove the unprocessed output to the beginning, and repeat. You could also use a circular buffer to save a little bit of memcpy-ing, but that's probably not very important.
Since you're doing this in Rust, I presume there's already a good asynchronous reaction pattern that you could leverage (I'm spoiled with go, I guess, and project the hopes on the unsuspecting).
*Always prefer platform-specific higher-performance primitives like epoll on Linux, /dev/poll on Solaris, pollset &c. on AIX
Another possibility is to launch the target process with LD_PRELOAD, with a dedicated library that it takes over glibc's POSIX write, detects writes to the pipes, and encapsulates such writes (and only those) in a packet by prepending it with a header that has an (atomically updated) process-wide incrementing counter stored in it, as well as the size of the write. Such headers can be easily decoded on the other end of the pipe to reorder the writes with a higher chance of success.
I think it's not possible to strictly do what you want to do.
If you think about how it's done when running a command in an interactive shell, what happens is that both stdout and stderr point to the same file descriptor (the TTY), so the total ordering is correct by means of synchronization against the same file.
To illustrate, imagine what happens if the child process has 2 completely independent threads, one only writing to stderr, and to other only writing to stdout. The total ordering would depend on however the scheduler decided to schedule these threads, and if you wanted to capture that, you'd need to synchronize those threads against something.
And of course, something can write thousands of lines to stdout before writing anything to stderr.
There are 2 ways to relax your requirements into something workable:
Have the user pass a flag waiving separate stdout and stderr streams in favor of a correct stdcombined, and then redirect both to a single file descriptor. You might need to change the buffering settings (like stdbuf does) before you execute the process.
Assume that stdout and stderr are "reasonably interleaved", an assumption pointed out by #Nate Eldredge, in which case you can use #Unslander Monica's answer.

How to detect pclose inside client

How can a C-client started by popen and writing to stdout properly detect that the calling process has called pclose.
I am sending binary data from a small client program written in C to Matlab. To this end, Matlab is starting the process by calling popen inside an API written in C. The client is continuously writing binary data to stdout using fwrite. When Matlab is stopping, the API apparently calls pclose on the client's handle, but that does not stop the client process. I guess the fwrite will not raise an error, as the data gets buffered by the OS. So what is the appropriate way to detect the pclose inside the client?
BTW I will run into the same problem agin, when trying to write to some C-client from within Matlab.
Three ways:
If you are reading from the pipe, and the other end goes away, you should get end-of-file, and you can detect that, and stop.
If you are reading from the pipe, and you control the protocol, the other end can send you a "quit" command over the pipe, and you can read that, and stop.
If you are writing to the pipe, and the other end goes away, you should get a SIGPIPE signal, which by default should kill your process.
Note that #1 only works if the other end is the only process that has the pipe open. If any other process has the writing end of pipe open, you won't get EOF, and you won't know to stop. If you're calling pipe(), then exec(), it's easy for you to have the writing end of the pipe open. This is a common mistake. (Since you're calling popen(), though, it's less likely you're having this problem.)

Using STDIN and STDOUT subsequently

I'm writing a chat in C that can be used in terminal...
For receiving text messages, i have a thread that will print that message out on STDOUT
Another thread is reading from stdin...
The problem is, if a new message is printed to stdout while typing, it will be printed between what i typed.
I researched several hours an experimented with GNU readline to prevent this problem. I thougth the "Redisplay" function will help me here.. but I could not compile my program on Mac OSX if I used certain redisplay functions (it said ld: undefined symbols) whereas other functions worked properly... I compiled this program on an Ubuntu machine and there it worked... i really have no idea why...
Nevertheless, how can achieve that everything that is written to stdout will be above the text i'm currently writing?
You have basically two solutions.
The first is to use something that helps you dividing your screen in different pieces and as #Banthar said ncurses is the standard solution.
The second is to synchronize your writings and readings. The thread that is reading from the network and writing to the console may simply postpone the messages until you entered something from the keyboard, at that time you can simply flush your messages-buffer by writing all the messages at once. Caveat: this solution may cause your buffer to be overflow, you may either forget too old messages or flush the buffer when full.
If your requirement it to use only stdin and stdout (ie a dumb terminal), you will have to first configure you console input as not line buffered which is the default (stty -icanon on Unix like systems). Unfortunately I could not find a portable way to do that programatically, but you will find more on that in this other question on SO How to avoid press enter with any getchar().
Then, you will have to collate next outgoing message character by character. So when an input message is ready to be delivered in the middle of the writing of an output one, you jump on line, write the output message, (eventually jump one other line or do what is normally done for prompting) and rewrite be input buffer so the user has exactly the characters he allready typed.
You will have to use a kind of mutual exclusion to avoid that the input thread makes any access to the input buffer while the output thread does all that work.

C check before writing to closed pipe

Is there an easy way to check if a pipe is closed before writing to it in C? I have a child and parent process, and the parent has a pipe to write to the child. However, if the child closes the pipe and the parent tries to read - I get a broken pipe error.
So how can I check to make sure I can write to the pipe, so I can handle it as an error if I can't? Thanks!
A simple way to check would be to do a 0 byte write(2) to the pipe and check the return. If you're catching SIGPIPE or checking for EPIPE, you get the error. But that's just the same as if you go ahead and do your real write, checking for the error return. So, just do the write and handle an error either in a signal handler (SIGPIPE) or, if the signal is ignored, by checking the error return from write.
How about just try to write and deal with the error? The same way you would for a write to a file or a database. I see no value in the idiom:
check if *this* is going to work
do *this*
You merely introduce a smaller, and harder to catch in testing, window of opportunity:
check if *this* is going to work
child thinks "Ha, fooled you, I'm off now!"
do *this*, which now fails!

Find out if pipe's read end is currently blocking

I'm trying to find out if a child process is waiting for user input (without parsing its output). Is it possible, in C on Unix, to determine if a pipe's read end currently has a read() call blocking?
The thing is, I have no control over the programs exec'd in the child processes. They print all kinds of verbose garbage which I would usually want to redirect to /dev/null. Occasionally though one will prompt the user for something. (With the prompt having no reliable format.) So my idea was:
In a loop:
Drain child's stdout, append it to a temporary buffer.
Check (no idea how) if the child is asking for user input, in which case the buffer is printed to stdout.
When the child exits, throw away the buffer.
The thing is, I have no control over the programs exec'd in the child processes. They print all kinds of verbose garbage which I would usually want to redirect to /dev/null. Occasionally though one will prompt the user for something. (With the prompt having no reliable format.) So my idea was:
In a loop:
Drain child's stdout, append it to a temporary buffer.
Check (no idea how) if the child is asking for user input, in which case the buffer is printed to stdout.
When the child exits, throw away the buffer.
You have these options:
if you know that the child will need certain input (such as shell that will read a command), just write to a pipe
if you assume the child won't read anything usually, but may do it sometimes, you probably need something like job control in the shell (use a terminal for communication with the child, use process groups and TIOCSPGRP ioctl on the terminal to get the child to the background; the child will get SIGTTIN when it tries to read from the terminal, and you can wait() for that). This is how bash handles things like "(sleep 10; read a;)&"
if you don't know what to write, or you have more possibilities, you will have to parse the output
That sounds as if you were trying to supervise dpkg where occasionally some post-inst script queries the admin whether it may override some config file.
Anyway, you may want to look at how strace works:
strace -f -etrace=read your.program
Of course you need to keep track of which fds are the pipes you write about, but you probably need only stdin, anyway.
I don't think that's true: For example, right before calling read() on the reader side, the pipe would have a reader that isn't actually reading.
You would typically just write to the pipe, or use select or poll. If you need a handshake mechanism you can do that out of band various ways or come up with and in-band protocol.
I don't know if there is a built-in way to know if a reader on the other end is blocking. Why do you need to know this?
If I recall correctly, you can not have a pipe with no reader which means that you have either a read(2) or a select(2) syscal pending at all time.

Resources