How can I ensure a child process eventually writes data in C? - c

In C, I'd like to fork off a child process, and map its STDIN and STDOUT to pipes. The parent then communicates with the child by writing to or reading from the child's STDIN and STDOUT.
The MWE code below is apparently successful. The parent thread receives the string "Sending some message", and I can send arbitrary messages to the parent thread by writing to stdout. I can also freely read messages from the parent using, e.g. scanf.
The problem is that, once execl is called by the child, the output seems to stop coming through. I know that without the call to setvbuf to unbuffer stdout, this code will hang indefinitely, and so I suppose that the call to execl re-buffers stdout. Since the child program ./a.out is itself interactive, we hit a race condition where the child will not write (because of the buffering), and blocks waiting for input, while the parent blocks waiting for the child to write before producing input for the child.
Is there a nice way to avoid this? In particular, is there a way to use exec that doesn't overwrite the attributes of stdin stdout, etc.?
int main(char* argv[], int argc){
int mgame_read_pipe[2];
int mgame_write_pipe[2];
pipe(mgame_read_pipe);
pipe(mgame_write_pipe);
pid_t is_child = fork();
if(is_child == -1){
perror("Error while forking.");
exit(1);
}
if(is_child==0){
dup2(mgame_read_pipe[1], STDOUT_FILENO);
printf("Sending some message.\n");
dup2(mgame_write_pipe[0], STDIN_FILENO);
setvbuf(stdin, NULL, _IONBF, 0);
setvbuf(stdout, NULL, _IONBF, 0);
close(mgame_read_pipe[0]);
close(mgame_write_pipe[1]);
execl("./a.out", "./a.out", NULL);
}
else{
close(mgame_read_pipe[1]);
close(mgame_write_pipe[0]);
int status;
do{
printf("SYSTEM: Waiting for inferior process op.\n");
char buf[BUFSIZ];
read(mgame_read_pipe[0], buf, BUFSIZ);
printf("%s",buf);
scanf("%s", buf);
printf("SYSTEM: Waiting for inferior process ip.\n");
write(mgame_write_pipe[1], buf, strlen(buf));
} while( !waitpid(is_child, &status, WNOHANG) );
}
}
EDIT: For completeness, here's an (untested) example a.out:
int main(){
printf("I'm alive!");
int parent_msg;
scanf("%d", &parent_msg);
printf("I got %d\n");
}

Your buffering problems stem from the fact that the buffering is being performed by the C standard library in the program that you are exec-ing, not at the kernel / file descriptor level (as observed by #Claris). There is nothing you can do to affect buffering in another programs own code (unless you modify that program).
This is actually a common problem encountered by anyone trying to automate interaction with a program.
One solution is to use a pseudo-tty, which makes the program think it is actually talking to an interactive terminal, which alters it's buffering behaviour, amongst other things.
This article provides a good introduction. There is an example program there showing exactly how to achieve what you are trying to do.

The setvbuf options you are setting have to do with stdio streams and not file descriptors so will have no effect.
The read/write system calls are not buffered (aside from caching which is different and which might exist in the kernel), so you don't need to worry about disabling a buffer or any other such stuff. They will go directly to where they need to go.
That being said, they are blocking so if the kernel does not have enough data to fill your IO block size they will block at the OS level until that data exists and can be copied to/from your buffer. They will only provide you less than the data you asked for if an EOF condition is encountered or you have enabled async/non blocking IO.
You may be able to enable non-blocking IO through a system call using the fcntl interface. This would return immediately but is not always supported depending on how you are using a file descriptor. Async IO (for files) is supported through the AIO interface.

Related

Unexpected behavior of pipes with scanf()

It's been a while since I last programmed in C, and I'm having trouble making pipes work. (For sake of clarity, I'm using Cygwin on Windows 7.) In particular, I need help understanding the behavior of the following example:
/* test.c */
#include <stdio.h>
#include <unistd.h>
int main() {
char c;
//scanf("%c", &c); // this is problematic
int p[2];
pipe(p);
int out = dup(STDOUT_FILENO);
// from now on, implicitly read from and write on pipe
dup2(p[0], STDIN_FILENO);
dup2(p[1], STDOUT_FILENO);
printf("hello");
fflush(stdout);
// restore stdout
dup2(out, STDOUT_FILENO);
// should read from pipe and write on stdout
putchar(getchar());
putchar(getchar());
putchar(getchar());
}
If I invoke:
echo abcde | ./test.exe
I get the following output:
hel
However, if I uncomment the scanf call, I get:
bcd
Which I can't explain. This is actually a very simplified version of a more complex program with a fork/exec structure that started behaving very bad. Despite not having cycles, it somehow began spawning infinite children in an endless loop. So, rules permitting, I'll probably need to extend the question with a more concrete case of use. Many thanks.
The stream I/O functions such as scanf generally perform buffering to improve performance. Thus, if you call scanf on the standard input then it will probably read more characters than needed to satisfy the request, and the extra will be waiting, buffered, for the next read.
Swapping out the the underlying file descriptor does not affect previously buffered data. When you subsequently read the file again, you get data buffered the first time until those are exhausted, and only then do you get fresh data from the new underlying file.
If you wish, you can turn off buffering of a stream via the setvbuf() function, before any I/O operations have been performed on it:
int result = setvbuf(stdin, NULL, _IONBF, 0);
if (result != 0) {
// handle error ...
}
This is actually a very simplified version of a more complex program
with a fork/exec structure that started behaving very bad. Despite not
having cycles, it somehow began spawning infinite children in an
endless loop.
I don't see how that behavior would be related to what you've asked here.
So, rules permitting, I'll probably need to extend the
question with a more concrete case of use.
That would be a separate question.

understanding pipe() function

I'm trying to understand how pipe() function works and I have the following program example
int main(void)
{
int fd[2], nbytes;
pid_t childpid;
char string[] = "Hello, world!\n";
char readbuffer[80];
pipe(fd);
if((childpid = fork()) == -1)
{
perror("fork");
exit(1);
}
if(childpid == 0)
{
/* Child process closes up input side of pipe */
close(fd[0]);
/* Send "string" through the output side of pipe */
write(fd[1], string, (strlen(string)+1));
exit(0);
}
else
{
/* Parent process closes up output side of pipe */
close(fd[1]);
/* Read in a string from the pipe */
nbytes = read(fd[0], readbuffer, sizeof(readbuffer));
printf("Received string: %s", readbuffer);
}
return(0);
}
My first question is what benefits do we get from closing the file descriptor using close(fd[0]) and close(fd[1]) in child and parent processes. Second, we use write in child and read in parent, but what if parent process reaches read before child reaches write and tries to read from pipe which has nothing in it ? Thanks!
Daniel Jour gave you 99% of the answer already, in a very succinct and easy to understand manner:
Closing: Because it's good practice to close what you don't need. For the second question: These are potentially blocking functions. So reading from an empty pipe will just block the reader process until something gets written into the pipe.
I'll try to elaborate.
Closing:
When a process is forked, its open files are duplicated.
Each process has a limit on how many files descriptors it's allowed to have open. As stated in the documentation: each side of the pipe is a single fd, meaning a pipe requires two file descriptors and in your example, each process is only using one.
By closing the file descriptor you don't use, you're releasing resources that are in limited supply and which you might need further on down the road.
e.g., if you were writing a server, that extra fd means you can handle one more client.
Also, although releasing resources on exit is "optional", it's good practice. Resources that weren't properly released should be handled by the OS...
...but the OS was also written by us programmers, and we do make mistakes. So it only makes sense that the one who claimed a resource and knows about it will be kind enough to release the resource.
Race conditions (read before write):
POSIX defines a few behaviors that make read, write and pipes a good choice for thread and process concurrency synchronization. You can read more about it on the Rational section for write, but here's a quick rundown:
By default, pipes (and sockets) are created in what is known as "blocking mode".
This means that the application will hang until the IO operation is performed.
Also, IO operations are atomic, meaning that:
You will never be reading and writing at the same time. A read operation will wait until a write operation completes before reading from the pipe (and vice-versa)
if two threads call read in the same time, each will get a serial (not parallel) response, reading sequentially from the pipe (or socket) - this make pipes great tools for concurrency handling.
In other words, when your application calls:
read(fd[0], readbuffer, sizeof(readbuffer));
Your application will wait forever for some data to be available and for the read operation to complete (which it will once 80 (sizeof(readbuffer)) bytes were read, or if the EOF status changed during a read).

Is dup2() necessary for execl

Is it necessary to replace stdin with a pipe end when using pipes?
I have an application that:-
Creates a pipe,
Forks a child process, and then
execl() a new process image within new child process,
But I'm running into two conceptual issues.
Is it necessary to use dup() or dup2() to replace stdin? It would obviously be easier to just use the fd from the pipe. (I need little insight about this)
If you can just use the fd from the pipe, how do you pass an integer fd using execl() when execl takes char * arguments?
I'm having trouble figuring out exactly what remains open after execl() is performed, and how to access that information from the newly execl'd process.
It depends on the commands you're running. However, many Unix commands read from standard input and write to standard output, so if the pipes are not set up so that the write end is the output of one command and the read end is the input of the next command, nothing happens (or, more accurately, programs read from places where the input isn't coming from, or write to places which will not be read from, or hang waiting for you to type the input at the terminal, or otherwise do not work as intended).
If your pipe is on file descriptors 3 and 4, the commands you execute must know to read from 3 and write to 4. You could handle that with shell, but it is moderately grotesque overkill to do so compared with using dup2().
No; you're not obliged to use dup2(), but it is generally easier to do so. You could close standard output and then use plain dup() instead of dup2().
If you use dup2() for a pipe, don't forget to close both of the original file descriptors.
You are probably trying to feed data to a subprocess that exists on the system but on the off chance that you are also writing the child process then no you don't need to use dup() and stdin.
execl() keeps all open file descriptors from the parent process open so you could:
int fd[2];
pipe(fd);
if (fork() == 0)
{
char tmp[20];
close(fd[1]);
snprintf(tmp, sizeof(tmp), "%d", fd[0]);
execl("client", tmp, NULL);
exit(1);
}
and in the code for client:
int main(int argc, char** argv)
{
int fd = strtod(argv[1], NULL, 10);
/* Read from fd */
}

C close STDOUT running forever

I am writing some C code that involves the use of pipes. To make a child process use my pipe instead of STDOUT for output, I used the following lines:
close(STDOUT);
dup2(leftup[1], STDOUT);
However, it seems to go into some sort of infinite loop or hang on those lines. When I get rid of close, it hangs on dup2.
Curiously, the same idea works in the immediately preceding line for STDIN:
close(STDIN);
dup2(leftdown[0], STDIN);
What could be causing this behavior?
Edit: Just to be clear...
#define STDIN 0
#define STDOUT 1
Edit 2: Here is a stripped-down example:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#define STDIN 0
#define STDOUT 1
main(){
pid_t child1 = 0;
int leftdown[2];
if (pipe(leftdown) != 0)
printf("ERROR");
int leftup[2];
if (pipe(leftup) != 0)
printf("ERROR");
printf("MADE PIPES");
child1 = fork();
if (child1 == 0){
close(STDOUT);
printf("TEST 1");
dup2(leftup[1], STDOUT);
printf("TEST 2");
exit(0);
}
return(0);
}
The "TEST 1" line is never reached. The only output is "MADE PIPES".
At a minimum, you should ensure that the dup2 function returns the new file descriptor rather than -1.
There's always a possibility that it will give you an error (for example, if the pipe() call failed previously). In addition, be absolutely certain that you're using the right indexes (0 and 1) - I've been bitten by that before and it depends on whether you're in the parent or child process.
Based on your edit, I'm not the least bit surprised that MADE PIPES is the last thing printed.
When you try to print TEST 1, you have already closed the STDOUT descriptor so that will go nowhere.
When you try to print TEST 2, you have duped the STDOUT descriptor so that will go to the parent but your parent doesn't read it.
If you change your forking code to:
child1 = fork();
if (child1 == 0){
int count;
close(STDOUT);
count = printf("TEST 1\n");
dup2(leftup[1], STDOUT);
printf("TEST 2 (%d)\n", count);
exit(0);
} else {
char buff[80];
read (leftup[0], buff, 80);
printf ("%s\n", buff);
sleep (2);
}
you'll see that the TEST 2 (-1) line is output by the parent because it read it via the pipe. The -1 in there is the return code from the printf you attempted in the child after you closed the STDOUT descriptor (but before you duped it), meaning that it failed.
From ISO C11 7.20.6.3 The printf function:
The printf function returns the number of characters transmitted, or a negative value if an output or encoding error occurred.
Multiple thing to mention,
When you use fork, it causes almost a complete copy of parent process. That also includes the buffer that is set up for stdout standard output stream as well. The stdout stream will hold the data till buffer is full or explicitly requested to flush the data from buffer/stream. Now because of this , now you have "MADE PIPES" sitting in buffer. When you close the STDOUT fd and use printf for writing data out to terminal, it does nothing but transfers your "TEST 1" and "TEST 2" into the stdout buffer and doesn't cause any error or crash (due to enough buffer). Thus even after duplicating pipe fd on STDOUT, due to buffered output printf hasn't even touched pipe write end. Most important, please use only one set of APIs i.e. either *NIX or standard C lib functions. Make sure you understand the libraries well, as they often play tricks for some sort of optimization.
Now, another thing to mention, make sure that you close the appropriate ends of pipe in appropriate process. Meaning that if say, pipe-1 is used to communicate from parent to child then make sure that you close the read end in parent and write end in child. Otherwise, your program may hung, due to reference counts associated with file descriptors you may think that closing read end in child means pipe-read end is closed. But as when you don't close the read end in parent, then you have extra reference count for read end of pipe and ultimately the pipe will never close.
There are many other things about your coding style, better you should get hold on it :)
Sooner you learn it better it will save your time. :)
Error checking is absolutely important, use at least assert to ensure that your assumptions are correct.
While using printf statements to log the error or as method of debugging and you are changing terminal FD's (STDOUT / STDIN / STDERR) its better you open a log file with *NIX open and write errors/ log entries to it.
At last, using strace utility will be a great help for you. This utility will allow you to track the system calls executed while executing your code. It is very straight forward and simple. You can even attach this to executing process, provided you have right permissions.

C fork/exec with non-blocking pipe IO

This seems to be a fairly common thing to do, and I've managed to teach myself everything that I need to make it work, except that I now have a single problem, which is defying my troubleshooting.
int nonBlockingPOpen(char *const argv[]){
int inpipe;
pid_t pid;
/* open both ends of pipe nonblockingly */
pid = fork();
switch(pid){
case 0: /*child*/
sleep(1); /*child should open after parent has open for reading*/
/*redirect stdout to opened pipe*/
int outpipe = open("./fifo", O_WRONLY);
/*SHOULD BLOCK UNTIL MAIN PROCESS OPENS FOR WRITING*/
dup2(outpipe, 1);
fcntl(1, F_SETFL, fcntl(1, F_GETFL) | O_NONBLOCK);
printf("HELLO WORLD I AM A CHILD PROCESS\n");
/*This seems to be written to the pipe immediately, blocking or not.*/
execvp(*argv, argv);
/*All output from this program, which outputs "one" sleeps for 1 second
*outputs "two" sleeps for a second, etc, is captured only after the
*exec'd program exits!
*/
break;
default: /*parent*/
inpipe = open("./fifo", O_RDONLY | O_NONBLOCK);
sleep(2);
/*no need to do anything special here*/
break;
}
return inpipe;
}
Why won't the child process write its stdout to the pipe each time a line is generated? Is there something I'm missing in the way execvp or dup2 work? I'm aware that my approach to all this is a bit strange, but I can't find another way to capture output of closed-source binaries programatically.
I would guess you only get the exec'd program's output after it exits because it does not flush after each message. If so, there is nothing you can do from the outside.
I am not quite sure how this is supposed to relate to the choice between blocking and nonblocking I/O in your question. A non-blocking write may fail completely or partially: instead of blocking the program until room is available in the pipe, the call returns immediately and says that it was not able to write everything it should have. Non-blocking I/O neither makes the buffer larger nor forces output to be flushed, and it may be badly supported by some programs.
You cannot force the binary-only program that you are exec'ing to flush. If you thought that non-blocking I/O was a solution to that problem, sorry, but I'm afraid it is quite orthogonal.
EDIT: Well, if the exec'd program only uses the buffering provided by libc (does not implement its own) and is dynamically linked, you could force it to flush by linking it against a modified libc that flushes every write. This would be a desperate measure. to try only if everything else failed.
When a process is started (via execvp() in your example), the behaviour of standard output depends on whether the output device is a terminal or not. If it is not (and a FIFO is not a terminal), then the output will be fully buffered, rather than line buffered. There is nothing you can do about that; the (Standard) C library does that.
If you really want to make it work line buffered, then you will have to provide the program with a pseudo-terminal as its standard output. That gets into interesting realms - pseudo-terminals or ptys are not all that easy to handle. For the POSIX functions, see:
grantpt() - grant access to the slave pseudo-terminal device
posix_openpt() - open a pseudo-terminal device
ptsname() - get name of the slave pseudo-terminal device
unlockpt() - unlock a pseudo-terminal master/slave pair
Why won't the child process write its stdout to the pipe each time a line is generated?
How do you know that? You do not even try to read the output from the fifo.
N.B. by the file name I presume that you are using the fifo. Or is it a plain file?
And the minor bug in the child: after dup2(), you need to close(outpipe).
fcntl(1, F_SETFL, fcntl(1, F_GETFL) | O_NONBLOCK);
Depending on what program you exec(), you might either lose some output or cause the program to fail since write to stdout now might fail with EWOULDBLOCK.
IIRC fifos has the same buffer size as pipes. Per POSIX minimum is 512 bytes, commonly 4K or 8K.
You probably want to explain why you need that at all. Non-blocking IO has different semantics compared to blocking IO and unless your child process expects that you will run into various problems.
printf("HELLO WORLD I AM A CHILD PROCESS\n");
stdout is buffered, I would have after that fflush(stdout). (Can't find documentation whether exec() on its own would flush stdout or not.)
Is there something I'm missing in the way execvp or dup2 work? I'm aware that my approach to all this is a bit strange, but I can't find another way to capture output of closed-source binaries programatically.
I wouldn't toy with non-blocking IO - and leave it as it is in blocking mode.
And I would use pipe() instead of the fifo. Linux's man pipe has a convenient example with the fork().
Otherwise, that is a pretty normal practice.
The sleep()s do not guarantee that the parent will open the pipe first - as Dummy00001 says, you should be using a pipe() pipe, not a named pipe. You should also check for execvp() and fork() failing, and you shouldn't be setting the child side to non-blocking - that's a decision for the child process to make.
int nonBlockingPOpen(char *const argv[])
{
int childpipe[2];
pid_t pid;
pipe(childpipe);
pid = fork();
if (pid == 0)
{
/*child*/
/*redirect stdout to opened pipe*/
dup2(childpipe[1], 1);
/* close leftover pipe file descriptors */
close(childpipe[0]);
close(childpipe[1]);
execvp(*argv, argv);
/* Only reached if execvp fails */
perror("execvp");
exit(1);
}
/*parent*/
/* Close leftover pipe file descriptor */
close(childpipe[1]);
/* Check for fork() failing */
if (pid < 0)
{
close(childpipe[0]);
return -1;
}
/* Set file descriptor non-blocking */
fcntl(childpipe[0], F_SETFL, fcntl(childpipe[0], F_GETFL) | O_NONBLOCK);
return childpipe[0];
}

Resources