forks and pipes implementation linux compiler - c

I have the following code taken from the “Pipes” section of Beej’s Guide to Unix IPC.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
int pfds[2];
pipe(pfds);
if (!fork()) {
close(1); /* close normal stdout */
dup(pfds[1]); /* make stdout same as pfds[1] */
close(pfds[0]); /* we don't need this */
execlp("ls", "ls", NULL);
} else {
close(0); /* close normal stdin */
dup(pfds[0]); /* make stdin same as pfds[0] */
close(pfds[1]); /* we don't need this */
execlp("wc", "wc", "-l", NULL);
}
return 0;
}
This code allows the user to see how many files are in a specific directory. How can I edit this code to implement the longer pipeline cat /etc/passwd | cut –f1 –d: | sort? Does anyone have any idea how to do this because I am completely stuck. Any help would be appreciated.

Feels like homework, so I'll just give you some pointers:
The longer pipeline has two pipes, so you'll need to call pipe() twice. (I'd also check pipe's return value whilst I was at it.)
There are three processes, which means two forks. Again, check fork()'s return value properly: it's tri-state: parent, child or failure, and your program should test all three cases.
If you call pipe() twice up front, think carefully about which file descriptors (i.e. which ends of pipes) are which in each process, and hence which ones to close before invoking execlp(). I'd draw a picture.
I'd prefer dup2() to dup(), since you're explicitly setting the target file descriptor, and so it makes sense to specify it in the call. Also avoids silly bugs.
dup and execlp can fail, so I'd check their return values too...

You need (depending on the length of the command list) some pipes. But: at maximum you need not more than two pipe-pair-fds for a process in the middle, for the first and the last you need one pipe-pair-fds. Be really sure to close the pipe-fds which are not needed - if not, the child processes might not get an EOF and never finish.
And (as user3392484 stated): check all system calls for error conditions and report them to the caller. This will make life much easier.
I implemented something like this during the last days, maybe you want to have a look there: pipexec.c.
Kind regards - Andreas

Related

Pipes in c linux to connect 3 processes to execute a command

My task is to write a C program that executes the command "ls -l /bin/?? | grep rwxr-xr-x | sort". There are 3 child processes where each of them executes one of the commands separately and sends the result through a pipe to the next child process. I'm using a Swedish modified verision of debian so the error message is in Swedish, but i'll translate the error i get, it's something along the lines of: sort: failed to status -: unknown fileidentifier.
Maybe it's my pipes that do not work as intended, I'm not too sure about the close() commands. I'm pretty sure the error comes from the pipes. Would be grateful if someone could run the program and get the english error message.
#include <stdio.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <assert.h>
#include <errno.h>
#include <string.h>
int main()
{
int ret;
int fds1[2], fds2[2], fds3[2];
char buf[20];
pid_t pid;
///initiating pipes
ret=pipe(fds1);
if(ret == -1){
perror("could not pipe");
exit(1);
}
ret=pipe(fds2);
if( ret == -1){
perror("could not pipe");
exit(1);
}
ret=pipe(fds3);
if (ret == -1){
perror("could not pipe");
exit(1);
}
pid=fork();
if(pid==-1){
fprintf(stderr,"fork failed");
exit(0);
}
if(pid==0){
///CHILD 1
close(1);
dup(fds1[1]);
close(fds1[0]);
close(fds1[1]);
close(0);
execlp("/bin/sh","bin/sh", "ls-l /bin/??", (char *)NULL);
}
else{
wait(0);
}
pid=fork();
if(pid==-1){
fprintf(stderr,"fork failed");
exit(0);
}
if(pid==0){
close(0);
dup(fds1[0]);
close(fds1[0]);
close(fds1[1]);
close(1);
dup(fds2[1]);
close(fds2[0]);
close(fds2[1]);
execlp("/usr/share/grep/", "grep", "rwxr-xr-x", NULL);
}
else{
wait(0);
}
close(fds1[0]);
close(fds1[1]);
pid=fork();
if(pid==-1){
fprintf(stderr,"fork failed");
exit(0);
}
if(pid==0){
close(0);
dup(fds2[0]);
close(fds2[0]);
close(fds2[1]);
execlp("sort", "sort", NULL);
}
else{
wait(0);
}
close(fds2[0]);
close(fds2[1]);
}
Your code has several problems, but before I discuss them, let me introduce you to a flavor of one of my favorite preprocessor macros:
#define DO_OR_DIE(x, s) do { \
if ((x) < 0) { \
perror(s); \
exit(1); \
} \
} while (0)
Using this macro where it is applicable can clarify your code by replacing all the boilerplate error checking. For example, this:
ret=pipe(fds1);
if(ret == -1){
perror("could not pipe");
exit(1);
}
becomes just
DO_OR_DIE(pipe(fds1), "pipe");
That makes it a lot easier to see and focus on the key parts of the code, and it's easier to type, too. As a result, it also reduces the temptation to skip error checks, such as those for your calls to dup().
Now, as to your code. For me, it exhibits not just the one misbehavior you now describe in your question, but three:
It emits an error message "bin/sh: ls-l /bin/??: No such file or directory".
It emits the error message you describe, "sort: stat failed: -: Bad file descriptor".
It does not terminate.
The first error message pertains to multiple problems in the arguments to your first execlp() call. If you want to launch a shell and specify a command for it to run, as opposed to a file from which to read commands, then you must pass the -c option to it. Additionally, you've omitted mandatory whitespace between the ls and its arguments. It looks like you want this:
execlp("/bin/sh","sh", "-c", "ls -l /bin/??", (char *)NULL);
Setting aside the second problem for the moment, let's turn to the failure to terminate. You have several problems in this area, falling into these categories:
Holding pipe ends open where you should ensure them closed
Calling wait() at the wrong points
When you set up a pipe between two processes, you generally want to make sure that there are no open file descriptors on either end of the pipe other than one on the write end held by one process, and one on the read end held by the other process. Each end should be open exactly once, in exactly one process. Since the processes being connected invariably inherit these file descriptors from their parent, it is essential that the parent close its copies (except that the parent will want to keep one open in the event that it itself is one of the communicating processes).
The process on the read end of a pipe will not see EOF on that pipe until all open file descriptors on the write end are closed. Child processes running programs such as grep and sort that read their input to its end will hang indefinitely if the write end of the pipe is not completely closed.
That can be a particularly perverse problem when the child reading the pipe also has a copy of the write end of that pipe, unused, or if one of its siblings does.
Additionally, the whole point of a pipeline is that the processes involved run concurrently. If you wait() after starting one before starting the next, then at minimum you prevent such concurrency. Worse, however, that can also cause your program to hang, because a pipe has finite buffer capacity. If the child is writing output to a pipe, but no one is reading it, then the pipe's buffer can fill to capacity, at which point the child blocks. If the parent is waiting for the child to finish before launching the process that will drain the pipe, then you have a deadlock. Therefore, you should start all the processes in the pipeline first, then wait for them all.
Having fixed such problems in your code, I find that the program emits a different error for me:
execlp: No such file or directory
(The specifics of this message derive from the nature of my fixes.) This should be especially concerning, because if execlp() fails then it returns in the process in which it was called. In your cases, control will then fall right out of your if statement, into the code intended only for the parent to execute. For this reason, it is essential to handle errors from execlp(). At minimum, add a call to exit() or _Exit() immediately after.
But what's failing? Well, it's the grep this time. Note that you specify the command to execute as "/usr/share/grep/" -- that trailing / is erroneous, and the path itself is suspect. On my system, the correct path is /usr/bin/grep, but since we're using execlp, which resolves the executable in the path, we might as well omit the path altogether:
execlp("grep", "grep", "rwxr-xr-x", (char *) NULL);
Et voilà! After making that correction as well, your program runs for me.
Additional advice: do not use dup() when you care what file descriptor number you want the duplicate to have, such as when you're trying to dup onto one of the standard streams. Use dup2() for that, which has the additional advantage that you don't need to close the specified file descriptor first.

beej guide pipe example explanation

The following code is the pipe implementation given in beej's guide:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
int pfds[2];
pipe(pfds);
if (!fork()) {
close(1); /* close normal stdout */
dup(pfds[1]); /* make stdout same as pfds[1] */
close(pfds[0]); /* we don't need this */
execlp("ls", "ls", NULL);
} else {
close(0); /* close normal stdin */
dup(pfds[0]); /* make stdin same as pfds[0] */
close(pfds[1]); /* we don't need this */
execlp("wc", "wc", "-l", NULL);
}
return 0;
}
I wanted to ask:
Is it possible that close(0) is executed before dup(pfds[1])? If yes, then in that case the program will not behave as expected.
What is the use of the following lines of code:
close(pfds[0]); /* we don't need this */
close(pfds[1]); /* we don't need this */
And what would change if these lines were not there?
Is it possible that close(0) is executed before dup(pfds[1])? If yes,
then in that case the program will not behave as expected.
Yes, it is possible to have the parent successfully complete close(0) before the child calls dup(pfds[1]). However, this is not a problem. When you fork a new process, the new process gets an entire copy of the parent's memory address space, including open file descriptors (except those marked with the O_CLOEXEC flag - see fcntl(2)). So, essentially each process has its own private copy of the file descriptors and is isolated and free to do whatever it wants with that copy.
Thus, when the parent calls close(0), it is only closing its copy of file descriptor 0 (stdin); it does not affect the child in any way, which still has a reference to stdin and can use it if needed (even though in this example it won't).
What is the use of the following lines of code:
close(pfds[0]); /* we don't need this */
close(pfds[1]); /* we don't need this */
Best practices mandate that you should close file descriptors that you don't use - this is the case for close(pfds[0]). Unused open file descriptors eat up space and resources, why keep it open if you're not going to use it?
close(pfds[1]) is a little more subtle though. Pipes report end of file only when there is no more data in the pipe buffer and there are no active writers, i.e., no live processes that have the pipe open for writing. If you do not close pfds[1] in the parent, the program will hang forever because wc(1) will never see the end of input, since there is a process (wc(1) itself) that has the pipe opened for writing and as such could (but won't) write more data.
Tl;DR: close(pfds[0]) is just good practice but not mandatory; close(pfds[1]) is absolutely necessary to ensure program correctness.
Question 1:
Yes it is entirely possible that "close(0);" (in the parent) is executed before "dup(pfds[1]);" (in the child). But since this happens in different processes, the child will still have fd 0 open.
Question 2:
It is good bookkeeping practice to close the end of the pipe that a process is not going to use. That way, you can avoid bugs further down the road in more complex programs. In the above scenario, the child process should ever only read from the pipe. If you close the write end in the child, eny attempt to write to it will cause an error, otherwise you might have a bug that is hard to detect.

Can a single pipe be used for 2 way communication between parent and a child?

Suppose I use pipefdn[2] and pipe() on it , can bidirectional communication be implemented using a single pipe or do you need 2 pipes ?
Though this operation results as success in some cases, but it is not a recommended way , especially in the production code. As pipe() by default dont provide any sync mechanism and moreover the read() can go for an infinite hang, if no data or read() is called before write() from other process.
Recommended way is to always use 2 pipe. pipe1[2], pipe2[2] for two way communication.
For more info please refer the following video description.
https://www.youtube.com/watch?v=8Q9CPWuRC6o&list=PLfqABt5AS4FkW5mOn2Tn9ZZLLDwA3kZUY&index=11
No sorry. Linux pipe() is unidirectional. See the man page, and also pipe(7) & fifo(7). Consider also AF_UNIX sockets, see unix(7).
Correct me if I am wrong: But I think you can. The problem is that you probably don't want to do that. First, of all create a simple program:
#include <stdio.h>
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
int pd[2];
int num = 2;
int main(){
pid_t t = fork();
/* create a child process */
if(t<0){
printf("error in fork");
exit(1);
}
/* create a pipe */
if(pipe(pd)==-1){
printf("error in pipe");
exit(3);
}
else if(t==0){
//close(pd[1]); // child close writing end
int r = read(pd[0], &num, sizeof(num));
if(r<0){
printf("error while reading");
exit(2);
}
printf("i am the child and i read %d\n",num);
// close(pd[0]);
exit(0);
}
/* parent process */
//close(pd[0]); /* parents closes its reading end
if(write(pd[1],&num,sizeof(num)<0)){
printf("error in reading");
exit(4);
}
//close(pd[1]);
/*parent wait for your child to terminate;*/
int status;
wait(&status);
printf("my child ended with status: %d\n",status);
return 0;
}
Try to play with close(). Skip it by putting it in a comment or include it. You will find out that in order this program to run the only really needed system-call close is the one before the child reads. I found here in stack overflow an answer saying that " Because the write-end is open the system waits because a potential write could occur .. " . Personally, I tried to run it without it and I discovered that it would not terminate. The other close(), although are a good practice , don't influence the execution. ( I am not sure why that happens maybe someone more experienced can help us).
Now let's examine what you asked:
I can see some problems here:
If two processes write in the same channel you may have race conditions:
They write to the same file descriptor at the same time:
What if one process reads its own writings instead of those of the process
it tries to communicate with? How you will know, where in the file you should read?
What if the one process, writes "above" the writings of the other?
Yes it can, I've done that before. I had a parent and child send each other different messages using the same 2 pipes and receive them correctly. Just make sure you're always reading from the first file descriptor and writing to the second.

C close STDOUT running forever

I am writing some C code that involves the use of pipes. To make a child process use my pipe instead of STDOUT for output, I used the following lines:
close(STDOUT);
dup2(leftup[1], STDOUT);
However, it seems to go into some sort of infinite loop or hang on those lines. When I get rid of close, it hangs on dup2.
Curiously, the same idea works in the immediately preceding line for STDIN:
close(STDIN);
dup2(leftdown[0], STDIN);
What could be causing this behavior?
Edit: Just to be clear...
#define STDIN 0
#define STDOUT 1
Edit 2: Here is a stripped-down example:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#define STDIN 0
#define STDOUT 1
main(){
pid_t child1 = 0;
int leftdown[2];
if (pipe(leftdown) != 0)
printf("ERROR");
int leftup[2];
if (pipe(leftup) != 0)
printf("ERROR");
printf("MADE PIPES");
child1 = fork();
if (child1 == 0){
close(STDOUT);
printf("TEST 1");
dup2(leftup[1], STDOUT);
printf("TEST 2");
exit(0);
}
return(0);
}
The "TEST 1" line is never reached. The only output is "MADE PIPES".
At a minimum, you should ensure that the dup2 function returns the new file descriptor rather than -1.
There's always a possibility that it will give you an error (for example, if the pipe() call failed previously). In addition, be absolutely certain that you're using the right indexes (0 and 1) - I've been bitten by that before and it depends on whether you're in the parent or child process.
Based on your edit, I'm not the least bit surprised that MADE PIPES is the last thing printed.
When you try to print TEST 1, you have already closed the STDOUT descriptor so that will go nowhere.
When you try to print TEST 2, you have duped the STDOUT descriptor so that will go to the parent but your parent doesn't read it.
If you change your forking code to:
child1 = fork();
if (child1 == 0){
int count;
close(STDOUT);
count = printf("TEST 1\n");
dup2(leftup[1], STDOUT);
printf("TEST 2 (%d)\n", count);
exit(0);
} else {
char buff[80];
read (leftup[0], buff, 80);
printf ("%s\n", buff);
sleep (2);
}
you'll see that the TEST 2 (-1) line is output by the parent because it read it via the pipe. The -1 in there is the return code from the printf you attempted in the child after you closed the STDOUT descriptor (but before you duped it), meaning that it failed.
From ISO C11 7.20.6.3 The printf function:
The printf function returns the number of characters transmitted, or a negative value if an output or encoding error occurred.
Multiple thing to mention,
When you use fork, it causes almost a complete copy of parent process. That also includes the buffer that is set up for stdout standard output stream as well. The stdout stream will hold the data till buffer is full or explicitly requested to flush the data from buffer/stream. Now because of this , now you have "MADE PIPES" sitting in buffer. When you close the STDOUT fd and use printf for writing data out to terminal, it does nothing but transfers your "TEST 1" and "TEST 2" into the stdout buffer and doesn't cause any error or crash (due to enough buffer). Thus even after duplicating pipe fd on STDOUT, due to buffered output printf hasn't even touched pipe write end. Most important, please use only one set of APIs i.e. either *NIX or standard C lib functions. Make sure you understand the libraries well, as they often play tricks for some sort of optimization.
Now, another thing to mention, make sure that you close the appropriate ends of pipe in appropriate process. Meaning that if say, pipe-1 is used to communicate from parent to child then make sure that you close the read end in parent and write end in child. Otherwise, your program may hung, due to reference counts associated with file descriptors you may think that closing read end in child means pipe-read end is closed. But as when you don't close the read end in parent, then you have extra reference count for read end of pipe and ultimately the pipe will never close.
There are many other things about your coding style, better you should get hold on it :)
Sooner you learn it better it will save your time. :)
Error checking is absolutely important, use at least assert to ensure that your assumptions are correct.
While using printf statements to log the error or as method of debugging and you are changing terminal FD's (STDOUT / STDIN / STDERR) its better you open a log file with *NIX open and write errors/ log entries to it.
At last, using strace utility will be a great help for you. This utility will allow you to track the system calls executed while executing your code. It is very straight forward and simple. You can even attach this to executing process, provided you have right permissions.

C fork/exec with non-blocking pipe IO

This seems to be a fairly common thing to do, and I've managed to teach myself everything that I need to make it work, except that I now have a single problem, which is defying my troubleshooting.
int nonBlockingPOpen(char *const argv[]){
int inpipe;
pid_t pid;
/* open both ends of pipe nonblockingly */
pid = fork();
switch(pid){
case 0: /*child*/
sleep(1); /*child should open after parent has open for reading*/
/*redirect stdout to opened pipe*/
int outpipe = open("./fifo", O_WRONLY);
/*SHOULD BLOCK UNTIL MAIN PROCESS OPENS FOR WRITING*/
dup2(outpipe, 1);
fcntl(1, F_SETFL, fcntl(1, F_GETFL) | O_NONBLOCK);
printf("HELLO WORLD I AM A CHILD PROCESS\n");
/*This seems to be written to the pipe immediately, blocking or not.*/
execvp(*argv, argv);
/*All output from this program, which outputs "one" sleeps for 1 second
*outputs "two" sleeps for a second, etc, is captured only after the
*exec'd program exits!
*/
break;
default: /*parent*/
inpipe = open("./fifo", O_RDONLY | O_NONBLOCK);
sleep(2);
/*no need to do anything special here*/
break;
}
return inpipe;
}
Why won't the child process write its stdout to the pipe each time a line is generated? Is there something I'm missing in the way execvp or dup2 work? I'm aware that my approach to all this is a bit strange, but I can't find another way to capture output of closed-source binaries programatically.
I would guess you only get the exec'd program's output after it exits because it does not flush after each message. If so, there is nothing you can do from the outside.
I am not quite sure how this is supposed to relate to the choice between blocking and nonblocking I/O in your question. A non-blocking write may fail completely or partially: instead of blocking the program until room is available in the pipe, the call returns immediately and says that it was not able to write everything it should have. Non-blocking I/O neither makes the buffer larger nor forces output to be flushed, and it may be badly supported by some programs.
You cannot force the binary-only program that you are exec'ing to flush. If you thought that non-blocking I/O was a solution to that problem, sorry, but I'm afraid it is quite orthogonal.
EDIT: Well, if the exec'd program only uses the buffering provided by libc (does not implement its own) and is dynamically linked, you could force it to flush by linking it against a modified libc that flushes every write. This would be a desperate measure. to try only if everything else failed.
When a process is started (via execvp() in your example), the behaviour of standard output depends on whether the output device is a terminal or not. If it is not (and a FIFO is not a terminal), then the output will be fully buffered, rather than line buffered. There is nothing you can do about that; the (Standard) C library does that.
If you really want to make it work line buffered, then you will have to provide the program with a pseudo-terminal as its standard output. That gets into interesting realms - pseudo-terminals or ptys are not all that easy to handle. For the POSIX functions, see:
grantpt() - grant access to the slave pseudo-terminal device
posix_openpt() - open a pseudo-terminal device
ptsname() - get name of the slave pseudo-terminal device
unlockpt() - unlock a pseudo-terminal master/slave pair
Why won't the child process write its stdout to the pipe each time a line is generated?
How do you know that? You do not even try to read the output from the fifo.
N.B. by the file name I presume that you are using the fifo. Or is it a plain file?
And the minor bug in the child: after dup2(), you need to close(outpipe).
fcntl(1, F_SETFL, fcntl(1, F_GETFL) | O_NONBLOCK);
Depending on what program you exec(), you might either lose some output or cause the program to fail since write to stdout now might fail with EWOULDBLOCK.
IIRC fifos has the same buffer size as pipes. Per POSIX minimum is 512 bytes, commonly 4K or 8K.
You probably want to explain why you need that at all. Non-blocking IO has different semantics compared to blocking IO and unless your child process expects that you will run into various problems.
printf("HELLO WORLD I AM A CHILD PROCESS\n");
stdout is buffered, I would have after that fflush(stdout). (Can't find documentation whether exec() on its own would flush stdout or not.)
Is there something I'm missing in the way execvp or dup2 work? I'm aware that my approach to all this is a bit strange, but I can't find another way to capture output of closed-source binaries programatically.
I wouldn't toy with non-blocking IO - and leave it as it is in blocking mode.
And I would use pipe() instead of the fifo. Linux's man pipe has a convenient example with the fork().
Otherwise, that is a pretty normal practice.
The sleep()s do not guarantee that the parent will open the pipe first - as Dummy00001 says, you should be using a pipe() pipe, not a named pipe. You should also check for execvp() and fork() failing, and you shouldn't be setting the child side to non-blocking - that's a decision for the child process to make.
int nonBlockingPOpen(char *const argv[])
{
int childpipe[2];
pid_t pid;
pipe(childpipe);
pid = fork();
if (pid == 0)
{
/*child*/
/*redirect stdout to opened pipe*/
dup2(childpipe[1], 1);
/* close leftover pipe file descriptors */
close(childpipe[0]);
close(childpipe[1]);
execvp(*argv, argv);
/* Only reached if execvp fails */
perror("execvp");
exit(1);
}
/*parent*/
/* Close leftover pipe file descriptor */
close(childpipe[1]);
/* Check for fork() failing */
if (pid < 0)
{
close(childpipe[0]);
return -1;
}
/* Set file descriptor non-blocking */
fcntl(childpipe[0], F_SETFL, fcntl(childpipe[0], F_GETFL) | O_NONBLOCK);
return childpipe[0];
}

Resources