I'm trying to do a little C program that realize a pipeline of two bash commands : echo $arithmeticOperation | bc
$arithmeticOperation is a string taken as input.
The program works fine executing first command, but when i run the second one, i get the right output but the child process executing bc remains stuck preventing the child from ending.
So in this line father process is blocked :
waitpid(pid2,NULL,0);
Where do you think the problem may be ?
Sorry if i asked the question incorrectly, it's my first one. Thanks.
#define SYSCALL(r,c,e) if((r=c)==-1) { perror(e);exit(EXIT_FAILURE);}
int main(){
char buf[128];
int pfd[2],err;
pid_t pid1,pid2;
SYSCALL(err,pipe(pfd),"pipe");
switch (pid1=fork()) {
case -1: { perror("fork"); exit(EXIT_FAILURE);}
case 0 : {
scanf("%s",buf);
SYSCALL(err,dup2(pfd[1],1),"dup");
close(pfd[1]);
close(pfd[0]);
execl("/bin/echo","echo",buf,(char *)NULL);
return 1;
}
}
switch (pid2=fork() ){
case -1 : { perror("fork"); exit(EXIT_FAILURE);}
case 0 : {
SYSCALL(err,dup2(pfd[0],0),"dup");
close(pfd[1]);
close(pfd[0]);
// execl("/usr/bin/bc","bc",(char *)NULL);
execlp("bc","bc",(char *)NULL);
return 1;
}
}
printf("waiting . . . \n");
waitpid(pid1,NULL,0);
printf("wait\n");
waitpid(pid2,NULL,0);
close(pfd[1]);
close(pfd[0]);
return 0;
}
So if i digit "1+1" as a input string i get the right output but then the process executing bc never exit
As I noted in a comment, your parent process must close the file descriptors for the pipe before waiting for bc (and you've agreed that this fixes the problem).
This arises because bc has the pipe open for reading, and the parent has the pipe open for writing, and the kernel thinks that the parent could therefore send data to bc. It won't, but it could.
You have to be very careful when managing pipes. You carefully avoided the usual problem of not closing enough file descriptors in the children.
Rule of thumb: If you
dup2()
one end of a pipe to standard input or standard output, close both of the
original file descriptors returned by
pipe()
as soon as possible.
In particular, you should close them before using any of the
exec*()
family of functions.
The rule also applies if you duplicate the descriptors with either
dup()
or
fcntl()
with F_DUPFD
I need to extend that to cover parent processes too.
If the parent process is not going to communicate with any of its children via the pipe, it must ensure that it closes both ends of the pipe so that its children can receive EOF indications on read (or get SIGPIPE signals or write errors on write), rather than blocking indefinitely. The parent should normally close at least one end of the pipe — it would be extremely unusual for a program to read and write on both ends of a single pipe.
Related
I've been working on this school assignment forever now, and I'm super close to finishing.
The assignment is to create a bash shell in C, which sounds basic enough, but it has to support piping, IO redirect, and flags within the piped commands. I have it all working except for one thing; the | piping child isn't getting any of the data written to the pipe by the user command process child. If I were to remove the child fork for pipechild, and have everything from if(pipe_cmd[0] != '\0') run as the parent, it would work just fine (minus ending the program because of execlp). If I were to use printf() inside the pipe section, the output would be in the right file or terminal, which just leaves the input from the user command process child not getting to where it needs to be as a culprit.
Does anyone see an issue on how I'm using the pipe? It all felt 100% normal to me, given the definition of a pipe.
int a[2];
pipe(a);
//assume file_name is something like file.txt
strcat(file_name, "file.txt");
strcat(pipe_cmd, "wc");
if(!fork())
{
if(pipe_cmd[0] != '\0') // if there's a pipe
{
close(1); //close normal stdout
dup(a[1]); // making stdout same as a[1]
close(a[0]); // closing other end of pipe
execlp("ls","ls",NULL);
}
else if(file_name[0] != '\0') // if just a bare command with a file redirect
{
int rootcmd_file = open(file_name, O_APPEND|O_WRONLY|O_CREAT, 0644);
dup2(rootcmd_file, STDOUT_FILENO);
execlp("ls","ls",NULL); // writes ls to the filename
}
// if no pipe or file name write...
else if(rootcmd_flags[0] != '\0') execlp("ls","ls",NULL)
else execlp("ls","ls",NULL);
} else wait(0);
if(pipe_cmd[0] != '\0') // parent goes here, if pipe.
{
pipechild = fork();
if(pipechild != 0) // *PROBLEM ARISES HERE- IF THIS IS FORKED, IT WILL HAVE NO INFO TAKEN IN.
{
close(0); // closing normal stdin
dup(a[0]); // making our input come from the child above
close(a[1]); // close other end of pipe
if(file_name[0] != '\0') // if a filename does exist, we must reroute the output to the pipe
{
close(1); // close normal stdout
int fileredir_pipe = open(file_name, O_APPEND|O_WRONLY|O_CREAT, 0644);
dup2(fileredir_pipe, STDOUT_FILENO); //redirects STDOUT to file
execlp("wc","wc",NULL); // this outputs nothing
}
else
{
// else there is no file.
// executing the pipe in stdout using execlp.
execlp("wc","wc",NULL); // this outputs nothing
}
}
else wait(0);
}
Thanks in advance. I apologize for some of the code being withheld. This is still an active assignment and I don't want any cases of academic dishonesty. This post was risky enough.
} else wait(0);
The shown code forks the first child process and then waits for it to terminate, at this point.
The first child process gets set up with a pipe on its standard output. The pipe will be connected to the second child process's standard input. The fatal flaw in this scheme is that the second child process isn't even started yet, and won't get started until the first process terminates.
Pipes have limited internal buffering. If the first process generates very little output chances are that its output will fit inside the tiny pipe buffer, it'll write its output and then quietly terminate, none the wiser.
But if the pipe buffer becomes full, the process will block and wait until something reads from the pipe and clears it. It will wait as long as it takes for that to happen. And wait, and wait, and wait. And since the second child process hasn't been started yet, and the parent process is waiting for the first process to terminate it will wait, in vain, forever.
This overall logic is fatally flawed for this reason. The correct logic is to completely fork and execute all child processes, close the pipe descriptors in the parent (this is also important), and then wait for all child processes to terminate. wait must be the very last thing that happens here, otherwise things will break in various amazing and mysterious ways.
I am confused as to how to properly use close to close pipes in C. I am fairly new to C so I apologize if this is too elementary but I cannot find any explanations elsewhere.
#include <stdio.h>
int main()
{
int fd[2];
pipe(fd);
if(fork() == 0) {
close(0);
dup(fd[0]);
close(fd[0]);
close(fd[1]);
} else {
close(fd[0]);
write(fd[1], "hi", 2);
close(fd[1]);
}
wait((int *) 0);
exit(0);
}
My first question is: In the above code, the child process will close the write side of fd. If we first reach close(fd[1]), then the parent process reach write(fd[1], "hi", 2), wouldn't fd[1] already been closed?
int main()
{
char *receive;
int[] fd;
pipe(fd);
if(fork() == 0) {
while(read(fd[0], receive, 2) != 0){
printf("got u!\n");
}
} else {
for(int i = 0; i < 2; i++){
write(fd[1], 'hi', 2);
}
close(fd[1]);
}
wait((int *) 0);
exit(0);
}
The second question is: In the above code, would it be possible for us to reach close(fd[1]) in the parent process before the child process finish receiving all the contents? If yes, then what is the correct way to communicate between parent and child. My understanding here is that if we do not close fd[1] in the parent, then read will keep being blocked, and the program won't exit either.
First of all note that, after fork(), the file descriptors fd would also get copied over to the child process. So basically, a pipe acts like a file with each process having its own references to the read and write end of the pipe. Essentially there are 2 read and 2 write file descriptors, one for each process.
My first question is: In the above code, the child process will close
the write side of fd. If we first reach close(fd[1]), then the parent
process reach write(fd[1], "hi", 2), wouldn't fd[1] already been
closed?
Answer: No. The fd[1] in parent process is the parent's write end. The child has forsaken its right to write on the pipe by closing its fd[1], which does not stop the parent from writing into it.
Before answering the second question, I fixed your code to actually run it and produce some results.
int main()
{
char receive[10];
int fd[2];
pipe(fd);
if(fork() == 0) {
close(fd[1]); <-- Close UNUSED write end
while(read(fd[0], receive, 2) != 0){
printf("got u!\n");
receive[2] = '\0';
printf("%s\n", receive);
}
close(fd[0]); <-- Close read end after reading
} else {
close(fd[0]); <-- Close UNUSED read end
for(int i = 0; i < 2; i++){
write(fd[1], "hi", 2);
}
close(fd[1]); <-- Close write end after writing
wait((int *) 0);
}
exit(0);
}
Result:
got u!
hi
got u!
hi
Note: We (seemingly) lost one hi because we are reading it into same array receive which essentially overrides the first hi. You can use 2D char arrays to retain both the messages.
The second question is: In the above code, would it be possible for us
to reach close(fd[1]) in the parent process before the child process
finish receiving all the contents?
Answer: Yes. Writing to a pipe() is non-blocking (unless otherwise specified) until the pipe buffer is full.
If yes, then what is the correct
way to communicate between parent and child. My understanding here is
that if we do not close fd[1] in the parent, then read will keep being
blocked, and the program won't exit either.
If we close fd[1] in parent, it will signal that parent has closed its write end. However, if the child did not close its fd[1] earlier, it will block on read() as the pipe will not send EOF until all the write ends are closed. So the child will be left expecting itself to write to the pipe, while reading from it simultaneously!
Now what happens if the parent does not close its unused read end? If the file had only one read descriptor (say the one with the child), then once the child closes it, the parent will receive some signal or error while trying to write further to the pipe as there are no readers.
However in this situation, parent also has a read descriptor open and it will be able to write to the buffer until it gets filled, which may cause problems to the next write call, if any.
This probably won't make much sense now, but if you write a program where you need to pass values through pipe again and again, then not closing unused ends will fetch you frustrating bugs often.
what is the correct way to communicate between parent and child[?]
The parent creates the pipe before forking. After the the fork, parent and child each close the pipe end they are not using (pipes should be considered unidirectional; create two if you want bidirectional communication). The processes each have their own copy of each pipe-end file descriptor, so these closures do not affect the other process's ability to use the pipe. Each process then uses the end it holds open appropriately for its directionality -- writing to the write end or reading from the read end.
When the writer finishes writing everything it intends ever to write to the pipe, it closes its end. This is important, and sometimes essential, because the reader will not perceive end-of-file on the read end of the pipe as long as any process has the write end open. This is also one reason why it is important for each process to close the end it is not using, because if the reader also has the write end open then it can block indefinitely trying to read from the pipe, regardless of what any other process does.
Of course, the reader should also close the read end when it is done with it (or terminate, letting the system handle that). Failing to do so constitutes excess resource consumption, but whether that is a serious problem depends on the circumstances.
I'm trying to implement a Linux pipe chain in C. For example:
grep file | ls | wc
So, there is a code that splits the arguments into tokens with the pipe as the separator, and sends each part to the following function with an integer specifying whether it precedes a pipe or not:
int control_flow(char** args, int precedes){
int stdin_copy = dup(0);
int stdout_copy = dup(1);
// if the command and its args precedes a pipe
if (precedes){
int fd[2];
if (pipe(fd) == -1){
fprintf(stderr, "pipe failed\n");
}
if (dup2(fd[1], 1)!=1)
perror("dup2 error 1 to p_in\n"); // 1 points to pipe's input
status = turtle_execute(args); // executes the argument list, output should go into the pipe
// Code stops running here
if (dup2(fd[0], 0)!=0)
perror("dup2 error 0 to p_out\n"); // 0 points to pipe's output, any process that reads next will read from the pipe
if (dup2(stdout_copy, 1)!=1)
perror("dup2 error 1 to stdout_copy\n"); // 1 points back to stdout
}
// if the command does not precede a pipe
else{
status = turtle_execute(args); // input to this is coming from pipe
if (dup2(stdin_copy, 0)!=0) // 0 points back to stdin
perror("dup2 error 1 to stdin_copy");
}
return 0;
}
My code stops running after the first command executes. I suspect it is necessary to fork a process before using this pipe, why is that? If so, how do I do that in my code without changing what I intend to do?
Edit:
This is roughly what turtle_execute does:
turtle_execute(args){
if (args[0] is cd or ls or pwd or echo)
// Implement by calling necessary syscalls
else
// Do fork and exec the process
So wherever I have used exec, I have first used fork, so process getting replaced shouldn't be a problem.
The exec system call replaces the current process with the program you are executing. So your process naturally stops working after the turtle_execute, since it was replaced with the new process.
To execute a new process you normally fork to create a copy of the current process and then execute in the copy.
When you are in the shell, normally each command you type is forked and executed. Try typing exec followed by a command into a shell and you will find that the shell terminates once that command has finished executing, since it does not fork in that case.
Edit
I suggest you have a look at the example on the pipe(2) man page (http://man7.org/linux/man-pages/man2/pipe.2.html#EXAMPLE). It shows the usual way of using a pipe:
Calling pipe to get the create the pipe
Calling fork to fork the process
Depending on whether it is child or parent close one end of the pipe and use the other
I think your problem might be that you make the writing end of your pipe the stdout before forking, causing both the parent and the child to have an open writing end. That could prevent an EOF to be sent since one writing end is still open.
I can only guess what happens in most of turtle_execute, but if you fork, exec on one process, and wait for it on the other, without consuming data from the pipe, it might fill the pipe and to the point where writing is blocked. You should always consume data from the pipe while you write to it. It is a pipe after all and not a water tank. For more information have a look at the pipe(7) man page under the 'Pipe capacity' section.
I want to do simple thing: my_process | proc2 | proc3, but programatically - without using shell, that can do this pretty easy. Is this possible? I cannot find anything :(
EDIT:
Well, without code, nobody will know, what problem I'm trying to resolve. Actually, no output is going out (I'm using printfs)
int pip1[2];
pipe(pip1);
dup2(pip1[1], STDOUT_FILENO);
int fres = fork();
if (fres == 0) {
close(pip1[1]);
dup2(pip1[0], STDIN_FILENO);
execlp("wc", "wc", (char*)0);
}
else {
close(pip1[0]);
}
Please learn about file descriptors and the pipe system call. Also, check read and write.
Your 'one child' code has some major problems, most noticeably that you configure the wc command to write to the pipe, not to your original standard output. It also doesn't close enough file descriptors (a common problem with pipes), and isn't really careful enough if the fork() fails.
You have:
int pip1[2];
pipe(pip1);
dup2(pip1[1], STDOUT_FILENO); // The process will write to the pipe
int fres = fork(); // Both the parent and the child will…
// Should handle fork failure
if (fres == 0) {
close(pip1[1]);
dup2(pip1[0], STDIN_FILENO); // Should close pip1[0] too
execlp("wc", "wc", (char*)0);
}
else { // Should duplicate pipe to stdout here
close(pip1[0]); // Should close pip1[1] too
}
You need:
fflush(stdout); // Print any pending output before forking
int pip1[2];
pipe(pip1);
int fres = fork();
if (fres < 0)
{
/* Failed to create child */
/* Report problem */
/* Probably close both ends of the pipe */
close(pip1[0]);
close(pip1[1]);
}
else if (fres == 0)
{
dup2(pip1[0], STDIN_FILENO);
close(pip1[0]);
close(pip1[1]);
execlp("wc", "wc", (char*)0);
}
else
{
dup2(pip1[1], STDOUT_FILENO);
close(pip1[0]);
close(pip1[1]);
}
Note that the amended code follows the:
Rule of thumb: If you use dup2() to duplicate one end of a pipe to standard input or standard output, you should close both ends of the original pipe.
This also applies if you use dup() or fcntl() with F_DUPFD.
The corollary is that if you don't duplicate one end of the pipe to a standard I/O channel, you typically don't close both ends of the pipe (though you usually still close one end) until you're finished communicating.
You might need to think about saving your original standard output before running the pipeline if you ever want to reinstate things.
As Alex answered, you'll need syscalls like pipe(2), dup2(2), perhaps poll(2) and some other syscalls(2) etc.
Read Advanced Linux Programming, it explains that quite well...
Also, play with strace(1) and study the source code of some simple free software shell.
See also popen(3) -which is not enough in your case-
Recall that stdio(3) streams are buffered. You probably need to fflush(3) at appropriate places (e.g. before fork(2))
Given the following code:
int main(int argc, char *argv[])
{
int pipefd[2];
pid_t cpid;
char buf;
if (argc != 2) {
fprintf(stderr, "Usage: %s \n", argv[0]);
exit(EXIT_FAILURE);
}
if (pipe(pipefd) == -1) {
perror("pipe");
exit(EXIT_FAILURE);
}
cpid = fork();
if (cpid == -1) {
perror("fork");
exit(EXIT_FAILURE);
}
if (cpid == 0) { /* Child reads from pipe */
close(pipefd[1]); /* Close unused write end */
while (read(pipefd[0], &buf, 1) > 0)
write(STDOUT_FILENO, &buf, 1);
write(STDOUT_FILENO, "\n", 1);
close(pipefd[0]);
_exit(EXIT_SUCCESS);
} else { /* Parent writes argv[1] to pipe */
close(pipefd[0]); /* Close unused read end */
write(pipefd[1], argv[1], strlen(argv[1]));
close(pipefd[1]); /* Reader will see EOF */
wait(NULL); /* Wait for child */
exit(EXIT_SUCCESS);
}
return 0;
}
Whenever the child process wants to read from the pipe, it must first close the pipe's side from writing. When I remove that line close(pipefd[1]); from the child process's if,
I'm basically saying that "okay, the child can read from the pipe, but I'm allowing the parent to write to the pipe at the same time"?
If so, what would happen when the pipe is open for both reading & writing? No mutual exclusion?
Whenever the child process wants to read from the pipe, it must first close the pipe's side from writing.
If the process — parent or child — is not going to use the write end of a pipe, it should close that file descriptor. Similarly for the read end of a pipe. The system will assume that a write could occur while any process has the write end open, even if the only such process is the one that is currently trying to read from the pipe, and the system will not report EOF, therefore. Further, if you overfill a pipe and there is still a process with the read end open (even if that process is the one trying to write), then the write will hang, waiting for the reader to make space for the write to complete.
When I remove that line close(pipefd[1]); from the child's process IF, I'm basically saying that "okay, the child can read from the pipe, but I'm allowing the parent to write to the pipe at the same time"?
No; you're saying that the child can write to the pipe as well as the parent. Any process with the write file descriptor for the pipe can write to the pipe.
If so, what would happen when the pipe is open for both reading and writing — no mutual exclusion?
There isn't any mutual exclusion ever. Any process with the pipe write descriptor open can write to the pipe at any time; the kernel ensures that two concurrent write operations are in fact serialized. Any process with the pipe read descriptor open can read from the pipe at any time; the kernel ensures that two concurrent read operations get different data bytes.
You make sure a pipe is used unidirectionally by ensuring that only one process has it open for writing and only one process has it open for reading. However, that is a programming decision. You could have N processes with the write end open and M processes with the read end open (and, perish the thought, there could be processes in common between the set of N and set of M processes), and they'd all be able to work surprisingly sanely. But you'd not readily be able to predict where a packet of data would be read after it was written.
fork() duplicates the file handles, so you will have two handles for each end of the pipe.
Now, consider this. If the parent doesn't close the unused end of the pipe, there will still be two handles for it. If the child dies, the handle on the child side goes away, but there's still the open handle held by the parent -- thus, there will never be a "broken pipe" or "EOF" arriving because the pipe is still perfectly valid. There's just nobody putting data into it anymore.
Same for the other direction, of course.
Yes, the parent/child could still use the handle to write into their own pipe; I don't remember a use-case for this, though, and it still gives you synchronization problems.
When the pipe is created it is having two ends the read end and write end. These are entries in the User File descriptor table.
Similarly there will be two entries in the File table with 1 as reference count for both the read end and the write end.
Now when you fork, a child is created that is the file descriptors are duplicated and thus the reference count of both the ends in the file table becomes 2.
Now "When I remove that line close(pipefd[1])" -> In this case even if the parent has completed writing, your while loop below this line will block for ever for the read to return 0(ie EOF). This happens since even if the parent has completed writing and closed the write end of the pipe, the reference count of the write end in the File table is still 1 (Initially it was 2) and so the read function still is waiting for some data to arrive which will never happen.
Now if you have not written "close(pipefd[0]);" in the parent, this current code may not show any problem, since you are writing once in the parent.
But if you write more than once then ideally you would have wanted to get an error (if the child is no longer reading),but since the read end in the parent is not closed, you will not be getting the error (Even if the child is no more there to read).
So the problem of not closing the unused ends become evident when we are continuously reading/writing data. This may not be evident if we are just reading/writing data once.
Like if instead of the read loop in the child, you are using only once the line below, where you are getting all the data in one go, and not caring to check for EOF, your program will work even if you are not writing "close(pipefd[1]);" in the child.
read(pipefd[0], buf, sizeof(buf));//buf is a character array sufficiently large
man page for pipe() for SunOS :-
Read calls on an empty pipe (no buffered data) with only one
end (all write file descriptors closed) return an EOF (end
of file).
A SIGPIPE signal is generated if a write on a pipe with only
one end is attempted.