Implement a pipe in C: Is it necessary to fork? - c

I'm trying to implement a Linux pipe chain in C. For example:
grep file | ls | wc
So, there is a code that splits the arguments into tokens with the pipe as the separator, and sends each part to the following function with an integer specifying whether it precedes a pipe or not:
int control_flow(char** args, int precedes){
int stdin_copy = dup(0);
int stdout_copy = dup(1);
// if the command and its args precedes a pipe
if (precedes){
int fd[2];
if (pipe(fd) == -1){
fprintf(stderr, "pipe failed\n");
}
if (dup2(fd[1], 1)!=1)
perror("dup2 error 1 to p_in\n"); // 1 points to pipe's input
status = turtle_execute(args); // executes the argument list, output should go into the pipe
// Code stops running here
if (dup2(fd[0], 0)!=0)
perror("dup2 error 0 to p_out\n"); // 0 points to pipe's output, any process that reads next will read from the pipe
if (dup2(stdout_copy, 1)!=1)
perror("dup2 error 1 to stdout_copy\n"); // 1 points back to stdout
}
// if the command does not precede a pipe
else{
status = turtle_execute(args); // input to this is coming from pipe
if (dup2(stdin_copy, 0)!=0) // 0 points back to stdin
perror("dup2 error 1 to stdin_copy");
}
return 0;
}
My code stops running after the first command executes. I suspect it is necessary to fork a process before using this pipe, why is that? If so, how do I do that in my code without changing what I intend to do?
Edit:
This is roughly what turtle_execute does:
turtle_execute(args){
if (args[0] is cd or ls or pwd or echo)
// Implement by calling necessary syscalls
else
// Do fork and exec the process
So wherever I have used exec, I have first used fork, so process getting replaced shouldn't be a problem.

The exec system call replaces the current process with the program you are executing. So your process naturally stops working after the turtle_execute, since it was replaced with the new process.
To execute a new process you normally fork to create a copy of the current process and then execute in the copy.
When you are in the shell, normally each command you type is forked and executed. Try typing exec followed by a command into a shell and you will find that the shell terminates once that command has finished executing, since it does not fork in that case.
Edit
I suggest you have a look at the example on the pipe(2) man page (http://man7.org/linux/man-pages/man2/pipe.2.html#EXAMPLE). It shows the usual way of using a pipe:
Calling pipe to get the create the pipe
Calling fork to fork the process
Depending on whether it is child or parent close one end of the pipe and use the other
I think your problem might be that you make the writing end of your pipe the stdout before forking, causing both the parent and the child to have an open writing end. That could prevent an EOF to be sent since one writing end is still open.
I can only guess what happens in most of turtle_execute, but if you fork, exec on one process, and wait for it on the other, without consuming data from the pipe, it might fill the pipe and to the point where writing is blocked. You should always consume data from the pipe while you write to it. It is a pipe after all and not a water tank. For more information have a look at the pipe(7) man page under the 'Pipe capacity' section.

Related

How to use stderr with execve [duplicate]

I'm writing a C program where I fork(), exec(), and wait(). I'd like to take the output of the program I exec'ed to write it to file or buffer.
For example, if I exec ls I want to write file1 file2 etc to buffer/file. I don't think there is a way to read stdout, so does that mean I have to use a pipe? Is there a general procedure here that I haven't been able to find?
For sending the output to another file (I'm leaving out error checking to focus on the important details):
if (fork() == 0)
{
// child
int fd = open(file, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
dup2(fd, 1); // make stdout go to file
dup2(fd, 2); // make stderr go to file - you may choose to not do this
// or perhaps send stderr to another file
close(fd); // fd no longer needed - the dup'ed handles are sufficient
exec(...);
}
For sending the output to a pipe so you can then read the output into a buffer:
int pipefd[2];
pipe(pipefd);
if (fork() == 0)
{
close(pipefd[0]); // close reading end in the child
dup2(pipefd[1], 1); // send stdout to the pipe
dup2(pipefd[1], 2); // send stderr to the pipe
close(pipefd[1]); // this descriptor is no longer needed
exec(...);
}
else
{
// parent
char buffer[1024];
close(pipefd[1]); // close the write end of the pipe in the parent
while (read(pipefd[0], buffer, sizeof(buffer)) != 0)
{
}
}
You need to decide exactly what you want to do - and preferably explain it a bit more clearly.
Option 1: File
If you know which file you want the output of the executed command to go to, then:
Ensure that the parent and child agree on the name (parent decides name before forking).
Parent forks - you have two processes.
Child reorganizes things so that file descriptor 1 (standard output) goes to the file.
Usually, you can leave standard error alone; you might redirect standard input from /dev/null.
Child then execs relevant command; said command runs and any standard output goes to the file (this is the basic shell I/O redirection).
Executed process then terminates.
Meanwhile, the parent process can adopt one of two main strategies:
Open the file for reading, and keep reading until it reaches an EOF. It then needs to double check whether the child died (so there won't be any more data to read), or hang around waiting for more input from the child.
Wait for the child to die and then open the file for reading.
The advantage of the first is that the parent can do some of its work while the child is also running; the advantage of the second is that you don't have to diddle with the I/O system (repeatedly reading past EOF).
Option 2: Pipe
If you want the parent to read the output from the child, arrange for the child to pipe its output back to the parent.
Use popen() to do this the easy way. It will run the process and send the output to your parent process. Note that the parent must be active while the child is generating the output since pipes have a small buffer size (often 4-5 KB) and if the child generates more data than that while the parent is not reading, the child will block until the parent reads. If the parent is waiting for the child to die, you have a deadlock.
Use pipe() etc to do this the hard way. Parent calls pipe(), then forks. The child sorts out the plumbing so that the write end of the pipe is its standard output, and ensures that all other file descriptors relating to the pipe are closed. This might well use the dup2() system call. It then executes the required process, which sends its standard output down the pipe.
Meanwhile, the parent also closes the unwanted ends of the pipe, and then starts reading. When it gets EOF on the pipe, it knows the child has finished and closed the pipe; it can close its end of the pipe too.
Since you look like you're going to be using this in a linux/cygwin environment, you want to use popen. It's like opening a file, only you'll get the executing programs stdout, so you can use your normal fscanf, fread etc.
After forking, use dup2(2) to duplicate the file's FD into stdout's FD, then exec.
You could also use the linux sh command and pass it a command that includes the redirection:
string cmd = "/bin/ls > " + filepath;
execl("/bin/sh", "sh", "-c", cmd.c_str(), 0);
For those such as myself who like a complete example with includes, here's this fantastic answer with a runnable example (still without error handling, left as an exercise):
#include <fcntl.h>
#include <sys/wait.h>
#include <unistd.h>
int main() {
if (fork() == 0) { // child
int fd = open("test.txt", O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
dup2(fd, 1); // make stdout go to file
dup2(fd, 2); // make stderr go to file - you may choose to not do this
// or perhaps send stderr to another file
close(fd); // fd no longer needed - the dup'ed handles are sufficient
execlp("ls", "ls", NULL);
}
else {
while (wait(NULL) > 0) {} // wait for each child process
}
return 0;
}

Child not reading output from another child that put it in the pipe

I've been working on this school assignment forever now, and I'm super close to finishing.
The assignment is to create a bash shell in C, which sounds basic enough, but it has to support piping, IO redirect, and flags within the piped commands. I have it all working except for one thing; the | piping child isn't getting any of the data written to the pipe by the user command process child. If I were to remove the child fork for pipechild, and have everything from if(pipe_cmd[0] != '\0') run as the parent, it would work just fine (minus ending the program because of execlp). If I were to use printf() inside the pipe section, the output would be in the right file or terminal, which just leaves the input from the user command process child not getting to where it needs to be as a culprit.
Does anyone see an issue on how I'm using the pipe? It all felt 100% normal to me, given the definition of a pipe.
int a[2];
pipe(a);
//assume file_name is something like file.txt
strcat(file_name, "file.txt");
strcat(pipe_cmd, "wc");
if(!fork())
{
if(pipe_cmd[0] != '\0') // if there's a pipe
{
close(1); //close normal stdout
dup(a[1]); // making stdout same as a[1]
close(a[0]); // closing other end of pipe
execlp("ls","ls",NULL);
}
else if(file_name[0] != '\0') // if just a bare command with a file redirect
{
int rootcmd_file = open(file_name, O_APPEND|O_WRONLY|O_CREAT, 0644);
dup2(rootcmd_file, STDOUT_FILENO);
execlp("ls","ls",NULL); // writes ls to the filename
}
// if no pipe or file name write...
else if(rootcmd_flags[0] != '\0') execlp("ls","ls",NULL)
else execlp("ls","ls",NULL);
} else wait(0);
if(pipe_cmd[0] != '\0') // parent goes here, if pipe.
{
pipechild = fork();
if(pipechild != 0) // *PROBLEM ARISES HERE- IF THIS IS FORKED, IT WILL HAVE NO INFO TAKEN IN.
{
close(0); // closing normal stdin
dup(a[0]); // making our input come from the child above
close(a[1]); // close other end of pipe
if(file_name[0] != '\0') // if a filename does exist, we must reroute the output to the pipe
{
close(1); // close normal stdout
int fileredir_pipe = open(file_name, O_APPEND|O_WRONLY|O_CREAT, 0644);
dup2(fileredir_pipe, STDOUT_FILENO); //redirects STDOUT to file
execlp("wc","wc",NULL); // this outputs nothing
}
else
{
// else there is no file.
// executing the pipe in stdout using execlp.
execlp("wc","wc",NULL); // this outputs nothing
}
}
else wait(0);
}
Thanks in advance. I apologize for some of the code being withheld. This is still an active assignment and I don't want any cases of academic dishonesty. This post was risky enough.
} else wait(0);
The shown code forks the first child process and then waits for it to terminate, at this point.
The first child process gets set up with a pipe on its standard output. The pipe will be connected to the second child process's standard input. The fatal flaw in this scheme is that the second child process isn't even started yet, and won't get started until the first process terminates.
Pipes have limited internal buffering. If the first process generates very little output chances are that its output will fit inside the tiny pipe buffer, it'll write its output and then quietly terminate, none the wiser.
But if the pipe buffer becomes full, the process will block and wait until something reads from the pipe and clears it. It will wait as long as it takes for that to happen. And wait, and wait, and wait. And since the second child process hasn't been started yet, and the parent process is waiting for the first process to terminate it will wait, in vain, forever.
This overall logic is fatally flawed for this reason. The correct logic is to completely fork and execute all child processes, close the pipe descriptors in the parent (this is also important), and then wait for all child processes to terminate. wait must be the very last thing that happens here, otherwise things will break in various amazing and mysterious ways.

Closing at the right time and waiting process in C

I have a function which create two child process. In the first child process I am writing in a file and in the second one I am also writing in a file which is different from the first one.
In the dad process I am executing the function execvp.
What I need is the stdout and stderrof the function execvp, so that the two child process can write in files what goes out from stderr and stdout. Then at the end I am merging the two files.
I would like to know where I should close pipes and where I should use wait so that I don't have problem using function read and write and such that I don't go in a infinite loop. I didn't implement the functions that create files, the function that merge files and the function that run the shell instruction because I am just wondering if this is the best structure for the function createTwoChild.
As aligned on the comments I'm providing a skeleton here (simplified just for stdout):
if (pipe(fd)<0) goto my_sys_error; // just an example to get out of here
if ((pid_child=fork())<0) {
close(fd[0]); close(fd[1]);
goto my_sys_error; // you can also use something like e.g. "return -1" to handle the error
}
if (!pid_child) {
// the child process with exec() of which we want to get the output
close(fd[0]); close(0);
dup2(fd[1],1); close(fd[1]);
execXX(...); // some of exec() family also spawn a shell here
close(1);
_exit(127); // This must not happen
}
// master/parent
close(fd[1]); // master doesn't need, only child writes to it
i = read(fd[0],p,PIPEBUF_SIZE);
if (i>0) {
// usual handling, write to file, do whatever you like
// should be while() instead of if(), just simplified
} else {
// handle it, e.g. print "no data from extcmd"
}
close(fd[0]); // close the last fd
waitpid(pid_child, &status_child, 0);
if (!WIFEXITED(status_child)) {
kill(pid_child,SIGKILL);
}
Some notes:
This is actually a simplified implementation of popen().
When the child exited then you will get an EOF on read - in this simple way no sighandler for SIGCHLD is required.
Other signal handling not covered.
stderr can be simply added with an additional pipe.

Can not understand the pipe() in my own shell

This is the code i found for my own shell. It works fine, but the thing i can't understand is pipe section of the code.
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
char* cmndtkn[256];
char buffer[256];
char* path=NULL;
char pwd[128];
int main(){
//setting path variable
char *env;
env=getenv("PATH");
putenv(env);
system("clear");
printf("\t MY OWN SHELL !!!!!!!!!!\n ");
printf("_______________________________________\n\n");
while(1){
fflush(stdin);
getcwd(pwd,128);
printf("[MOSH~%s]$",pwd);
fgets(buffer,sizeof(buffer),stdin);
buffer[sizeof(buffer)-1] = '\0';
//tokenize the input command line
char* tkn = strtok(buffer," \t\n");
int i=0;
int indictr=0;
// loop for every part of the command
while(tkn!=NULL)
{
if(strcoll(tkn,"exit")==0 ){
exit(0);
}
else if(strcoll(buffer,"cd")==0){
path = buffer;
chdir(path+=3);
}
else if(strcoll(tkn,"|")==0){
indictr=i;
}
cmndtkn[i++] = tkn;
tkn = strtok(NULL," \t\n");
}cmndtkn[i]='\0';
// execute when command has pipe. when | command is found indictr is greater than 0.
if(indictr>0){
char* leftcmnd[indictr+1];
char* rightcmnd[i-indictr];
int a,b;
for(b=0;b<indictr;b++)
leftcmnd[b]=cmndtkn[b];
leftcmnd[indictr]=NULL;
for(a=0;a<i-indictr-1;a++)
rightcmnd[a]=cmndtkn[a+indictr+1];
rightcmnd[i-indictr]=NULL;
if(!fork())
{
fflush(stdout);
int pfds[2];
pipe(pfds);
if(!fork()){
close(1);
dup(pfds[1]);
close(pfds[0]);
execvp(leftcmnd[0],leftcmnd);
}
else{
close(0);
dup(pfds[0]);
close(pfds[1]);
execvp(rightcmnd[0],rightcmnd);
}
}else
wait(NULL);
//command not include pipe
}else{
if(!fork()){
fflush(stdout);
execvp(cmndtkn[0],cmndtkn);
}else
wait(NULL);
}
}
}
What is the purpose of the calls to close() with parameters of 0 and 1 mean and what does the call to dup() do?
On Unix, the dup() call uses the lowest numbered unused file descriptor. So, the close(1) before the call to dup() is to coerce dup() to use file descriptor 1. Similarly for close(0).
So, the aliasing is to get the process to use the write end of the pipe for stdout (file descriptor 1 is used for console output), and the read end of the pipe for stdin (file descriptor 0 is used for console input).
The code may have been more clearly expressed with dup2() instead.
dup2(fd[1], 1); /* alias fd[1] to 1 */
From your question about how ls | sort works, your question is not limited to why the dup() system call is being made. Your question is actually how pipes in Unix work, and how a shell command pipeline works.
A pipe in Unix is a pair of file descriptors that are related in that writing data on tje writable descriptor allows that data to be read from the readable descriptor. The pipe() call returns this pair in an array, where the first array element is readable, and second array element is writable.
In Unix, a fork() followed by some kind of exec() is the only way to produce a new process (there are other library calls, such as system() or popen() that create processes, but they call fork() and do an exec() under the hood). A fork() produces a child process. The child process sees the return value of 0 from the call, while the parent sees a non-zero return value that is either the PID of the child process, or a -1 indicating that an error has occurred.
The child process is a duplicate of the parent. This means that when a child modifies a variable, it is modifying a copy of the variable that resides in its own process. The parent does not see the modification occur, as the parent has the original copy). However, a duplicated pair of file descriptors that form a pipe can be used to allow a child process its parent to communicate with each other.
So, ls | sort means that there are two processes being spawned, and the output written by ls is being read as input by sort. Two processes means two calls to fork() to create two child processes. One child process will exec() the ls command, the other child process will exec() the sort command. A pipe is used between them to allow the processes to talk to each other. The ls process writes to the writable end of the pipe, the sort process reads from the readable end of the pipe.
The ls process is coerced into writing into the writable end of the pipe with the dup() call after issuing close(1). The sort process is coerced into reading the readable end of the pipe with the dup() call after close(0).
In addition, the close() calls that close the pipe file descriptors are used to make sure that the ls process is the only process to have an open reference to the writable fd, the the sort process is the only process to have an open reference to the readable fd. That step is important because after ls exits, it will close the writable end of the fd, and the sort process will expect to see an EOF as a result. However, this will not occur if some other process still has the writable fd open.
http://en.wikipedia.org/wiki/Standard_streams#Standard_input_.28stdin.29
stdin is file descriptor 0.
stdout is file descriptor 1.
In the !fork section, the process closes stdout then calls dup on pfds[1] which according to:
http://linux.die.net/man/2/dup
Creates a duplicate of the specified file descriptor at the lowest available position, which will be 1, since it was just closed (and stdin hasn't been closed yet). This means everything sent to stdout will really go to pfds[1].
So, basically, it's setting up the two new processes to talk to each other. the !fork section is for the new child which will send data to stdout (file descriptor 1), the parent (the else block) closes stdin, so it really reads from pfds[0] when it tries to read from stdout.
Each process has to close the file descriptor in pfds it's not using, as there are two open handles to the file now that the process has forked. Each process now execs to left/right-cmnd, but the new stdin and stdout mappings remain for the new processes.
Forking twice is explained here: Why fork() twice

Redirecting exec output to a buffer or file

I'm writing a C program where I fork(), exec(), and wait(). I'd like to take the output of the program I exec'ed to write it to file or buffer.
For example, if I exec ls I want to write file1 file2 etc to buffer/file. I don't think there is a way to read stdout, so does that mean I have to use a pipe? Is there a general procedure here that I haven't been able to find?
For sending the output to another file (I'm leaving out error checking to focus on the important details):
if (fork() == 0)
{
// child
int fd = open(file, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
dup2(fd, 1); // make stdout go to file
dup2(fd, 2); // make stderr go to file - you may choose to not do this
// or perhaps send stderr to another file
close(fd); // fd no longer needed - the dup'ed handles are sufficient
exec(...);
}
For sending the output to a pipe so you can then read the output into a buffer:
int pipefd[2];
pipe(pipefd);
if (fork() == 0)
{
close(pipefd[0]); // close reading end in the child
dup2(pipefd[1], 1); // send stdout to the pipe
dup2(pipefd[1], 2); // send stderr to the pipe
close(pipefd[1]); // this descriptor is no longer needed
exec(...);
}
else
{
// parent
char buffer[1024];
close(pipefd[1]); // close the write end of the pipe in the parent
while (read(pipefd[0], buffer, sizeof(buffer)) != 0)
{
}
}
You need to decide exactly what you want to do - and preferably explain it a bit more clearly.
Option 1: File
If you know which file you want the output of the executed command to go to, then:
Ensure that the parent and child agree on the name (parent decides name before forking).
Parent forks - you have two processes.
Child reorganizes things so that file descriptor 1 (standard output) goes to the file.
Usually, you can leave standard error alone; you might redirect standard input from /dev/null.
Child then execs relevant command; said command runs and any standard output goes to the file (this is the basic shell I/O redirection).
Executed process then terminates.
Meanwhile, the parent process can adopt one of two main strategies:
Open the file for reading, and keep reading until it reaches an EOF. It then needs to double check whether the child died (so there won't be any more data to read), or hang around waiting for more input from the child.
Wait for the child to die and then open the file for reading.
The advantage of the first is that the parent can do some of its work while the child is also running; the advantage of the second is that you don't have to diddle with the I/O system (repeatedly reading past EOF).
Option 2: Pipe
If you want the parent to read the output from the child, arrange for the child to pipe its output back to the parent.
Use popen() to do this the easy way. It will run the process and send the output to your parent process. Note that the parent must be active while the child is generating the output since pipes have a small buffer size (often 4-5 KB) and if the child generates more data than that while the parent is not reading, the child will block until the parent reads. If the parent is waiting for the child to die, you have a deadlock.
Use pipe() etc to do this the hard way. Parent calls pipe(), then forks. The child sorts out the plumbing so that the write end of the pipe is its standard output, and ensures that all other file descriptors relating to the pipe are closed. This might well use the dup2() system call. It then executes the required process, which sends its standard output down the pipe.
Meanwhile, the parent also closes the unwanted ends of the pipe, and then starts reading. When it gets EOF on the pipe, it knows the child has finished and closed the pipe; it can close its end of the pipe too.
Since you look like you're going to be using this in a linux/cygwin environment, you want to use popen. It's like opening a file, only you'll get the executing programs stdout, so you can use your normal fscanf, fread etc.
After forking, use dup2(2) to duplicate the file's FD into stdout's FD, then exec.
You could also use the linux sh command and pass it a command that includes the redirection:
string cmd = "/bin/ls > " + filepath;
execl("/bin/sh", "sh", "-c", cmd.c_str(), 0);
For those such as myself who like a complete example with includes, here's this fantastic answer with a runnable example (still without error handling, left as an exercise):
#include <fcntl.h>
#include <sys/wait.h>
#include <unistd.h>
int main() {
if (fork() == 0) { // child
int fd = open("test.txt", O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
dup2(fd, 1); // make stdout go to file
dup2(fd, 2); // make stderr go to file - you may choose to not do this
// or perhaps send stderr to another file
close(fd); // fd no longer needed - the dup'ed handles are sufficient
execlp("ls", "ls", NULL);
}
else {
while (wait(NULL) > 0) {} // wait for each child process
}
return 0;
}

Resources