Pipe and Process management - c

I am working on a tiny shell(tsh) implemented in C(it's an assignment). One part of assignment belongs to PIPING. I have to pipe a command's output to another command. e.g:ls -l | sort
When I run the shell, every command that I execute on it, is processed by a child process that it spawns. After the child finishes the result is returned. For piping I wanted to implement a harcoded example first to check how it works. I wrote a method, that partially works. The problems is when I run the pipe command, after child process finishes, the whole program quits with it! Obviously I am not handling the child process signal properly(Method code below).
My Question:
How does process management with pipe() works? if i run a command ls -l | sort does it create a child process for ls -l and another process for sort ? From the piping examples that I have seen so far, only one process is created(fork()).
When the second command (sort from our example) is processed, how can i get its process ID?
EDIT: Also while running this code I get the result twice. don't know why it runs twice, there is no loop in there.
Here is my code:
pid_t pipeIt(void){
pid_t pid;
int pipefd[2];
if(pipe(pipefd)){
unix_error("pipe");
return -1;
}
if((pid = fork()) <0){
unix_error("fork");
return -1;
}
if(pid == 0){
close(pipefd[0]);
dup2(pipefd[1],1);
close(pipefd[1]);
if(execl("/bin/ls", "ls", (char *)NULL) < 0){
unix_error("/bin/ls");
return -1;
}// End of if command wasn't successful
}// End of pid == 0
else{
close(pipefd[1]);
dup2(pipefd[0],0);
close(pipefd[0]);
if(execl("/usr/bin/tr", "tr", "e", "f", (char *)NULL) < 0){
unix_error("/usr/bin/tr");
return -1;
}
}
return pid;
}// End of pipeIt

Yes, the shell must fork to exec each subprocess. Remember that when you call one of the execve() family of functions, it replaces the current process image with the exec'ed one. Your shell cannot continue to process further commands if it directly execs a subprocess, because thereafter it no longer exists (except as the subprocess).
To fix it, simply fork() again in the pid == 0 branch, and exec the ls command in that child. Remember to wait() for both (all) child processes if you don't mean the pipeline to be executed asynchronously.

Yes, you do need to call fork at least twice, once for each program in the pipeline. Remember that exec replaces the program image of the current process, so your shell stops existing the moment you start running sort or (tr).

Related

Do *Unix shells call the pipe() function when encountering the "pipe character"? [duplicate]

I am working on a tiny shell(tsh) implemented in C(it's an assignment). One part of assignment belongs to PIPING. I have to pipe a command's output to another command. e.g:ls -l | sort
When I run the shell, every command that I execute on it, is processed by a child process that it spawns. After the child finishes the result is returned. For piping I wanted to implement a harcoded example first to check how it works. I wrote a method, that partially works. The problems is when I run the pipe command, after child process finishes, the whole program quits with it! Obviously I am not handling the child process signal properly(Method code below).
My Question:
How does process management with pipe() works? if i run a command ls -l | sort does it create a child process for ls -l and another process for sort ? From the piping examples that I have seen so far, only one process is created(fork()).
When the second command (sort from our example) is processed, how can i get its process ID?
EDIT: Also while running this code I get the result twice. don't know why it runs twice, there is no loop in there.
Here is my code:
pid_t pipeIt(void){
pid_t pid;
int pipefd[2];
if(pipe(pipefd)){
unix_error("pipe");
return -1;
}
if((pid = fork()) <0){
unix_error("fork");
return -1;
}
if(pid == 0){
close(pipefd[0]);
dup2(pipefd[1],1);
close(pipefd[1]);
if(execl("/bin/ls", "ls", (char *)NULL) < 0){
unix_error("/bin/ls");
return -1;
}// End of if command wasn't successful
}// End of pid == 0
else{
close(pipefd[1]);
dup2(pipefd[0],0);
close(pipefd[0]);
if(execl("/usr/bin/tr", "tr", "e", "f", (char *)NULL) < 0){
unix_error("/usr/bin/tr");
return -1;
}
}
return pid;
}// End of pipeIt
Yes, the shell must fork to exec each subprocess. Remember that when you call one of the execve() family of functions, it replaces the current process image with the exec'ed one. Your shell cannot continue to process further commands if it directly execs a subprocess, because thereafter it no longer exists (except as the subprocess).
To fix it, simply fork() again in the pid == 0 branch, and exec the ls command in that child. Remember to wait() for both (all) child processes if you don't mean the pipeline to be executed asynchronously.
Yes, you do need to call fork at least twice, once for each program in the pipeline. Remember that exec replaces the program image of the current process, so your shell stops existing the moment you start running sort or (tr).

finding process id using program name

My program receives a string (a shell script's path) as input. Now I have to launch that shell script and find out the process id for it.
I'm using the system() function to launch a shell script and after that using popen() with ps -aef | grep "ShellScript" to get its PID.
It's being suggested to me that there's a better way to do it. The way I did it will give a wrong PID if multiple scripts are running at same time.
What is the correct way to get a PID for a given script name after launching it?
Firstly you should not use system().
A better approach would be using fork(), which returns a PID > 0 if you are in the parent process and return 0 if you are in the child.
Any other return satus is an error and errno is set accordingly.
When you are in the child process you should exec your command, while you should either wait for it in the parent one if you want to avoid to have a zombie process or handle SIGCHLD signal.
Always read man for better insight.
Here is a brief example
int pid = -1;
If((pid = fork()) > 0)
{
/* Parent process*/
wait(NULL);
}
else if(pid == 0)
{
/*Child process*/
execv(....);
exit (0);
}
else
{
/*Error*/
perror("fork()");
}

Pipe function in Linux shell write in C

My mini-shell program accepts pipe command, for example, ls -l | wc -l and uses excevp to execute these commands.
My problem is if there is no fork() for execvp, the pipe command works well but the shell terminates afterward. If there is a fork() for execvp, dead loop happens. And I cannot fix it.
code:
void run_pipe(char **args){
int ps[2];
pipe(ps);
pid_t pid = fork();
pid_t child_pid;
int child_status;
if(pid == 0){ // child process
close(1);
close(ps[0]);
dup2(ps[1], 1);
//e.g. cmd[0] = "ls", cmd[1] = "-l"
char ** cmd = split(args[index], " \t");
//if fork here, program cannot continue with infinite loop somewhere
if(fork()==0){
if (execvp(cmd[0],cmd)==-1){
printf("%s: Command not found.\n", args[0]);
}
}
wait(0);
}
else{ // parent process
close(0);
close(ps[1]);
dup2(ps[0],0);
//e.g. cmd[0] = "wc", cmd[1] = "-l"
char ** cmd = split(args[index+1], " \t");
//if fork here, program cannot continue with infinite loop somewhere
if(fork()==0){
if (execvp(cmd[0],cmd)==-1){
printf("%s: Command not found.\n", args[0]);
}
}
wait(0);
waitpid(pid, &child_status, 0);
}
}
I know fork() is needed for excevp in order to not terminate the shell program, but I still cannot fix it. Any help will be appreciated, thank you!
How should I make two children parallel?
pid = fork();
if( pid == 0){
// child
} else{ // parent
pid1 = fork();
if(pid1 == 0){
// second child
} else // parent
}
is this correct?
Yes, execvp() replaces the program in which it is called with a different one. If you want to spawn another program without ending execution of the one that does the spawning (i.e. a shell) then that program must fork() to create a new process, and have the new process perform the execvp().
Your program source exhibits a false parallelism that probably either confuses you or reflects a deeper confusion. You structure the behavior of the first child forked in just the same way as the behavior of the parent process after the fork, but what should be parallel is the behavior of the first child and the behavior of the second child.
One outcome is that your program has too many forks. The initial process should fork exactly twice -- once for each child it wants to spawn -- and neither child should fork because it's already a process dedicated to one of the commands you want to run. In your actual program, however, the first child does fork. That case is probably rescued by the child also wait()ing for the grandchild, but it's messy and poor form.
Another outcome is that when you set up the second child's file descriptors, you manipulate the parent's, prior to forking, instead of manipulating the child's after forking. Those changes will persist in the parent process, which I'm pretty confident is not what you want. This is probably why the shell seems to hang: when run_pipe() returns (the shell's standard input has been changed to the read end of the pipe).
Additionally, the parent process should close both ends of the pipe after the children have both been forked, for more or less the same reason that the children must each close the end they are not using. In the end, there will be exactly one open copy of the file descriptor for each end of the pipe, one in one child and the other in the other. Failing to do this correctly can also cause a hang under some circumstances, as the processes you fork may not terminate.
Here's a summary of what you want the program to do:
The original process sets up the pipe.
The original process forks twice, once for each command.
Each subprocess manipulates its own file descriptors to use the correct end of the pipe as the appropriate standard FD, and closes the other end of the pipe.
Each subprocess uses execvp() (or one of the other functions in that family) to run the requested program
the parent closes its copies of the file descriptors for both ends of the pipe
the parent uses wait() or waitpid() to collect two children.
Note, too, that you should check the return values of all your function calls and provide appropriate handling for errors.

using execvp to execute commands that I have in an array

I have a commands array and I want to execute each command in this array but I couldn't seem to get it working so I have
childPid = fork();
for(int i =0;i < numOfCommands;i++)
{
if(childPid == 0)
{
execvp(commands[i], argv);
perror("exec failure");
exit(1);
}
else
{
wait(&child_status);
}
}
What this does, is that it only executes the 1st command in my array but doesn't proceed any further, how would I continue ?
And what if i want the order for the commands to executed randomly and the results be intermixed so do I have to use fork then ?
You need to use fork in any case, if you want to execute more than one program. From man exec: (emphasis added)
The exec() family of functions replaces the current process image with a new process image.
…
The exec() functions return only if an error has occurred.
By using fork, you create a new process with the same image, and you can replace the image in the child process by calling exec without affecting the parent process, which is then free to fork and exec as many times as it wants to.
Don't forget to wait for the child processes to terminate. Otherwise, when they die they will become zombies. There is a complete example in the wait manpage, linked above.

popen() alternative

My question is extension of this one: popen creates an extra sh process
Motives:
1) My program need to create a child which does tail on a file. I need to process the output line by line. That is why I am using popen because it returns FILE *. I can easily fetch single line, do what I need to do and print it.
One problem with popen is that you do not get pid of child (tail command in my case).
2) My program should not exit before its child is done. So I need to do wait; but without pid, I cannot do it.
How can I achieve both the goals?
A possible (kludge) solution: do execvp("tail -f file > tmpfile") and the keep reading that tmpfile. I am not sure how good this solution is, though.
Why aren't you using pipe/fork/exec method?
pid_t pid = 0;
int pipefd[2];
FILE* output;
char line[256];
int status;
pipe(pipefd); //create a pipe
pid = fork(); //span a child process
if (pid == 0)
{
// Child. Let's redirect its standard output to our pipe and replace process with tail
close(pipefd[0]);
dup2(pipefd[1], STDOUT_FILENO);
dup2(pipefd[1], STDERR_FILENO);
execl("/usr/bin/tail", "/usr/bin/tail", "-f", "path/to/your/file", (char*) NULL);
}
//Only parent gets here. Listen to what the tail says
close(pipefd[1]);
output = fdopen(pipefd[0], "r");
while(fgets(line, sizeof(line), output)) //listen to what tail writes to its standard output
{
//if you need to kill the tail application, just kill it:
if(something_goes_wrong)
kill(pid, SIGKILL);
}
//or wait for the child process to terminate
waitpid(pid, &status, 0);
You can use pipe, a function of the exec* family and fdopen. This is non-standard, but so is popen.
You don't need to wait. Just read the pipe up to EOF.
execvp("tail -f file > tmpfile") won't work, redirection is a feature of the shell and you're not running the shell here. Even if it worked it would be an awful solution. Suppose you have read to the end of the file, but the child process has not ended yet. What do you do?
You can use wait as it doesn't want a PID to wait for but simply waits for the any child process to exit. If you have created other child processes you can keep track of them, and if wait returns an unknown PID you can assume it's from your popen process.
I'm not sure why you need the process ID of the child. When the child exits, your pipe read will return an EOF. If you need to terminate the child, just close the pipe.

Resources