//Executing shell command ls -l | sort
int *pipeIN, *pipeOUT;
int runLS(){
char* parmListLS[] = { "ls", "-l", NULL };
int pid = fork();
if(pid==0){
close(*pipeIN);
dup2(*pipeOUT, STDOUT_FILENO);
execvp(parmListLS[0], parmListLS);
}else return pid;
}
int runSORT(){
char* parmListSORT[] = { "sort", NULL };
int pid = fork();
if(pid==0){
close(*pipeOUT);
dup2(*pipeIN, STDIN_FILENO);
execvp(parmListSORT[0], parmListSORT);
}else return pid;
}
int main(void){
int pidLS, pidSort, pipeId[2];
pipeIN = &pipeId[0], pipeOUT = &pipeId[1];
pipe(pipeId); //open pipes
pidLS = runLS();
pidSort = runSORT();
printf("PIDS: LS -> %d, Sort -> %d\n", pidLS, pidSort);
printf("Terminated: %d\n", wait(NULL)); //return pid of 1st exited proc
printf("Terminated: %d\n", wait(NULL)); //return pid of 2nd exited proc
printf("Terminated Main Proccess!\n");
}
Hello! I'm having trouble making some simple pipes work on linux.
I'm trying to emulate the shell command ls - l | sort.
As you can see, I'm initializing two child proccesses,
one for ls -l, and the other for sort.
Those proccesses run independently, and I'm trying to redirect the output
of ls - l, as input for sort via pipes. The concept is that the main program
should wait for both of them to terminate. However, wait seems to work
fine for ls -l, but it hangs - never terminates for sort. So the whole program
hangs. If I remove the two wait code lines from the program, the ls - l -> sort
pipe works perfectly, and of course main terminates before them.
I suspect this is due to the fact that sort keeps waiting for input,
even after ls - l has terminated. I don't understand how the termination of
the parent affects the termination of both children processes.
Would anyone please be so kind as to explain to me what actualy happens?
Thank you very much. ps. I ignored most error checking for clarity and less code.
The sort can't terminate because the parent process still has the write end (and the read end) of the pipe open, so the sort hasn't reached EOF on the pipe, so it hasn't got all the data, so it can't write any output.
Add:
close(pipeId[0]);
close(pipeID[1]);
after the calls to runLS() and runSORT() and things work as expected.
Ideally, the children should close both ends of the pipe after they've just duplicated. If you duplicate a pipe file descriptor to a standard file stream, you should almost always close both ends of the pipe. In this code, you get away without doing it; that won't always be the case.
Personally, I don't think the two global variables help much. The pipe labelled pipeIn is the input to sort but the output from ls, for example.
Related
I've managed to make an abstract syntax tree for my minishell, the things is
when I tried to execute the piped commands I got stuck.
The first pipe execute and output the result to the stdout 1, while the second one grep filename either stuck or not executed at all.
I tried different approches and I got different result, yet none of theme works
I would appreciate any help.
This how my AST looks like.
ls -la | cat -e | grep filename
t_node *pipe_execution(t_node *node, t_list *blt, t_line *line, int std[2])
{
int pp[2];
if (node)
{
if (node->kind == NODE_PIPE)
{
if (node->and_or_command->left)
{
pipe(pp);
std[1] = pp[1];
pipe_execution(node->and_or_command->left, blt, line, std);
close(pp[1]);
}
if (node->and_or_command->right)
{
std[0] = pp[0];
std[1] = 1;
dprintf(2, "right std %d\n", std[1]);
pipe_execution(node->and_or_command->right, blt, line, std);
close(std[0]);
}
} else if (node->kind == NODE_SIMPLE_COMMAND)
{
dprintf(2, "====%s=== and stdin %d stdout %d\n", node->simple_command->head->name, std[0], std[1]);
execute_shell(blt, line->env, node, std);
}
}
return (node);
}
int execute_shell(t_list *blt, t_list *env, t_node *node, int std[2])
{
...
return (my_fork(path, env, cmds, std));
}
my implementation of fork process.
int my_fork(char *path, t_list *env, char **cmds, int std[2])
{
pid_t child;
char **env_tab;
int status;
status = 0;
env_tab = env_to_tab(env);
child = fork();
if (child > 0)
waitpid(child, &status, 0);
else if (child == 0)
{
dup2(std[0], 0);
dup2(std[1], 1);
execve(path, cmds, env_tab);
}
return (status);
}
I hope this code make some sense.
Pipes require concurrent execution
The problem, as far as I can tell from the code snippets you provided, is that my_fork() is blocking. So when you execute a process, your shell stops and wait for that process to finish, before starting the next one. If you do something simple, like:
/bin/echo Hello | cat
Then the pipe's internal buffer is big enough to store the whole input string Hello. Once the /bin/echo process finishes, you execute cat, which can then read the buffered data from the pipe. However, once it gets more complicated, or when the first process would send a lot more data to the pipe, its internal buffer will get full, and then it will block.
The solution is to defer calling waitpid() on the processes you fork until you have spawned all the processes that are part of the command line.
Create all required pipes before starting processes
Your function pipe_execution() assumes that there is only a single pipe; it starts the first process with filedescriptor 0 as its input, and it starts the second process with filedescriptor 1 as its output. However, if you have multiple pipes on a single command line, like in ls -la | cat -e | grep filename, then the output of the cat -e process need to go into the second pipe, not to standard output.
You need to create the second pipe before starting the right-hand command of the first pipe. It's probably simplest to just create all the pipes before starting any of the commands. You could do this by defining multiple phases:
Create pipes
Start commands
Wait for all commands to finish
You can traverse the abstract syntax tree you built multiple times, each time executing one of the phases.
I am working on a tiny shell(tsh) implemented in C(it's an assignment). One part of assignment belongs to PIPING. I have to pipe a command's output to another command. e.g:ls -l | sort
When I run the shell, every command that I execute on it, is processed by a child process that it spawns. After the child finishes the result is returned. For piping I wanted to implement a harcoded example first to check how it works. I wrote a method, that partially works. The problems is when I run the pipe command, after child process finishes, the whole program quits with it! Obviously I am not handling the child process signal properly(Method code below).
My Question:
How does process management with pipe() works? if i run a command ls -l | sort does it create a child process for ls -l and another process for sort ? From the piping examples that I have seen so far, only one process is created(fork()).
When the second command (sort from our example) is processed, how can i get its process ID?
EDIT: Also while running this code I get the result twice. don't know why it runs twice, there is no loop in there.
Here is my code:
pid_t pipeIt(void){
pid_t pid;
int pipefd[2];
if(pipe(pipefd)){
unix_error("pipe");
return -1;
}
if((pid = fork()) <0){
unix_error("fork");
return -1;
}
if(pid == 0){
close(pipefd[0]);
dup2(pipefd[1],1);
close(pipefd[1]);
if(execl("/bin/ls", "ls", (char *)NULL) < 0){
unix_error("/bin/ls");
return -1;
}// End of if command wasn't successful
}// End of pid == 0
else{
close(pipefd[1]);
dup2(pipefd[0],0);
close(pipefd[0]);
if(execl("/usr/bin/tr", "tr", "e", "f", (char *)NULL) < 0){
unix_error("/usr/bin/tr");
return -1;
}
}
return pid;
}// End of pipeIt
Yes, the shell must fork to exec each subprocess. Remember that when you call one of the execve() family of functions, it replaces the current process image with the exec'ed one. Your shell cannot continue to process further commands if it directly execs a subprocess, because thereafter it no longer exists (except as the subprocess).
To fix it, simply fork() again in the pid == 0 branch, and exec the ls command in that child. Remember to wait() for both (all) child processes if you don't mean the pipeline to be executed asynchronously.
Yes, you do need to call fork at least twice, once for each program in the pipeline. Remember that exec replaces the program image of the current process, so your shell stops existing the moment you start running sort or (tr).
I'm still new to processes,pipes and dup2, therefore I'd like someone to help me figure out what's wrong with a program I've created. This program is supposed to run ls | wc. So far the output I get is :
wc : standard input : Bad file descriptor
0 0 0
ls : write error : Bad file descriptor
After I get this output, the terminal still accepts inputs. It's like wc is still running, although if I put commands like ls first(without any other input before) it runs them and shuts down. I tried running ps before/after and while the program was still running and it didn't show any process being open aside from bash and ps. (I'm running this program in Linux terminal)
Here's my code :
#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
#include<stdlib.h>
#include<string.h>
#include<sys/wait.h>
#include<errno.h>
int main(int argc, char* argv[]){
pid_t pid;
int fd[2];
char com1[1024] = ("ls");
char com2[1024] = ("wc");
pipe(fd);
pid = fork();
if(pid == 0){
open(fd[1]);
dup2(fd[0],STDOUT_FILENO);
close(fd[0]);
execlp(com1, com1, NULL);
}
else {
pid = fork();
if (pid == 0){
open(fd[0]);
dup2(fd[1],STDIN_FILENO);
close(fd[1]);
execlp(com2, com2, NULL);
}
}
return 0;
}
Bear in mind that I know some if commands for checking are required(like if(pid<0)exit(0);) but I tried to simplify my code as much as possible in order to see if there's any mistake due to carelessness.
Thank you in advance!
According to the pipe manual page:
pipefd[0] refers to the read end of the pipe. pipefd[1] refers to the write end of the pipe.
Now take this line from the first child, the process that calls the ls command:
dup2(fd[0],STDOUT_FILENO);
Here you duplicate the read end of the pipe to STDOUT_FILENO, i.e. where output is written. If you stop and think a little about it, how would you write to a read-only file-descriptor like fd[0]?
Same with the other child process, where you make the write end of the pipe standard input.
The solution is simple: Swap places of the descriptors you duplicate. Use fd[1] for the first child process, and fd[0] for the second child process.
In the first process where you call the ls command:
dup2(fd[1],STDOUT_FILENO);
close(fd[1]);
execlp(com1, com1, NULL);
And in the second child process where you call the wc command:
dup2(fd[0],STDIN_FILENO);
close(fd[0]);
execlp(com2, com2, NULL);
My mini-shell program accepts pipe command, for example, ls -l | wc -l and uses excevp to execute these commands.
My problem is if there is no fork() for execvp, the pipe command works well but the shell terminates afterward. If there is a fork() for execvp, dead loop happens. And I cannot fix it.
code:
void run_pipe(char **args){
int ps[2];
pipe(ps);
pid_t pid = fork();
pid_t child_pid;
int child_status;
if(pid == 0){ // child process
close(1);
close(ps[0]);
dup2(ps[1], 1);
//e.g. cmd[0] = "ls", cmd[1] = "-l"
char ** cmd = split(args[index], " \t");
//if fork here, program cannot continue with infinite loop somewhere
if(fork()==0){
if (execvp(cmd[0],cmd)==-1){
printf("%s: Command not found.\n", args[0]);
}
}
wait(0);
}
else{ // parent process
close(0);
close(ps[1]);
dup2(ps[0],0);
//e.g. cmd[0] = "wc", cmd[1] = "-l"
char ** cmd = split(args[index+1], " \t");
//if fork here, program cannot continue with infinite loop somewhere
if(fork()==0){
if (execvp(cmd[0],cmd)==-1){
printf("%s: Command not found.\n", args[0]);
}
}
wait(0);
waitpid(pid, &child_status, 0);
}
}
I know fork() is needed for excevp in order to not terminate the shell program, but I still cannot fix it. Any help will be appreciated, thank you!
How should I make two children parallel?
pid = fork();
if( pid == 0){
// child
} else{ // parent
pid1 = fork();
if(pid1 == 0){
// second child
} else // parent
}
is this correct?
Yes, execvp() replaces the program in which it is called with a different one. If you want to spawn another program without ending execution of the one that does the spawning (i.e. a shell) then that program must fork() to create a new process, and have the new process perform the execvp().
Your program source exhibits a false parallelism that probably either confuses you or reflects a deeper confusion. You structure the behavior of the first child forked in just the same way as the behavior of the parent process after the fork, but what should be parallel is the behavior of the first child and the behavior of the second child.
One outcome is that your program has too many forks. The initial process should fork exactly twice -- once for each child it wants to spawn -- and neither child should fork because it's already a process dedicated to one of the commands you want to run. In your actual program, however, the first child does fork. That case is probably rescued by the child also wait()ing for the grandchild, but it's messy and poor form.
Another outcome is that when you set up the second child's file descriptors, you manipulate the parent's, prior to forking, instead of manipulating the child's after forking. Those changes will persist in the parent process, which I'm pretty confident is not what you want. This is probably why the shell seems to hang: when run_pipe() returns (the shell's standard input has been changed to the read end of the pipe).
Additionally, the parent process should close both ends of the pipe after the children have both been forked, for more or less the same reason that the children must each close the end they are not using. In the end, there will be exactly one open copy of the file descriptor for each end of the pipe, one in one child and the other in the other. Failing to do this correctly can also cause a hang under some circumstances, as the processes you fork may not terminate.
Here's a summary of what you want the program to do:
The original process sets up the pipe.
The original process forks twice, once for each command.
Each subprocess manipulates its own file descriptors to use the correct end of the pipe as the appropriate standard FD, and closes the other end of the pipe.
Each subprocess uses execvp() (or one of the other functions in that family) to run the requested program
the parent closes its copies of the file descriptors for both ends of the pipe
the parent uses wait() or waitpid() to collect two children.
Note, too, that you should check the return values of all your function calls and provide appropriate handling for errors.
I am working on a tiny shell(tsh) implemented in C(it's an assignment). One part of assignment belongs to PIPING. I have to pipe a command's output to another command. e.g:ls -l | sort
When I run the shell, every command that I execute on it, is processed by a child process that it spawns. After the child finishes the result is returned. For piping I wanted to implement a harcoded example first to check how it works. I wrote a method, that partially works. The problems is when I run the pipe command, after child process finishes, the whole program quits with it! Obviously I am not handling the child process signal properly(Method code below).
My Question:
How does process management with pipe() works? if i run a command ls -l | sort does it create a child process for ls -l and another process for sort ? From the piping examples that I have seen so far, only one process is created(fork()).
When the second command (sort from our example) is processed, how can i get its process ID?
EDIT: Also while running this code I get the result twice. don't know why it runs twice, there is no loop in there.
Here is my code:
pid_t pipeIt(void){
pid_t pid;
int pipefd[2];
if(pipe(pipefd)){
unix_error("pipe");
return -1;
}
if((pid = fork()) <0){
unix_error("fork");
return -1;
}
if(pid == 0){
close(pipefd[0]);
dup2(pipefd[1],1);
close(pipefd[1]);
if(execl("/bin/ls", "ls", (char *)NULL) < 0){
unix_error("/bin/ls");
return -1;
}// End of if command wasn't successful
}// End of pid == 0
else{
close(pipefd[1]);
dup2(pipefd[0],0);
close(pipefd[0]);
if(execl("/usr/bin/tr", "tr", "e", "f", (char *)NULL) < 0){
unix_error("/usr/bin/tr");
return -1;
}
}
return pid;
}// End of pipeIt
Yes, the shell must fork to exec each subprocess. Remember that when you call one of the execve() family of functions, it replaces the current process image with the exec'ed one. Your shell cannot continue to process further commands if it directly execs a subprocess, because thereafter it no longer exists (except as the subprocess).
To fix it, simply fork() again in the pid == 0 branch, and exec the ls command in that child. Remember to wait() for both (all) child processes if you don't mean the pipeline to be executed asynchronously.
Yes, you do need to call fork at least twice, once for each program in the pipeline. Remember that exec replaces the program image of the current process, so your shell stops existing the moment you start running sort or (tr).