I've managed to make an abstract syntax tree for my minishell, the things is
when I tried to execute the piped commands I got stuck.
The first pipe execute and output the result to the stdout 1, while the second one grep filename either stuck or not executed at all.
I tried different approches and I got different result, yet none of theme works
I would appreciate any help.
This how my AST looks like.
ls -la | cat -e | grep filename
t_node *pipe_execution(t_node *node, t_list *blt, t_line *line, int std[2])
{
int pp[2];
if (node)
{
if (node->kind == NODE_PIPE)
{
if (node->and_or_command->left)
{
pipe(pp);
std[1] = pp[1];
pipe_execution(node->and_or_command->left, blt, line, std);
close(pp[1]);
}
if (node->and_or_command->right)
{
std[0] = pp[0];
std[1] = 1;
dprintf(2, "right std %d\n", std[1]);
pipe_execution(node->and_or_command->right, blt, line, std);
close(std[0]);
}
} else if (node->kind == NODE_SIMPLE_COMMAND)
{
dprintf(2, "====%s=== and stdin %d stdout %d\n", node->simple_command->head->name, std[0], std[1]);
execute_shell(blt, line->env, node, std);
}
}
return (node);
}
int execute_shell(t_list *blt, t_list *env, t_node *node, int std[2])
{
...
return (my_fork(path, env, cmds, std));
}
my implementation of fork process.
int my_fork(char *path, t_list *env, char **cmds, int std[2])
{
pid_t child;
char **env_tab;
int status;
status = 0;
env_tab = env_to_tab(env);
child = fork();
if (child > 0)
waitpid(child, &status, 0);
else if (child == 0)
{
dup2(std[0], 0);
dup2(std[1], 1);
execve(path, cmds, env_tab);
}
return (status);
}
I hope this code make some sense.
Pipes require concurrent execution
The problem, as far as I can tell from the code snippets you provided, is that my_fork() is blocking. So when you execute a process, your shell stops and wait for that process to finish, before starting the next one. If you do something simple, like:
/bin/echo Hello | cat
Then the pipe's internal buffer is big enough to store the whole input string Hello. Once the /bin/echo process finishes, you execute cat, which can then read the buffered data from the pipe. However, once it gets more complicated, or when the first process would send a lot more data to the pipe, its internal buffer will get full, and then it will block.
The solution is to defer calling waitpid() on the processes you fork until you have spawned all the processes that are part of the command line.
Create all required pipes before starting processes
Your function pipe_execution() assumes that there is only a single pipe; it starts the first process with filedescriptor 0 as its input, and it starts the second process with filedescriptor 1 as its output. However, if you have multiple pipes on a single command line, like in ls -la | cat -e | grep filename, then the output of the cat -e process need to go into the second pipe, not to standard output.
You need to create the second pipe before starting the right-hand command of the first pipe. It's probably simplest to just create all the pipes before starting any of the commands. You could do this by defining multiple phases:
Create pipes
Start commands
Wait for all commands to finish
You can traverse the abstract syntax tree you built multiple times, each time executing one of the phases.
Related
Here is a C program which operates finding specific properties like CPU bus info by consecutive calls of lshw (to access total hardware list with respective properties) and grep (to select just a relevant point among lshw results):
char *strCombine(char *str1, char *str2, int n)
{
int i = strlen(str2);
int j = 0;
if((str2 = (char *) realloc(str2, (i + n + 1))) == NULL)
perror(0);
while(j < n && str1[j])
{
str2[i] = str1[j];
i++;
j++;
}
str2[i] = 0;
return (str2);
}
int main()
{
pid_t parent;
char buf[1000] = {0};
char *str;
char *argv[6] = {"/usr/bin/lshw", "-C", "CPU", "|", "grep", "bus info"};
int fd[2];
int ret;
if(pipe(fd) == -1)
{
perror(NULL);
return -1;
}
parent = fork();
if(parent == 0)
{
close(fd[1]);
while((ret = read(fd[0], buf, 1000)))
str = strCombine(buf, str, ret);
close(fd[0]);
}
else
{
close(fd[0]);
execv(argv[0], argv);
close(fd[1]);
wait(0);
}
wait(0);
printf("%s", str);
return 0;
}
In this code grep is expected to follow lshw since both go executed by invoking execv. However, this pipeline doesn't work because lshw usage reference gets printed out in terminal (running on Ubuntu 18.04 LTS) instead of bus info needed originally. What makes this program failed to show just info that matters and what way must I try to set up pipeline?
The vertical bar is not a parameter you use to separate commands, as the execve(2) system call will load a program into the virtual space of one process only. You need to create two processes, one per command you want to execute, and communicate them so input from one goes to output from the other. I think also you'll be interested in the output of the last command, so you need to do two redirections (one from the first command to the second, and one from the output of the second command to a pipe descriptor), two forks, and two exec's in order to do this.
First the good news, you can do all this stuff with a simple call to popen(3) without the nitty gritties of making forks and execs while redirecting i/o from individual commands. Just use this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *cmd = "/usr/bin/lshw -C CPU | grep 'bus info'";
int n = 0;
char line[1000];
/* f will be associated to the output of the pipeline, so you can read from it.
* this is stated by the "r" of the second parameter */
FILE *f = popen(cmd, "r");
if (!f) {
perror(cmd);
exit(EXIT_FAILURE);
}
/* I read, line by line, and process the output,
* printing each line with some format string, but
* you are free here. */
while (fgets(line, sizeof line, f)) {
char *l = strtok(line, "\n");
if (!l) continue;
printf("line %d: [%s]\n", ++n, l);
}
/* once finished, you need to pclose(3) it. This
* makes program to wait(2) for child to finish and
* closing descriptor */
pclose(f);
}
If you need to mount such a pipeline you'll end having to
redirections from first command to second, from second to
parent process, and fork/exec both processes yourself.
In this approach, you handle a subshell to do the piping
and redirection work for you, and just you get a FILE * descriptor to read upon.
(if I find some time, I'll show you a full example of a chain of N commands with redirections to pipe them, but I cannot promise, as I have to write the code)
NOTE
fork() returns the pid of the child process to the parent, and 0 to the child process itself. I don't understand why you have a variable named parent where you store the value received from fork(). If it is nonzero (and non-negative) it represents the pid of a child process. You need two, as you need two processes. In the example I post, you create three processes (you ask a subshell to mount the pipeline for you, so you have a subshell you instruct to create two more processes, to execute your command) If you had to mount all this paraphernalia, you'd also to wait(2) for the children to finish (this is done in pclose(3) call)
I have a little program to spawn a process (only one) repeatedly, while overprinting its output in the same place. I use it as some kind of htop program when I try to see e.g. the output of ls -l (showing a file growing as it is being filled) or the output of df command. It starts the program, makes one fork, redirects the output of it to a pipe and gets the output of the command to count the number of lines output (to emit an escape sequence to put the cursor on top of the listing, and to emit a clear to the end of line after each output line, so shorter lines dont get blurred by longer ones. It shows you how to deal with forks and exec system calls, and you can use as example on how to do the things the brave way. But having popen(3) I think is the solution to your problem. If you want to have a look to my cont program, just find it here.
I am currently writing my own shell implementation in C. I understood the principle behind piping and redirecting the fds. However, some specific behavior with pipes has attracted my attention:
cat | ls (or any command that does not read from stdin as final element of the pipe).
In that case, what happens in the shell is that ls executes and cat asks for a single line before exiting (resulting from a SIGPIPE I guess). I have tried to follow this tutorial to better understand the principle behind multiple pipes: http://web.cse.ohio-state.edu/~mamrak.1/CIS762/pipes_lab_notes.html
Below is some code I have written to try to replicate the behavior I am looking for:
char *cmd1[] = {"/bin/cat", NULL};
char *cmd2[] = {"/bin/ls", NULL};
int pdes[2];
pid_t child;
if (!(child = fork()))
{
pipe(pdes);
if (!fork())
{
close(pdes[0]);
dup2(pdes[1], STDOUT_FILENO);
/* cat command gets executed here */
execvp(cmd1[0], cmd1);
}
else
{
close(pdes[1]);
dup2(pdes[0], STDIN_FILENO);
/* ls command gets executed here */
execvp(cmd2[0], cmd2);
}
}
wait(NULL);
I am aware of the security flaws of that implementation but this is just for testing. The problem with that code as I understand it is that whenever ls gets executed, it just exits and then cat runs in the background somehow (and in my case fail because it tries to read during the prompt of zsh as my program exits). I cannot find a solution to make it work like it should be. Because if I wait for the commands one by one, such commands as cat /dev/random | head -c 10 would run forever...
If anyone has a solution for this issue or at least some guidance it would be greatly appreciated.
After consideration of comments from #thatotherguy here is the solution I found as implemented in my code. Please bear in mind that pipe and fork calls should be checked for errors but this version is meant to be as simple as possible. Extra exit calls are also necessary for some of my built-in commands.
void exec_pipe(t_ast *tree, t_sh *sh)
{
int pdes[2];
int status;
pid_t child_right;
pid_t child_left;
pipe(pdes);
if (!(child_left = fork()))
{
close(pdes[READ_END]);
dup2(pdes[WRITE_END], STDOUT_FILENO);
/* Execute command to the left of the tree */
exit(execute_cmd(tree->left, sh));
}
if (!(child_right = fork()))
{
close(pdes[WRITE_END]);
dup2(pdes[READ_END], STDIN_FILENO);
/* Recursive call or execution of last command */
if (tree->right->type == PIPE_NODE)
exec_pipe(tree->right, sh);
else
exit(execute_cmd(tree->right, sh));
}
/* Should not forget to close both ends of the pipe */
close(pdes[WRITE_END]);
close(pdes[READ_END]);
wait(NULL);
waitpid(child_right, &status, 0);
exit(get_status(status));
}
I was confused with the original link I posted and the different ways to handle chained pipes. From the link to the POSIX documented posted below my original question (http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_02) it appears that:
If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.
Both behavior are therefore accepted: waiting for last command, or waiting for all of them. I chose to implement the second behavior to stick to what bash/zsh would do.
In order to realize a shell command interpretor, I try to execute pipes.
To do it, I use a recursive function in wich I use the pipe function and some redirections with dup2.
Here is my code :
void test_recurs(pid_t pid, char **ae)
{
char *const arg[2] = {"/bin/ls", NULL};
char *const arg2[3] = {"/bin/wc", NULL};
static int limit = 0;
int check;
int fd[2];
if (limit > 5)
return ;
if (pipe(fd) == -1)
{
printf("pipe failed\n");
return ;
}
pid = fork();
if(pid != 0)
{
printf("père %d\n",getpid());
close(fd[0]);
dup2(fd[1], 1);
close(fd[1]);
if ((execve("/bin/ls", arg, ae)) == -1)
exit(125);
dprintf(2, "execution ls\n");
wait(&check);
}
else
{
printf("fils %d\n", getpid());
close(fd[1]);
dup2(fd[0], 0);
close(fd[0]);
if ((execve("/bin/wc", arg2, ae)) == -1)
printf("echec execve\n");;
dprintf(2, "limit[%d]\n", limit);
limit++;
test_recurs(pid, ae);
}
}
The problem is it only execute "ls | wc" one time and then wait on the standard input. I know that the problem may come from the pipes (and the redirections).
It's a bit unclear how you are trying to use the function you present, but here are some notable points about it:
It's poor form to rely on a static variable to limit recursion depth because it's not thread-safe and because you need to do extra work to manage it (for example, to ensure that any changes are backed out when the function returns). Use a function parameter instead.
As has been observed in comments, the exec-family functions return only on failure. Although you acknowledge that, I'm not sure you appreciate the consequences, for both branches of your fork contain code that will never be executed as a result. The recursive call in particular is dead and will never be executed.
Moreover, the process in which the function is called performs an execve() call itself. The reason that function does not return is that it replaces the process image with that of the new process. That means that function test_recurs() also does not return.
Just as shell ordinarily must fork / exec to launch a single external command, it ordinarily must fork / exec for each command in a pipeline. If it fails to do so then afterward it is no longer running -- whatever it exec'ed without forking runs instead.
The problem is it only execute "ls | wc" one time and then wait on the standard input.
Certainly it does not recurse, because the recursive call is in a section of dead code. I suspect you are mistaken in your claim that it afterward waits on standard input, because the process that calls that function execs /bin/ls, which does not read from standard input. When the ls exits, however, leaving you with neither shell nor ls, what you then see might seem to be a wait on stdin.
//Executing shell command ls -l | sort
int *pipeIN, *pipeOUT;
int runLS(){
char* parmListLS[] = { "ls", "-l", NULL };
int pid = fork();
if(pid==0){
close(*pipeIN);
dup2(*pipeOUT, STDOUT_FILENO);
execvp(parmListLS[0], parmListLS);
}else return pid;
}
int runSORT(){
char* parmListSORT[] = { "sort", NULL };
int pid = fork();
if(pid==0){
close(*pipeOUT);
dup2(*pipeIN, STDIN_FILENO);
execvp(parmListSORT[0], parmListSORT);
}else return pid;
}
int main(void){
int pidLS, pidSort, pipeId[2];
pipeIN = &pipeId[0], pipeOUT = &pipeId[1];
pipe(pipeId); //open pipes
pidLS = runLS();
pidSort = runSORT();
printf("PIDS: LS -> %d, Sort -> %d\n", pidLS, pidSort);
printf("Terminated: %d\n", wait(NULL)); //return pid of 1st exited proc
printf("Terminated: %d\n", wait(NULL)); //return pid of 2nd exited proc
printf("Terminated Main Proccess!\n");
}
Hello! I'm having trouble making some simple pipes work on linux.
I'm trying to emulate the shell command ls - l | sort.
As you can see, I'm initializing two child proccesses,
one for ls -l, and the other for sort.
Those proccesses run independently, and I'm trying to redirect the output
of ls - l, as input for sort via pipes. The concept is that the main program
should wait for both of them to terminate. However, wait seems to work
fine for ls -l, but it hangs - never terminates for sort. So the whole program
hangs. If I remove the two wait code lines from the program, the ls - l -> sort
pipe works perfectly, and of course main terminates before them.
I suspect this is due to the fact that sort keeps waiting for input,
even after ls - l has terminated. I don't understand how the termination of
the parent affects the termination of both children processes.
Would anyone please be so kind as to explain to me what actualy happens?
Thank you very much. ps. I ignored most error checking for clarity and less code.
The sort can't terminate because the parent process still has the write end (and the read end) of the pipe open, so the sort hasn't reached EOF on the pipe, so it hasn't got all the data, so it can't write any output.
Add:
close(pipeId[0]);
close(pipeID[1]);
after the calls to runLS() and runSORT() and things work as expected.
Ideally, the children should close both ends of the pipe after they've just duplicated. If you duplicate a pipe file descriptor to a standard file stream, you should almost always close both ends of the pipe. In this code, you get away without doing it; that won't always be the case.
Personally, I don't think the two global variables help much. The pipe labelled pipeIn is the input to sort but the output from ls, for example.
I am trying to implement multi pipe in C, to run multiple commands like a shell.
I have made a linked list (called t_launch in my code) which look like that if you type "ls | grep src | wc" :
wc -- PIPE -- grep src -- PIPE -- ls
Every PIPE node contain an int tab[2] from the pipe() function (of course, there have been one pipe() call for each PIPE node)
Now i am trying to execute these commands :
int execute_launch_list(t_shell *shell, t_launch *launchs)
{
pid_t pid;
int status;
int firstpid;
firstpid = 0;
while (launchs != NULL)
{
if ((pid = fork()) == -1)
return (my_error("Unable to fork\n"));
if (pid == 0)
{
if (launchs->prev != NULL)
{
close(1);
dup2(launchs->prev->pipefd[1], 1);
close(launchs->prev->pipefd[0]);
}
if (launchs->next != NULL)
{
close(0);
dup2(launchs->next->pipefd[0], 0);
close(launchs->next->pipefd[1]);
}
execve(launchs->cmdpath, launchs->words, shell->environ);
}
else if (firstpid == 0)
firstpid = pid;
launchs = launchs->next == NULL ? launchs->next : launchs->next->next;
}
waitpid(firstpid, &status, 0);
return (SUCCESS);
}
But that doesn't work : it looks like commands dont stop reading.
For example if i type "ls | grep src, "src" will be print from the grep command, but the grep continue reading and never stop. If i type "ls | grep src | wc", nothing is printed. What's wrong with my code ?
Thank you.
If I understand your code correctly, you first call pipe in the shell process for every PIPE. You then proceed to fork each process.
While you do close the unused end of each of the child's pipes in the child process, this procedure suffers from two problems:
Every child has every pipe, and doesn't close the ones which don't belong to it
The parent (shell) process has all the pipes open.
Consequently, all the pipes are open, and the children don't get EOFs.
By the way, you need to wait() for all the children, not just the last one. Consider the case where the first child does some long computation after closing stdout, but remember that any computation or side-effect after stdout is closed, even a short one, could be sequenced after the sink process terminates since multiprocessing is essentially non-deterministic.