Here is a C program which operates finding specific properties like CPU bus info by consecutive calls of lshw (to access total hardware list with respective properties) and grep (to select just a relevant point among lshw results):
char *strCombine(char *str1, char *str2, int n)
{
int i = strlen(str2);
int j = 0;
if((str2 = (char *) realloc(str2, (i + n + 1))) == NULL)
perror(0);
while(j < n && str1[j])
{
str2[i] = str1[j];
i++;
j++;
}
str2[i] = 0;
return (str2);
}
int main()
{
pid_t parent;
char buf[1000] = {0};
char *str;
char *argv[6] = {"/usr/bin/lshw", "-C", "CPU", "|", "grep", "bus info"};
int fd[2];
int ret;
if(pipe(fd) == -1)
{
perror(NULL);
return -1;
}
parent = fork();
if(parent == 0)
{
close(fd[1]);
while((ret = read(fd[0], buf, 1000)))
str = strCombine(buf, str, ret);
close(fd[0]);
}
else
{
close(fd[0]);
execv(argv[0], argv);
close(fd[1]);
wait(0);
}
wait(0);
printf("%s", str);
return 0;
}
In this code grep is expected to follow lshw since both go executed by invoking execv. However, this pipeline doesn't work because lshw usage reference gets printed out in terminal (running on Ubuntu 18.04 LTS) instead of bus info needed originally. What makes this program failed to show just info that matters and what way must I try to set up pipeline?
The vertical bar is not a parameter you use to separate commands, as the execve(2) system call will load a program into the virtual space of one process only. You need to create two processes, one per command you want to execute, and communicate them so input from one goes to output from the other. I think also you'll be interested in the output of the last command, so you need to do two redirections (one from the first command to the second, and one from the output of the second command to a pipe descriptor), two forks, and two exec's in order to do this.
First the good news, you can do all this stuff with a simple call to popen(3) without the nitty gritties of making forks and execs while redirecting i/o from individual commands. Just use this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *cmd = "/usr/bin/lshw -C CPU | grep 'bus info'";
int n = 0;
char line[1000];
/* f will be associated to the output of the pipeline, so you can read from it.
* this is stated by the "r" of the second parameter */
FILE *f = popen(cmd, "r");
if (!f) {
perror(cmd);
exit(EXIT_FAILURE);
}
/* I read, line by line, and process the output,
* printing each line with some format string, but
* you are free here. */
while (fgets(line, sizeof line, f)) {
char *l = strtok(line, "\n");
if (!l) continue;
printf("line %d: [%s]\n", ++n, l);
}
/* once finished, you need to pclose(3) it. This
* makes program to wait(2) for child to finish and
* closing descriptor */
pclose(f);
}
If you need to mount such a pipeline you'll end having to
redirections from first command to second, from second to
parent process, and fork/exec both processes yourself.
In this approach, you handle a subshell to do the piping
and redirection work for you, and just you get a FILE * descriptor to read upon.
(if I find some time, I'll show you a full example of a chain of N commands with redirections to pipe them, but I cannot promise, as I have to write the code)
NOTE
fork() returns the pid of the child process to the parent, and 0 to the child process itself. I don't understand why you have a variable named parent where you store the value received from fork(). If it is nonzero (and non-negative) it represents the pid of a child process. You need two, as you need two processes. In the example I post, you create three processes (you ask a subshell to mount the pipeline for you, so you have a subshell you instruct to create two more processes, to execute your command) If you had to mount all this paraphernalia, you'd also to wait(2) for the children to finish (this is done in pclose(3) call)
I have a little program to spawn a process (only one) repeatedly, while overprinting its output in the same place. I use it as some kind of htop program when I try to see e.g. the output of ls -l (showing a file growing as it is being filled) or the output of df command. It starts the program, makes one fork, redirects the output of it to a pipe and gets the output of the command to count the number of lines output (to emit an escape sequence to put the cursor on top of the listing, and to emit a clear to the end of line after each output line, so shorter lines dont get blurred by longer ones. It shows you how to deal with forks and exec system calls, and you can use as example on how to do the things the brave way. But having popen(3) I think is the solution to your problem. If you want to have a look to my cont program, just find it here.
Related
So my program needs to pipe multiple processes and read the number of bytes each process output has.
The way I implemented it, in a for loop, we have two children:
Child 1: dups output and executes the process
Child 2: reads the output and writes it for the next input
Currently, child 1 executes the process and the child 2 reads its output, but it doesn't seem to write it in the right place because in the second loop iteration it prints the output to the screen and blocks.
for (int i = 0; i < processes; i++) {
int result = socketpair(PF_LOCAL, SOCK_STREAM, 0, apipe[i]);
if (result == -1) {
error_and_exit();
}
int pid;
int pid2;
pid = fork_or_die();
// child one points to STDOUT
if (pid == FORK_CHILD) {
if (dup2(apipe[i][1], STDOUT_FILENO) == -1)
error_and_exit();
if (close(apipe[i][1]) == -1)
error_and_exit();
if (close(apipe[i][0]) == -1)
error_and_exit();
if (execlp("/bin/sh", "sh", "-c", tabCommande[i], (char *)NULL) == -1)
error_and_exit();
}
pid2 = fork_or_die();
//CHILD 2 reads the output and writes if for the next command to use
if(pid2 == FORK_CHILD){
FILE *fp;
fp = fopen("count", "a");
close(apipe[i][1]);
int count=0;
char str[4096];
count = read(apipe[i][0], str, sizeof(str)+1);
close(apipe[i][0]);
write(STDIN_FILENO, str, count);
fprintf(fp, "%d : %d \n ", i, count);
fclose(fp);
}
}
Your second child does “write(STDIN_FILENO, …); that’s not a conventional way of using standard input.
If standard input is a terminal, then the device is usually opened for reading and writing and the three standard I/O channels are created using dup() or dup2(). Thus you can read from the outputs and write to the input — but only if the streams are connected to a login terminal (window). If the input is a pipe, you can't successfully write to it, nor can you read from the output if it is a pipe. (Similarly if the input is redirected from a file or the output is redirected to a file.) This terminal setup is done by the process that creates the terminal window. It is background information explaining why writing to standard input appears on the terminal.
Anyway, that's what you're doing. You are writing to the terminal via standard input. Your minimum necessary change is to replace STDIN_FILENO with STDOUT_FILENO.
You are also going to need a loop around the reading and writing code. In general, processes generate lots of output in small chunks. The close on the input pipe will be outside the loop, of course, not between the read() and write() operations. You should check that the write() operations write all the data to the output.
You should also have the second child exit after it closes the output file. In this code, I'd probably open the file after the counting loop (or what will become the counting loop), but that's mostly a stylistic change, keeping the scope of variables to a minimum.
You will probably eventually need to handle signals like SIGPIPE (or ignore it so that the output functions return errors when the pipe is closed early). However, that's a refinement for when you have the basic code working.
Bug: you have:
count = read(apipe[i][0], str, sizeof(str)+1);
This is a request to the o/s to give you a buffer overflow — you ask it to write more data into str than str can hold. Remove the +1!
Minor note: you don’t need to check the return value from execlp() or any of that family of functions. If the call succeeds, it doesn’t return; if it returns, it failed. Your code is correct to exit after the call to execlp(), though; that's good.
You said:
I replaced STDIN_FILENO to STDOUT_FILENO in the second child but it doesn't seem to solve the issue. The output is still shown in the terminal and there's a pipe blockage after.
That observation may well be correct, but it isn't something that can be resolved by studying this code alone. The change to write to an output stream is necessary — and in the absence of any alternative information, writing to STDOUT_FILENO is better than writing to STDIN_FILENO.
That is a necessary change, but it is probably not a sufficient change. There are other changes needed too.
Did you set up the inputs and outputs for the pair of children this code creates correctly? It is very hard to know from the code shown — but given that it is not working as you intended, it's a reasonable inference that you did not get all the plumbing correct. You need to draw a diagram of how the processes are intended to operate in the larger context. At a minimum, you need to know where the standard input for each process comes from, and where its standard input goes. Sometimes, you need to worry about standard error too — most likely though, in this case, you can quietly ignore it.
This is what I think your code could look like — though the comments in it describe numerous possible variants.
#include <sys/socket.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
/* The code needs these declarations and definition to compile */
extern _Noreturn void error_and_exit(void);
extern pid_t fork_or_die(void);
extern void unknown_function(void);
static ssize_t copy_bytes(int fd1, int fd2);
#define FORK_CHILD 0
int processes;
int apipe[20][2];
char *tabCommande[21];
void unknown_function(void)
{
for (int i = 0; i < processes; i++)
{
int result = socketpair(PF_LOCAL, SOCK_STREAM, 0, apipe[i]);
if (result == -1)
error_and_exit();
int pid1 = fork_or_die();
// child one points to STDOUT
if (pid1 == FORK_CHILD)
{
if (dup2(apipe[i][1], STDOUT_FILENO) == -1)
error_and_exit();
if (close(apipe[i][1]) == -1)
error_and_exit();
if (close(apipe[i][0]) == -1)
error_and_exit();
execlp("/bin/sh", "sh", "-c", tabCommande[i], (char *)NULL);
error_and_exit();
}
//CHILD 2 reads the output and writes if for the next command to use
int pid2 = fork_or_die();
if (pid2 == FORK_CHILD)
{
close(apipe[i][1]);
ssize_t count = copy_bytes(apipe[i][0], STDOUT_FILENO);
FILE *fp = fopen("count", "a");
if (fp == NULL)
error_and_exit();
/*
** Using %zd for ssize_t is a reasonable guess at a format to
** print ssize_t - but it is a guess. Alternatively, change the
** type of count to long long and use %lld. There isn't a
** documented, official (fully standardized by POSIX) conversion
** specifier for ssize_t AFAIK.
*/
fprintf(fp, "%d : %zd\n ", i, count);
fclose(fp);
exit(EXIT_SUCCESS);
}
/*
** This is crucial - the parent has all the pipes open, and the
** child processes won't get EOF until the parent closes the
** write ends of the pipes, and they won't get EOF on the inputs
** until the parent closes the read ends of the pipe.
**
** It could be avoided if the first child creates the pipe or
** socketpair and then creates the second child as a grandchild
** of the main process. That also alters the process structure
** and reduces the number of processes that the original parent
** process has to wait for. If the first child creates the
** pipe, then the apipe array of arrays becomes unnecessary;
** you can have a simple int apipe[2]; array that's local to the
** two processes. However, you may need the array of arrays so
** that you can chain the outputs of one process (pair of
** processes) to the input of the next.
*/
close(apipe[i][0]);
close(apipe[i][1]);
}
}
static ssize_t copy_bytes(int fd1, int fd2)
{
ssize_t tbytes = 0;
ssize_t rbytes;
char buffer[4096];
while ((rbytes = read(fd1, buffer, sizeof(buffer))) > 0)
{
ssize_t wbytes = write(fd2, buffer, rbytes);
if (wbytes != rbytes)
{
/*
** There are many possible ways to deal with this. If
** wbytes is negative, then the write failed, presumably
** irrecoverably. The code could break the loop, reporting
** how many bytes were written successfully to the output.
** If wbytes is zero (pretty improbable), it isn't clear
** what happened. If wbytes is positive, then you could add
** the current value to tbytes and try to write the rest in
** a loop until everything has been written or an error
** occurs. You pays your money and takes your pick.
*/
error_and_exit();
}
tbytes += wbytes;
}
if (tbytes == 0 && rbytes < 0)
tbytes = rbytes;
return tbytes;
}
You could add #include <signal.h> and signal(SIGPIPE, SIG_IGN); to the code in the second child.
I've managed to make an abstract syntax tree for my minishell, the things is
when I tried to execute the piped commands I got stuck.
The first pipe execute and output the result to the stdout 1, while the second one grep filename either stuck or not executed at all.
I tried different approches and I got different result, yet none of theme works
I would appreciate any help.
This how my AST looks like.
ls -la | cat -e | grep filename
t_node *pipe_execution(t_node *node, t_list *blt, t_line *line, int std[2])
{
int pp[2];
if (node)
{
if (node->kind == NODE_PIPE)
{
if (node->and_or_command->left)
{
pipe(pp);
std[1] = pp[1];
pipe_execution(node->and_or_command->left, blt, line, std);
close(pp[1]);
}
if (node->and_or_command->right)
{
std[0] = pp[0];
std[1] = 1;
dprintf(2, "right std %d\n", std[1]);
pipe_execution(node->and_or_command->right, blt, line, std);
close(std[0]);
}
} else if (node->kind == NODE_SIMPLE_COMMAND)
{
dprintf(2, "====%s=== and stdin %d stdout %d\n", node->simple_command->head->name, std[0], std[1]);
execute_shell(blt, line->env, node, std);
}
}
return (node);
}
int execute_shell(t_list *blt, t_list *env, t_node *node, int std[2])
{
...
return (my_fork(path, env, cmds, std));
}
my implementation of fork process.
int my_fork(char *path, t_list *env, char **cmds, int std[2])
{
pid_t child;
char **env_tab;
int status;
status = 0;
env_tab = env_to_tab(env);
child = fork();
if (child > 0)
waitpid(child, &status, 0);
else if (child == 0)
{
dup2(std[0], 0);
dup2(std[1], 1);
execve(path, cmds, env_tab);
}
return (status);
}
I hope this code make some sense.
Pipes require concurrent execution
The problem, as far as I can tell from the code snippets you provided, is that my_fork() is blocking. So when you execute a process, your shell stops and wait for that process to finish, before starting the next one. If you do something simple, like:
/bin/echo Hello | cat
Then the pipe's internal buffer is big enough to store the whole input string Hello. Once the /bin/echo process finishes, you execute cat, which can then read the buffered data from the pipe. However, once it gets more complicated, or when the first process would send a lot more data to the pipe, its internal buffer will get full, and then it will block.
The solution is to defer calling waitpid() on the processes you fork until you have spawned all the processes that are part of the command line.
Create all required pipes before starting processes
Your function pipe_execution() assumes that there is only a single pipe; it starts the first process with filedescriptor 0 as its input, and it starts the second process with filedescriptor 1 as its output. However, if you have multiple pipes on a single command line, like in ls -la | cat -e | grep filename, then the output of the cat -e process need to go into the second pipe, not to standard output.
You need to create the second pipe before starting the right-hand command of the first pipe. It's probably simplest to just create all the pipes before starting any of the commands. You could do this by defining multiple phases:
Create pipes
Start commands
Wait for all commands to finish
You can traverse the abstract syntax tree you built multiple times, each time executing one of the phases.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
int main(int argc, char **argv) {
FILE *file;
file = fopen(argv[1], "r");
char buf[600];
char *pos;
pid_t parent = fork();
if(parent == 0) {
while (fgets(buf, sizeof(buf), file)) {
pid_t child = fork();
if(child == 0) {
/* is there a function I can put here so that it waits
once the parent is exited to then run?*/
printf("%s\n", buf);
return(0);
}
}
return(0);
}
wait(NULL);
return(0);
}
The goal here to print out the line of a file all at the same time, parallel.
For example:
Given a file
a
b
c
$ gcc -Wall above.c
$ ./a.out file
a
c
b
$ ./a.out file
b
c
a
As in the processes ran at the exact same time. I think I can get this to work if there was a wait clause that waits for the parent to exit then start running the child. As shown in the comments above. Once the parent exits then all the processes would start at the print statement as wanted.
If you had:
int i = 10;
while (i > 0)
{
pid_t child = fork();
if(child == 0) {
printf("i: %d\n", i--);
exit(0);
}
}
then the child processes are running concurrently. And depending on the number of cores and your OS scheduler, they might even run literally at the same time. However, printf is buffer, so the order in which the lines appear on screen cannot be determined and will vary between executions of your program. And because printf is buffered, you will most likely not see lines overlapping other other. However if you were using write directly to stdout, then the outputs might overlap.
In your scenario however, the children die so fast and because you are reading
from a file (which might take a while to return), by the time the next fork is executed,
the previous child is already dead. But that doesn't change the fact, that if
the children would run long enough, they would be running concurrently and the
order of the lines on screen cannot be determined.
edit
As Barmar points out in the comments, write is atomic. I looked up in my
man page and in the BUGS section it says this:
man 2 write
According to POSIX.1-2008/SUSv4 Section XSI 2.9.7 ("Thread Interactions with Regular File Operations"):
All of the following functions shall be atomic with respect to each other in the effects specified in POSIX.1-2008 when they operate on regular files or symbolic links: ...
Among the APIs subsequently listed are write() and writev(2). And among the effects that should be atomic across threads (and pro‐
cesses) are updates of the file offset. However, on Linux before version 3.14, this was not the case: if two processes that share an
open file description (see open(2)) perform a write() (or writev(2)) at the same time, then the I/O operations were not atomic with
respect updating the file offset, with the result that the blocks of data output by the two processes might (incorrectly) overlap.
This problem was fixed in Linux 3.14.
Sever years ago I observed this behaviour of write on stdout with concurrent
children printing stuff, that's why I wrote that with write, the lines may
overlap.
I am not sure why you have an outer loop. You could rewrite as follows. Once you create the child processes, they could run in any order. So you might seem the output in "order" but in another run you might see different order. It depends on the process scheduling by your OS and for your purpose, it's all running in "parallel". So you really don't need to ensure parent process is dead.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char **argv)
{
if (argc != 2) {
printf("Incorrect args\n");
exit(1);
}
char buf[1024];
FILE *file = fopen(argv[1], "r");
while (fgets(buf, sizeof buf, file)) {
pid_t child = fork();
if(child == 0) {
write(STDOUT_FILENO, buf, strlen(buf));
_exit(0);
}
}
/* Wait for all child processes. */
while (wait(NULL) != -1);
}
In order to realize a shell command interpretor, I try to execute pipes.
To do it, I use a recursive function in wich I use the pipe function and some redirections with dup2.
Here is my code :
void test_recurs(pid_t pid, char **ae)
{
char *const arg[2] = {"/bin/ls", NULL};
char *const arg2[3] = {"/bin/wc", NULL};
static int limit = 0;
int check;
int fd[2];
if (limit > 5)
return ;
if (pipe(fd) == -1)
{
printf("pipe failed\n");
return ;
}
pid = fork();
if(pid != 0)
{
printf("père %d\n",getpid());
close(fd[0]);
dup2(fd[1], 1);
close(fd[1]);
if ((execve("/bin/ls", arg, ae)) == -1)
exit(125);
dprintf(2, "execution ls\n");
wait(&check);
}
else
{
printf("fils %d\n", getpid());
close(fd[1]);
dup2(fd[0], 0);
close(fd[0]);
if ((execve("/bin/wc", arg2, ae)) == -1)
printf("echec execve\n");;
dprintf(2, "limit[%d]\n", limit);
limit++;
test_recurs(pid, ae);
}
}
The problem is it only execute "ls | wc" one time and then wait on the standard input. I know that the problem may come from the pipes (and the redirections).
It's a bit unclear how you are trying to use the function you present, but here are some notable points about it:
It's poor form to rely on a static variable to limit recursion depth because it's not thread-safe and because you need to do extra work to manage it (for example, to ensure that any changes are backed out when the function returns). Use a function parameter instead.
As has been observed in comments, the exec-family functions return only on failure. Although you acknowledge that, I'm not sure you appreciate the consequences, for both branches of your fork contain code that will never be executed as a result. The recursive call in particular is dead and will never be executed.
Moreover, the process in which the function is called performs an execve() call itself. The reason that function does not return is that it replaces the process image with that of the new process. That means that function test_recurs() also does not return.
Just as shell ordinarily must fork / exec to launch a single external command, it ordinarily must fork / exec for each command in a pipeline. If it fails to do so then afterward it is no longer running -- whatever it exec'ed without forking runs instead.
The problem is it only execute "ls | wc" one time and then wait on the standard input.
Certainly it does not recurse, because the recursive call is in a section of dead code. I suspect you are mistaken in your claim that it afterward waits on standard input, because the process that calls that function execs /bin/ls, which does not read from standard input. When the ls exits, however, leaving you with neither shell nor ls, what you then see might seem to be a wait on stdin.
I need some help emulating the "|" command in unix. I need to be able to use the output from the first argument as the input of the second, something as simple as ls and more. I got this code so far but I'm just stuck at this point. Any and all help would be helpful.-Thanks.
#include <sys/types.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char ** words)
{
char** args;
char *cmd1[2] = { words[1], 0 };
char *cmd2[2] = { words[2], 0 };
int colon, arg1 ,i, pid, status;
int thepipe[2];
char ch;
args = (char **) malloc(argc*(sizeof(char*)));
colon = -1;
for (i=0;(i<argc); i=i+1){
if (strcmp(words[i],":") == 0) {
colon = i;
}
else {}
}
pipe(thepipe);
arg1 = colon;
arg1 = arg1 - 1;
for (i=0;(i<arg1); i=i+1){
args[i] = (char*) (malloc(strlen(words[i+1])+1));
strcpy(args[i], words[i+1]);
}
args[argc] = NULL;
pid = fork();
if (pid == 0) {
wait(&pid);
dup2(thepipe[1], 1);
close(thepipe[0]);
printf("in new\n");
execvp(*args, cmd1);
}
else {
close(thepipe[1]);
printf("in old\n");
while ((status=read(thepipe[0],&ch,1)) > 0){
execvp(*args, cmd2);
}
}
}
Assuming that argv[1] is a single word command (like ls) and argv[2] is a second single word command (like more), then:
Parent
Create a pipe.
Fork first child.
Fork second child.
Close both ends of the pipe.
Parent waits for both children to die, reports their exit status, and exits itself.
Child 1
Duplicates write end of pipe to standard output.
Close both ends of the pipe.
Uses execvp() to run the command in argv[1].
Exits, probably with an error message written to standard error (if the execvp() returns).
Child 2
Duplicates read end of pipe to standard input.
Close both ends of the pipe.
Uses execvp() to run the command in argv[2].
Exits, probably with an error message written to standard error (if the execvp() returns).
The only remaining trick is that you need to create a vector such as:
char cmd1[2] = { argv[1], 0 };
char cmd2[2] = { argv[2], 0 };
to pass as the second argument to execvp().
Note that this outline does not break the strings up. If you want to handle an invocation such as:
./execute "ls -la" "wc -wl"
then you will need to split each argument into separate words and create bigger arrays for cmd1 and cmd2. If you want to handle more than two commands, you need to think quite carefully about how you're going to manage the extra stages in the pipeline. The first and last commands are different from those in the middle (so 3 processes has three different mechanisms, but 4 or more substantially uses the same mechanism over and over for all except the first and last commands).