C Minishell Adding Pipelines - c

So I'm making a UNIX minishell, and am trying to add pipelines, so I can do things like this:
ps aux | grep dh | grep -v grep | cut -c1-5
However I'm having trouble wrapping my head around the piping part. I replace all the "|" characters with 0, and then run each line as a normal line. However, I am trying to divert the output and input. The input of a command needs to be the output of the previous command, and the output of a command needs to be the input of the next command.
I'm doing this using pipes, however I can't figure out where to call pipe() and where to close them. From the main processing function, processline(), I have this code:
if((pix = findUnquotChar(line_itr, '|')))
{
line_itr[pix++] = 0;
if(pipe (fd) < 0) perror("pipe");
processline(line_itr, inFD, fd[1], pl_flags);
line_itr = &(line_itr[pix]);
while((pix = findUnquotChar(line_itr, '|')) && pix < line_len)
{
line_itr[pix++] = 0;
//? if(pipe (fd) < 0) perror("pipe");
processline(line_itr, fd[0], fd[1] pl_flags);
line_itr = &(line_itr[pix]);
//? close(fd[0]);
//? close(fd[1]);
}
return;
}
So, I'm recursively(the code above is in processline) sending the commands in between the "|" to be processed by processline. You can see where I commented out the code above, I'm not sure how to make it work. The 2nd and 3rd parameter of processline are the inputFD and outputFD respectively, so I need to process a command, write the output to a pipe, and then call processline again on the next command, however this time the output of the previous command is the input. This just doesn't seem like it can work though, because each time I close fd[0] I'm losing the previous output. Do I need two separate pipes, that I can flip flop back and forth with?
I'm just having trouble seeing how this is possible with a single pipe, if you guys need any additional info just ask. Here's the entire processline function in case you want to take a look:
http://pastebin.com/YiEdaYdj
EDIT: If anybody has an example of a shell that implements pipelines I would love a link to the source, I haven't been able to find one on google so far.
EDIT2: Here's an example of my predicament:
echo a | echo b | echo c
So first I would call the shell like this:
processline("echo a", 0, fd[1], flags);
....
processline("echo b", fd[0], NOT_SURE_GOES_HERE[1], flags);
....
processline("echo c", NOT_SURE_GOES_HERE[0], NOT_SURE_EITHER[1], flags);
Each of these occurs once per iteration, and as you can see I can't figure out what to pass for the input-file-descriptors and the output-file-descriptors for the 2nd and 3rd(and so on) iteration.

Here's some moderately generic but simple code to execute pipelines, a program I'm calling pipeline. It's an SSCCE in a single file as presented, though I'd have the files stderr.h and stderr.c as separate files in a library to be linked with all my programs. (Actually, I have a more complex set of functions in my 'real' stderr.c and stderr.h, but this is a good starting point.)
The code operates in two ways. If you supply no arguments, then it runs a built-in pipeline:
who | awk '{print $1}' | sort | uniq -c | sort -n
This counts the number of times each person is logged in on the system, presenting the list in order of increasing number of sessions. Alternatively, you can invoke with a sequence of arguments that are the command line you want invoked, use a quoted pipe '|' (or "|") to separate commands:
Valid:
pipeline
pipeline ls '|' wc
pipeline who '|' awk '{print $1}' '|' sort '|' uniq -c '|' sort -n
pipeline ls
Invalid:
pipeline '|' wc -l
pipeline ls '|' '|' wc -l
pipeline ls '|' wc -l '|'
The last three invocations enforce 'pipes as separators'. The code does not error check every system call; it does error check fork(), execvp() and pipe(), but skips checking on dup2() and close(). It doesn't include diagnostic printing for the commands that are generated; a -x option to pipeline would be a sensible addition, causing it to print out a trace of what it does. It also does not exit with the exit status of the last command in the pipeline.
Note that the code starts with a child being forked. The child will become the last process in the pipeline, but first creates a pipe and forks another process to run the earlier processes in the pipeline. The mutually recursive functions are unlikely to be the only way of sorting things out, but they do leave minimal code repetition (earlier drafts of the code had the content of exec_nth_command() largely repeated in exec_pipeline() and exec_pipe_command()).
The process structure here is such that the original process only knows about the last process in the pipeline. It is possible to redesign things in such a way that the original process is the parent of every process in the pipeline, so the original process can report separately on the status of each command in the pipeline. I've not yet modified the code to allow for that structure; it will be a little more complex, though not hideously so.
/* One way to create a pipeline of N processes */
/* stderr.h */
#ifndef STDERR_H_INCLUDED
#define STDERR_H_INCLUDED
static void err_setarg0(const char *argv0);
static void err_sysexit(char const *fmt, ...);
static void err_syswarn(char const *fmt, ...);
#endif /* STDERR_H_INCLUDED */
/* pipeline.c */
#include <assert.h>
#include <stdio.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
/*#include "stderr.h"*/
typedef int Pipe[2];
/* exec_nth_command() and exec_pipe_command() are mutually recursive */
static void exec_pipe_command(int ncmds, char ***cmds, Pipe output);
/* With the standard output plumbing sorted, execute Nth command */
static void exec_nth_command(int ncmds, char ***cmds)
{
assert(ncmds >= 1);
if (ncmds > 1)
{
pid_t pid;
Pipe input;
if (pipe(input) != 0)
err_sysexit("Failed to create pipe");
if ((pid = fork()) < 0)
err_sysexit("Failed to fork");
if (pid == 0)
{
/* Child */
exec_pipe_command(ncmds-1, cmds, input);
}
/* Fix standard input to read end of pipe */
dup2(input[0], 0);
close(input[0]);
close(input[1]);
}
execvp(cmds[ncmds-1][0], cmds[ncmds-1]);
err_sysexit("Failed to exec %s", cmds[ncmds-1][0]);
/*NOTREACHED*/
}
/* Given pipe, plumb it to standard output, then execute Nth command */
static void exec_pipe_command(int ncmds, char ***cmds, Pipe output)
{
assert(ncmds >= 1);
/* Fix stdout to write end of pipe */
dup2(output[1], 1);
close(output[0]);
close(output[1]);
exec_nth_command(ncmds, cmds);
}
/* Execute the N commands in the pipeline */
static void exec_pipeline(int ncmds, char ***cmds)
{
assert(ncmds >= 1);
pid_t pid;
if ((pid = fork()) < 0)
err_syswarn("Failed to fork");
if (pid != 0)
return;
exec_nth_command(ncmds, cmds);
}
/* Collect dead children until there are none left */
static void corpse_collector(void)
{
pid_t parent = getpid();
pid_t corpse;
int status;
while ((corpse = waitpid(0, &status, 0)) != -1)
{
fprintf(stderr, "%d: child %d status 0x%.4X\n",
(int)parent, (int)corpse, status);
}
}
/* who | awk '{print $1}' | sort | uniq -c | sort -n */
static char *cmd0[] = { "who", 0 };
static char *cmd1[] = { "awk", "{print $1}", 0 };
static char *cmd2[] = { "sort", 0 };
static char *cmd3[] = { "uniq", "-c", 0 };
static char *cmd4[] = { "sort", "-n", 0 };
static char **cmds[] = { cmd0, cmd1, cmd2, cmd3, cmd4 };
static int ncmds = sizeof(cmds) / sizeof(cmds[0]);
static void exec_arguments(int argc, char **argv)
{
/* Split the command line into sequences of arguments */
/* Break at pipe symbols as arguments on their own */
char **cmdv[argc/2]; // Way too many
char *args[argc+1];
int cmdn = 0;
int argn = 0;
cmdv[cmdn++] = &args[argn];
for (int i = 1; i < argc; i++)
{
char *arg = argv[i];
if (strcmp(arg, "|") == 0)
{
if (i == 1)
err_sysexit("Syntax error: pipe before any command");
if (args[argn-1] == 0)
err_sysexit("Syntax error: two pipes with no command between");
arg = 0;
}
args[argn++] = arg;
if (arg == 0)
cmdv[cmdn++] = &args[argn];
}
if (args[argn-1] == 0)
err_sysexit("Syntax error: pipe with no command following");
args[argn] = 0;
exec_pipeline(cmdn, cmdv);
}
int main(int argc, char **argv)
{
err_setarg0(argv[0]);
if (argc == 1)
{
/* Run the built in pipe-line */
exec_pipeline(ncmds, cmds);
}
else
{
/* Run command line specified by user */
exec_arguments(argc, argv);
}
corpse_collector();
return(0);
}
/* stderr.c */
/*#include "stderr.h"*/
#include <stdio.h>
#include <stdarg.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
static const char *arg0 = "<undefined>";
static void err_setarg0(const char *argv0)
{
arg0 = argv0;
}
static void err_vsyswarn(char const *fmt, va_list args)
{
int errnum = errno;
fprintf(stderr, "%s:%d: ", arg0, (int)getpid());
vfprintf(stderr, fmt, args);
if (errnum != 0)
fprintf(stderr, " (%d: %s)", errnum, strerror(errnum));
putc('\n', stderr);
}
static void err_syswarn(char const *fmt, ...)
{
va_list args;
va_start(args, fmt);
err_vsyswarn(fmt, args);
va_end(args);
}
static void err_sysexit(char const *fmt, ...)
{
va_list args;
va_start(args, fmt);
err_vsyswarn(fmt, args);
va_end(args);
exit(1);
}
Signals and SIGCHLD
The POSIX Signal Concepts section discusses SIGCHLD:
Under SIG_DFL:
If the default action is to ignore the signal, delivery of the signal shall have no effect on the process.
Under SIG_IGN:
If the action for the SIGCHLD signal is set to SIG_IGN, child processes of the calling processes shall not be transformed into zombie processes when they terminate. If the calling process subsequently waits for its children, and the process has no unwaited-for children that were transformed into zombie processes, it shall block until all of its children terminate, and wait(), waitid(), and waitpid() shall fail and set errno to [ECHILD].
The description of <signal.h> has a table of default dispositions for signals, and for SIGCHLD, the default is I (SIG_IGN).
I added another function to the code above:
#include <signal.h>
typedef void (*SigHandler)(int signum);
static void sigchld_status(void)
{
const char *handling = "Handler";
SigHandler sigchld = signal(SIGCHLD, SIG_IGN);
signal(SIGCHLD, sigchld);
if (sigchld == SIG_IGN)
handling = "Ignored";
else if (sigchld == SIG_DFL)
handling = "Default";
printf("SIGCHLD set to %s\n", handling);
}
I called it immediately after the call to err_setarg0(), and it reports 'Default' on both Mac OS X 10.7.5 and Linux (RHEL 5, x86/64). I validated its operation by running:
(trap '' CHLD; pipeline)
On both platforms, that reported 'Ignored', and the pipeline command no longer reported the exit status of the child; it didn't get it.
So, if the program is ignoring SIGCHLD, it does not generate any zombies, but does wait until 'all' of its children terminate. That is, until all of its direct children terminate; a process cannot wait on its grandchildren or more distant progeny, nor on its siblings, nor on its ancestors.
On the other hand, if the setting for SIGCHLD is the default, the signal is ignored, and zombies are created.
That's the most convenient behaviour for this program as written. The corpse_collector() function has a loop that collects the status information from any children. There's only one child at a time with this code; the rest of the pipeline is run as a child (of the child, of the child, ...) of the last process in the pipeline.
However I'm having trouble with zombies/corpses. My teacher had me implement it the same way you did, as cmd1 isn't the parent of cmd2 in the case of: "cmd1 | cmd2 | cmd3". Unless I tell my shell to wait on each process (cmd1, cmd2, and cmd3), rather than just waiting on the last process (cmd3), the entire pipeline shuts down before the output can reach the end. I'm having trouble figuring out a good way to wait on them; my teacher said to use WNOHANG.
I'm not sure I understand the problem. With the code I provided, cmd3 is the parent of cmd2, and cmd2 is the parent of cmd1 in a 3-command pipeline (and the shell is the parent of cmd3), so the shell can only wait on cmd3. I did state originally:
The process structure here is such that the original process only knows about the last process in the pipeline. It is possible to redesign things in such a way that the original process is the parent of every process in the pipeline, so the original process can report separately on the status of each command in the pipeline. I've not yet modified the code to allow for that structure; it will be a little more complex, though not hideously so.
If you've got your shell able to wait on all three commands in the pipeline, you must be using the alternative organization.
The waitpid() description includes:
The pid argument specifies a set of child processes for which status is requested. The waitpid() function shall only return the status of a child process from this set:
If pid is equal to (pid_t)-1, status is requested for any child process. In this respect, waitpid() is then equivalent to wait().
If pid is greater than 0, it specifies the process ID of a single child process for which status is requested.
If pid is 0, status is requested for any child process whose process group ID is equal to that of the calling process.
If pid is less than (pid_t)-1, status is requested for any child process whose process group ID is equal to the absolute value of pid.
The options argument is constructed from the bitwise-inclusive OR of zero or more of the following flags, defined in the header:
...
WNOHANG
The waitpid() function shall not suspend execution of the calling thread if status is not immediately available for one of the child processes specified by pid.
...
This means that if you're using process groups and the shell knows which process group the pipeline is running in (for example, because the pipeline is put into its own process group by the first process), then the parent can wait for the appropriate children to terminate.
...rambling... I think there's some useful information here; there probably should be more that I'm writing, but my mind's gone blank.

Related

How to combine `lshw` and `grep` command inside `execv` in C?

Here is a C program which operates finding specific properties like CPU bus info by consecutive calls of lshw (to access total hardware list with respective properties) and grep (to select just a relevant point among lshw results):
char *strCombine(char *str1, char *str2, int n)
{
int i = strlen(str2);
int j = 0;
if((str2 = (char *) realloc(str2, (i + n + 1))) == NULL)
perror(0);
while(j < n && str1[j])
{
str2[i] = str1[j];
i++;
j++;
}
str2[i] = 0;
return (str2);
}
int main()
{
pid_t parent;
char buf[1000] = {0};
char *str;
char *argv[6] = {"/usr/bin/lshw", "-C", "CPU", "|", "grep", "bus info"};
int fd[2];
int ret;
if(pipe(fd) == -1)
{
perror(NULL);
return -1;
}
parent = fork();
if(parent == 0)
{
close(fd[1]);
while((ret = read(fd[0], buf, 1000)))
str = strCombine(buf, str, ret);
close(fd[0]);
}
else
{
close(fd[0]);
execv(argv[0], argv);
close(fd[1]);
wait(0);
}
wait(0);
printf("%s", str);
return 0;
}
In this code grep is expected to follow lshw since both go executed by invoking execv. However, this pipeline doesn't work because lshw usage reference gets printed out in terminal (running on Ubuntu 18.04 LTS) instead of bus info needed originally. What makes this program failed to show just info that matters and what way must I try to set up pipeline?
The vertical bar is not a parameter you use to separate commands, as the execve(2) system call will load a program into the virtual space of one process only. You need to create two processes, one per command you want to execute, and communicate them so input from one goes to output from the other. I think also you'll be interested in the output of the last command, so you need to do two redirections (one from the first command to the second, and one from the output of the second command to a pipe descriptor), two forks, and two exec's in order to do this.
First the good news, you can do all this stuff with a simple call to popen(3) without the nitty gritties of making forks and execs while redirecting i/o from individual commands. Just use this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *cmd = "/usr/bin/lshw -C CPU | grep 'bus info'";
int n = 0;
char line[1000];
/* f will be associated to the output of the pipeline, so you can read from it.
* this is stated by the "r" of the second parameter */
FILE *f = popen(cmd, "r");
if (!f) {
perror(cmd);
exit(EXIT_FAILURE);
}
/* I read, line by line, and process the output,
* printing each line with some format string, but
* you are free here. */
while (fgets(line, sizeof line, f)) {
char *l = strtok(line, "\n");
if (!l) continue;
printf("line %d: [%s]\n", ++n, l);
}
/* once finished, you need to pclose(3) it. This
* makes program to wait(2) for child to finish and
* closing descriptor */
pclose(f);
}
If you need to mount such a pipeline you'll end having to
redirections from first command to second, from second to
parent process, and fork/exec both processes yourself.
In this approach, you handle a subshell to do the piping
and redirection work for you, and just you get a FILE * descriptor to read upon.
(if I find some time, I'll show you a full example of a chain of N commands with redirections to pipe them, but I cannot promise, as I have to write the code)
NOTE
fork() returns the pid of the child process to the parent, and 0 to the child process itself. I don't understand why you have a variable named parent where you store the value received from fork(). If it is nonzero (and non-negative) it represents the pid of a child process. You need two, as you need two processes. In the example I post, you create three processes (you ask a subshell to mount the pipeline for you, so you have a subshell you instruct to create two more processes, to execute your command) If you had to mount all this paraphernalia, you'd also to wait(2) for the children to finish (this is done in pclose(3) call)
I have a little program to spawn a process (only one) repeatedly, while overprinting its output in the same place. I use it as some kind of htop program when I try to see e.g. the output of ls -l (showing a file growing as it is being filled) or the output of df command. It starts the program, makes one fork, redirects the output of it to a pipe and gets the output of the command to count the number of lines output (to emit an escape sequence to put the cursor on top of the listing, and to emit a clear to the end of line after each output line, so shorter lines dont get blurred by longer ones. It shows you how to deal with forks and exec system calls, and you can use as example on how to do the things the brave way. But having popen(3) I think is the solution to your problem. If you want to have a look to my cont program, just find it here.

C piping using the command line arguments

I need some help emulating the "|" command in unix. I need to be able to use the output from the first argument as the input of the second, something as simple as ls and more. I got this code so far but I'm just stuck at this point. Any and all help would be helpful.-Thanks.
#include <sys/types.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char ** words)
{
char** args;
char *cmd1[2] = { words[1], 0 };
char *cmd2[2] = { words[2], 0 };
int colon, arg1 ,i, pid, status;
int thepipe[2];
char ch;
args = (char **) malloc(argc*(sizeof(char*)));
colon = -1;
for (i=0;(i<argc); i=i+1){
if (strcmp(words[i],":") == 0) {
colon = i;
}
else {}
}
pipe(thepipe);
arg1 = colon;
arg1 = arg1 - 1;
for (i=0;(i<arg1); i=i+1){
args[i] = (char*) (malloc(strlen(words[i+1])+1));
strcpy(args[i], words[i+1]);
}
args[argc] = NULL;
pid = fork();
if (pid == 0) {
wait(&pid);
dup2(thepipe[1], 1);
close(thepipe[0]);
printf("in new\n");
execvp(*args, cmd1);
}
else {
close(thepipe[1]);
printf("in old\n");
while ((status=read(thepipe[0],&ch,1)) > 0){
execvp(*args, cmd2);
}
}
}
Assuming that argv[1] is a single word command (like ls) and argv[2] is a second single word command (like more), then:
Parent
Create a pipe.
Fork first child.
Fork second child.
Close both ends of the pipe.
Parent waits for both children to die, reports their exit status, and exits itself.
Child 1
Duplicates write end of pipe to standard output.
Close both ends of the pipe.
Uses execvp() to run the command in argv[1].
Exits, probably with an error message written to standard error (if the execvp() returns).
Child 2
Duplicates read end of pipe to standard input.
Close both ends of the pipe.
Uses execvp() to run the command in argv[2].
Exits, probably with an error message written to standard error (if the execvp() returns).
The only remaining trick is that you need to create a vector such as:
char cmd1[2] = { argv[1], 0 };
char cmd2[2] = { argv[2], 0 };
to pass as the second argument to execvp().
Note that this outline does not break the strings up. If you want to handle an invocation such as:
./execute "ls -la" "wc -wl"
then you will need to split each argument into separate words and create bigger arrays for cmd1 and cmd2. If you want to handle more than two commands, you need to think quite carefully about how you're going to manage the extra stages in the pipeline. The first and last commands are different from those in the middle (so 3 processes has three different mechanisms, but 4 or more substantially uses the same mechanism over and over for all except the first and last commands).

Easiest way to execute linux program and communicate with it via stdin/stdout in C/C++

I have program I cant modify, as is, and I need to execute it, write some data to its stdin and get the answer from its stdout in programmatic manner, automated.
What is the simpliest way to do this?
I suppose something like this pseudo-C-code
char input_data_buffer[] = "calculate 2 + 2\nsay 'hello world!'";
char output_data_buffer[MAX_BUF];
IPCStream ipcs = executeIPC("./myprogram", "rw");
ipcs.write(input_data_buffer);
ipcs.read(output_data_buffer);
...
PS: I thought of popen, but AFAIK there is only one-way pipes in linux
EDIT:
It is supposed it will be one-message-from-each-side communication. Firstly parent side send input to child process' stdin, then child provides output to its stdout and exits, meanwhile parent reads its stdout. Now about communication termination: I think when child process exits it will send EOF terminator to stdout, so parent will know exactly whether child done, on the other hand it is guaranteed that parent knows what kind of input child expects for.
Generally this program (parent) - a student's solution tester. It takes paths to two other executables from CLI, the first is student's program to test, the second is etalon correctly working program, which solves very same problem.
Input/output of students programs is strictly specified, so tester run both programs and compares its outputs for lots of random inputs, all mismatches will be reported.
Input/output max size is estimated at few hundreds kilobytes
Example: ..implement insertion sort algorithm ... first line there is sequence length ... second line there is sequence of numbers a_i where |a_i| < 2^31 - 1...
output first line must be sum of all elements, the second line must be sorted sequence.
Input:
5
1 3 4 6 2
Expected output:
16
1 2 3 4 6
Read Advanced Linux Programming -which has at least an entire chapter to answer your question- and learn more about execve(2), fork(2), waitpid(2), pipe(2), dup2(2), poll(2) ...
Notice that you'll need (at least in a single-threaded program) to multiplex (with poll) on the input and the output of the program. Otherwise you might have a deadlock: the child process could be blocked writing to your program (because the output pipe is full), and your program could be blocked reading from it (because the input pipe is empty).
BTW, if your program has some event loop it might help (and actually poll is providing the basis for a simple event loop). And Glib (from GTK) provides function to spawn processes, Qt has QProcess, libevent knows them, etc.
Given that the processing is simply a question of one message from parent to child (which must be complete before the child responds), and one message from child to parent, then it is easy enough to handle:
Create two pipes, one for communication to child, one for communication to parent.
Fork.
Child process duplicates the relevant ends of the pipes (read end of 'to-child' pipe, write end of 'to-parent' pipe) to standard input, output.
Child closes all pipe file descriptors.
Child execs test program (or prints a message to standard error reporting failure and exits).
Parent closes the irrelevant ends of the pipes.
Parent writes the message to the child and closes the pipe.
Parent reads the response from the child and closes the pipe.
Parent continues on its merry way.
This leaves the child process lying around as a zombie. If the parent is going to do this more than once, or just needs to know the exit status of the child, then after closing the read pipe, it will wait for the child to die, collecting its status.
All this is straight-forward, routine coding. I'm sure you could find examples on SO.
Since apparently there are no suitable examples on Stack Overflow, here is a simple implementation of the code outlined above. There are two source files, basic_pipe.c for the basic piping work, and myprogram.c which is supposed to respond to the prompts shown in the question. The first is almost general purpose; it should probably loop on the read operation (but that hasn't mattered on the machine I tested it on, which is running an Ubuntu 14.04 derivative). The second is very specialized.
System calls
pipe()
fork()
dup2()
execv()
waitpid()
close()
read()
write()
basic_pipe.c
#include <assert.h>
#include <errno.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
static char msg_for_child[] = "calculate 2 + 2\nsay 'hello world!'\n";
static char cmd_for_child[] = "./myprogram";
static void err_syserr(const char *fmt, ...);
static void be_childish(int to_child[2], int fr_child[2]);
static void be_parental(int to_child[2], int fr_child[2], int pid);
int main(void)
{
int to_child[2];
int fr_child[2];
if (pipe(to_child) != 0 || pipe(fr_child) != 0)
err_syserr("Failed to open pipes\n");
assert(to_child[0] > STDERR_FILENO && to_child[1] > STDERR_FILENO &&
fr_child[0] > STDERR_FILENO && fr_child[1] > STDERR_FILENO);
int pid;
if ((pid = fork()) < 0)
err_syserr("Failed to fork\n");
if (pid == 0)
be_childish(to_child, fr_child);
else
be_parental(to_child, fr_child, pid);
printf("Process %d continues and exits\n", (int)getpid());
return 0;
}
static void be_childish(int to_child[2], int fr_child[2])
{
printf("Child PID: %d\n", (int)getpid());
fflush(0);
if (dup2(to_child[0], STDIN_FILENO) < 0 ||
dup2(fr_child[1], STDOUT_FILENO) < 0)
err_syserr("Failed to set standard I/O in child\n");
close(to_child[0]);
close(to_child[1]);
close(fr_child[0]);
close(fr_child[1]);
char *args[] = { cmd_for_child, 0 };
execv(args[0], args);
err_syserr("Failed to execute %s", args[0]);
/* NOTREACHED */
}
static void be_parental(int to_child[2], int fr_child[2], int pid)
{
printf("Parent PID: %d\n", (int)getpid());
close(to_child[0]);
close(fr_child[1]);
int o_len = sizeof(msg_for_child) - 1; // Don't send null byte
if (write(to_child[1], msg_for_child, o_len) != o_len)
err_syserr("Failed to write complete message to child\n");
close(to_child[1]);
char buffer[4096];
int nbytes;
if ((nbytes = read(fr_child[0], buffer, sizeof(buffer))) <= 0)
err_syserr("Failed to read message from child\n");
close(fr_child[0]);
printf("Read: [[%.*s]]\n", nbytes, buffer);
int corpse;
int status;
while ((corpse = waitpid(pid, &status, 0)) != pid && corpse != -1)
err_syserr("Got pid %d (status 0x%.4X) instead of pid %d\n",
corpse, status, pid);
printf("PID %d exited with status 0x%.4X\n", pid, status);
}
static void err_syserr(const char *fmt, ...)
{
int errnum = errno;
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
if (errnum != 0)
fprintf(stderr, "(%d: %s)\n", errnum, strerror(errnum));
exit(EXIT_FAILURE);
}
myprogram.c
#include <stdio.h>
int main(void)
{
char buffer[4096];
char *response[] =
{
"4",
"hello world!",
};
enum { N_RESPONSES = sizeof(response)/sizeof(response[0]) };
for (int line = 0; fgets(buffer, sizeof(buffer), stdin) != 0; line++)
{
fprintf(stderr, "Read line %d: %s", line + 1, buffer);
if (line < N_RESPONSES)
{
printf("%s\n", response[line]);
fprintf(stderr, "Sent line %d: %s\n", line + 1, response[line]);
}
}
fprintf(stderr, "All done\n");
return 0;
}
Example output
Note that there is no guarantee that the child will complete before the parent starts executing the be_parental() function.
Child PID: 19538
Read line 1: calculate 2 + 2
Sent line 1: 4
Read line 2: say 'hello world!'
Sent line 2: hello world!
All done
Parent PID: 19536
Read: [[4
hello world!
]]
PID 19538 exited with status 0x0000
Process 19536 continues and exits
You can use expect to achieve this:
http://en.wikipedia.org/wiki/Expect
This is what a usual expect program would do:
# Start the program
spawn <your_program>
# Send data to the program
send "calculate 2 + 2"
# Capture the output
set results $expect_out(buffer)
Expect can be used inside C programs using expect development library, so you can translate previous commands directly into C function calls. Here you have an example:
http://kahimyang.info/kauswagan/code-blogs/1358/using-expect-script-cc-library-to-manage-linux-hosts
You can also use it from perl and python which usually are usually easier to program for these type of purposes than C.

fork multiple child processes to run other programs

I want from parent program (called daemon) to start 5 child processes of test program with args(all 5 in parallel, not to wait to finish).
I have the following code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc,char* argv[]){
//missing irrelevant part where argum is set
int status,i;
char cmd[512];
pid_t process_id = 0;
for (i=0; i<=5;i++)
{
process_id = fork();
if (process_id < 0)
{
printf("fork failed - %d!\n",i);
continue;
}
else if(process_id > 0) {
printf("process_id of child process %d \n", process_id);
}
else
{
sprintf(cmd,"./test %s",argum);
status = system(cmd);
exit(0);
}
}
return 0;
}
it starts them but when I run ps -aux to see the processes, besides the good ones (like: ./test [args]) there are some duplicates like: sh -c ./test [args]
How can I get rid of those starting with "sh -c" ?
Instead of calling system() from the child, use a member of the exec*() family of functions.
Calling execXYZ() from the fork()ed off child process replaces the child process by the new process created from what had been passed to the execXYZ() call.
Please note that if execXYZ() succeeds it does not return.
Example for executing /bin/ls -alrt *.c:
The execl*() members of the family expect each white-space separate command line option as a single parameter.
execl("/bin/ls", "ls", "-alrt", "*.c", (char*) 0);
execlp("ls", "ls", "-alrt", "*.c", (char*) 0);
The execv*() members of the family expect each white-space separate command line option in the way parameters are passed to main():
char * const argv[] = {
"ls",
"-alrt",
"*.c",
NULL,
}
execv("/bin/ls", argv);
execvp("ls", argv);
The exec*p() family members make use of the environment's variable PATH to search for the binary to be executed. So for this example (as for the system command ls) the path does need to be specified.
At test program:
#include <unistd.h>
#include <stdio.h>
/* This should list the current working directory. */
int main(void)
{
execl("/bin/ls", "ls", "-al", "-rt", (char*) 0);
perror("execl() failed");
return 0;
}
The simplest way to lose sight of the sh -c entries is:
sprintf(cmd, "exec ./test %s", argum);
The exec replaces the shell run by system() with the command, instead of having the shell hang around until the ./test process terminates.
The alternative is outlined by alk in his answer — use the exec*() family of functions (system calls).

Unix pipe - reading data from stdin in the child descriptor

I'm trying to implement unix piping in c (i.e. execute ls | wc). I have found a related solution to my problem (C Unix Pipes Example) however, I am not sure why a specific portion of the solved code snippet works.
Here's the code:
/* Run WC. */
int filedes[2];
pipe(filedes);
/* Run LS. */
pid_t pid = fork();
if (pid == 0) {
/* Set stdout to the input side of the pipe, and run 'ls'. */
dup2(filedes[1], 1);
char *argv[] = {"ls", NULL};
execv("/bin/ls", argv);
} else {
/* Close the input side of the pipe, to prevent it staying open. */
close(filedes[1]);
}
/* Run WC. */
pid = fork();
if (pid == 0) {
dup2(filedes[0], 0);
char *argv[] = {"wc", NULL};
execv("/usr/bin/wc", argv);
}
In the child process that executes the wc command, though it attaches stndin to a file descriptor, it seems that we are not explicitly reading the output produced by ls in the first child process. Thus, to me it seems that ls is run independently and wc is running independently as we not explicitly using the output of ls when executing wc. How then does this code work (i.e. it executes ls | wc)?
The code shown just about works (it cuts a number of corners, but it works) because the forked children ensure that the the file descriptor that the executed process will write to (in the case of ls) and read from (in the case of wc) is the appropriate end of the pipe. You don't have to do any more; standard input is file descriptor 0, so wc with no (filename) arguments reads from standard input. ls always writes to standard output, file descriptor 1, unless it is writing an error message.
There are three processes in the code snippet; the parent process and two children, one from each fork().
The parent process should be closing both its ends of the pipe too; it only closes one.
In general, after you do a dup() or dup2() call on a pipe file descriptor, you should close both ends of the pipe. You get away with it here because ls generates data and terminates; you wouldn't in all circumstances.
The comment:
/* Set stdout to the input side of the pipe, and run 'ls'. */
is inaccurate; you're setting stdout to the output side of the pipe, not the input side.
You should have an error exit after the execv() calls; if they fail, they return, and the process can wreak havoc (for example, if the ls fails, you end up with two copies of wc running.
An SSCCE
Note the careful closing of both ends of the pipe in each of the processes. The parent process has no use for the pipe once it has launched both children. I left the code which closes filedes[1] early in place (but removed it from an explicit else block since the following code was also only executed if the else was executed). I might well have kept pairs of closes() in each of the three code paths where files need to be closed.
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
int main(void)
{
int filedes[2];
int corpse;
int status;
pipe(filedes);
/* Run LS. */
pid_t pid = fork();
if (pid == 0)
{
/* Set stdout to the output side of the pipe, and run 'ls'. */
dup2(filedes[1], 1);
close(filedes[1]);
close(filedes[0]);
char *argv[] = {"ls", NULL};
execv("/bin/ls", argv);
fprintf(stderr, "Failed to execute /bin/ls\n");
exit(1);
}
/* Close the input side of the pipe, to prevent it staying open. */
close(filedes[1]);
/* Run WC. */
pid = fork();
if (pid == 0)
{
/* Set stdin to the input side of the pipe, and run 'wc'. */
dup2(filedes[0], 0);
close(filedes[0]);
char *argv[] = {"wc", NULL};
execv("/usr/bin/wc", argv);
fprintf(stderr, "Failed to execute /usr/bin/wc\n");
exit(1);
}
close(filedes[0]);
while ((corpse = waitpid(-1, &status, 0)) > 0)
printf("PID %d died 0x%.4X\n", corpse, status);
return(0);
}
Example output:
$ ./pipes-14312939
32 32 389
PID 75954 died 0x0000
PID 75955 died 0x0000
$

Resources