Difference between pipe system call and reading/writing to stdin/out

Difference between pipe system call and reading/writing to stdin/out - c

A pipe connects the stdout of one process to the stdin of another: https://superuser.com/a/277327
Here is a simple program to take input from stdin and print it:
int main( ) {
char str[100];
gets( str );
puts( str );
return 0;
}
I can use a unix pipe to pass the input from another process:
echo "hi" | ./a.out
My question is, what is the difference between the simple code above and using the pipe() system call? Does the system call essentially do the same job without writing to the terminal? More on Pipes: https://tldp.org/LDP/lpg/node11.html

The pipe() system call allows you to get file descriptors (one for reading and one for writing) for a channel (a pipe) that allows to stream bytes through multiple processes. This is an example where a parent process creates a pipe and its child writes to it so the parent can read from it:
int main() {
int fd[2];
pipe(fd);
int pid = fork();
if (pid == 0) { // Child:
close(fd[0]); // Close reading descriptor as it's not needed
write(fd[1], "Hello", 5);
} else { // Parent:
char buf[5];
close(fd[1]); // Close writing descriptor as it's not needed
read(fd[0], buf, 5); // Read the data sent by the child through the pipe
write(1, buf, 5); // print the data that's been read to stdout
}
}
When a shell encounters the pipe (|) operator, it does use the pipe() system call, but also does additional things, in order to redirect the left operand's stdout and the right operand's stdin to the pipe. Here's a simplified example of what the shell would do for the command echo "hi" | ./a.out (keep in mind that when duplicating a file descriptor it gets duplicated to the first index available in the open files structure of the process):
int main() {
int fd[2];
pipe(fd);
int pid_echo = fork();
if (pid_echo == 0) {
// Close reading descriptor as it's not needed
close(fd[0]);
// Close standard output
close(1);
// Replace standard output with the pipe by duplicating its writing descriptor
dup(fd[1]);
// Execute echo;
// now when echo prints to stdout it will actually print to the pipe
// because now file descriptor 1 belongs to the pipe
execlp("echo", "echo", "hi", (char*)NULL);
exit(-1);
}
int pid_aout = fork();
if (pid_aout == 0) {
// Close standard input
close(0);
// Replace standard input with the pipe by duplicating its reading descriptor
dup(fd[0]);
// Execute a.out;
// Now when a.out reads from stdin it will actually read from the pipe
// because now file descriptor 0 belongs to the pipe
execl("./a.out", "./a.out", (char*)NULL);
exit(-1);
}
}

A pipe is an inter-process communication mechanism that leverages I/O redirection. However, pipes are not involved in all I/O redirection.
Since child processes may inherit file descriptors from their parent process, a parent process may change what files the child's standard streams point to, unbeknownst to the child process. This is I/O redirection.

Related

Using pipes write input to cat command

I am new to using pipes and forking in general. What I want to do is create a program that will execute the "cat" function in bash indirectly such that I can send input to cat through my program and receive the output in a text file.
I am having two problems:
Using the execvp function, is there a way of running "cat" without being forced to interact with the prompts, and instead send input through C?
The other issue is catching the input from the cat and writing it to a text file.
For instance, if I wrote something like
send_cat("hi");
send_cat("hello");
Then in the text file it would read
hi
hello

The solution to the problem you're describing involves the use of the pipe() system call and the dup2 system call.
Basically, you'd set up a pipe() between the parent and child processes, and then your solution should use dup2 to redirect the stdin of the child process that runs cat to come from the stdin of the process that calls execvp. Your solution should do something similar for stdout: use dup2 to redirect the stdout of the execvp child process to the stdout of the program.
Edit: There was a bit of hand-waving done in the above explanation, and you caught me in an extremely generous mood, so such a program structure might look like this:
Edit 2: I first tried writing this example program with cat instead of echo, but then I realized that you'd need to somehow send an EOF signal to the cat process from within the cat process, and sending a '\0' is ineffective.
int pipefd[2];
int result = pipe(pipefd);
if (result < 0) {
// pipe error
perror("pipe failure");
exit(1);
}
// Redirect the program's stdout and stdin to go to and from the pipe, respectively.
// This means that "echo"'s output will go to the pipe, and when "echo" finishes and we return execution to the parent process, we'll be able to read the information that "echo" just output from that pipe
// This is necessary in order to restore stdin and stdout to what they were prior to running this program
int savedStdin = dup(0);
int savedStdout = dup(1);
// Redirect stdin to come from the pipe
if ( dup2(pipefd[0], 0) < 0 ) {
perror("dup2 error");
exit(1);
}
// Close the read end of the pipe because the original descriptor was dupliechoed
close(pipefd[0]);
// Redirect stdout to go to the pipe
if ( dup2(pipefd[1], 1) < 0 ) {
perror("dup2 error");
exit(1);
}
// Close the write end of the pipe because the original descriptor was dupliechoed
close(pipefd[1]);
if ( fork() == 0 ) {
// Child process, will call "echo" and die
execlp("echo", "echo", "Hello_world!", NULL);
// The program should never ever get to this point, ever
// but if it does, we need to handle it
exit(1);
} else {
// Parent process, we need to wait for "echo" to terminate
wait(NULL);
// At this point stdout and stdin are still coming to/from the pipe, so if we do something like cin >> s, that will read from the pipe
// First, let's restore stdout to what it was before we redirected it, so that we can print the output of "echo" to the terminal
if (dup2(savedStdout, 1) < 0 ) {
perror("dup2 error");
exit(1);
}
close(savedStdout);
string s;
// Now we're going to read from stdin (the pipe) and print to stdout (the terminal, if you're running this from the command-line)
while (cin >> s) printf("%s\n", s.c_str() );
// We've read everything from "echo", let's fix stdin now
if (dup2(savedStdin, 0) < 0 ) {
perror("dup2 error");
exit(1);
}
close(savedStdin);
}

Why do we need to call close on pipes before execvp?

I've been trying to implement shell-like functionality with pipes in an application and I'm following this example. I will reproduce the code here for future reference in case the original is removed:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
/**
* Executes the command "cat scores | grep Villanova | cut -b 1-10".
* This quick-and-dirty version does no error checking.
*
* #author Jim Glenn
* #version 0.1 10/4/2004
*/
int main(int argc, char **argv)
{
int status;
int i;
// arguments for commands; your parser would be responsible for
// setting up arrays like these
char *cat_args[] = {"cat", "scores", NULL};
char *grep_args[] = {"grep", "Villanova", NULL};
char *cut_args[] = {"cut", "-b", "1-10", NULL};
// make 2 pipes (cat to grep and grep to cut); each has 2 fds
int pipes[4];
pipe(pipes); // sets up 1st pipe
pipe(pipes + 2); // sets up 2nd pipe
// we now have 4 fds:
// pipes[0] = read end of cat->grep pipe (read by grep)
// pipes[1] = write end of cat->grep pipe (written by cat)
// pipes[2] = read end of grep->cut pipe (read by cut)
// pipes[3] = write end of grep->cut pipe (written by grep)
// Note that the code in each if is basically identical, so you
// could set up a loop to handle it. The differences are in the
// indicies into pipes used for the dup2 system call
// and that the 1st and last only deal with the end of one pipe.
// fork the first child (to execute cat)
if (fork() == 0)
{
// replace cat's stdout with write part of 1st pipe
dup2(pipes[1], 1);
// close all pipes (very important!); end we're using was safely copied
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
execvp(*cat_args, cat_args);
}
else
{
// fork second child (to execute grep)
if (fork() == 0)
{
// replace grep's stdin with read end of 1st pipe
dup2(pipes[0], 0);
// replace grep's stdout with write end of 2nd pipe
dup2(pipes[3], 1);
// close all ends of pipes
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
execvp(*grep_args, grep_args);
}
else
{
// fork third child (to execute cut)
if (fork() == 0)
{
// replace cut's stdin with input read of 2nd pipe
dup2(pipes[2], 0);
// close all ends of pipes
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
execvp(*cut_args, cut_args);
}
}
}
// only the parent gets here and waits for 3 children to finish
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
for (i = 0; i < 3; i++)
wait(&status);
}
I have trouble understanding why the pipes are being closed just before calling execvp and reading or writing any data. I believe it has something to do with passing EOF flags to processes so that they can stop reading writing however I don't see how that helps before any actual data is pushed to the pipe. I'd appreciate a clear explanation. Thanks.

I have trouble understanding why the pipes are being closed just before calling execvp and reading or writing any data.
The pipes are not being closed. Rather, some file descriptors associated with the pipe ends are being closed. Each child process is duping pipe-end file descriptors onto one or both of its standard streams, then closing all pipe-end file descriptors that it is not actually going to use, which is all of the ones stored in the pipes array. Each pipe itself remains open and usable as long as each end is open in at least one process, and each child process holds at least one end of one pipe open. Those are closed when the child processes terminate (or at least under the control of the child processes, post execvp()).
One reason to perform such closures is for tidiness and resource management. There is a limit on how many file descriptors a process may have open at once, so it is wise to avoiding leaving unneeded file descriptors open.
But also, functionally, a process reading from one of the pipes will not detect end of file until all open file descriptors associated with the write end of the pipe, in any process, are closed. That's what EOF on a pipe means, and it makes sense because as long as the write end is open anywhere, it is possible that more data will be written to it.

how does popen2() work in c?

im trying to execute md5sume command in my programm using pipe,fork and dup.i found sum code that run succesfully but i cant understand some line of code. Here is my code:
int infp, outfp;
char buf[128];
if (popen2("md5sum", &infp, &outfp) <= 0)
{
printf("Unable to exec sort\n");
exit(1);
}
write(infp, "hello\n", 2);
close(infp);
*buf = '\0';
read(outfp, buf, 128);
printf("buf = '%s'\n", buf);
return 0;
}
int p_stdin[2], p_stdout[2];
pid_t pid;
if (pipe(p_stdin) != 0 || pipe(p_stdout) != 0)
return -1;
pid = fork();
if (pid < 0)
return pid;
if (pid == 0)
{
close(p_stdin[WRITE]);
dup2(p_stdin[READ], READ);
close(p_stdout[READ]);
dup2(p_stdout[WRITE], WRITE);
execl("/bin/sh", "sh", "-c", command, NULL);
perror("execl");
exit(1);
}
else
{
if (infp == NULL)
close(p_stdin[WRITE]);
else
*infp = p_stdin[WRITE];
if (outfp == NULL)
close(p_stdout[READ]);
else
*outfp = p_stdout[READ];
}
return pid;
}
i dont understand the popen function. What does this line exactly do?
*infp = p_stdin[WRITE];
how can pipes comunicate with each other?

i dont understand the popen function.
how can pipes comunicate with each other?
pipe() : A pipe is unidirectional and a byte stream buffer in kernel. As it is of type byte stream, a writer can write in arbitrary number of bytes and reader can read out arbitrary number of bytes. However, note that sequential reads are possible , but seek (like lseek) is not possible. Since pipe is uni-directinal, the data that is written into pipe shall be buffered in kernel, until it is read from the read-end of the pipe. Also, if pipe gets full, the write blocks.
Let's consider that fd is an integer array of 2 file descriptors (int fd[2]), then the pipe(fd) system call shall create a pipe and return a pair of file descriptors such that fd[1] (stdout is 1) shall be the write-end of the pipe and the fd[0] (stdin is 0) shall be the read-end of the pipe. Unlike named pipe(like FIFO - a pipe with name in File system), the anonymous pipes can be used only between related processes like parent-child. So, fork shall be done to duplicate these 2 parent file descriptors in child, thereby parent shares the pipe with child so that the child shall write in write-end and parent shall read from read-end of pipe or Parent shall write into write-end and child shall read from read-end of pipe. Care should be taken to ensure to close the unused read(fd[0]) file descriptor / unused write(fd[1]) file descriptor by the parent or child as per the scenario.
popen() : popen enables you to invoke another program as a new process and thereby transmit data to it or receive data from it. In case of popen, note that the direction of data flow is based on the 2nd argument. We need not manually create a child process as popen automatically forks for creating a child process, starts a shell and executes the command argument passed via popen. It also establishes appropriate read or write stream between parent and child automatically based on the type argument.
Thus, popen() simplifies things, as it avoids the need to manually call/invoke pipe,fork,exec and simplifies establishment of appropriate streams between parent / child automatically as per argument type. However, the other side of popen is that, it should be noted that every invocation of the popen() shall result in creation of extra process - that is, the shell is invoked every time apart from the program that is being invoked which in-turn leads to high resource consumption.

pipe() and fork() in c

I need to create two child processes. One child needs to run the command "ls -al" and redirect its output to the input of the next child process, which in turn will run the command "sort -r -n -k 5" on its input data. Finally, the parent process needs to read that (data already sorted) and display it in the terminal. The final result in the terminal (when executing the program) should be the same as if I entered the following command directly in the shell: "ls -al | sort -r -n -k 5". For this I need to use the following methods: pipe(), fork(), execlp().
My program compiles, but I don't get the desired output to the terminal. I don't know what is wrong. Here is the code:
#include <sys/types.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main()
{
int fd[2];
pid_t ls_pid, sort_pid;
char buff[1000];
/* create the pipe */
if (pipe(fd) == -1) {
fprintf(stderr, "Pipe failed");
return 1;
}
/* create child 2 first */
sort_pid = fork();
if (sort_pid < 0) { // error creating Child 2 process
fprintf(stderr, "\nChild 2 Fork failed");
return 1;
}
else if(sort_pid > 0) { // parent process
wait(NULL); // wait for children termination
/* create child 1 */
ls_pid = fork();
if (ls_pid < 0) { // error creating Child 1 process
fprintf(stderr, "\nChild 1 Fork failed");
return 1;
}
else if (ls_pid == 0) { // child 1 process
close(1); // close stdout
dup2(fd[1], 1); // make stdout same as fd[1]
close(fd[0]); // we don't need this end of pipe
execlp("bin/ls", "ls", "-al", NULL);// executes ls command
}
wait(NULL);
read(fd[0], buff, 1000); // parent reads data
printf(buff); // parent prints data to terminal
}
else if (sort_pid == 0) { // child 2 process
close(0); // close stdin
dup2(fd[0], 0); // make stdin same as fd[0]
close(fd[1]); // we don't need this end of pipe
execlp("bin/sort", "sort", "-r", "-n", "-k", "5", NULL); // executes sort operation
}
return 0;
}

Your parent process waits for the sort process to finish before creating the ls process.
The sort process needs to read its input before it can finish. And its input is coming from the ls that won't be started until after the wait. Deadlock.
You need to create both processes, then wait for both of them.
Also, your file descriptor manipulations aren't quite right. In this pair of calls:
close(0);
dup2(fd[0], 0);
the close is redundant, since dup2 will automatically close the existing fd 0 if there is one. You should do a close(fd[0]) after ther dup2, so you only have one file descriptor tied to that end of the pipe. And if you want to be really robust, you should test wither fd[0]==0 already, and in that case skip the dup2 and close.
Apply all of that to the other dup2 also.
Then there's the issue of the parent process holding the pipe open. I'd say you should close both ends of the pipe in the parent after you've passed them on to the children, but you have that weird read from fd[0] after the last wait... I'm not sure why that's there. If the ls|sort pipeline has run correctly, the pipe will be empty afterward, so there will be nothing to read. In any case, you definitely need to close fd[1] in the parent, otherwise the sort process won't finish because the pipe won't indicate EOF until all writers are closed.
After the weird read is a printf that will probably crash, since the read buffer won't be '\0'-terminated.
And the point of using execlp is that it does the $PATH lookup for you so you don't have to specify /bin/. My first test run failed because my sort is in /usr/bin/. Why hardcode paths when you don't have to?

Can popen() make bidirectional pipes like pipe() + fork()?

I'm implementing piping on a simulated file system in C++ (with mostly C). It needs to run commands in the host shell but perform the piping itself on the simulated file system.
I could achieve this with the pipe(), fork(), and system() system calls, but I'd prefer to use popen() (which handles creating a pipe, forking a process, and passing a command to the shell). This may not be possible because (I think) I need to be able to write from the parent process of the pipe, read on the child process end, write the output back from the child, and finally read that output from the parent. The man page for popen() on my system says a bidirectional pipe is possible, but my code needs to run on a system with an older version supporting only unidirectional pipes.
With the separate calls above, I can open/close pipes to achieve this. Is that possible with popen()?
For a trivial example, to run ls -l | grep .txt | grep cmds I need to:
Open a pipe and process to run ls -l on the host; read its output back
Pipe the output of ls -l back to my simulator
Open a pipe and process to run grep .txt on the host on the piped output of ls -l
Pipe the output of this back to the simulator (stuck here)
Open a pipe and process to run grep cmds on the host on the piped output of grep .txt
Pipe the output of this back to the simulator and print it
man popen
From Mac OS X:
The popen() function 'opens' a
process by creating a bidirectional
pipe, forking, and invoking the shell.
Any streams opened by previous popen()
calls in the parent process are closed
in the new child process.
Historically, popen() was implemented
with a unidirectional pipe; hence,
many implementations of popen() only
allow the mode argument to specify
reading or writing, not both. Because
popen() is now implemented using a
bidirectional pipe, the mode argument
may request a bidirectional data flow.
The mode argument is a pointer to a
null-terminated string which must be
'r' for reading, 'w' for writing, or
'r+' for reading and writing.

I'd suggest writing your own function to do the piping/forking/system-ing for you. You could have the function spawn a process and return read/write file descriptors, as in...
typedef void pfunc_t (int rfd, int wfd);
pid_t pcreate(int fds[2], pfunc_t pfunc) {
/* Spawn a process from pfunc, returning it's pid. The fds array passed will
* be filled with two descriptors: fds[0] will read from the child process,
* and fds[1] will write to it.
* Similarly, the child process will receive a reading/writing fd set (in
* that same order) as arguments.
*/
pid_t pid;
int pipes[4];
/* Warning: I'm not handling possible errors in pipe/fork */
pipe(&pipes[0]); /* Parent read/child write pipe */
pipe(&pipes[2]); /* Child read/parent write pipe */
if ((pid = fork()) > 0) {
/* Parent process */
fds[0] = pipes[0];
fds[1] = pipes[3];
close(pipes[1]);
close(pipes[2]);
return pid;
} else {
close(pipes[0]);
close(pipes[3]);
pfunc(pipes[2], pipes[1]);
exit(0);
}
return -1; /* ? */
}
You can add whatever functionality you need in there.

You seem to have answered your own question. If your code needs to work on an older system that doesn't support popen opening bidirectional pipes, then you won't be able to use popen (at least not the one that's supplied).
The real question would be about the exact capabilities of the older systems in question. In particular, does their pipe support creating bidirectional pipes? If they have a pipe that can create a bidirectional pipe, but popen that doesn't, then I'd write the main stream of the code to use popen with a bidirectional pipe, and supply an implementation of popen that can use a bidirectional pipe that gets compiled in an used where needed.
If you need to support systems old enough that pipe only supports unidirectional pipes, then you're pretty much stuck with using pipe, fork, dup2, etc., on your own. I'd probably still wrap this up in a function that works almost like a modern version of popen, but instead of returning one file handle, fills in a small structure with two file handles, one for the child's stdin, the other for the child's stdout.

POSIX stipulates that the popen() call is not designed to provide bi-directional communication:
The mode argument to popen() is a string that specifies I/O mode:
If mode is r, when the child process is started, its file descriptor STDOUT_FILENO shall be the writable end of the pipe, and the file descriptor fileno(stream) in the calling process, where stream is the stream pointer returned by popen(), shall be the readable end of the pipe.
If mode is w, when the child process is started its file descriptor STDIN_FILENO shall be the readable end of the pipe, and the file descriptor fileno(stream) in the calling process, where stream is the stream pointer returned by popen(), shall be the writable end of the pipe.
If mode is any other value, the result is unspecified.
Any portable code will make no assumptions beyond that. The BSD popen() is similar to what your question describes.
Additionally, pipes are different from sockets and each pipe file descriptor is uni-directional. You would have to create two pipes, one configured for each direction.

In one of netresolve backends I'm talking to a script and therefore I need to write to its stdin and read from its stdout. The following function executes a command with stdin and stdout redirected to a pipe. You can use it and adapt it to your liking.
static bool
start_subprocess(char *const command[], int *pid, int *infd, int *outfd)
{
int p1[2], p2[2];
if (!pid || !infd || !outfd)
return false;
if (pipe(p1) == -1)
goto err_pipe1;
if (pipe(p2) == -1)
goto err_pipe2;
if ((*pid = fork()) == -1)
goto err_fork;
if (*pid) {
/* Parent process. */
*infd = p1[1];
*outfd = p2[0];
close(p1[0]);
close(p2[1]);
return true;
} else {
/* Child process. */
dup2(p1[0], 0);
dup2(p2[1], 1);
close(p1[0]);
close(p1[1]);
close(p2[0]);
close(p2[1]);
execvp(*command, command);
/* Error occured. */
fprintf(stderr, "error running %s: %s", *command, strerror(errno));
abort();
}
err_fork:
close(p2[1]);
close(p2[0]);
err_pipe2:
close(p1[1]);
close(p1[0]);
err_pipe1:
return false;
}
https://github.com/crossdistro/netresolve/blob/master/backends/exec.c#L46
(I used the same code in popen simultaneous read and write)

Here's the code (C++, but can be easily converted to C):
#include <unistd.h>
#include <cstdlib>
#include <cstdio>
#include <cstring>
#include <utility>
// Like popen(), but returns two FILE*: child's stdin and stdout, respectively.
std::pair<FILE *, FILE *> popen2(const char *__command)
{
// pipes[0]: parent writes, child reads (child's stdin)
// pipes[1]: child writes, parent reads (child's stdout)
int pipes[2][2];
pipe(pipes[0]);
pipe(pipes[1]);
if (fork() > 0)
{
// parent
close(pipes[0][0]);
close(pipes[1][1]);
return {fdopen(pipes[0][1], "w"), fdopen(pipes[1][0], "r")};
}
else
{
// child
close(pipes[0][1]);
close(pipes[1][0]);
dup2(pipes[0][0], STDIN_FILENO);
dup2(pipes[1][1], STDOUT_FILENO);
execl("/bin/sh", "/bin/sh", "-c", __command, NULL);
exit(1);
}
}
Usage:
int main()
{
auto [p_stdin, p_stdout] = popen2("cat -n");
if (p_stdin == NULL || p_stdout == NULL)
{
printf("popen2() failed\n");
return 1;
}
const char msg[] = "Hello there!";
char buf[32];
printf("I say \"%s\"\n", msg);
fwrite(msg, 1, sizeof(msg), p_stdin);
fclose(p_stdin);
fread(buf, 1, sizeof(buf), p_stdout);
fclose(p_stdout);
printf("child says \"%s\"\n", buf);
return 0;
}
Possible Output:
I say "Hello there!"
child says " 1 Hello there!"

No need to create two pipes and waste a filedescriptor in each process. Just use a socket instead. https://stackoverflow.com/a/25177958/894520

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight