dup2 paramater order confusion - c

I have written this simple program:
#include<stdio.h>
#include<unistd.h>
#include <fcntl.h>
#include <stdlib.h>
int main(){
int fd = open("theFile.txt", O_CREAT | O_RDWR, 0666);
if(fd<0){
printf("coudlnt open File descriptor \n");
}
pid_t pid = fork();
if(pid==0){
dup2(int oldFD, int newFD);
dup2(fd,1);
execlp("/bin/ls","ls","-l", NULL);
}
return 0;
}
What i want is, redirect the output of ls - l to a file called "theFile.txt". The code works as i expect. What confuses me here is the order of dup2 parameters. I believe that correct order should be dup2(1, fd) - considering fd as the newFD and 1 as the oldFD. But the code works when i use it as dup2(fd,1) which basically is stdout to fd according to some other answers on SO.
How is the oldFD fd here and how is the newFD 1 here? If 1 is newFD, why does this program work in the first place?
Also, execlp overwrites child's address space after I call dup2. How is dup2 connected to execlp such that the desired result is obtained. i.e. what i do cat theFile.txt i get the current directly listed.
Can i get some explanation here please?

According to [man7]: DUP(2):
int dup2(int oldfd, int newfd);
...
The dup() system call creates a copy of the file descriptor oldfd,
using the lowest-numbered unused file descriptor for the new
descriptor.
...
The dup2() system call performs the same task as dup(), but instead
of using the lowest-numbered unused file descriptor, it uses the file
descriptor number specified in newfd. If the file descriptor newfd
was previously open, it is silently closed before being reused.
When outputing data (e.g. text) to console, applications use the stdout stream (also stderr, but for simplicity's sake, let's leave that out). stdout's fileno is 1 (it's better to use constants instead of values, as values might change - not very likely in this case, but in general):
cfati#cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q048923791$ cat /usr/include/unistd.h | grep STDOUT
#define STDOUT_FILENO 1 /* Standard output. */
In your child process, ls (via execlp) spits its data to stdout (fileno 1). Before that, the dup2 call is made. The current situation, before the dup2 call (for clarity, I'm going to use the defined macro when referring to stdout's fileno):
fd: points to the custom file (opened previously)
STDOUT_FILENO: points to stdout
The dup2 call:
dup2(fd, STDOUT_FILENO) (as it is now): closes current STDOUT_FILENO and duplicates fd to STDOUT_FILENO. Current situation:
fd: points to the custom file
STDOUT_FILENO: points to the custom file
dup2(STDOUT_FILENO, fd): closes current fd and duplicates STDOUT_FILENO to fd. Current situation:
fd: points to stdout
STDOUT_FILENO: points to stdout
As seen, for #1., when data will be output to stdout, it will actually go to the custom file (as opposed to #2. where it would go to stdout, even when using fd).
Regarding the 2nd question:
[man7]: EXEC(3):
The exec() family of functions replaces the current process image
with a new process image. The functions described in this manual
page are front-ends for execve(2).
[man7]: EXECVE(2):
By default, file descriptors remain open across an execve().
...
POSIX.1 says that if file descriptors 0, 1, and 2 would
otherwise be closed after a successful execve(), and the process
would gain privilege because the set-user-ID or set-group_ID mode
bit was set on the executed file, then the system may open an
unspecified file for each of these file descriptors. As a general
principle, no portable program, whether privileged or not, can
assume that these three file descriptors will remain closed across
an execve().
The file descriptors will be passed from the child process to ls.
Here's an improved version (minor changes only) of your code (code00.c):
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
int main() {
int ret = 0, fd = open("thefile.txt", O_CREAT | O_RDWR, 0666);
if (fd < 0) {
printf("Coudln't open file: %d\n", errno);
ret = 1;
}
pid_t pid = fork();
if (pid == 0) {
// dup2(int oldFD, int newFD);
if (dup2(fd, STDOUT_FILENO) < 0) {
printf("Couldn't redirect stdout: %d\n", errno);
ret = 2;
}
execlp("/bin/ls", "ls", "-l", NULL);
} else if (pid < 0) {
printf("Couldn't spawn child process: %d\n", errno);
ret = 3;
}
return ret;
}

Related

Reading from FIFO after unlink()

I have created a FIFO, wrote to it and unlinked it.
To my surprise I was able to read data from the fifo after unlinking, why is that?
#include <fcntl.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#define MAX_BUF 256
int main()
{
int fd;
char * myfifo = "/tmp/myfifo";
/* create the FIFO (named pipe) */
mkfifo(myfifo, 0666);
int pid = fork();
if (pid != 0)
{
/* write "Hi" to the FIFO */
fd = open(myfifo, O_WRONLY);
write(fd, "Hi", sizeof("Hi"));
close(fd);
/* remove the FIFO */
unlink(myfifo);
}
else
{
wait(NULL);
char buf[MAX_BUF];
/* open, read, and display the message from the FIFO */
fd = open(myfifo, O_RDONLY);
read(fd, buf, MAX_BUF);
printf("Received: %s\n", buf);
close(fd);
return 0;
}
return 0;
}
Unless you pass the O_NONBLOCK flag to open(2), opening a FIFO blocks until the other end is opened. From man 7 fifo:
The FIFO must be opened on both ends (reading and writing) before data
can be passed. Normally, opening the FIFO blocks until the other end
is opened also.
A process can open a FIFO in nonblocking mode. In this case, opening
for read only will succeed even if no-one has opened on the write side
yet, opening for write only will fail with ENXIO (no such device or
address) unless the other end has already been opened.
Which is to say, your parent / child processes are implicitly synchronized upon opening the FIFO. So by the time the parent process calls unlink(2), the child process opened the FIFO long ago. So the child will always find the FIFO object and open it before the parent calls unlink(2) on it.
A note about unlink(2): unlink(2) simply deletes the filename from the filesystem; as long as there is at least one process with the file (FIFO in this case) open, the underlying object will persist. Only after that process terminates or closes the file descriptor will the operating system free the associated resources. FWIW, this is irrelevant in the scope of this question, but it seems worth noting.
A couple of other (unrelated) remarks:
Don't call wait(2) on the child. It will return an error (which you promptly ignore), because the child didn't fork any process.
mkfifo(3), fork(2), open(2), read(2), write(2), close(2) and unlink(2) can all fail and return -1. You should gracefully handle possible errors instead of ignoring them. A common strategy for these toy programs is to print a descriptive error message with perror(3) and terminate.
If you just want parent to child communication, use a pipe: it's easier to setup, you don't need to unlink it, and it is not exposed in the filesystem (but you need to create it with pipe(2) before forking, so that the child can access it).

How to call UNIX sort command on data in pipe

I am creating a C program and with it I am setting up a pipe between separately forked process for interprocess communication.
The first process has written the data I need into the pipe.
However, with the second process reading from the pipe, I am trying to exec the process to become the UNIX sort command. I want to somehow call sort on the data in the pipe.
How can I call sort on a pipe? On the commandline, I can sort by supplying the filename to sort as a commandline argument e.g. "sort -r MyFileToSort". I know that pipes are essentially considered files, but they are only described by their file descriptor, and as far as I know, sort won't know what to do with a fd.
Thanks for any help/feedback
int p[2];
if (pipe(p) != 0) ...report error and do not continue...
pid_t pid = fork();
if (pid < 0) ...report error, close pipe descriptors, and do not continue...
if (pid == 0)
{
/* Child - becomes sort */
dup2(p[0], 0);
close(p[0]);
close(p[1]);
int fd = open("output-file", O_CREAT | O_EXCL | O_WRONLY, 0644);
if (fd < 0) ...report error and exit...
dup2(fd, 1);
close(fd);
execlp("sort", "sort", (char *)0);
...report error and exit...
}
else
{
/* Parent - writes data to sort */
close(fd[0]);
...write data to fd[1]...
close(fd[1]);
int status;
int corpse;
while ((corpse = wait(&status)) > 0 && corpse != pid)
...consider reporting which child died...
...consider reporting sort status...
...continue with the rest of the program...
}
You can decide whether to report errors related to dup2() failing, or close() failing. There isn't much you can do in either case except report the problem and exit. Unless someone has subjected your program to cruel and unusual punishment by not supplying it with standard input, standard output and standard error (or something elsewhere in the program has closed any of the standard channels), then the pipe and file descriptors can't be the standard I/O descriptors, so the closes are safe. If you're not sure how sick your users are, you might protect the closes:
if (p[0] > FILENO_STDERR)
close(p[0]);
That is normally unnecessarily paranoid (but it can be fun trying programs with missing standard I/O).
You don't need to pass sort any arguments to specify input source or output sink at all in this case. Instead, before execing it, you should make attach your pipeline's file descriptors to its stdin (FD 0, if receiving data from a pipe) or stdout (FD 1, if writing data to a pipe), as appropriate.
See the dup2() call, which lets you set the destination to which you're copying a FD, for this purpose. As #JonathanLeffler points out, you'll want to be sure to close the original FDs (after duplicating them to the numbers you want) before your exec call.
Since you've clarified, in comments, that your goal is to write to a file, you would attach FD 1 to that destination file before calling exec, with FD 0 attached to the output side of the pipeline containing input.

Re-opening stdout and stdin file descriptors after closing them

I'm writing a function, which, given an argument, will either redirect the stdout to a file or read the stdin from a file. To do this I close the file descriptor associated with the stdout or stdin, so that when I open the file it opens under the descriptor that I just closed. This works, but the problem is that once this is done, I need to restore the stdout and stdin to what they should really be.
What I can do for stdout is open("/dev/tty",O_WRONLY); But I'm not sure why this works, and more importantly I don't know of an equivalent statement for stdin.
So I have, for stdout
close(1);
if (creat(filePath, O_RDWR) == -1)
{
exit(1);
}
and for stdin
close(0);
if (open(filePath, O_RDONLY) == -1)
{
exit(1);
}
You should use dup() and dup2() to clone a file descriptor.
int stdin_copy = dup(0);
int stdout_copy = dup(1);
close(0);
close(1);
int file1 = open(...);
int file2 = open(...);
< do your work. file1 and file2 must be 0 and 1, because open always returns lowest unused fd >
close(file1);
close(file2);
dup2(stdin_copy, 0);
dup2(stdout_copy, 1);
close(stdin_copy);
close(stdout_copy);
However, there's a minor detail you might want to be careful with (from man dup):
The two descriptors do not share file descriptor flags (the
close-on-execflag). The close-on-exec flag (FD_CLOEXEC; see fcntl(2))
for the duplicate descriptor is off.
If this is a problem, you might have to restore the close-on-exec flag, possibly using dup3() instead of dup2() to avoid race conditions.
Also, be aware that if your program is multi-threaded, other threads may accidentally write/read to your remapped stdin/stdout.
I think you can "save" the descriptors before redirecting:
int save_in, save_out;
save_in = dup(STDIN_FILENO);
save_out = dup(STDOUT_FILENO);
Later on you can use dup2 to restore them:
/* Time passes, STDIN_FILENO isn't what it used to be. */
dup2(save_in, STDIN_FILENO);
I am not doing any error checking in that example - you should.
You could create a child process, and set up the redirection inside the child only. Then wait for the child to terminate, and continue working in the parent process. That way you don't have to worry about reversing your redirection at all.
Just look for examples of code using fork() and wait ().

Can popen() make bidirectional pipes like pipe() + fork()?

I'm implementing piping on a simulated file system in C++ (with mostly C). It needs to run commands in the host shell but perform the piping itself on the simulated file system.
I could achieve this with the pipe(), fork(), and system() system calls, but I'd prefer to use popen() (which handles creating a pipe, forking a process, and passing a command to the shell). This may not be possible because (I think) I need to be able to write from the parent process of the pipe, read on the child process end, write the output back from the child, and finally read that output from the parent. The man page for popen() on my system says a bidirectional pipe is possible, but my code needs to run on a system with an older version supporting only unidirectional pipes.
With the separate calls above, I can open/close pipes to achieve this. Is that possible with popen()?
For a trivial example, to run ls -l | grep .txt | grep cmds I need to:
Open a pipe and process to run ls -l on the host; read its output back
Pipe the output of ls -l back to my simulator
Open a pipe and process to run grep .txt on the host on the piped output of ls -l
Pipe the output of this back to the simulator (stuck here)
Open a pipe and process to run grep cmds on the host on the piped output of grep .txt
Pipe the output of this back to the simulator and print it
man popen
From Mac OS X:
The popen() function 'opens' a
process by creating a bidirectional
pipe, forking, and invoking the shell.
Any streams opened by previous popen()
calls in the parent process are closed
in the new child process.
Historically, popen() was implemented
with a unidirectional pipe; hence,
many implementations of popen() only
allow the mode argument to specify
reading or writing, not both. Because
popen() is now implemented using a
bidirectional pipe, the mode argument
may request a bidirectional data flow.
The mode argument is a pointer to a
null-terminated string which must be
'r' for reading, 'w' for writing, or
'r+' for reading and writing.
I'd suggest writing your own function to do the piping/forking/system-ing for you. You could have the function spawn a process and return read/write file descriptors, as in...
typedef void pfunc_t (int rfd, int wfd);
pid_t pcreate(int fds[2], pfunc_t pfunc) {
/* Spawn a process from pfunc, returning it's pid. The fds array passed will
* be filled with two descriptors: fds[0] will read from the child process,
* and fds[1] will write to it.
* Similarly, the child process will receive a reading/writing fd set (in
* that same order) as arguments.
*/
pid_t pid;
int pipes[4];
/* Warning: I'm not handling possible errors in pipe/fork */
pipe(&pipes[0]); /* Parent read/child write pipe */
pipe(&pipes[2]); /* Child read/parent write pipe */
if ((pid = fork()) > 0) {
/* Parent process */
fds[0] = pipes[0];
fds[1] = pipes[3];
close(pipes[1]);
close(pipes[2]);
return pid;
} else {
close(pipes[0]);
close(pipes[3]);
pfunc(pipes[2], pipes[1]);
exit(0);
}
return -1; /* ? */
}
You can add whatever functionality you need in there.
You seem to have answered your own question. If your code needs to work on an older system that doesn't support popen opening bidirectional pipes, then you won't be able to use popen (at least not the one that's supplied).
The real question would be about the exact capabilities of the older systems in question. In particular, does their pipe support creating bidirectional pipes? If they have a pipe that can create a bidirectional pipe, but popen that doesn't, then I'd write the main stream of the code to use popen with a bidirectional pipe, and supply an implementation of popen that can use a bidirectional pipe that gets compiled in an used where needed.
If you need to support systems old enough that pipe only supports unidirectional pipes, then you're pretty much stuck with using pipe, fork, dup2, etc., on your own. I'd probably still wrap this up in a function that works almost like a modern version of popen, but instead of returning one file handle, fills in a small structure with two file handles, one for the child's stdin, the other for the child's stdout.
POSIX stipulates that the popen() call is not designed to provide bi-directional communication:
The mode argument to popen() is a string that specifies I/O mode:
If mode is r, when the child process is started, its file descriptor STDOUT_FILENO shall be the writable end of the pipe, and the file descriptor fileno(stream) in the calling process, where stream is the stream pointer returned by popen(), shall be the readable end of the pipe.
If mode is w, when the child process is started its file descriptor STDIN_FILENO shall be the readable end of the pipe, and the file descriptor fileno(stream) in the calling process, where stream is the stream pointer returned by popen(), shall be the writable end of the pipe.
If mode is any other value, the result is unspecified.
Any portable code will make no assumptions beyond that. The BSD popen() is similar to what your question describes.
Additionally, pipes are different from sockets and each pipe file descriptor is uni-directional. You would have to create two pipes, one configured for each direction.
In one of netresolve backends I'm talking to a script and therefore I need to write to its stdin and read from its stdout. The following function executes a command with stdin and stdout redirected to a pipe. You can use it and adapt it to your liking.
static bool
start_subprocess(char *const command[], int *pid, int *infd, int *outfd)
{
int p1[2], p2[2];
if (!pid || !infd || !outfd)
return false;
if (pipe(p1) == -1)
goto err_pipe1;
if (pipe(p2) == -1)
goto err_pipe2;
if ((*pid = fork()) == -1)
goto err_fork;
if (*pid) {
/* Parent process. */
*infd = p1[1];
*outfd = p2[0];
close(p1[0]);
close(p2[1]);
return true;
} else {
/* Child process. */
dup2(p1[0], 0);
dup2(p2[1], 1);
close(p1[0]);
close(p1[1]);
close(p2[0]);
close(p2[1]);
execvp(*command, command);
/* Error occured. */
fprintf(stderr, "error running %s: %s", *command, strerror(errno));
abort();
}
err_fork:
close(p2[1]);
close(p2[0]);
err_pipe2:
close(p1[1]);
close(p1[0]);
err_pipe1:
return false;
}
https://github.com/crossdistro/netresolve/blob/master/backends/exec.c#L46
(I used the same code in popen simultaneous read and write)
Here's the code (C++, but can be easily converted to C):
#include <unistd.h>
#include <cstdlib>
#include <cstdio>
#include <cstring>
#include <utility>
// Like popen(), but returns two FILE*: child's stdin and stdout, respectively.
std::pair<FILE *, FILE *> popen2(const char *__command)
{
// pipes[0]: parent writes, child reads (child's stdin)
// pipes[1]: child writes, parent reads (child's stdout)
int pipes[2][2];
pipe(pipes[0]);
pipe(pipes[1]);
if (fork() > 0)
{
// parent
close(pipes[0][0]);
close(pipes[1][1]);
return {fdopen(pipes[0][1], "w"), fdopen(pipes[1][0], "r")};
}
else
{
// child
close(pipes[0][1]);
close(pipes[1][0]);
dup2(pipes[0][0], STDIN_FILENO);
dup2(pipes[1][1], STDOUT_FILENO);
execl("/bin/sh", "/bin/sh", "-c", __command, NULL);
exit(1);
}
}
Usage:
int main()
{
auto [p_stdin, p_stdout] = popen2("cat -n");
if (p_stdin == NULL || p_stdout == NULL)
{
printf("popen2() failed\n");
return 1;
}
const char msg[] = "Hello there!";
char buf[32];
printf("I say \"%s\"\n", msg);
fwrite(msg, 1, sizeof(msg), p_stdin);
fclose(p_stdin);
fread(buf, 1, sizeof(buf), p_stdout);
fclose(p_stdout);
printf("child says \"%s\"\n", buf);
return 0;
}
Possible Output:
I say "Hello there!"
child says " 1 Hello there!"
No need to create two pipes and waste a filedescriptor in each process. Just use a socket instead. https://stackoverflow.com/a/25177958/894520

How can I implement 'tee' programmatically in C?

I'm looking for a way in C to programmatically (ie, not using redirection from the command line) implement 'tee' functionality such that my stdout goes to both stdout and a log file. This needs to work for both my code and all linked libraries that output to stdout. Any way to do this?
You could popen() the tee program.
Or you can fork() and pipe stdout through a child process such as this (adapted from a real live program I wrote, so it works!):
void tee(const char* fname) {
int pipe_fd[2];
check(pipe(pipe_fd));
const pid_t pid = fork();
check(pid);
if(!pid) { // our log child
close(pipe_fd[1]); // Close unused write end
FILE* logFile = fname? fopen(fname,"a"): NULL;
if(fname && !logFile)
fprintf(stderr,"cannot open log file \"%s\": %d (%s)\n",fname,errno,strerror(errno));
char ch;
while(read(pipe_fd[0],&ch,1) > 0) {
//### any timestamp logic or whatever here
putchar(ch);
if(logFile)
fputc(ch,logFile);
if('\n'==ch) {
fflush(stdout);
if(logFile)
fflush(logFile);
}
}
putchar('\n');
close(pipe_fd[0]);
if(logFile)
fclose(logFile);
exit(EXIT_SUCCESS);
} else {
close(pipe_fd[0]); // Close unused read end
// redirect stdout and stderr
dup2(pipe_fd[1],STDOUT_FILENO);
dup2(pipe_fd[1],STDERR_FILENO);
close(pipe_fd[1]);
}
}
The "popen() tee" answers were correct. Here is an example program that does exactly that:
#include "stdio.h"
#include "unistd.h"
int main (int argc, const char * argv[])
{
printf("pre-tee\n");
if(dup2(fileno(popen("tee out.txt", "w")), STDOUT_FILENO) < 0) {
fprintf(stderr, "couldn't redirect output\n");
return 1;
}
printf("post-tee\n");
return 0;
}
Explanation:
popen() returns a FILE*, but dup2() expects a file descriptor (fd), so fileno() converts the FILE* to an fd. Then dup2(..., STDOUT_FILENO) says to replace stdout with the fd from popen().
Meaning, you spawn a child process (popen) that copies all its input to stdout and a file, then you port your stdout to that process.
You could use pipe(2) and dup2(2) to connect your standard out to a file descriptor you can read from. Then you can have a separate thread monitoring that file descriptor, writing everything it gets to a log file and the original stdout (saved avay to another filedescriptor by dup2 before connecting the pipe). But you would need a background thread.
Actually, I think the popen tee method suggested by vatine is probably simpler and safer (as long as you don't need to do anyhing extra with the log file, such as timestamping or encoding or something).
You can use forkpty() with exec() to execute the monitored program with its parameters. forkpty() returns a file descriptor which is redirected to the programs stdin and stdout. Whatever is written to the file descriptor is the input of the program. Whatever is written by the program can be read from the file descriptor.
The second part is to read in a loop the program's output and write it to a file and also print it to stdout.
Example:
pid = forkpty(&fd, NULL, NULL, NULL);
if (pid<0)
return -1;
if (!pid) /* Child */
{
execl("/bin/ping", "/bin/ping", "-c", "1", "-W", "1", "192.168.3.19", NULL);
}
/* Parent */
waitpid(pid, &status, 0);
return WEXITSTATUS(status);
There's no trivial way of doing this in C. I suspect the easiest would be to call popen(3), with tee as the command and the desired log file as an arument, then dup2(2) the file descriptor of the newly-opened FILE* onto fd 1.
But that looks kinda ugly and I must say that I have NOT tried this.

Resources