I am trying to integrate use of samtools into a C program. This application reads data in a binary format called BAM, e.g. from stdin:
$ cat foo.bam | samtools view -h -
...
(I realize this is a useless use of cat, but I'm just showing how a BAM file's bytes can be piped to samtools on the command line. These bytes could come from other upstream processes.)
Within a C program, I would like to write chunks of unsigned char bytes to the samtools binary, while simultaneously capturing the standard output from samtools after it processes these bytes.
Because I cannot use popen() to simultaneously write to and read from a process, I looked into using publicly-available implementations of popen2(), which appears to be written to support this.
I wrote the following test code, which attempts to write() 4 kB chunks bytes of a BAM file located in the same directory to a samtools process. It then read()s bytes from the output of samtools into a line buffer, printed to standard error:
#include <sys/types.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define READ 0
#define WRITE 1
pid_t popen2(const char *command, int *infp, int *outfp)
{
int p_stdin[2], p_stdout[2];
pid_t pid;
if (pipe(p_stdin) != 0 || pipe(p_stdout) != 0)
return -1;
pid = fork();
if (pid < 0)
return pid;
else if (pid == 0)
{
close(p_stdin[WRITE]);
dup2(p_stdin[READ], READ);
close(p_stdout[READ]);
dup2(p_stdout[WRITE], WRITE);
execl("/bin/sh", "sh", "-c", command, NULL);
perror("execl");
exit(1);
}
if (infp == NULL)
close(p_stdin[WRITE]);
else
*infp = p_stdin[WRITE];
if (outfp == NULL)
close(p_stdout[READ]);
else
*outfp = p_stdout[READ];
return pid;
}
int main(int argc, char **argv)
{
int infp, outfp;
/* set up samtools to read from stdin */
if (popen2("samtools view -h -", &infp, &outfp) <= 0) {
printf("Unable to exec samtools\n");
exit(1);
}
const char *fn = "foo.bam";
FILE *fp = NULL;
fp = fopen(fn, "r");
if (!fp)
exit(-1);
unsigned char buf[4096];
char line_buf[65536] = {0};
while(1) {
size_t n_bytes = fread(buf, sizeof(buf[0]), sizeof(buf), fp);
fprintf(stderr, "read\t-> %08zu bytes from fp\n", n_bytes);
write(infp, buf, n_bytes);
fprintf(stderr, "wrote\t-> %08zu bytes to samtools process\n", n_bytes);
read(outfp, line_buf, sizeof(line_buf));
fprintf(stderr, "output\t-> \n%s\n", line_buf);
memset(line_buf, '\0', sizeof(line_buf));
if (feof(fp) || ferror(fp)) {
break;
}
}
return 0;
}
(For a local copy of foo.bam, here is a link to the binary file I am using for testing. But any BAM file would do for testing purposes.)
To compile:
$ cc -Wall test_bam.c -o test_bam
The problem is that the procedure hangs after the write() call:
$ ./test_bam
read -> 00004096 bytes from fp
wrote -> 00004096 bytes to samtools process
[bam_header_read] EOF marker is absent. The input is probably truncated.
If I close() the infp variable immediately after the write() call, then the loop goes through one more iteration before hanging:
...
write(infp, buf, n_bytes);
close(infp); /* <---------- added after the write() call */
fprintf(stderr, "wrote\t-> %08zu bytes to samtools process\n", n_bytes);
...
With the close() statement:
$ ./test_bam
read -> 00004096 bytes from fp
wrote -> 00004096 bytes to samtools process
[bam_header_read] EOF marker is absent. The input is probably truncated.
[main_samview] truncated file.
output ->
#HD VN:1.0 SO:coordinate
#SQ SN:seq1 LN:5000
#SQ SN:seq2 LN:5000
#CO Example of SAM/BAM file format.
read -> 00004096 bytes from fp
wrote -> 00004096 bytes to samtools process
With this change, I get some output that I'd otherwise expect to get if I ran samtools on the command line, but as mentioned, the procedure hangs once again.
How does one use popen2() to write and read data in chunks to internal buffers? If this isn't possible, are there alternatives to popen2() that would work better for this task?
As an alternative to a pipe, why not communicate with samtools through a socket? Checking the samtools source, the file knetfile.c indicates that samtools has socket communications available:
#include "knetfile.h"
/* In winsock.h, the type of a socket is SOCKET, which is: "typedef
* u_int SOCKET". An invalid SOCKET is: "(SOCKET)(~0)", or signed
* integer -1. In knetfile.c, I use "int" for socket type
* throughout. This should be improved to avoid confusion.
*
* In Linux/Mac, recv() and read() do almost the same thing. You can see
* in the header file that netread() is simply an alias of read(). In
* Windows, however, they are different and using recv() is mandatory.
*/
That may provide a better option than using pipe2.
This problem has nothing to do with the particular implementation of popen2. Also note that on OS X, popen lets you open a bidirectional pipe, this may be true on other BSD systems too. If this is to be portable, you'd need a configure check for whether popen allows bidirectional pipes (or something equivalent to a configure check).
You need to switch the pipes to non-blocking mode, and alternate between read and write calls in an endless loop. Such loop, in order not to waste the CPU when the samtools process is busy, needs to use select, poll or a similar mechanism that blocks for a file descriptor to become "available" (more data to read, or ready to accept data for writing).
See this question for some inspiration.
Related
I'm trying to use posix_openpt on Mac. The issue I'm seeing is that I get a file descriptor back from posix_openpt. I use the file descriptor for reading and create a copy using dup for writing. The issue I'm running into is that when I write to the master file descriptor, I read that data back out from the master. So no data ends up at the slave. I confirmed this by using posix_spawnp to run a program with stdin/stdout/stderr set to the slave file. The program hangs indefinitely waiting for input. Here is my code (note, all error handling was removed for legibility):
int master_fd = posix_openpt(O_RDWR);
grantpt(master_fd);
unlockpt(master_fd);
char *slave_filename_orig = ptsname(master_fd);
size_t slave_filename_len = strlen(slave_filename_orig);
char slave_filename[slave_filename_len + 1];
strcpy(slave_filename, slave_filename_orig);
posix_spawn_file_actions_t fd_actions;
posix_spawn_file_actions_init(&fd_actions);
posix_spawn_file_actions_addopen(&fd_actions, STDIN_FILENO, slave_filename, O_RDONLY, 0644);
posix_spawn_file_actions_addopen(&fd_actions, STDOUT_FILENO, slave_filename, O_WRONLY, 0644);
posix_spawn_file_actions_adddup2(&fd_actions, STDOUT_FILENO, STDERR_FILENO);
pid_t pid;
posix_spawnp(&pid, "wc", &fd_actions, NULL, NULL, NULL);
int master_fd_write = dup(master_fd);
char *data = "hello world";
write(master_fd_write, data, strlen(data));
close(master_fd_write);
char buffer[1024];
read(master_fd, buffer, 1024); // <- Issue Here
// buffer now contains hello world. It should contain the output of `wc`
(Note: The above was only tested on Linux; I don't have a Mac to work on, but I have no reason to believe it's any different in the details here.)
There are several problems with your code:
At least on Linux, calling posix_spawn() with a null pointer causes a crash. You need to provide all the arguments. Even if Macs accept it the way you have it, doing this is a Good Idea.
Next, wc reading from standard input will wait until an attempt to read more data gives an End Of File condition before it prints out the statistics it gathers; your code doesn't do this. With a pty, if you write a specific byte (Typically with the value 4, but it can be different, so best to use what the terminal says instead of hardcoding it) to it, the terminal driver will recognize that as signalling EOF without having to close the master like you would when using a pipe (Making it impossible to read the output of wc).
Second, the terminal's default settings include echoing the input; that's what you're reading.
A cleaned up version that addresses these issues and more (Like yours, with most error checking omitted; real code should be checking all these functions for errors):
#define _XOPEN_SOURCE 700
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <fcntl.h>
#include <spawn.h>
#include <termios.h>
#include <unistd.h>
#include <wait.h>
int main(void) {
int master_fd = posix_openpt(O_RDWR);
grantpt(master_fd);
unlockpt(master_fd);
char *slave_filename_orig = ptsname(master_fd);
size_t slave_filename_len = strlen(slave_filename_orig);
char slave_filename[slave_filename_len + 1];
strcpy(slave_filename, slave_filename_orig);
//printf("slave pty filename: %s\n", slave_filename);
// Open the slave pty in this process
int slave_fd = open(slave_filename, O_RDWR);
// Set up slave pty to not echo input
struct termios tty_attrs;
tcgetattr(slave_fd, &tty_attrs);
tty_attrs.c_lflag &= ~ECHO;
tcsetattr(slave_fd, TCSANOW, &tty_attrs);
posix_spawn_file_actions_t fd_actions;
posix_spawn_file_actions_init(&fd_actions);
// Use adddup2 instead of addopen since we already have the pty open.
posix_spawn_file_actions_adddup2(&fd_actions, slave_fd, STDIN_FILENO);
posix_spawn_file_actions_adddup2(&fd_actions, slave_fd, STDOUT_FILENO);
// Also close the master and original slave fd in the child
posix_spawn_file_actions_addclose(&fd_actions, master_fd);
posix_spawn_file_actions_addclose(&fd_actions, slave_fd);
posix_spawnattr_t attrs;
posix_spawnattr_init(&attrs);
pid_t pid;
extern char **environ;
char *const spawn_argv[] = {"wc" , NULL};
posix_spawnp(&pid, "wc", &fd_actions, &attrs, spawn_argv, environ);
close(slave_fd); // No longer needed in the parent process
const char *data = "hello world\n";
ssize_t len = strlen(data);
if (write(master_fd, data, len) != len) {
perror("write");
}
// Send the terminal's end of file interrupt
cc_t tty_eof = tty_attrs.c_cc[VEOF];
if (write(master_fd, &tty_eof, sizeof tty_eof) != sizeof tty_eof) {
perror("write EOF");
}
// Wait for wc to exit
int status;
waitpid(pid, &status, 0);
char buffer[1024];
ssize_t bytes = read(master_fd, buffer, 1024);
if (bytes > 0) {
fwrite(buffer, 1, bytes, stdout);
}
close(master_fd);
return 0;
}
When compiled and run, outputs
1 2 12
There are two problems with this code.
First, you are seeing "hello world" on master_fd because by default terminals echo. You need to set the terminal to raw mode to suppress that.
Second, wc won't output anything until it sees an EOF, and it will not see an EOF until you close the master. Not just master_fd_write mind you, but all copies of master_fd, including master_fd itself. However, once you close the master, you cannot read from it.
Choose some other program that wc to demonstrate the functionality of posix_openpt.
Edit: It is possible to raise the end-of-file condition on the slave without closing the master by writing ^D (EOT, ascii 4).
pipe(7) says:
If a process attempts to read from an empty pipe, then read(2) will block until data is available. If a process attempts to write to a full pipe (see below), then write(2) blocks until sufficient data has been read from the pipe to allow the write to complete. Nonblocking I/O is possible by using the fcntl(2) F_SETFL operation to enable the O_NONBLOCK open file status flag.
Below I have two simple C programs compiled on linux with gcc:
reader.c:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define STACKBUF_SIZE 128
#define FIFO_PATH "/home/bogdan/.imagedata"
signed int main(int argc, char **argv) {
int fifo_fd = open(FIFO_PATH, O_RDONLY); // blocking... - notice no O_NONBLOCK flag
if (fifo_fd != -1) {
fprintf(stdout, "open() call succeeded\n");
}
while (1) {
char buf[STACKBUF_SIZE] = {0};
ssize_t bread = read(fifo_fd, buf, STACKBUF_SIZE);
fprintf(stdout, "%d - %s\n", bread, buf);
sleep(1);
}
close(fifo_fd);
return EXIT_SUCCESS;
}
writer.c:
#define STACKBUF_SIZE 128
#define FIFO_PATH "/home/bogdan/.imagedata"
#define DATA "data"
int main(void) {
int fifo_fd = open(FIFO_PATH, O_WRONLY); // blocks until reader opens on the reader end, however we always first open the reader so...
if(fifo_fd != -1) {
ssize_t bwritten = write(fifo_fd, DATA, 5);
fprintf(stdout, "writer wrote %ld bytes\n", bwritten);
}
close(fifo_fd);
return EXIT_SUCCESS;
}
The files are compiled into two separate binaries with gcc writer.c -Og -g -o ./writer, same for the reader.
From the shell I first execute the reader binary, and as expected, the initial open() call blocks until I also execute the writer. I then execute the writer, whose open() call immediately succeeds and it writes 5 bytes to the FIFO (which are correctly displayed by the reader), after which it closes the fd, leaving the FIFO empty (?).
However, the following read() calls in the while loop of the reader don't block at all, and instead just return 0.
Unless I am missing something (I probably am) this is in clash with the semantics outlined by the pipe(7) manpage, as the FIFO fd was open without the O_NONBLOCK flag both in the reader and the writer.
The section of the manual that you quoted only applies to pipes with open writers. Two paragraphs down, it says this:
If all file descriptors referring to the write end of a pipe have been closed, then an attempt to read(2) from the pipe will see end-of-file (read(2) will return 0).
I have two file open in two different processes. There's a pipe connecting the two. Is it possible to write directly from one file to another? Especially if the process reading doesn't know the size of the file it's trying to read?
I was hoping to do something like this
#define length 100
int main(){
int frk = fork();
int pip[2];
pipe(pip);
if (frk==0){ //child
FILE* fp fopen("file1", "r");
write(pip[1],fp,length);
}
else {
FILE* fp fopen("file2", "w");
read(pip[0],fp,length);
}
Is it possible to write directly from one file to another?
C does not provide any mechanism for that, and it seems like it would require specialized hardware support. The standard I/O paradigm is that data get read from their source into memory or written from memory to their destination. That pesky "memory" in the middle means copying from one file to another cannot be direct.
Of course, you can write a function or program that performs such a copy, hiding the details from you. This is what the cp command does, after all, but the C standard library does not contain a function for that purpose.
Especially if the process reading doesn't know the size of the file it's trying to read?
That bit isn't very important. One simply reads and then writes (only) what one has read, repeating until there is nothing more to read. "Nothing more to read" means that a read attempt indicates by its return value that the end of the file has been reached.
If you want one process to read one file and the other to write that data to another file, using a pipe to convey data between the two, then you need both processes to implement that pattern. One reads from the source file and writes to the pipe, and the other reads from the pipe and writes to the destination file.
Special note: for the process reading from the pipe to detect EOF on that pipe, the other end has to be closed, in both processes. After the fork, each process can and should close the pipe end that it doesn't intend to use. The one using the write end then closes that end when it has nothing more to write to it.
In other unix systems, like BSD, there's a call to connect directly two file descriptors to do what you want, but don't know if there's a system call to do that in linux. Anywya, this cannot be done with FILE * descriptors, as these are the instance of a buffered file used by <stdio.h> library to represent a file. You can get the file descriptor (as the system knows it) of a FILE * instance by a call to the getfd(3) function call.
The semantics you are trying to get from the system are quite elaborate, as you want something to pass directly the data from one file descriptor to another, without intervention of any process (directly in the kernel), and the kernel needs for that a pool of threads to do the work of copying directly from the read calls to the write ones.
The old way of doing this is to create a thread that makes the work of reading from one file descriptor (not a FILE * pointer) and write to the other.
Another thing to comment is that the pipe(2) system call gives you two connected descriptors, that allow you to read(2) in one (the 0 index) what is write(2)n in the second (the 1 index). If you fork(2) a second process, and you do the pipe(2) call on both, you will have two pipes (with two descriptors each), one in each process, with no relationship between them. You will be able only to communicate each process with itself, but not with the other (which doesn't know anything about the other process' pipe descriptors) so no communication between them will be possible.
Next is a complete example of what you try to do:
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#define length 100
#define FMT(fmt) "pid=%d:"__FILE__":%d:%s: " fmt, getpid(), __LINE__, __func__
#define ERR(fmt, ...) do { \
fprintf(stderr, \
FMT(fmt ": %s (errno = %d)\n"), \
##__VA_ARGS__, \
strerror(errno), errno); \
exit(1); \
} while(0)
void copy(int fdi, int fdo)
{
unsigned char buffer[length];
ssize_t res, nread;
while((nread = res = read(fdi, buffer, sizeof buffer)) > 0) {
res = write(fdo, buffer, nread);
if (res < 0) ERR("write");
} /* while */
if (res < 0) ERR("read");
} /* copy */
int main()
{
int pip[2];
int res;
res = pipe(pip);
if (res < 0) ERR("pipe");
char *filename;
switch (res = fork()) {
case -1: /* error */
ERR("fork");
case 0: /* child */
filename = "file1";
res = open(filename, O_RDONLY);
if (res < 0) ERR("open \"%s\"", filename);
close(pip[0]);
copy(res, pip[1]);
break;
default: /* parent, we got the child's pid in res */
filename = "file2";
res = open(filename, O_CREAT | O_TRUNC | O_WRONLY, 0666);
if (res < 0) ERR("open \"%s\"", filename);
close(pip[1]);
copy(pip[0], res);
int status;
res = wait(&status); /* wait for the child to finish */
if (res < 0) ERR("wait");
fprintf(stderr,
FMT("The child %d finished with exit code %d\n"),
res,
status);
break;
} /* switch */
exit(0);
} /* main */
Consider this C program:
#include <poll.h>
#include <stdio.h>
#include <unistd.h>
#define TIMEOUT 500 // 0.5 s
#define BUF_SIZE 512
int fd_can_read(int fd, int timeout) {
struct pollfd pfd;
pfd.fd = fd;
pfd.events = POLLIN;
if (poll(&pfd, 1, timeout)) {
if (pfd.revents & POLLIN) {
return 1;
}
}
return 0;
}
int main(int argv, char **argc) {
int fd;
size_t bytes_read;
char buffer[BUF_SIZE];
fd = STDIN_FILENO;
while (1) {
if (fd_can_read(fd, TIMEOUT)) {
printf("Can read\n");
bytes_read = read(fd, buffer, sizeof(buffer));
printf("Bytes read: %zu\n", bytes_read);
}
else {
printf("Can't read\n");
}
}
}
It tries to poll given file descriptor (which is the fd of stdin in this case), and tries to read from it when it's available for reading. Here's an example input file called "input":
stuff to be read
So let's say I run the program, give a few inputs and close it:
./a.out
test
Can read
Bytes read: 5
Can't read
Can't read
...
So lets try reading the input from a file by piping/redirecting the contents of it to stdin of my program:
cat input | ./a.out # Or ./a.out < input
Bytes read: 0
Can read
Bytes read: 0
Can read
...
Now, the poll returns instantly (does not wait for the timeout to run out), and gives the results I was not expecting. I do know poll() does not work on files correctly, but if I'm not mistaken, I'm not reading from a file.
The problem is that poll (just like select) only tell you that a call to e.g. read will not block. It doesn't tell you if there's actually anything to read.
And if you read the read manual page you will see that when it returns 0 it means end of file (or connection closed for sockets).
What poll is telling you is that read can be called without blocking, and what read tells you by returning 0 is that there is nothing more to read.
You will get a similar "false positive" by pressing the end-of-file shortcut key (by default Ctrl-D on POSIX systems like Linux) for the non-piped or -redirected input example.
Throughout my years as a C programmer, I've always been confused about the standard stream file descriptors. Some places, like Wikipedia[1], say:
In the C programming language, the standard input, output, and error streams are attached to the existing Unix file descriptors 0, 1 and 2 respectively.
This is backed up by unistd.h:
/* Standard file descriptors. */
#define STDIN_FILENO 0 /* Standard input. */
#define STDOUT_FILENO 1 /* Standard output. */
#define STDERR_FILENO 2 /* Standard error output. */
However, this code (on any system):
write(0, "Hello, World!\n", 14);
Will print Hello, World! (and a newline) to STDOUT. This is odd because STDOUT's file descriptor is supposed to be 1. write-ing to file descriptor 1
also prints to STDOUT.
Performing an ioctl on file descriptor 0 changes standard input[2], and on file descriptor 1 changes standard output. However, performing termios functions on either 0 or 1 changes standard input[3][4].
I'm very confused about the behavior of file descriptors 1 and 0. Does anyone know why:
writeing to 1 or 0 writes to standard output?
Performing ioctl on 1 modifies standard output and on 0 modifies standard input, but performing tcsetattr/tcgetattr on either 1 or 0 works for standard input?
I guess it is because in my Linux, both 0 and 1 are by default opened with read/write to the /dev/tty which is the controlling terminal of the process. So indeed it is possible to even read from stdout.
However this breaks as soon as you pipe something in or out:
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
int main() {
errno = 0;
write(0, "Hello world!\n", 14);
perror("write");
}
and run with
% ./a.out
Hello world!
write: Success
% echo | ./a.out
write: Bad file descriptor
termios functions always work on the actual underlying terminal object, so it doesn't matter whether 0 or 1 is used for as long as it is opened to a tty.
Let's start by reviewing some of the key concepts involved:
File description
In the operating system kernel, each file, pipe endpoint, socket endpoint, open device node, and so on, has a file description. The kernel uses these to keep track of the position in the file, the flags (read, write, append, close-on-exec), record locks, and so on.
The file descriptions are internal to the kernel, and do not belong to any process in particular (in typical implementations).
File descriptor
From the process viewpoint, file descriptors are integers that identify open files, pipes, sockets, FIFOs, or devices.
The operating system kernel keeps a table of descriptors for each process. The file descriptor used by the process is simply an index to this table.
The entries to in the file descriptor table refer to a kernel file description.
Whenever a process uses dup() or dup2() to duplicate a file descriptor, the kernel only duplicates the entry in the file descriptor table for that process; it does not duplicate the file description it keeps to itself.
When a process forks, the child process gets its own file descriptor table, but the entries still point to the exact same kernel file descriptions. (This is essentially a shallow copy, will all file descriptor table entries being references to file descriptions. The references are copied; the referred to targets remain the same.)
When a process sends a file descriptor to another process via an Unix Domain socket ancillary message, the kernel actually allocates a new descriptor on the receiver, and copies the file description the transferred descriptor refers to.
It all works very well, although it is a bit confusing that "file descriptor" and "file description" are so similar.
What does all that have to do with the effects the OP is seeing?
Whenever new processes are created, it is common to open the target device, pipe, or socket, and dup2() the descriptor to standard input, standard output, and standard error. This leads to all three standard descriptors referring to the same file description, and thus whatever operation is valid using one file descriptor, is valid using the other file descriptors, too.
This is most common when running programs on the console, as then the three descriptors all definitely refer to the same file description; and that file description describes the slave end of a pseudoterminal character device.
Consider the following program, run.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
static void wrerrp(const char *p, const char *q)
{
while (p < q) {
ssize_t n = write(STDERR_FILENO, p, (size_t)(q - p));
if (n > 0)
p += n;
else
return;
}
}
static inline void wrerr(const char *s)
{
if (s)
wrerrp(s, s + strlen(s));
}
int main(int argc, char *argv[])
{
int fd;
if (argc < 3) {
wrerr("\nUsage: ");
wrerr(argv[0]);
wrerr(" FILE-OR-DEVICE COMMAND [ ARGS ... ]\n\n");
return 127;
}
fd = open(argv[1], O_RDWR | O_CREAT, 0666);
if (fd == -1) {
const char *msg = strerror(errno);
wrerr(argv[1]);
wrerr(": Cannot open file: ");
wrerr(msg);
wrerr(".\n");
return 127;
}
if (dup2(fd, STDIN_FILENO) != STDIN_FILENO ||
dup2(fd, STDOUT_FILENO) != STDOUT_FILENO) {
const char *msg = strerror(errno);
wrerr("Cannot duplicate file descriptors: ");
wrerr(msg);
wrerr(".\n");
return 126;
}
if (dup2(fd, STDERR_FILENO) != STDERR_FILENO) {
/* We might not have standard error anymore.. */
return 126;
}
/* Close fd, since it is no longer needed. */
if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO)
close(fd);
/* Execute the command. */
if (strchr(argv[2], '/'))
execv(argv[2], argv + 2); /* Command has /, so it is a path */
else
execvp(argv[2], argv + 2); /* command has no /, so it is a filename */
/* Whoops; failed. But we have no stderr left.. */
return 125;
}
It takes two or more parameters. The first parameter is a file or device, and the second is the command, with the rest of the parameters supplied to the command. The command is run, with all three standard descriptors redirected to the file or device named in the first parameter. You can compile the above with gcc using e.g.
gcc -Wall -O2 run.c -o run
Let's write a small tester utility, report.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
int main(int argc, char *argv[])
{
char buffer[16] = { "\n" };
ssize_t result;
FILE *out;
if (argc != 2) {
fprintf(stderr, "\nUsage: %s FILENAME\n\n", argv[0]);
return EXIT_FAILURE;
}
out = fopen(argv[1], "w");
if (!out)
return EXIT_FAILURE;
result = write(STDIN_FILENO, buffer, 1);
if (result == -1) {
const int err = errno;
fprintf(out, "write(STDIN_FILENO, buffer, 1) = -1, errno = %d (%s).\n", err, strerror(err));
} else {
fprintf(out, "write(STDIN_FILENO, buffer, 1) = %zd%s\n", result, (result == 1) ? ", success" : "");
}
result = read(STDOUT_FILENO, buffer, 1);
if (result == -1) {
const int err = errno;
fprintf(out, "read(STDOUT_FILENO, buffer, 1) = -1, errno = %d (%s).\n", err, strerror(err));
} else {
fprintf(out, "read(STDOUT_FILENO, buffer, 1) = %zd%s\n", result, (result == 1) ? ", success" : "");
}
result = read(STDERR_FILENO, buffer, 1);
if (result == -1) {
const int err = errno;
fprintf(out, "read(STDERR_FILENO, buffer, 1) = -1, errno = %d (%s).\n", err, strerror(err));
} else {
fprintf(out, "read(STDERR_FILENO, buffer, 1) = %zd%s\n", result, (result == 1) ? ", success" : "");
}
if (ferror(out))
return EXIT_FAILURE;
if (fclose(out))
return EXIT_FAILURE;
return EXIT_SUCCESS;
}
It takes exactly one parameter, a file or device to write to, to report whether writing to standard input, and reading from standard output and error work. (We can normally use $(tty) in Bash and POSIX shells, to refer to the actual terminal device, so that the report is visible on the terminal.) Compile this one using e.g.
gcc -Wall -O2 report.c -o report
Now, we can check some devices:
./run /dev/null ./report $(tty)
./run /dev/zero ./report $(tty)
./run /dev/urandom ./report $(tty)
or on whatever we wish. On my machine, when I run this on a file, say
./run some-file ./report $(tty)
writing to standard input, and reading from standard output and standard error all work -- which is as expected, as the file descriptors refer to the same, readable and writable, file description.
The conclusion, after playing with the above, is that there is no strange behaviour here at all. It all behaves exactly as one would expect, if file descriptors as used by processes are simply references to operating system internal file descriptions, and standard input, output, and error descriptors are duplicates of each other.