How to reserve a file descriptor? - c

I'm writing a curses-based program. In order to make it simpler for me to find errors in this program, I would like to produce debug output. Due to the program already displaying a user interface on the terminal, I cannot put debugging output there.
Instead, I plan to write debugging output to file descriptor 3 unconditionally. You can invoke the program as program 3>/dev/ttyX with /dev/ttyX being a different teletype to see the debugging output. When file descriptor 3 is not opened, write calls fail with EBADF, which I ignore like all errors when writing debugging output.
A problem occurs when I open another file and no debugging output has been requested (i.e. file descriptor 3 has not been opened). In this case, the newly opened file might receive file descriptor 3, causing debugging output to randomly corrupt a file I just opened. This is a bad thing. How can I avoid this? Is there a portable way to mark a file descriptor as “reserved” or such?
Here are a couple of ideas I had and their problems:
I could open /dev/null or a temporary file to file descriptor 3 (e.g. by means of dup2()) before opening any other file. This works but I'm not sure if I can assume this to always succeed as opening /dev/null may not succeed.
I could test if file descriptor 3 is open and not write debugging output if it isn't. This is problematic when I'm attempting to restart the program by calling exec as a different file descriptor might have been opened (and not closed) prior to the exec call. I could intentionally close file descriptor 3 before calling exec when it has not been opened for debugging, but this feels really uggly.

Why use fd 3? Why not use fd 2 (stderr)? It already has a well-defined "I am logging of some sorts" meaning, is always (not true, but sufficiently true...) and you can redirect it before starting your binary, to get the logs where you want.
Another option would be to log messages to syslog, using the LOG_DEBUG level. This entails calling syslog() instead of a normal write function, but that's simply making the logging more explicit.
A simple way of checking if stderr has been redirected or is still pointing at the terminal is by using the isatty function (example code below):
#include <stdio.h>
#include <unistd.h>
int main(void) {
if (isatty(2)) {
printf("stderr is not redirected.\n");
} else {
printf("stderr seems to be redirected.\n");
}
}

In the very beginning of your program, open /dev/null and then assign it to file descriptor 3:
int fd = open ("/dev/null", O_WRONLY);
dup2(fd, 3);
This way, file descriptor 3 won't be taken.
Then, if needed, reuse dup2() to assign file descriptor 3 to your debugging output.

You claim you can't guarantee you can open /dev/null successfully, which is a little strange, but let's run with it. You should be able to use socketpair() to get a pair of FDs. You can then set the write end of the pair non-blocking, and dup2 it. You claim you are already ignoring errors on writes to this FD, so the data going in the bit-bucket won't bother you. You can of course close the other end of the socketpair.

Don't focus on a specific file descriptor value - you can't control it in a portable manner anyway. If you can control it at all. But you can use an environment variable to control debug output to a file:
int debugFD = getDebugFD();
...
int getDebugFD()
{
const char *debugFile = getenv( "DEBUG_FILE" );
if ( NULL == debugFile )
{
return( -1 );
}
int fd = open( debugFile, O_CREAT | O_APPEND | O_WRONLY, 0644 );
// error checking can be here
return( fd );
}
Now you can write your debug output to debugFD. I assume you know enough to make sure debugFD is visible where you need it, and also how to make sure it's initialized before trying to use it.
If you don't pass a DEBUG_FILE envval, you get an invalid file descriptor and your debug calls fail - presumably silently.

Related

How does mode affect the permisson on newly created files in Linux?

I'm new to Linux, still struggling to understand how permisson control work in Linux. The open function prototype is sth like :
int open(char *filename, int flags, mode_t mode);
Let's I have the following code:
fd = open("foo.txt", O_CREAT|O_RDWR, S_IRUSR)
and let's say the file "foo.txt" didn't exist before, so the above statment will create a file called "foo.txt", and current process who executes this open statment can read and write this file. But after this process terminates, If another process starts and tries to open this file. Below is my question:
Q1-Since the file was created with S_IRUSR(owner can read this file) in the first open call, does it mean that even I as owner of the file, if I start a new process to open this file again, I can only read this file and I cannot write this file, is my understanding correct?
If my understanding is correct, is it sensible/practicable that owners create sth that they cannot have full access to it later?
Q2-If my above understanding is correct, then in the second call to open by a new process. I can only call like:
fd = open("foo.txt", O_RDONLY) // uses flags like O_WRONLY or O_RDWR will throw an error?
since the first open specified the mode as S_IRUSR, which maps to O_RDONLY in the subsequent calls, is my understanding correct?
Correct, if you create the file with permissions S_IRUSR (often written in octal as 0400), then you will not be able to open the file for writing. Attempting to do so will fail and set errno to EACCES.
This is quite practical as it gives you a way to protect files you do not want to accidentally overwrite, as long as the permissions stay as they are. However, as the owner, you have the power to change the permissions later, using the chmod() system call. So it's not as though you have permanently lost the ability to write that file; you can give yourself back that ability whenever you want.

Can a file descriptor be duplicated multiple times?

I've been looking for quite a while and cannot find the answer to my question.
I'm trying to reproduce a shell in C, with full redirections. In order to do this, I wanted to open the file before executing my command.
For example, in ls > file1 > file2, I use dup2(file1_fd, 1) and dup2(file2_fd, 1) and then I execute ls to fill the files, but it seems a standard output can only be open once so only file2 will be filled, because it was the last one to be duplicated.
Is there a way to redirect standard output to multiple file?
Is there something I am missing? Thanks!
Is there a way to redirect standard output to multiple files?
Many file descriptors cannot be made one file descriptor. You need to write into each file descriptor separately. This is what tee utility does for you.
What you are asking for is the exact reason why the tee command exists (you can take a look at its source code here).
You cannot duplicate a file descriptor using dup2() multiple times. As you already saw, the last one overwrites any previous duplication. Therefore you cannot redirect the output of a program to multiple files directly using dup2().
In order to do this, you really need multiple descriptors, and therefore you would have to open both files, launch the command using popen() and then read from the pipe and write to both files.
Here is a very simple example of how you could do it:
#include <stdio.h>
#include <stdlib.h>
#define N 4096
int main(int argc, const char *argv[]) {
FILE *fp1, *fp2, *pipe;
fp1 = fopen("out1.txt", "w");
if (fp1 == NULL) {
perror("fopen out1 failed");
return 1;
}
fp2 = fopen("out2.txt", "w");
if (fp2 == NULL) {
perror("fopen out2 failed");
return 1;
}
// Run `ls -l` just as an example.
pipe = popen("ls -l", "r");
if (pipe == NULL) {
perror("popen failed");
return 1;
}
size_t nread, nwrote;
char buf[N];
while ((nread = fread(buf, 1, N, pipe))) {
nwrote = 0;
while (nwrote < nread)
nwrote += fwrite(buf + nwrote, 1, nread - nwrote, fp1);
nwrote = 0;
while (nwrote < nread)
nwrote += fwrite(buf + nwrote, 1, nread - nwrote, fp2);
}
pclose(pipe);
fclose(fp2);
fclose(fp1);
return 0;
}
The above code is only to give a rough estimate on how the whole thing works, it doesn't check for some errors on fread, fwrite, etc: you should of course check for errors in your final program.
It's also easy to see how this could be extended to support an arbitrary number of output files (just using an array of FILE *).
Standard output is not different from any other open file, the only special characteristic is for it to be file descriptor 1 (so only one file descriptor with index 1 can be in your process) You can dup(2) file descriptor 1 to get, let´s say file descriptor 6. That's the mission of dup() just to get another file descriptor (with a different number) than the one you use as source, but for the same source. Dupped descriptors allow you to use any of the dupped descriptors indifferently to output, or to change open flags like close on exec flag or non block or append flag (not all are shared, I'm not sure which ones can be changed without affecting the others in a dup). They share the file pointer, so every write() you attempt to any of the file descriptors will be updated in the others.
But the idea of redirection is not that. A convention in unix says that every program will receive three descriptors already open from its parent process. So to use forking, first you need to consider how to write notation to express that a program will receive (already opened) more than one output stream (so you can redirect any of them properly, before calling the program) The same also applies for joining streams. Here, the problem is more complex, as you'll need to express how the data flows might be merged into one, and this makes the merging problem, problem dependant.
File dup()ping is not a way to make a file descriptor to write in two files... but the reverse, it is a way to make two different file descriptors to reference the same file.
The only way to do what you want is to duplicate write(2) calls on every file descriptor you are going to use.
As some answer has commented, tee(1) command allows you to fork the flow of data in a pipe, but not with file descriptors, tee(1) just opens a file, and write(2)s there all the input, in addition to write(2)`ing it to stdout also.
There's no provision to fork data flows in the shell, as there's no provision to join (in paralell) dataflows on input. I think this is some abandoned idea in the shell design by Steve Bourne, and you'll probably get to the same point.
BTW, just study the possibility of using the general dup2() operator, which is <n>&m>, but again, consider that, for the redirecting program, 2>&3 2>&4 2>&5 2>&6 mean that you have pre-opened 7 file descriptors, 0...6 in which stderr is an alias of descpritors 3 to 6 (so any data written to any of those descriptors will appear into what was stderr) or you can use 2<file_a 3<file_b 4<file_c meaning your program will be executed with file descriptor 2 (stderr) redirected from file_a, and file descriptors 3 and 4 already open from files file_b and file_c. Probably, some notation should be designed (and it doesn't come easily to my mind now, how to devise it) to allow for piping (with the pipe(2) system call) between different processes that have been launched to do some task, but you need to build a general graph to allow for generality.

how to use the system call dup?

I am trying to understand how the system call dup() works. I am asking this question because I am writing a shell in C and I need to redirect the STDOUT to a file. Is this the right way to do it?
If for example I have the following code:
remember = dup(STDOUT_FILENO);
fileDescriptor = open("file.txt",O_RDONLY);
then everything that writes to the stdout will now write to the opened file?
As soon as the following line is executed:
remember = dup(STDOUT_FILENO);
STDOUT_FILENO is removed from the table of file descriptors leaving the first spot empty. When a new file is opened, the earliest empty file descriptor will be appointed to this new opened file, so in this case 1.
Nope. You just duplicate the file descriptor for stdout.
With the code you have so far, you could now do a write to remember, and the output would go to console, too:
char str = "this now goes to console, too!";
write(remember, str, strlen(str));
If you want to redirect console output, you yet have to do this:
dup2(fileDescriptor, STDOUT_FILENO);
This will close STDOUT_FILENO (but you have a duplicate in remember to restore it, if need be) and overwrite it with fileDescriptor – and from now on, console output goes to file...
If you don't ever consider to restore outputting to console, you can ommit the first call to dup entirely...
Edit (in response to your edit):
STDOUT_FILENO is removed from the table of file descriptors leaving the first spot empty. When a new file is opened, the earliest empty file descriptor will be appointed to this new opened file, so in this case 1.
This applies for close(STDOUT_FILENO)!
So back to if not wanting to restore: You could then do, too:
close(STDOUT_FILENO);
fileDescriptor = open("file.txt",O_WRONLY | O_CREAT);
// fileDescriptor will be 1 now
By the way: You must open your file with write access enabled (O_WRONLY or O_RDWR), as you want to write to that file (redirect output to)!
And you need the O_CREAT flag for the case the file does not exist yet. If you do not want to clear the file, but append to, add the O_APPEND flag (see open).
No dup is used to duplicate an existing descriptor and returning a duplicate whose value is the less possible among free descriptors (some docs says ``The new descriptor returned by the call is the lowest
numbered descriptor currently not in use by the process.''), so:
fileDescriptor = open("file.txt",O_RDONLY); // get new desc.
close(STDIN_FILENO); // close 0
dup(fileDescriptor); // dup new desc to 0 (less possible free desc).
// here fileDescriptor and 0 are aliases to the same opened file
close(fileDescriptor); // free unused desc.

Why does calling write() with stdin result in output? [duplicate]

I was working on an assignment where a program took a file descriptor as an argument (generally from the parent in an exec call) and read from a file and wrote to a file descriptor, and in my testing, I realized that the program would work from the command-line and not give an error if I used 0, 1 or 2 as the file descriptor. That made sense to me except that I could write to stdin and have it show on the screen.
Is there an explanation for this? I always thought there was some protection on stdin/stdout and you certainly can't fprintf to stdin or fgets from stdout.
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
char message[20];
read(STDOUT_FILENO, message, 20);
write(STDIN_FILENO, message, 20);
return 0;
}
Attempting to write on a file marked readonly or vice-versa would cause write and read to return -1, and fail. In this specific case, stdin and stdout are actually the same file. In essence, before your program executes (if you don't do any redirection) the shell goes:
if(!fork()){
<close all fd's>
int fd = open("/dev/tty1", O_RDWR);
dup(fd);
dup(fd);
execvp("name", argv);
}
So, stdin, out, and err are all duplicates of the same file descriptor, opened for reading and writing.
read(STDIN_FILENO, message, 20);
write(STDOUT_FILENO, message, 20);
Should work. Note - stdout my be a different place from stdin (even on the command line). You can feed output from another process as stdin into you process, or arrange the stdin/stdout to be files.
fprintf/fgets have a buffer - thus reducing the number of system calls.
Best guess - stdin points to where the input is coming from, your terminal and stdout points to where output should be going, your terminal. Since they both point to the same place they are interchangeable(in this case)?
If you run a program on UNIX
myapp < input > output
You can open /proc/{pid}/fd/1 and read from it, open /proc/{pid}/fd/0 and write to it and for example, copy output to input. (There is possibly a simpler way to do this, but I know it works)
You can do any manner of things which are plain confusing if you put your mind to it. ;)
It's very possible that file descriptors 0, 1, and 2 are all open for both reading and writing (and in fact that they all refer to the same underlying "open file description"), in which case what you're doing will work. But as far as I know, there's no guarantee, so it also might not work. I do believe POSIX somewhere specifies that if stderr is connected to the terminal when a program is invoked by the shell, it's supposed to be readable and writable, but I can't find the reference right off..
Generally, I would recommend against ever reading from stdout or stderr unless you're looking for a terminal to read a password from, and stdin has been redirected (not a tty). And I would recommend never writing to stdin - it's dangerous and you could end up clobbering a file the user did not expect to be written to!

Redirecting of stdout in bash vs writing to file in c with fprintf (speed)

I am wondering which option is basically quicker.
What interests me the most is the mechanism of redirection. I suspect the file is opened at the start of the program ./program > file and is closed at the end. Hence every time a program outputs something it should be just written to a file, as simple as it sounds. Is it so? Then I guess both options should be comparable when it comes to speed.
Or maybe it is more complicated process since the operating system has to perform more operations?
There is no much difference between that options (except making file as a strict option reduces flexibility of your program).
To compare both approaches, let's check, what stays behind a magical entity FILE*:
So in both cases we have a FILE* object, a file descriptor fd - a gateway to an OS kernel and in-kernel infrastructure that provides access to files or user terminals, which should (unless libc has some special initializer for stdout or kernel specially handles files with fd = 1).
How does bash redirection work in compare with fopen()?
When bash redirects file:
fork() // new process is created
fd = open("file", ...) // open new file
close(1) // get rid of fd=1 pointing to /dev/pts device
dup2(fd, 1) // make fd=1 point to opened file
close(fd) // get rid of redundant fd
execve("a") // now "a" will have file as its stdout
// in a
stdout = fdopen(1, ...)
When you open file on your own:
fork() // new process is created
execve("a") // now "a" will have file as its stdout
stdout = fdopen(1, ...)
my_file = fopen("file", ...)
fd = open("file", ...)
my_file = fdopen(fd, ...)
So as you can see, the main bash difference is twiddling with file descriptors.
Yes, you are right. The speed will be identical. The only difference in the two cases is which program opens and closes the file. When you redirect it using shell, it is the shell that opens the file and makes the handle available as stdout to the program. When the program opens the file, well, the program opens the file. After that, the handle is a file handle in both the cases, so there should be absolutely no difference in speed.
As a side remark, the program which writes to stdout can be used in more general ways. You can for example say
./program | ssh remotehost bash -c "cat > file"
which will cause the output of the program to be written to file on remotehost. Of course in this case there is no comparison like one you are making in the question.
stdout is a FILE handle, fprintf writes to a file handle, so the speed will be very similar in both cases. In fact printf("Some string") is equivalent to fprintf(stdout, "Some string"). I will say no more :)

Resources