I'm trying to create a linux daemon in c and found some sample code on this page.
I understand all the code except where it tries to redirect STDIN, STDOUT and STDERR (to /dev/null/). I also found a number of questions on here related to why these should be redirected (which I understand).
Specifically the section of code my question relates to is:
/* Route I/O connections */
/* Open STDIN */
i = open("/dev/null", O_RDWR);
/* STDOUT */
dup(i);
/* STDERR */
dup(i);
Reading the man page for dup() it implies that dup() simply duplicates a file descriptor.
So I don't understand how this does the redirect ? Is the compiler taking hints from the comments in the line above ?, or is it missing some code ?, is it plain wrong ?, or am I missing something ?
It's import to understand the previous bit of the example code you link to:
/* close all descriptors */
for (i = getdtablesize(); i >= 0; --i)
{
close(i);
}
This closes all open file descriptors including STDIN, STDOUT and STDERR.
As the manpage for open() states
The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open for the process
So the subsequent call to open() in the example code will redirect file descriptor 0 which is STDIN, to /dev/null.
The subsequent calls to dup() will duplicate the file descriptor using the next lowest numbers. STDOUT is 1, and STDERR is 2.
The manpage for dup() states:
The dup() system call creates a copy of the file descriptor oldfd, using the lowest-numbered unused descriptor for the new descriptor
From the man page of dup:
The dup(oldfd) system call creates a copy of the file descriptor oldfd,
using the lowest-numbered unused descriptor for the new descriptor.
If you see the referenced code, he is first closing all the open file descriptors:
for (i = getdtablesize(); i >= 0; --i)
{
close(i);
}
After that when you call dup(i), it will copy the file descriptor i to the lowest available descriptor, which will be 0 (stdin). Doing that again will copy it to descriptor 1 (stdout) and similarly for descriptor 2 (stderr). In this way, the stdin, stdout, and stderr of the daemon process are pointing to /dev/null.
Every process gets three open file descriptors which are the stdin, stdout, and stderr (these descriptors usually have the values 0, 1, and 2 respectively). When you call printf(), for example, it writes to the file pointed to by the stdout descriptor. By pointing this descriptor to another file (such as /dev/null), any output from this process will get redirected to that file. Same logic applies for stdin and stderr.
On the shell, when you run something like ls > ls.out, the shell does the same. It fork()s a new process, opens ls.out for writing, and calls dup (or dup2) to copy the file descriptor of ls.out to this process' stdout.
Related
I feel like this is a topic I've taken for granted. In the past I literally just closed as many file descriptors "because I was told to". Most of the time this worked, but occasionally I ran into some unpredictable behaviour.
Thus, I'd like to ask - what the rule for closing file descriptors after calling dup / dup2?
Let's say I want to perform cat < in > out.
fd[IN] = open("in", O_RDONLY);
saved_stdin = dup(STDIN_FILENO);
dup2(fd[IN], STDIN_FILENO);
close(fd[IN])
fd[OUT] = open("out", O_WRONLY | O_CREAT | O_TRUNC, 0644);
saved_stdout = dup(STDOUT_FILENO);
dup2(fd[OUT], STDOUT_FILENO);
close(fd[OUT])
// Later on when I want to restore stdin and stdout
dup2(saved_stdin, STDIN_FILENO);
close(saved_stdin);
dup2(saved_stdout, STDINOUT_FILENO);
close(saved_stdout);
Is this correct or should I be closing more file descriptors?
The rule is indeed quite simple. For both dup() variants, it is true, that:
The source fd remains open and will have to be closed once it is no longer needed.
The target file descriptor,
when using dup(), is always an unused one
when using dup2(), is implicitly closed and replaced by a copy of the source fd.
The new target fd has to be closed, when it is no longer needed.
Source fd refers to the file descriptor to be duplicated, while target fd is the new file descriptor.
int new_fd = dup(source_fd);
dup2(source_fd, new_fd);
So yes, your code does the necessary closes, and no unneeded ones.
The figures is come from CSAPP System-Level:
Figure 2: Before Redirect IO
dup2(4,1);
Figure 1: After Redirect IO
Notice the refcnt of fd 1 has changed to 0, after call dup2.
According to the description of close in linux manual. It said:
if the file descriptor was the last reference to a file which has been removed using unlink(2), the file is deleted.
close be used to decrease the refcnt of opened file. we use dup to create a new fd will increase the refcnt of opened file. when we call close function, it did't close the file immediately, it only decrease the reference count of file. the file will be close/delete when the refcnt is 0.
So it's really like the Reference counting for memory management.
I am completely confused, is it possible that stdin, stdout, and stderr point to the same filedescriptor internally?
Because it makes no difference in C if i want to read in a string from the console if I am using stdin as input or stdout.
read(1, buf, 200) works as read(0, buf, 200) how is this possible?
(0 == STDIN_FILENO == fileno(stdin),
1 == STDOUT_FILENO == fileno(stdout))
When the input comes from the console, and the output goes to the console, then all three indeed happen to refer to the same file. (But the console device has quite different implementations for reading and writing.)
Anyway, you should use stdin/stdout/stderr only for their intended purpose; otherwise, redirections like the following would not work:
<inputfile myprogram >outputfile
(Here, stdin and stdout refer to two different files, and stderr refers to the console.)
One thing that some people seem to be overlooking: read is the low-level system call. Its first argument is a Unix file descriptor, not a FILE* like stdin, stdout and stderr. You should be getting a compiler warning about this:
warning: passing argument 1 of ‘read’ makes integer from pointer without a cast [-Wint-conversion]
int r = read(stdout, buf, 200);
^~~~~~
On my system, it doesn't work with either stdin or stdout. read always returns -1, and errno is set to EBADF, which is "Bad file descriptor". It seems unlikely to me that those exact lines work on your system: the pointer would have to point to memory address 0, 1 or 2, which won't happen on a typical machine.
To use read, you need to pass it STDIN_FILENO, STDOUT_FILENO or STDERR_FILENO.
To use a FILE* like stdin, stdout or stderr, you need to use fread instead.
is it possible that stdin, stdout, and stderr point to the same filedescriptor internally?
A file descriptor is an index into the file descriptor table of your process (see also credentials(7)...). By definition STDIN_FILENO is 0, STDOUT_FILENO is 1, annd STDERR_FILENO is 2. Read about proc(5) to query information about some process (for example, try ls -l /proc/$$/fd in your interactive shell).
The program (usually, but not always, some shell) which has execve(2)-d your executable might have called dup2(2) to share (i.e. duplicate) some file descriptors.
See also fork(2), intro(2) and read some Linux programming book, such as the old ALP.
Notice that read(2) from STDOUT_FILENO could fail (e.g. with errno(3) being EBADF) in the (common) case where stdout is not readable (e.g. after redirection by the shell). If reading from the console, it could be readable. Read also the Tty Demystified.
There is nothing prohibiting any number of file-handles referring the same thing in the kernel.
And the default for a terminal-program is to have STDIN, STDOUT and STDERR refer to the same terminal.
So, it might look like it doesn't matter which you use, but it will all go wrong if the caller does any handle-redirection, which is quite common.
The most common is piping output from one program into the input of the next, but keeping stdout out of that.
An example for the shell:
source | filter | sink
Programs such as login and xterm typically open the tty device once when creating a new terminal session, and duplicate the file descriptor two or three times, arranging for file descriptors 0, 1 and 2 to be linked to the open file description of the opened tty device. They typically close all other file descriptors before exec-ing the shell. So if no further redirection is done by the shell or its child processes, the file descriptors, 0, 1 and 2, remain linked to the same file. Because the underlying tty device was opened in read-write mode, all three file descriptors have both read and write access.
I am trying to redirect stdout to a socket. I do something like this:
dup2(new_fd, STDOUT_FILENO);
After doing so all stdio functions writing to the stdout fail. I have tried to reopen stdout this way:
fclose(stdout);
stdout = fdopen(STDOUT_FILENO, "wb");
But printf and other functions still don't work.
EDIT:
I am affraid that I misunderstood the problem at the first place. After some more debugging I've figured out that this is a real issue:
printf("Test"); // We get Broken pipe here
// Reconnect new_fd
dup2(new_fd, STDERR_FILENO);
printf("Test"); // This also returns Broken pipe despite that stdout is fine now
Thanks.
1: on dup2(src, dst)
A number of operating systems track open files through the use of file descriptors. dup2 internally duplicates a file descriptor from the src to dst, closing dst if its already open.
What your first statement is doing is making every write to STDOUT_FILENO to go to the object represented by new_fd. I say object because it could be a socket as well as a file.
I don't see anything wrong with your first line of code, but I don't know how new_fd is defined.
2: on reopening stdout
When you close a file descriptor, the OS removes it from its table. However, when you open a file descriptor, the OS sets the smallest available file descriptor as the returned value. Thus, to reopen stdout, all you need to do is reopen the device. I believe the device changes depending on the OS. For example, on my Mac the device is /dev/tty.
Therefore, to reopen the stdout, you want to do the following:
close(1);
open("/dev/tty", O_WRONLY);
I've solved the problem by clearing a stdio's error indicator after fixing stdout:
clearerr(stdout);
Thanks for your help.
I'm trying to understand the use of dup2 and dup.
From the man page:
DESCRIPTION
dup and dup2 create a copy of the file descriptor oldfd. After successful return of dup or dup2, the old and new descriptors may be used interchangeably. They share locks, file position pointers and flags; for example, if the file position is modified by using lseek on one of the descriptors, the position is also changed for the other.
The two descriptors do not share the close-on-exec flag, however. dup uses the lowest-numbered unused descriptor for the new descriptor.
dup2 makes newfd be the copy of oldfd, closing newfd first if necessary.
RETURN VALUE
dup and dup2 return the new descriptor, or -1 if an error occurred (in which case, errno is set appropriately).
Why would I need that system call? What is the use of duplicating the file descriptor? If I have the file descriptor, why would I want to make a copy of it? I'd appreciate it if you could explain and give me an example where dup2 / dup is needed.
The dup system call duplicates an existing file descriptor, returning a new one that
refers to the same underlying I/O object.
Dup allows shells to implement commands like this:
ls existing-file non-existing-file > tmp1 2>&1
The 2>&1 tells the shell to give the command a file descriptor 2 that is a duplicate of descriptor 1. (i.e stderr & stdout point to same fd).
Now the error message for calling ls on non-existing file and the correct output of ls on existing file show up in tmp1 file.
The following example code runs the program wc with standard input connected
to the read end of a pipe.
int p[2];
char *argv[2];
argv[0] = "wc";
argv[1] = 0;
pipe(p);
if(fork() == 0) {
close(STDIN); //CHILD CLOSING stdin
dup(p[STDIN]); // copies the fd of read end of pipe into its fd i.e 0 (STDIN)
close(p[STDIN]);
close(p[STDOUT]);
exec("/bin/wc", argv);
} else {
write(p[STDOUT], "hello world\n", 12);
close(p[STDIN]);
close(p[STDOUT]);
}
The child dups the read end onto file descriptor 0, closes the file de
scriptors in p, and execs wc. When wc reads from its standard input, it reads from the
pipe.
This is how pipes are implemented using dup, well that one use of dup now you use pipe to build something else, that's the beauty of system calls,you build one thing after another using tools which are already there , these tool were inturn built using something else so on ..
At the end system calls are the most basic tools you get in kernel
Cheers :)
Another reason for duplicating a file descriptor is using it with fdopen. fclose closes the file descriptor that was passed to fdopen, so if you don't want the original file descriptor to be closed, you have to duplicate it with dup first.
dup is used to be able to redirect the output from a process.
For example, if you want to save the output from a process, you duplicate the output (fd=1), you redirect the duplicated fd to a file, then fork and execute the process, and when the process finishes, you redirect again the saved fd to output.
Some points related to dup/dup2 can be noted please
dup/dup2 - Technically the purpose is to share one File table Entry inside a single process by different handles. ( If we are forking the descriptor is duplicated by default in the child process and the file table entry is also shared).
That means we can have more than one file descriptor having possibly different attributes for one single open file table entry using dup/dup2 function.
(Though seems currently only FD_CLOEXEC flag is the only attribute for a file descriptor).
http://www.gnu.org/software/libc/manual/html_node/Descriptor-Flags.html
dup(fd) is equivalent to fcntl(fd, F_DUPFD, 0);
dup2(fildes, fildes2); is equivalent to
close(fildes2);
fcntl(fildes, F_DUPFD, fildes2);
Differences are (for the last)- Apart from some errno value beteen dup2 and fcntl
close followed by fcntl may raise race conditions since two function calls are involved.
Details can be checked from
http://pubs.opengroup.org/onlinepubs/009695399/functions/dup.html
An Example of use -
One interesting example while implementing job control in a shell, where the use of dup/dup2 can be seen ..in the link below
http://www.gnu.org/software/libc/manual/html_node/Launching-Jobs.html#Launching-Jobs
In Stevens' UNIX Network Programming, he mentions redirecting stdin, stdout and stderr, which is needed when setting up a daemon. He does it with the following C code
/* redirect stdin, stdout, and stderr to /dev/null */
open("/dev/null", O_RDONLY);
open("/dev/null", O_RDWR);
open("/dev/null", O_RDWR);
I'm confused how these three 'know' they are redirecting the three std*. Especially since the last two commands are the same. Could someone explain or point me in the right direction?
Presumably file descriptors 0, 1, and 2 have already been closed when this code executes, and there are no other threads which might be allocating new file descriptors. In this case, since open is required to always allocate the lowest available file descriptor number, these three calls to open will yield file descriptors 0, 1, and 2, unless they fail.
It's because file descriptors 0, 1 and 2 are input, output and error respectively, and open will grab the first file descriptor available. Note that this will only work if file descriptors 0, 1 and 2 are not already being used.
And you should be careful about the terms used, stdin, stdout and stderr are actually file handles (FILE*) rather than file descriptors, although there is a correlation between those and the file descriptors.