What is the purpose of file descriptors? [duplicate] - c

This question already has answers here:
What's the difference between a file descriptor and a file pointer?
(9 answers)
Closed 4 years ago.
My understanding is that both fopen() and open() can be used to open files. open() returns a file descriptor. But they should be equivalent in terms of get a file for writing or reading. What is the purpose of definining the file descriptors? It is not clear from the wiki page.
https://en.wikipedia.org/wiki/File_descriptor

fopen returns a FILE * which is a wrapper around the file descriptor (I will ignore the "this is not required by the specification" aspect here, as I am not aware of an implementation that does not do this). At a high level, it looks like this:
application --FILE *--> libc --file descriptor--> kernel
Shells operate directly on file descriptors mainly because they are executing other programs, and you cannot modify the other program's FILE * objects. However, you are able to modify other program's file descriptors using the dup syscall at startup (i.e. between fork and exec). For example:
/bin/cat > foo.txt
This tells the shell to execute the /bin/cat program, but first redirect stdout (file descriptor #1) to a file that it opens. This is implemented as (pseudocode):
if (fork() == 0) {
int fd = open("foo.txt");
dup2(fd, 1);
exec("/bin/cat");
}
The closest thing you can do with FILE * is calling freopen, but this is not persisted when using exec unlike file descriptors.
But why do we need FILE * at all then, if it's just a wrapper around a file descriptor? One main benefit is having a readahead buffer. For example, consider fgets. This will eventually call the read syscall on the file descriptor associated with the FILE * that you pass in. But how does it know how much to read? The kernel has no option to say "give me one line" (line-buffered ttys aside). If you read more than one line in the first read, the next time you call fgets you might only get part of the next line, since the kernel has already given you the first part in the previous read syscall. The other option would be calling read one character at a time, which is horrible for performance.
So what does libc do? It reads a bunch of characters at once, then stores the extra characters in an internal buffer on the FILE * object. The next time you call fgets, it is able to use the internal buffer. This buffer is also shared with functions like fread, so you can interleave calls to fgets and fread without losing data.

The two function at different levels:
open() is a lower-level, POSIX function to open a file. It returns a distinct integer to identify, and enable access to, the file opened. This integer is a file descriptor.
fopen() is a higher-level, portable, C standard-library function to open a file.
On a POSIX system, the portable fopen() probably calls the nonportable open(), but this is an implementation detail.
When in doubt, prefer fopen().
For more information, on a Linux system, man 2 read. The POSIX read() function reads data via the file descriptor returned by open().

Related

Prevent called function to print output [duplicate]

I want to reopen the stdin and stdout (and perhaps stderr while I'm at it) filehandles, so that future calls to printf() or putchar() or puts() will go to a file, and future calls to getc() and such will come from a file.
1) I don't want to permanently lose standard input/output/error. I may want to reuse them later in the program.
2) I don't want to open new filehandles because these filehandles would have to be either passed around a lot or global (shudder).
3) I don't want to use any open() or fork() or other system-dependent functions if I can't help it.
So basically, does it work to do this:
stdin = fopen("newin", "r");
And, if it does, how can I get the original value of stdin back? Do I have to store it in a FILE * and just get it back later?
Why use freopen()? The C89 specification has the answer in one of the endnotes for the section on <stdio.h>:
116. The primary use of the freopen function is to change the file associated with a standard
text stream (stderr,
stdin, or stdout), as those identifiers need not be
modifiable lvalues to which the value
returned by the fopen function
may be assigned.
freopen is commonly misused, e.g. stdin = freopen("newin", "r", stdin);. This is no more portable than fclose(stdin); stdin = fopen("newin", "r");. Both expressions attempt to assign to stdin, which is not guaranteed to be assignable.
The right way to use freopen is to omit the assignment: freopen("newin", "r", stdin);
I think you're looking for something like freopen()
This is a modified version of Tim Post's method; I used /dev/tty instead of /dev/stdout. I don't know why it doesn't work with stdout
(which is a link to /proc/self/fd/1):
freopen("log.txt","w",stdout);
...
...
freopen("/dev/tty","w",stdout);
By using /dev/tty the output is redirected to the terminal from where the app was launched.
Hope this info is useful.
freopen("/my/newstdin", "r", stdin);
freopen("/my/newstdout", "w", stdout);
freopen("/my/newstderr", "w", stderr);
... do your stuff
freopen("/dev/stdin", "r", stdin);
...
...
This peaks the needle on my round-peg-square-hole-o-meter, what are you trying to accomplish?
Edit:
Remember that stdin, stdout and stderr are file descriptors 0, 1 and 2 for every newly created process. freopen() should keep the same fd's, just assign new streams to them.
So, a good way to ensure that this is actually doing what you want it to do would be:
printf("Stdout is descriptor %d\n", fileno(stdout));
freopen("/tmp/newstdout", "w", stdout);
printf("Stdout is now /tmp/newstdout and hopefully still fd %d\n",
fileno(stdout));
freopen("/dev/stdout", "w", stdout);
printf("Now we put it back, hopefully its still fd %d\n",
fileno(stdout));
I believe this is the expected behavior of freopen(), as you can see, you're still only using three file descriptors (and associated streams).
This would override any shell redirection, as there would be nothing for the shell to redirect. However, its probably going to break pipes. You might want to be sure to set up a handler for SIGPIPE, in case your program finds itself on the blocking end of a pipe (not FIFO, pipe).
So, ./your_program --stdout /tmp/stdout.txt --stderr /tmp/stderr.txt should be easily accomplished with freopen() and keeping the same actual file descriptors. What I don't understand is why you'd need to put them back once changing them? Surely, if someone passed either option, they would want it to persist until the program terminated?
The os function dup2() should provide what you need (if not references to exactly what you need).
More specifically, you can dup2() the stdin file descriptor to another file descriptor, do other stuff with stdin, and then copy it back when you want.
The dup() function duplicates an open file descriptor. Specifically, it provides an alternate interface to the service provided by the fcntl() function using the F_DUPFD constant command value, with 0 for its third argument. The duplicated file descriptor shares any locks with the original.
On success, dup() returns a new file descriptor that has the following in common with the original:
Same open file (or pipe)
Same file pointer (both file descriptors share one file pointer)
Same access mode (read, write, or read/write)
freopen solves the easy part. Keeping old stdin around is not hard if you haven't read anything and if you're willing to use POSIX system calls like dup or dup2. If you're started to read from it, all bets are off.
Maybe you can tell us the context in which this problem occurs?
I'd encourage you to stick to situations where you're willing to abandon old stdin and stdout and can therefore use freopen.
And in the meantime, there's a C source code library that will do all this for you, redirecting stdout or stderr. But the cool part is that it lets you assign as many callback functions as you want to the intercepted streams, allowing you then to very easily send a single message to multiple destinations, a DB, a text file, etc.
On top of that, it makes it trivial to create new streams that look and behave the same as stdout and stderr, where you can redirect these new streams to multiple locations as well.
look for U-Streams C library on *oogle.
This is the most readily available, handy and useful way to do
freopen("dir","r",stdin);

dup2 function and program behaviour

if am writing a code.
#include<stdio.h>
int main()
{
int i;
FILE *fp;
fp=fopen("shiv.txt","w");
printf("%d",fileno(fp));
dup2(3,1);
fprintf(fp,"hello");
}
as an output the program is printing hello3 in the shiv.txt file
as we can see printf is called first yet its output is shown after the output of fprintf.
moreover dup2 was called after the printf statement therefore the output of printf should be placed on terminal
The standard I/O streams are buffered — with the possible exception of the standard error stream, which is only required to not be fully buffered by POSIX. Without a call to fflush(stdout) to flush the output buffer for standard output (or the output of a newline sequence if it is line-buffered), the way things work with respect to the FILE interface is not defined once you call dup2.
Since dup2 works with file descriptors and not FILE pointers, you have a problem: POSIX doesn't specify what to do in this case. The buffer associated with stdout may be discarded, or it may be flushed as if with fclose. The buffer may even remain associated and not flushed/discarded since stdout from the perspective of the FILE interface is still open.
So the behavior isn't necessarily deterministic without syncing the FILE interface with the underlying file description (add an fclose(stdout) call after dup2). Additionally, what happens with, e.g., stderr in addition to stdout with dup2 associated with the file description of the file you open? Is the behavior in order of the dup2 calls as with a queue or in reverse order as with a stack or even in a seemingly random order, the latter of which suggests that a segfault may be possible? And what is the order of output if you dup2(STDERR_FILENO, STDOUT_FILENO), followed by dup2(fileno(fp), STDERR_FILENO)? Do the results of writing to the standard output/error buffers appear before the fprintf results or after or mixed (or sometimes one and sometimes another)? Which appears first — the data written to stderr or the data written to stdout? Can you be certain this will always happen in that order?
The answer probably won't surprise you: No. What happens on one configuration may differ from what happens on another configuration because the interaction between file descriptors, the buffers used by the standard streams, and the FILE interface of the standard streams is left undefined.
As #twalberg commented, there is no guarantee that the file you opened is file descriptor 3, so be careful when hard-coding numbers like that. Also, you have STDOUT_FILENO available from <unistd.h>, which is where dup2 is actually declared, so you can avoid using a call to fileno in place of file descriptor 1 by using it.
There are rules to follow when manipulating "handles" to open file descriptions.
Per System Interfaces Chapter 2, file descriptors and streams are "handles" to file descriptions and a file description can have multiple handles.
This chapter defines rules when working with both a file descriptor and a stream for the same file description. If they are not followed, then the results are "undefined".
Calling dup2() to replace the stdout file descriptor after calling printf() (on the STDOUT stream) without a call to fflush(STDOUT) in between is a violation of those rules (simplified), hence the undefined behavior.

application of open() and fopen() [duplicate]

This question already has answers here:
C fopen vs open
(11 answers)
Closed 8 years ago.
There are differences between open() and fopen() function. One is system call and other is library function. I try to figure out what is the application of these two function but I found nothing useful. Can you give some scenarios where to use open() and where fopen() should be used?
Sometimes you need a file descriptor. open() gives you one. Sometimes you need a FILE*, in which case use fopen(). You can always turn your FILE* into a file descriptor via fileno(), whereas the opposite transformation is not really supported. It mostly depends on what downstream functions you intend to call with the file handle.
open() will return the file descriptor. We overwrite the file, while using the fopen() we cannot overwrite the file. We use the file descriptor for reading and writing using the other
functions like read() write() etc. But in fopen() it will return file descriptor we have to
use fprintf() to write to the file stream. sscanf() to read from the stream.

C - Proper way to close files when using both open() and fdopen()

So I'm building a Unix minishell in C, and am implementing input, output, and err redirection, and have come across a problem with files. I open my files in a loop where I find redirection operators, and use open(), which returns an fd. I then assign the child's fd accordingly, and call an execute function.
When my shell is just going out and finding programs, and executing them with execvp(), I don't have much of a problem. The only problem is knowing whether or not I need to call close() on the file descriptors before prompting for the next command line. I'm worried about having an fd leak, but don't exactly understand how it works.
My real problem arises when using builtin commands. I have a builtin command called "read", that takes one argument, an environmental variable name(could be one that doesn't yet exist). Read then prompts for a value, and assigns that value to the variable. Here's an example:
% read TESTVAR
test value test value test value
% echo ${TESTVAR}
test value test value test value
Well lets say that I try something like this:
% echo here's another test value > f1
% read TESTVAR < f1
% echo ${TESTVAR}
here's another test value
This works great, keep in mind that read executes inside the parent process, I don't call read with execvp since it's builtin. Read uses gets, which requires a stream variable, not an fd. So after poking around on the irc forums a bit I was told to use fdopen, to get the stream from the file descriptor. So before calling gets, I call:
rdStream = fdopen(inFD, "r");
then call
if(fgets(buffer, envValLen, rdStream) != buffer)
{
if(inFD) fclose(rdStream);
return -1;
}
if(inFD) fclose(rdStream);
As you can see, at the moment I'm closing the stream with fclose(), unless it is equal to stdin(which is 0). Is this necessary? Do I need to close the stream? Or just the file descriptor? Or both? I'm quite confused on which I should close, since they both refer to the same file, in a different manner. At the moment I'm not closing the fd, however I think that I definitely should. I would just like somebody to help make sure my shell isn't leaking any files, as I want it to be able to execute several thousand commands in a single session without leaking memory.
Thanks, if you guys want me to post anymore code just ask.
The standard says:
The fclose() function shall perform the equivalent of a close() on the
file descriptor that is associated with the stream pointed to by
stream.
So calling fclose is enough; it will also close the descriptor.
FILE is a buffering object from standard C library. When you do fclose (standard C function) it will eventually call close (Unix system function) but only after making sure C library buffers are flushed. So, I would say, if you use fopen andfwrite then you should use fclose, and not just close, otherwise you risk loosing some data.

C prog: After append file, read still return 0

I've a new file, opened as read/write then 1 thread will receive from network and append binary data to that file, the other thread will read from the same file to process the binary data, but the read() always return 0, so I can't read the data, but if I using cat in command line to append data, then the program can read the data and process. I don't know why it can't notice the new data coming from network. I'm using open(), read(), and write() in this program.
Use a pipe instead of an HDD-file. Depending on your system (which you didnt tell us) there are only minor modifications to your code (which you didnt give us) to do that.
file operations are buffered. try flushing the stream?
Assuming that your read() and write() functions are the POSIX one, they share the file position, even if they are used in different threads. So your read after write was trying to read after the position at which write had written. Don't use file IO to communicate between threads. In most contexts, I'd not even use pipe or sockets for that (one context I'd use them is when the reading thread is using poll/select with other file descriptors) but simple shared memory and mutex.

Resources