Is there any ordinary reason to use open() instead of fopen()? - c

I'm doing a small project in C after quite a long time away from it. These happen to include some file handling. I noticed in various documentation that there are functions which return FILE * handles and others which return (small integer) descriptors. Both sets of functions offer the same basic services I need so it really does not matter I use.
But I'm curious about the collection wisdom: is it better to use fopen() and friends, or open() and friends?
Edit Since someone mentioned buffered vs unbuffered and accessing devices, I should add that one part of this small project will be writing a userspace filesystem driver under FUSE. So the file level access could as easily be on a device (e.g. a CDROM or a SCSI drive) as on a "file" (i.e. an image).

It is better to use open() if you are sticking to unix-like systems and you might like to:
Have more fine-grained control over unix permission bits on file creation.
Use the lower-level functions such as read/write/mmap as opposed to the C buffered stream I/O functions.
Use file descriptor (fd) based IO scheduling (poll, select, etc.) You can of course obtain an fd from a FILE * using fileno(), but care must be taken not to mix FILE * based stream functions with fd based functions.
Open any special device (not a regular file)
It is better to use fopen/fread/fwrite for maximum portability, as these are standard C functions, the functions I've mentioned above aren't.

The objection that "fopen" is portable and "open" isn't is bogus.
fopen is part of libc, open is a POSIX system call.
Each is as portable as the place they come from.
i/o to fopen'ed files is (you must assume it may be, and for practical purposes, it is) buffered by libc, file descriptors open()'ed are not buffered by libc (they may well be, and usually are buffered in the filesystem -- but not everything you open() is a file on a filesystem.
What's the point of fopen'ing, for example, a device node like /dev/sg0, say, or /dev/tty0... What are you going to do? You're going to do an ioctl on a FILE *? Good luck with that.
Maybe you want to open with some flags like O_DIRECT -- makes no sense with fopen().

fopen works at a higher level than open ....
fopen returns you a pointer to FILE stream which is similar to the stream abstraction that you read in C++
open returns you a file descriptor for the file opened ... It does not provide you a stream abstraction and you are responsible for handling the bits and bytes yourself ... This is at a lower level as compared to fopen
Stdio streams are buffered, while open() file descriptors are not. Depends on what you need. You can also create one from the other:
int fileno (FILE * stream) returns the file descriptor for a FILE *, FILE * fdopen(int fildes, const char * mode) creates a FILE * from a file descriptor.
Be careful when intermixing buffered and non-buffered IO, since you'll lose what's in your buffer when you don't flush it with fflush().

Yes. When you need a low-level handle.
On UNIX operating systems, you can generally exchange file handles and sockets.
Also, low-level handles make for better ABI compatibility than FILE pointers.

read() & write() use unbuffered I/O. (fd: integer file descriptor)
fread() & fwrite() use buffered I/O. (FILE* structure pointer)
Binary data written to a pipe with write() may not be able to read binary data with fread(), because of byte alignments, variable sizes, etc. Its a crap-shoot.
Most low-level device driver code uses unbuffered I/O calls.
Most application level I/O uses buffered.
Use of the FILE* and its associated functions
is OK on a machine-by-machine basis: but portability is lost
on other architectures in the reading and writing of binary data.
fwrite() is buffered I/O and can lead to unreliable results if
written for a 64 bit architecture and run on a 32bit; or (Windows/Linux).
Most OSs have compatibility macros within their own code to prevent this.
For low-level binary I/O portability read() and write() guarantee
the same binary reads and writes when compiled on differing architectures.
The basic thing is to pick one way or the other and be consistent about it,
throughout the binary suite.
<stdio.h> // mostly FILE* some fd input/output parameters for compatibility
// gives you a lot of helper functions -->
List of Functions
Function Description
───────────────────────────────────────────────────────────────────
clearerr check and reset stream status
fclose close a stream
fdopen stream open functions //( fd argument, returns FILE*) feof check and reset stream status
ferror check and reset stream status
fflush flush a stream
fgetc get next character or word from input stream
fgetpos reposition a stream
fgets get a line from a stream
fileno get file descriptor // (FILE* argument, returns fd)
fopen stream open functions
fprintf formatted output conversion
fpurge flush a stream
fputc output a character or word to a stream
fputs output a line to a stream
fread binary stream input/output
freopen stream open functions
fscanf input format conversion
fseek reposition a stream
fsetpos reposition a stream
ftell reposition a stream
fwrite binary stream input/output
getc get next character or word from input stream
getchar get next character or word from input stream
gets get a line from a stream
getw get next character or word from input stream
mktemp make temporary filename (unique)
perror system error messages
printf formatted output conversion
putc output a character or word to a stream
putchar output a character or word to a stream
puts output a line to a stream
putw output a character or word to a stream
remove remove directory entry
rewind reposition a stream
scanf input format conversion
setbuf stream buffering operations
setbuffer stream buffering operations
setlinebuf stream buffering operations
setvbuf stream buffering operations
sprintf formatted output conversion
sscanf input format conversion
strerror system error messages
sys_errlist system error messages
sys_nerr system error messages
tempnam temporary file routines
tmpfile temporary file routines
tmpnam temporary file routines
ungetc un-get character from input stream
vfprintf formatted output conversion
vfscanf input format conversion
vprintf formatted output conversion
vscanf input format conversion
vsprintf formatted output conversion
vsscanf input format conversion
So for basic use I would personally use the above without mixing idioms too much.
By contrast,
<unistd.h> write()
lseek()
close()
pipe()
<sys/types.h>
<sys/stat.h>
<fcntl.h> open()
creat()
fcntl()
all use file descriptors.
These provide fine-grained control over reading and writing bytes
(recommended for special devices and fifos (pipes) ).
So again, use what you need, but keep consistent in your idioms and interfaces.
If most of your code base uses one mode , use that too, unless there is
a real reason not to. Both sets of I/O library functions are extremely reliable
and used millions of times a day.
note-- If you are interfacing C I/O with another language,
(perl, python, java, c#, lua ...) check out what the developers of those languages
recommend before you write your C code and save yourself some trouble.

usually, you should favor using the standard library (fopen). However, there are occasions where you will need to use open directly.
One example that comes to mind is to work around a bug in an older version of solaris which made fopen fail after 256 files were open. This was because they erroniously used an unsigned char for the fd field in their struct FILE implementation instead of an int. But this was a very specific case.

fopen and its cousins are buffered. open, read, and write are not buffered. Your application may or may not care.
fprintf and scanf have a richer API that allows you to read and write formatted text files. read and write use fundamental arrays of bytes. Conversions and formatting must be hand crafted.
The difference between file descriptors and (FILE *) is really inconsequential.
Randy

Related

Why the restrictions on C standard I/O streams that interact with sockets?

In book CSAPP section 10.9, it says that there are two restrictions on standard I/O streams that interact badly with restrictions on sockets.
Restriction 1: Input functions following output functions. An input function cannot follow an output function without an intervening call to fflush, fseek, fsetpos, or rewind. The fflush function empties the buffer associated with a stream. The latter three functions use the Unix I/O lseek function to reset the current file position.
Restriction 2: Output functions following input functions. An output function cannot follow an input function without an intervening call to fseek, fsetpos, or rewind, unless the input function encounters an end-of-file.
But I cannot figure out why the restrictions imposed. So, my question is: what factors result to the two restrictions?
It also says that "It is illegal to use the lseek function on a socket.", but how is it possible fseek, fsetpos and rewind use lseek to reset the current file position if it is true?
There is a similar question here, but my question different from this one.
The stdio functions are for buffered file input and output. A socket is not a file, but a socket. It doesn't even have a file position, and the buffer requirements are quite distinct from ordinary files - sockets can have independent input and output buffers, stdio file I/O cannot!
The problem is that the file input and file output share the same file position, and the operating system might have (and indeed will have on Unix) a distinct file position from what the file position due to the buffering in C would be.
Hence, from C99 rationale
A change of input/output direction on an update file is only allowed following a successful
fsetpos, fseek, rewind, or fflush operation, since these are precisely the functions
which assure that the I/O buffer has been flushed.
Note that all this applies to only files opened with + - with files opened in any other standard modes, it is not possible to mix input and output.
Since it is required by the C standard that when switching from input to output on FILE * one of the functions fsetpos, rewind or fseek, which essentially invoke lseek must be successful (mind you, calling fflush causes the buffered output to be written, and certainly not discarding the buffered input) before an output function is attempted... yet a socket is not seekable and lseek would therefore always fail - it means that you cannot use a FILE * that has been opened for both reading and writing wrapping the socket for actually both reading from and writing to the socket.
It is possible to use fdopen to open a FILE * with stream sockets if you really need to: just open two files - one "rb" for input and another with "wb" for output.
When it says "An input function cannot follow an output function without an intervening call to fflush, fseek, fsetpos, or rewind", what it means is that if you don't, it might not work as you expect. But they're mostly talking about i/o to/from ordinary files.
If you have a FILE * stream connected to a socket, and you want to switch back and forth between writing and reading, I would expect it to work just fine if you called fflush when switching from writing to reading. I would not expect it to be necessary to call anything when switching from reading to writing.
(When working with files, the call to fseek or one of its relatives is necessary in order to update the file position properly, but streams don't have a file position to update.)
I think the reason is that, in the early days, the buffer is shared for read and write for most implementations.
The rationale is simple, most cases are uni-direction. And maintain 2 buffers for read and write respectively wastes space.
If you have only one buffer, when you change the IO direction, you need to deal with the buffer. That's why you need fflush, fseek, fsetpos, or rewind to either write the buffer to disk or empty the buffer in preparation for the next IO operation.
I checked one glibc implementation, which only uses one single buffer for read and write.
static void init_stream (register FILE *fp) {
...
fp->__buffer = (char *) malloc (fp->__bufsize);
if (fp->__bufp == NULL)
{
/* Set the buffer pointer to the beginning of the buffer. */
fp->__bufp = fp->__buffer;
fp->__put_limit = fp->__get_limit = fp->__buffer;
}
}
take fseek for example
/* Move the file position of STREAM to OFFSET
bytes from the beginning of the file if WHENCE
is SEEK_SET, the end of the file is it is SEEK_END,
or the current position if it is SEEK_CUR. */
int
fseek (stream, offset, whence)
register FILE *stream;
long int offset;
int whence;
{
...
if (stream->__mode.__write && __flshfp (stream, EOF) == EOF)
return EOF;
...
/* O is now an absolute position, the new target. */
stream->__target = o;
/* Set bufp and both end pointers to the beginning of the buffer.
The next i/o will force a call to the input/output room function. */
stream->__bufp
= stream->__get_limit = stream->__put_limit = stream->__buffer;
...
}
this implementation would flush the buffer to disk file if it's write mode.
And it will reset the pointer for both read and write. It's equivalent to reset or flush the buffer for read.
It matches the C99 (credit to the previous answer)
A change of input/output direction on an update file is only allowed following a successful fsetpos, fseek, rewind, or fflush operation, since these are precisely the functions which assure that the I/O buffer has been flushed.
For more details, check here.

Difference between stream and direct I/O in C?

In C, I believe (correct me if I'm wrong) there are two different types of input/output functions, direct and stream, which result in binary and ASCII files respectively.
What is the difference between stream (ASCII) and direct (Binary) I/O in terms of retrieving (read/write) and printing data?
No, yes, sort of, maybe…
In C, … there are two different types of input/output functions, direct and stream, which result in binary and ASCII files respectively.
In Standard C, there are only file streams, FILE *. In POSIX C, there are what might be termed 'direct' file access functions, mainly using file descriptors instead of file streams. AFAIK, Windows also provides alternative I/O functions, mainly using handles instead of file streams. So "No" — Standard C has one type of I/O function; but POSIX (and Windows) provide alternatives.
In Standard C, you can create a binary files and text files using:
FILE *bfp = fopen("binary-file.bin", "wb");
FILE *tfp = fopen("regular-file.txt", "w");
On Windows (and maybe other systems for Windows compatibility), you can be explicit about opening a text file:
FILE *tcp = fopen("regular-file.txt", "wt");
So the standard distinguishes between text and binary files, but file streams can be used to access either type of file. Further, on Unix systems, there is no difference between a text file and a binary file; they will be treated the same. On Windows, a text file will have its CRLF (carriage return, line feed) line endings mapped to newline on input, and newlines mapped to CRLF line endings on output. That translation does not occur with binary files.
Note that there is also a concept 'direct I/O' on Linux, activated using the O_DIRECT flag, which is probably not what you're thinking of. It is a refinement of file descriptor I/O.
What is the difference between stream (ASCII) and direct (Binary) I/O in terms of retrieving (read/write) and printing data?
There are multiple issues.
First, the dichotomy between text files and binary files is separate from the dichotomy between stream I/O and direct I/O.
With stream I/O, the mapping of line endings from native (e.g. CRLF) to newline when processing text files compared with no such mapping when processing binary files.
With text I/O, it is assumed that there will be no null bytes, '\0' in the data. Such bytes in the middle of a line mess up text processing code that expects to read up to a null. With binary I/O, all 256 byte values are expected; code that breaks because of a null byte is broken.
Complicating this is the distinction between different code sets for encoding text files. If you have a single-byte code set, such as ISO 8859-15, then null bytes don't generally appear. If you have a multi-byte code set such as UTF-8, again, null bytes don't generally appear. However, if you have a wide character code set such as UTF-16 (whether big-endian or little-endian), then you will often get zero bytes in the body of the file — it is not intended to be read or written as a byte stream but rather as a stream of 16-bit units.
The major difference between stream I/O and direct I/O is that the stream library buffers data for both input and output, unless you override it with setvbuf(). That is, if you repeatedly read a single character in the user code (getchar() for example), the stream library first reads a chunk of data from the file and then doles out one character at a time from the chunk, only going back to the file for more data when the previous chunk has been delivered completely. By contrast, direct I/O reading a single byte at a time will make a system call for each byte. Granted, the kernel will buffer the I/O (it does that for the stream I/O too — so there are multiple layers of buffering here, which is part of what O_DIRECT I/O attempts to avoid whenever possible), but the overhead of a system call per byte is rather substantial.
Generally, you have more fine-grained control over access with file descriptors; there are operations you can do with file descriptors that are simply not feasible with streams because the stream interface functions simply don't cover the possibility. For example, setting FD_CLOEXEC or O_CLOEXEC on a file descriptor means that the file descriptor will be closed automatically by the system when the program executes another one — the stream library simply doesn't cover the concept, let alone provide control over it. The cost of gaining the fine-grained control is that you have to write more code — or, at least, different code that does what is handled for you by the stream library functions.
Streams are a portable way of reading and writing data. They provide a flexible and efficient means of I/O. A Stream is a file or a physical device (like monitor) which is manipulated with a pointer to the stream.
This is BUFFERED that is to say a fixed chunk is read from or written to a file via some temporary storage area (the buffer). But data written to a buffer does not appear in a file (or device) until the buffer is flushed or written out. (\n does this).
In Direct or low-level I/O-
This form of I/O is UNBUFFERED -- each read/write request results in accessing disk (or device) directly to fetch/put a specific number of bytes.
There are no formatting facilities -- we are dealing with bytes of information.
This means we are now using binary (and not text) files.

Is there a guaranteed and safe way to truncate a file from ANSI C FILE pointer?

I know ANSI C defines fopen, fwrite, fread, fclose to modify a file's content. However, when it comes to truncating a file, we have to turn to OS specific function, e.g, truncate() on Linux, _chsize_s_() on Windows. But before we can call those OS specific functions, we have to obtain the file-handle from FILE pointer, by calling fileno, also an non-ANSI-C one.
My question is: Is it reliable to continue using FILE* after truncating the file? I mean, ANSI C FILE layer has its own buffer and does not know the file is truncated from beneath. In case the buffered bytes is beyond the truncated point, will the buffered content be flushed to the file when doing fclose() ?
If no guarantee, what is the best practice of using file I/O functions accompanied with truncate operation when write a Windows-Linux portable program?
Similar question: When querying file size from a file-handle returned by fileno , is it the accurate size when I later call fclose() -- without further fwrite()?
[EDIT 2012-12-11]
According to Joshua's suggestion. I conclude that current possible best practice is: Set the stream to unbuffered mode by calling setbuf(stream, NULL); , then truncate() or _chsize_s() can work peacefully with the stream.
Anyhow, no official document seems to explicitly confirm this behavior, whether Microsoft CRT or GNU glibc.
The POSIX way....
ftruncate() is what you're looking for, and it's been in POSIX base specifications since 2001, so it should be in every modern POSIX-compatible system by now.
Note that ftruncate() operates on a POSIX file descriptor (despite its potentially misleading name), not a STDIO stream FILE handle. Note also that mixing operations on the STDIO stream and on the underlying OS calls which operate on the file descriptor for the open stream can confuse the internal runtime state of the STDIO library.
So, to use ftruncate() safely with STDIO it may be necessary to first flush any STDIO buffers (with fflush()) if your program may have already written to the stream in question. This will avoid STDIO trying to flush the otherwise unwritten buffer to the file after the truncation has been done.
You can then use fileno() on the STDIO stream's FILE handle to find the underlying file descriptor for the open STDIO stream, and you would then use that file descriptor with ftruncate(). You might consider putting the call to fileno() right in the parameter list for the ftruncate() call so that you don't keep the file descriptor around and accidentally use it yet other ways which might further confuse the internal state of STDIO. Perhaps like this (say to truncate a file to the current STDIO stream offset):
/*
* NOTE: fflush() is not needed here if there have been no calls to fseek() since
* the last fwrite(), assuming it extended the length of the stream --
* ftello() will account for any unwritten buffers
*/
if (ftruncate(fileno(stdout), ftello(stdout)) == -1) {
fprintf(stderr, "%s: ftruncate(stdout) failed: %s\n", argv[0], strerror(errno));
exit(1);
}
/* fseek() is not necessary here since we truncated at the current offset */
Note also that the POSIX definition of ftruncate() says "The value of the seek pointer shall not be modified by a call to ftruncate()", so this means you may also need to use use fseek() to set the STDIO layer (and thus indirectly the file descriptor) either to the new end of the file, or perhaps back to the beginning of the file, or somewhere still within the boundaries of the file, as desired. (Note that the fseek() should not be necessary if the truncation point is found using ftello().)
You should not have to make the STDIO stream unbuffered if you follow the procedure above, though of course doing so could be an alternative to using fflush() (but not fseek()).
Without POSIX....
If you need to stick to strict ISO Standard C, say C99, then you have no portable way to truncate a file to a given length other than zero (0) length. The latest draft of C11 that I have says this in Section 7.21.3 (paragraph 2):
Binary files are not truncated, except as defined in 7.21.5.3. Whether a write on a text stream causes the associated file to be truncated beyond that point is implementation-defined.
(and 7.21.5.3 describes the flags to fopen() which allow a file to be truncated to a length of zero)
The caveat about text files is there because on silly systems that have both text and binary files (as opposed to just plain POSIX-style content agnostic files) then it is often possible to write a value to the file which will be stored in the file at the position written and which will be treated as an EOF indicator when the file is next read.
Other types of systems may have different underlying file I/O interfaces that are not compatible with POSIX while still providing a compatible ISO C STDIO library. In theory if such a system offers something similar to fileno() and ftrunctate() then a similar procedure could be used with them as well, provided that one took the same care to avoid confusing the internal runtime state of the STDIO library.
With regard to querying file size....
You also asked whether the file size found by querying the file descriptor returned by fileno() would be an accurate representation of the file size after a successful call to fclose(), even without any further calls to fwrite().
The answer is: Don't do that!
As I mentioned above, the POSIX file descriptor for a file opened as a STDIO stream must be used very carefully if you don't want to confuse the internal runtime state of the STDIO library. We can add here that it is important not to confuse yourself with it either.
The most correct way to find the current size of a file opened as a STDIO stream is to seek to the end of it and then ask where the stream pointer is by using only STDIO functions.
Isn't an unbuffered write of zero bytes supposed to truncate the file at that point?
See this question for how to set unbuffered: Unbuffered I/O in ANSI C

Difference between fflush and fsync

I thought fsync() does fflush() internally, so using fsync() on a stream is OK. But I am getting an unexpected result when executed under network I/O.
My code snippet:
FILE* fp = fopen(file, "wb");
/* multiple fputs() calls like: */
fputs(buf, fp);
...
...
fputs(buf.c_str(), fp);
/* get fd of the FILE pointer */
fd = fileno(fp);
#ifndef WIN32
ret = fsync(fd);
#else
ret = _commit(fd);
fclose(fp);
But it seems _commit() is not flushing the data (I tried on Windows and the data was written on a Linux exported filesystem).
When I changed the code to be:
FILE* fp = fopen(file, "wb");
/* multiple fputs() calls like: */
fputs(buf, fp);
...
...
fputs(buf.c_str(), fp);
/* fflush the data */
fflush(fp);
fclose(fp);
it flushes the data.
I am wondering if _commit() does the same thing as fflush(). Any inputs?
fflush() works on FILE*, it just flushes the internal buffers in the FILE* of your application out to the OS.
fsync works on a lower level, it tells the OS to flush its buffers to the physical media.
OSs heavily cache data you write to a file. If the OS enforced every write to hit the drive, things would be very slow. fsync (among other things) allows you to control when the data should hit the drive.
Furthermore, fsync/commit works on a file descriptor. It has no knowledge of a FILE* and can't flush its buffers. FILE* lives in your application, file descriptors live in the OS kernel, typically.
The standard C function fflush() and the POSIX system call fsync() are conceptually somewhat similar. fflush() operates on C file streams (FILE objects), and is therefore portable.
fsync() operate on POSIX file descriptors.
Both cause buffered data to be sent to a destination.
On a POSIX system, each C file stream has an associated file descriptor, and all the operations on a C file stream will be implemented by delegating, when necessary, to POSIX system calls that operate on the file descriptor.
One might think that a call to fflush on a POSIX system would cause a write of any data in the buffer of the file stream, followed by a call of fsync() for the file descriptor of that file stream. So on a POSIX system there would be no need to follow a call to fflush with a call to fsync(fileno(fp)). But is that the case: is there a call to fsync from fflush?
No, calling fflush on a POSIX system does not imply that fsync will be called.
The C standard for fflush says (emphasis added) it
causes any unwritten data for [the] stream to be delivered to the host environment to be written to the file
Saying that the data is to be written, rather than that is is written implies that further buffering by the host environment is permitted. That buffering by the "host environment" could include, for a POSIX environment, the internal buffering that fsync flushes. So a close reading of the C standard suggests that the standard does not require the POSIX implementation to call fsync.
The POSIX standard description of fflush does not declare, as an extension of the C semantics, that fsync is called.
fflush() and fsync() can be used to try and ensure data is written to the storage media (but it is not always be possible):
first use fflush(fp) on the output stream (fp being a FILE * obtained from fopen or one of the standard streams stdout or stderr) to write the contents of the buffer associated with the stream to the OS.
then use fsync(fileno(fp)) to tell the OS to write its own buffers to the storage media.
Note however that fileno() and fsync() are POSIX functions that might not be available on all systems, notably Microsoft legacy systems where alternatives may be named _fileno(), _fsync() or _commit()...
I could say that for simplicity:
use fsync() with not streaming files (integer file descriptors)
use fflush() with file streams.
Also here is the help from man:
int fflush(FILE *stream); // flush a stream, FILE* type
int fsync(int fd); // synchronize a file's in-core state with storage device
// int type
To force the commitment of recent changes to disk, use the sync() or fsync() functions.
fsync() will synchronize all of the given file's data and metadata with the permanent storage device. It should be called just before the corresponding file has been closed.
sync() will commit all modified files to disk.
I think below document from python (https://docs.python.org/2/library/os.html) clarifies it very well.
os.fsync(fd) Force write of file with filedescriptor fd to disk. On
Unix, this calls the native fsync() function; on Windows, the MS
_commit() function.
If you’re starting with a Python file object f, first do f.flush(),
and then do os.fsync(f.fileno()), to ensure that all internal buffers
associated with f are written to disk.
Availability: Unix, and Windows starting in 2.2.3.

C fopen vs open

Is there any reason (other than syntactic ones) that you'd want to use
FILE *fdopen(int fd, const char *mode);
or
FILE *fopen(const char *path, const char *mode);
instead of
int open(const char *pathname, int flags, mode_t mode);
when using C in a Linux environment?
First, there is no particularly good reason to use fdopen if fopen is an option and open is the other possible choice. You shouldn't have used open to open the file in the first place if you want a FILE *. So including fdopen in that list is incorrect and confusing because it isn't very much like the others. I will now proceed to ignore it because the important distinction here is between a C standard FILE * and an OS-specific file descriptor.
There are four main reasons to use fopen instead of open.
fopen provides you with buffering IO that may turn out to be a lot faster than what you're doing with open.
fopen does line ending translation if the file is not opened in binary mode, which can be very helpful if your program is ever ported to a non-Unix environment (though the world appears to be converging on LF-only (except IETF text-based networking protocols like SMTP and HTTP and such)).
A FILE * gives you the ability to use fscanf and other stdio functions.
Your code may someday need to be ported to some other platform that only supports ANSI C and does not support the open function.
In my opinion the line ending translation more often gets in your way than helps you, and the parsing of fscanf is so weak that you inevitably end up tossing it out in favor of something more useful.
And most platforms that support C have an open function.
That leaves the buffering question. In places where you are mainly reading or writing a file sequentially, the buffering support is really helpful and a big speed improvement. But it can lead to some interesting problems in which data does not end up in the file when you expect it to be there. You have to remember to fclose or fflush at the appropriate times.
If you're doing seeks (aka fsetpos or fseek the second of which is slightly trickier to use in a standards compliant way), the usefulness of buffering quickly goes down.
Of course, my bias is that I tend to work with sockets a whole lot, and there the fact that you really want to be doing non-blocking IO (which FILE * totally fails to support in any reasonable way) with no buffering at all and often have complex parsing requirements really color my perceptions.
open() is a low-level os call. fdopen() converts an os-level file descriptor to the higher-level FILE-abstraction of the C language. fopen() calls open() in the background and gives you a FILE-pointer directly.
There are several advantages to using FILE-objects rather raw file descriptors, which includes greater ease of usage but also other technical advantages such as built-in buffering. Especially the buffering generally results in a sizeable performance advantage.
fopen vs open in C
1) fopen is a library function while open is a system call.
2) fopen provides buffered IO which is faster compare to open which is non buffered.
3) fopen is portable while open not portable (open is environment specific).
4) fopen returns a pointer to a FILE structure(FILE *); open returns an integer that identifies the file.
5) A FILE * gives you the ability to use fscanf and other stdio functions.
Unless you're part of the 0.1% of applications where using open is an actual performance benefit, there really is no good reason not to use fopen. As far as fdopen is concerned, if you aren't playing with file descriptors, you don't need that call.
Stick with fopen and its family of methods (fwrite, fread, fprintf, et al) and you'll be very satisfied. Just as importantly, other programmers will be satisfied with your code.
If you have a FILE *, you can use functions like fscanf, fprintf and fgets etc. If you have just the file descriptor, you have limited (but likely faster) input and output routines read, write etc.
open() is a system call and specific to Unix-based systems and it returns a file descriptor. You can write to a file descriptor using write() which is another system call.
fopen() is an ANSI C function call which returns a file pointer and it is portable to other OSes. We can write to a file pointer using fprintf.
In Unix:
You can get a file pointer from the file descriptor using:
fP = fdopen(fD, "a");
You can get a file descriptor from the file pointer using:
fD = fileno (fP);
Using open, read, write means you have to worry about signal interaptions.
If the call was interrupted by a signal handler the functions will return -1
and set errno to EINTR.
So the proper way to close a file would be
while (retval = close(fd), retval == -1 && ernno == EINTR) ;
I changed to open() from fopen() for my application, because fopen was causing double reads every time I ran fopen fgetc . Double reads were disruptive of what I was trying to accomplish. open() just seems to do what you ask of it.
open() will be called at the end of each of the fopen() family functions. open() is a system call and fopen() are provided by libraries as a wrapper functions for user easy of use
Depends also on what flags are required to open. With respect to usage for writing and reading (and portability) f* should be used, as argued above.
But if basically want to specify more than standard flags (like rw and append flags), you will have to use a platform specific API (like POSIX open) or a library that abstracts these details. The C-standard does not have any such flags.
For example you might want to open a file, only if it exits. If you don't specify the create flag the file must exist. If you add exclusive to create, it will only create the file if it does not exist. There are many more.
For example on Linux systems there is a LED interface exposed through sysfs. It exposes the brightness of the led through a file. Writing or reading a number as a string ranging from 0-255. Of course you don't want to create that file and only write to it if it exists. The cool thing now: Use fdopen to read/write this file using the standard calls.
opening a file using fopen
before we can read(or write) information from (to) a file on a disk we must open the file. to open the file we have called the function fopen.
1.firstly it searches on the disk the file to be opened.
2.then it loads the file from the disk into a place in memory called buffer.
3.it sets up a character pointer that points to the first character of the buffer.
this the way of behaviour of fopen function
there are some causes while buffering process,it may timedout. so while comparing fopen(high level i/o) to open (low level i/o) system call , and it is a faster more appropriate than fopen.

Resources