In what cases does does the function fwrite put an error indicator onto the stream, such that ferror will return true?
Specifically, I am wondering if it puts an error when all the bytes were not successfully written.
Please provide a link as to where you get your information from.
Thank you
If any error occurs, the error indicator for the stream will be set, and will not be cleared until clearerr is called. However, due to buffering, it's difficult for stdio functions to report errors. Often the error will not be seen until a later call, since buffering data never fails, but the subsequent write after the buffer is full might fail. If you're using stdio to write files, the only ways I know to handle write errors robustly are (choose one):
disable buffering (with setbuf or setvbuf - this must be the very first operation performed on the FILE after it's opened or otherwise it has UB)
keep track of the last successful fflush and assume any data after that might be only partially-written
treat any failed write as a completely unrecoverable file and just delete it
Also note that on a POSIX system, fwrite (and other stdio functions) are required to set errno to indicate the type of error, if an error occurs. As far as plain C is concerned, these functions may set errno, or they may not.
From the fwrite man page on my Linux system:
RETURN VALUE
fread() and fwrite() return the number of items successfully read or written
(i.e., not the number of characters). If an error occurs, or the end-of-file
is reached, the return value is a short item count (or zero).
fread() does not distinguish between end-of-file and error, and callers
must use feof(3) and ferror(3) to determine which occurred.
Just from reading the man page, it doesn't look like it will set errno.
On Windows fwrite may put error when try to write to read-opened stream.
For example if there was a call to seekg which sets read flag to the stream and seekp was not called before writing.
Related
During read/write operations is it absolutely necessary to check feof()?
The reason I ask is because I have a program that performs read/write ONCE, here is the code below:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file))
print(read error)
}
This is just pseudocode but is it necessary to check feof() in a case like this where a read will occur once? Currently, I think it is only necessary if you will do ANOTHER read after the one above like this:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file)) // feof error occurred oops
print(read error)
data = read_file_data(file); // reading after error
}
Lastly, what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
During read/write operations is it absolutely necessary to check feof()?
No. During normal operations, the best way of detecting EOF is to check the return value of the particular input call you're using. Input calls can always fail, so their return values should always be checked. This is especially true of scanf and fgets, which many beginning programmers (and unfortunately many beginning programming tutorials) conspicuously neglect to check.
Explicit calls to feof are rarely necessary. You might need to call feof if:
An input call has just returned 0 or EOF, but you'd like to know whether that was due to an actual end-of-file, or a more serious error.
You've used an input call such as getw() that has no way to indicate EOF or error.
(rare, and arguably poor form) You've made a long series of input calls in a row, and you didn't expect EOF from any of them, so you didn't feel like checking all of them explicitly, but decided to check with one catch-all call to feof() at the end.
The other thing to know is that feof() only tells you that you did hit end-of-file — that is, past tense. It does not predict the future; it does not tell you that next input call you try to make will hit EOF. It tells you that the previous call you made did hit EOF. See also Why is “while( !feof(file) )” always wrong?
See also how to detect read/write errors when using fread() and fwrite?
what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
That's a good question. The answer is, "it depends," and since unpredictability can be a real problem, your best bet is usually not to try to read past EOF.
When I was first learning C, if you got an EOF, but tried reading some more, and if the EOF had somehow "gone away", your next read might succeed. And it could be quite common for the EOF to "go away" like that, if you were reading from the keyboard, and the user indicated EOF by typing control-D. But if they typed more input after typing control-D, you could go ahead and read it.
But that was in the old days. These days EOF is "sticky", and once the end-of-file flag has been set for a stream, I think any future attempts to read are supposed to immediately return EOF. These days, if the user hits control-D and you want to keep reading, you have to call clearerr().
I'm pretty sure everything I just said abut feof() and the per-stream EOF flag is also true of ferror() and the per-stream error flag.
In a comment, #0___________ said that "if you ignore I/O errors and continue you invoke Undefined Behaviour". I don't think this is true, but I don't have my copy of the Standard handy to check.
The reason to call feof is to figure out whether the EOF return from an input function was due to some kind of actual I/O error (ferror() would return a non-zero value then but feof() wouldn't), or due to the input being exhausted (which is not an error, but a condition; feof() would return a non-zero value).
For example if your program is to consume all the input and process it, it might be crucial to be able to distinguish that you actually did read all of the input vs someone removed the USB stick from the drive when you'd read just half of the input.
Most IO functions (fopen, fread, etc) can fail if the system encounters an error, or if a passed argument is invalid (fseek to some value other than SEEK_SET, SEEK_CUR, or SEEK_END), but they can also fail for reasons beyond the programmers control. Checking for success is tedious and might sometimes unnecessary. Are there any circumstances in which a function will not fail even if it could fail in certain circumstances? Such as fcloseing a file opened in read only mode, and the file pointer was freshly returned by fopen? When do I not have to worry about an IO function failing? Are these such circumstances?
File was opened in READ mode
File pointer was returned by fopen and nullchecked
Read operations
What other circumstances?
I do not see why fclose would fail on a file opened in READ mode, because there is not going to be any buffers written to the file, all that needs to happen is freeing up memory, and free cannot fail so neither should fclose, is this assumption correct?
Also should I avoid rewind because it cannot indicate error?
We know that call to functions like fprintf or fwrite will not write data to the disk immediately, instead, the data will be buffered until a threshold is reached. My question is, if I call the fseek function, will these buffered data writen to disk before seeking to the new position? Or the data is still in the buffer, and is writen to the new position?
cheng
I'm not aware if the buffer is guaranteed to be flushed, it may not if you seek to a position close enough. However there is no way that the buffered data will be written to the new position. The buffering is just an optimization, and as such it has to be transparent.
Yes; fseek() ensures that the file will look like it should according to the fwrite() operations you've performed.
The C standard, ISO/IEC 9899:1999 §7.19.9.2 fseek(), says:
The fseek function sets the file position indicator for the stream pointed to by stream.
If a read or write error occurs, the error indicator for the stream is set and fseek fails.
I don't believe that it's specified that the data must be flushed on a fseek but when the data is actually written to disk it must be written at that position that the stream was at when the write function was called. Even if the data is still buffered, that buffer can't be written to a different part of the file when it is flushed even if there has been a subsequent seek.
It seems that your real concern is whether previously-written (but not yet flushed) data would end up in the wrong place in the file if you do an fseek.
No, that won't happen. It'll behave as you'd expect.
I have vague memories of a requirement that you call fflush before
fseek, but I don't have my copy of the C standard available to verify.
(If you don't it would be undefined behavior or implementation defined,
or something like that.) The common Unix standard specifies that:
If the most recent operation, other than ftell(), on a given stream is
fflush(), the file offset in the underlying open file description
shall be adjusted to reflect the location specified by fseek().
[...]
If the stream is writable and buffered data had not been written to
the underlying file, fseek() shall cause the unwritten data to be
written to the file and shall mark the st_ctime and st_mtime fields of
the file for update.
This is marked as an extention to the ISO C standard, however, so you can't count on it except on Unix platforms (or other platforms which make similar guarantees).
Can anyone please explain me about difference between fpurge(FILE *stream) and fflush(FILE *stream) in C?
Both fflush() and fpurge() will discard any unwritten or unread data in the buffer.
Please explain me the exact difference between these two and also their pros and cons.
"... both fflush and fpurge will discard any unwritten or unread data in the buffer..." : No.
fflush:
The function fflush forces a write of all buffered data for the given output or update stream via the stream's underlying write function. The open status of the stream is unaffected.
If the stream argument is NULL, fflush flushes all open output streams.
fpurge:
The function fpurge erases any input or output buffered in the given stream. For output streams this discards any unwritten output. For input streams this discards any input read from the underlying object but not yet obtained via getc. This includes any text pushed back via ungetc. (P.S.: there also exists __fpurge, which does the same, but without returning any value).
Besides the obvious effect on buffered data, one use where you would notice the difference is with input streams. You can fpurge one such stream (although it is usually a mistake, possibly conceptual). Depending on the environment, you might not fflush an input stream (its behaviour might be undefined, see man page). In addition to the above remarked differences: 1) the cases where they lead to errors are different, and 2) fflush can work on all output streams with a single statement, as said (this might be very useful).
As for pros and cons, I would not really quote any... they simply work different (mostly), so you should know when to use each.
In addition to the functional difference (what you were asking), there is a portability difference: fflush is a standard function, while fpurge is not (and __fpurge is not either).
Here you have the respective man pages (fflush, fpurge).
To start with, both the functions clear the buffers (type of operable buffers are discussed below), the major difference is what happens with the data present in the buffer.
For fflush(), the data is forced to be written to disk.
For fpurge(), data is discarded.
That being said, fflush() is a standard C function, mentioned in the C11, chapter §7.21.5.2.
Whereas, fpurge() is a non-portable and non-standard function. From the man page
These functions are nonstandard and not portable. The function
fpurge() was introduced in 4.4BSD and is not available under Linux.
The function __fpurge() was introduced in Solaris, and is present in
glibc 2.1.95 and later.
That said, the major usage-side difference is,
Calling fflush() with input stream is undefined behavior.
If stream points to an output stream or an update stream in which the most recent
operation was not input, the fflush function causes any unwritten data for that stream
to be delivered to the host environment to be written to the file; otherwise, the behavior is
undefined.
Calling fpurge() with input stream is defined.
For input streams
this discards any input read from the underlying object but not yet
obtained via getc(3); this includes any text pushed back via
ungetc(3).
Still, try to stick to fflush().
I am struggling to know the difference between these functions. Which one of them can be used if i want to read one character at a time.
fread()
read()
getc()
Depending on how you want to do it you can use any of those functions.
The easier to use would probably be fgetc().
fread() : read a block of data from a stream (documentation)
read() : posix implementation of fread() (documentation)
getc() : get a character from a stream (documentation). Please consider using fgetc() (doc)instead since it's kind of saffer.
fread() is a standard C function for reading blocks of binary data from a file.
read() is a POSIX function for doing the same.
getc() is a standard C function (a macro, actually) for reading a single character from a file - i.e., it's what you are looking for.
In addition to the other answers, also note that read is unbuffered method to read from a file. fread provides an internal buffer and reading is buffered. The buffer size is determined by you. Also each time you call read a system call occurs which reads the amount of bytes you told it to. Where as with fread it will read a chunk in the internal buffer and return you only the bytes you need. For each call on fread it will first check if it can provide you with more data from the buffer, if not it makes a system call (read) and gets a chunk more data and returns you only the portion you wanted.
Also read directly handles the file descriptor number, where fread needs the file to be opened as a FILE pointer.
The answer depends on what you mean by "one character at a time".
If you want to ensure that only one character is consumed from the underlying file descriptor (which may refer to a non-seekable object like a pipe, socket, or terminal device) then the only solution is to use read with a length of 1. If you use strace (or similar) to monitor a shell script using the shell command read, you'll see that it repeatedly calls read with a length of 1. Otherwise it would risk reading too many bytes (past the newline it's looking for) and having subsequent processes fail to see the data on the "next line".
On the other hand, if the only program that should be performing further reads is your program itself, fread or getc will work just fine. Note that getc should be a lot faster than fread if you're just reading a single byte.