During read/write operations is it absolutely necessary to check feof()?
The reason I ask is because I have a program that performs read/write ONCE, here is the code below:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file))
print(read error)
}
This is just pseudocode but is it necessary to check feof() in a case like this where a read will occur once? Currently, I think it is only necessary if you will do ANOTHER read after the one above like this:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file)) // feof error occurred oops
print(read error)
data = read_file_data(file); // reading after error
}
Lastly, what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
During read/write operations is it absolutely necessary to check feof()?
No. During normal operations, the best way of detecting EOF is to check the return value of the particular input call you're using. Input calls can always fail, so their return values should always be checked. This is especially true of scanf and fgets, which many beginning programmers (and unfortunately many beginning programming tutorials) conspicuously neglect to check.
Explicit calls to feof are rarely necessary. You might need to call feof if:
An input call has just returned 0 or EOF, but you'd like to know whether that was due to an actual end-of-file, or a more serious error.
You've used an input call such as getw() that has no way to indicate EOF or error.
(rare, and arguably poor form) You've made a long series of input calls in a row, and you didn't expect EOF from any of them, so you didn't feel like checking all of them explicitly, but decided to check with one catch-all call to feof() at the end.
The other thing to know is that feof() only tells you that you did hit end-of-file — that is, past tense. It does not predict the future; it does not tell you that next input call you try to make will hit EOF. It tells you that the previous call you made did hit EOF. See also Why is “while( !feof(file) )” always wrong?
See also how to detect read/write errors when using fread() and fwrite?
what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
That's a good question. The answer is, "it depends," and since unpredictability can be a real problem, your best bet is usually not to try to read past EOF.
When I was first learning C, if you got an EOF, but tried reading some more, and if the EOF had somehow "gone away", your next read might succeed. And it could be quite common for the EOF to "go away" like that, if you were reading from the keyboard, and the user indicated EOF by typing control-D. But if they typed more input after typing control-D, you could go ahead and read it.
But that was in the old days. These days EOF is "sticky", and once the end-of-file flag has been set for a stream, I think any future attempts to read are supposed to immediately return EOF. These days, if the user hits control-D and you want to keep reading, you have to call clearerr().
I'm pretty sure everything I just said abut feof() and the per-stream EOF flag is also true of ferror() and the per-stream error flag.
In a comment, #0___________ said that "if you ignore I/O errors and continue you invoke Undefined Behaviour". I don't think this is true, but I don't have my copy of the Standard handy to check.
The reason to call feof is to figure out whether the EOF return from an input function was due to some kind of actual I/O error (ferror() would return a non-zero value then but feof() wouldn't), or due to the input being exhausted (which is not an error, but a condition; feof() would return a non-zero value).
For example if your program is to consume all the input and process it, it might be crucial to be able to distinguish that you actually did read all of the input vs someone removed the USB stick from the drive when you'd read just half of the input.
Related
I would like to read characters from stdin until one of the following occurs:
an end-of-line marker is encountered (the normal case, in my thinking),
the EOF condition occurs, or
an error occurs.
How can I guarantee that one of the above events will happen eventually? In other words, how do I guarantee that getchar will eventually return either \n or EOF, provided that no error (in terms of ferror(stdin)) occurs?
// (How) can we guarantee that the LABEL'ed statement will be reached?
int done = 0;
while (!0) if (
(c = getchar()) == EOF || ferror(stdin) || c == '\n') break;
LABEL: done = !0;
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur. It seems like the answer will have to do with the properties of the device. Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition. Similarly for files stored on disc / SSD.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I am using C89, but I am curious if the answer depends on which C standard is used.
You can't.
Let's say I run your program, then I put a weight on my keyboard's "X" key and go on vacation to Hawaii. On the way there, I get struck by lightning and die.
There will never be any input other than 'x'.
Or, I may decide to type the complete story of Moby Dick, without pressing enter. It will probably take a few days. How long should your program wait before it decides that maybe I won't ever finish typing?
What do you want it to do?
Looking at all the discussion in the comments, it seems you are looking in the wrong place:
It is not a matter of keyboard drivers or wrapping stdin.
It is also not a matter of what programming language you are using.
It is a matter of the purpose of the input in your software.
Basically, it is up to you as a programmer to know how much input you want or need, and then decide when to stop reading input, even if valid input is still available.
Note, that not only are there devices that can send input forever without triggering EOF or end of line condition, but there are also programs that will happily read input forever.
This is by design.
Common examples can be found in POSIX style OS (like Linux) command line tools.
Here is a simple example:
cat /dev/urandom | hexdump
This will print random numbers for as long as your computer is running, or until you hit Ctrl+C
Though cat will stop working when there is nothing more to print (EOF or any read error), it does not expect such an end, so unless there is a bug in the implementation you are using it should happily run forever.
So the real question is:
When does your program need to stop reading characters and why?
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur.
A device such as /dev/zero, for example. Yes, stdin can be connected to a device that never provides a newline or reaches EOF, and that is not expected ever to report an error condition.
It seems like the answer will have to do with the properties of the device.
Indeed so.
Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
Generally, it's a question of the device driver. And in some cases (such as the /dev/zero example) that's all there is anyway. Generally drivers do things that are sensible for the underlying hardware, but in principle, they don't have to do.
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition.
No. Generally speaking, an end-of-line marker is sent by a terminal device if and only if the <enter> key is pressed. An end-of-file condition might be signaled if the terminal disconnects (but the program continues), or if the user explicitly causes one to be sent (by typing <-<D> on Linux or Mac, for example, or <-<Z> on Windows). Neither of those events need actually happen on any given run of a program, and it is very common for the latter not to do.
Similarly for files stored on disc / SSD.
You can generally rely on data read from an ordinary file to contain newlines where they are present in the file itself. If the file is open in text mode, then the system-specific text line terminator will also be translated to a newline, if it differs. It is not necessary for a file to contain any of those, so a program reading from a regular file might never see a newline.
You can rely on EOF being signaled when a read is attempted while the file position is at or past the and of the file's data.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I think you're trying too hard.
Reading to end-of-line might be a reasonable thing to do in some cases. Expecting a newline to eventually be reached is reasonable if the program is intended to support interactive use. But trying to ensure that invalid data cannot be fed to your program is a losing cause. Your objective should be to accept the widest array of inputs you reasonably can, and to fail gracefully when other inputs are presented.
If you need to read input in a line-by-line mode then by all means do that, and document that you do it. If only the first n characters of each line are significant to the program then document that, too. Then, if your program never terminates when a user connects its input to /dev/zero that's on them, not on you.
On the other hand, try to avoid placing arbitrary constraints, especially on sizes of things. If there is not a natural limit on the size of something, then no artificial limit you introduce will ever be enough.
Scanf in C language, I am a little confused about the return value.
In the instruction, it says:
EOF is returned if the end of input is reached before either the first successful conversion or a matching failure occurs.
EOF is also returned if a read error occurs, in which case the error indicator for the stream is set.
First, I am not sure what they mean by if the end of input is reached before the first successful conversion or before a matching failure occurs. How is that possible?
Second, I am not sure the difference between read error and matching failure?
First, I am not sure what they mean by if the end of input is reached before the first successful conversion or before a matching failure occurs. How is that possible?
Imagine you're trying to read a character from a file and you're at the end of the file. The end of input will be reached before any successful conversion or matching attempt takes place.
Second, I am not sure the difference between read error and matching failure?
A read error means you were unable to read data from the FILE. A matching failure means you were able to read data but it didn't match what was expected (for example, reading a for %d.)
The function scanf() returns the number of fields successfully read and converted. But if I type (Windows) Ctrl-Z for the input, that indicates EOF. In Linux, I think that may be Ctrl-D? So if you haven't entered meaningful values, scanf() indicates failure in one way or another.
Typically, you test for the number of inputs required, and this will cover the EOF situation too.
if (1 != scanf("%d", &i))
printf ("No valid input\n");
The manual for the standard c library function fseek says: "A successful call to the fseek()function clears the end-of-file indicator for the stream."
To me it sounds like saying if EOF is at 2 and I call fseek() to get the pointer at 4, that should work and eof will be now pointing to 5. But when I test this hypothesis, the pointer doesn't advance beyond current EOF(2 in above case), and hence my understanding of the line is wrong. What does this line mean then? Thanks!
You have to remember that the EOF flag is not set until you actually try to read from beyond the end of the file.
With fseek clearing the flag, it does this even if you seek to beyond the end of the file. And it works because the flag will be set again next time you read.
That is why it's a bad idea to have loops such as while (!feof(...)), as those will then loop once to many without detecting the actual end of the file condition.
I was writing a bit C99 code reading from stdin:
// [...]
fgets(buf, sizeof(buf), stdin);
// [...]
But I am wondering if I should catch errors in this case, since the shell could redirect stdin to anything that may be less robust than plain stdin. But that would also mean, that every access on stdin, stdout and stderr has to be checked for errors, and I seldomly see any checks after printf and co.
So, is it recommended to check every stream access for errors?
The above example would then be something like:
// [...]
if (!fgets(buf, sizeof(buf), stdin) && ferror(stdin)) {
exit(EXIT_FAILURE);
}
// [...]
Thanks in advance!
You always have to check the return value from fgets(), every time you use it. If you don't, you have no idea whether there is useful data in the buffer; it could hold the last line a second time. Similar comments apply to every read operation; you must check whether the read operation returned what you expected.
if (fgets(buf, sizeof(buf), stdin) == 0)
...EOF, or some error...
In the handling code, you need to decide what to do. You can legitimately use feof() and ferror() in that code. The correct reaction to the problem depends on your code. Detecting EOF is usually a cause for exiting the loop or exiting the function (break or return, but you only return if the function did not open the file; otherwise you have to close the file at least). Detecting errors on stdin is going to be a rare occurrence; you will have to decide what's appropriate to do.
Detecting errors on write to stderr or stdout is something that's less often done, but it is arguably sloppy programming to omit them. One issue, especially if it is stderr that has the problem, is "how are you going to report the error?" You might need to use syslog() instead, but that's the sort of issue you have to think about.
It depends on the nature of the application you are developing. For example if you are developing a hard real time system whose abnormal termination results in severe problems. then you should take precaution to deal with all kinds of data streaming errors. On such situations use the following code,
if (!fgets(buf, sizeof(buf), stdin) && ferror(stdin)) {
exit(EXIT_FAILURE);
}
or some construct similar to it. But if your application's rare failures won't have any severe consequences you don't need to check every data streaming operation.
Here's a fun exercise. Find someone's interactive program, run it until it asks for input from the terminal, and press control-D (EOF). Odds are that the author doesn't check feof(stdin) and his gets() calls keep just returning 0 bytes which the code interprets as a blank line. If it takes that as invalid input and re-prompts, it'll end up in an infinite loop!
In what cases does does the function fwrite put an error indicator onto the stream, such that ferror will return true?
Specifically, I am wondering if it puts an error when all the bytes were not successfully written.
Please provide a link as to where you get your information from.
Thank you
If any error occurs, the error indicator for the stream will be set, and will not be cleared until clearerr is called. However, due to buffering, it's difficult for stdio functions to report errors. Often the error will not be seen until a later call, since buffering data never fails, but the subsequent write after the buffer is full might fail. If you're using stdio to write files, the only ways I know to handle write errors robustly are (choose one):
disable buffering (with setbuf or setvbuf - this must be the very first operation performed on the FILE after it's opened or otherwise it has UB)
keep track of the last successful fflush and assume any data after that might be only partially-written
treat any failed write as a completely unrecoverable file and just delete it
Also note that on a POSIX system, fwrite (and other stdio functions) are required to set errno to indicate the type of error, if an error occurs. As far as plain C is concerned, these functions may set errno, or they may not.
From the fwrite man page on my Linux system:
RETURN VALUE
fread() and fwrite() return the number of items successfully read or written
(i.e., not the number of characters). If an error occurs, or the end-of-file
is reached, the return value is a short item count (or zero).
fread() does not distinguish between end-of-file and error, and callers
must use feof(3) and ferror(3) to determine which occurred.
Just from reading the man page, it doesn't look like it will set errno.
On Windows fwrite may put error when try to write to read-opened stream.
For example if there was a call to seekg which sets read flag to the stream and seekp was not called before writing.