Scanf in C language, I am a little confused about the return value.
In the instruction, it says:
EOF is returned if the end of input is reached before either the first successful conversion or a matching failure occurs.
EOF is also returned if a read error occurs, in which case the error indicator for the stream is set.
First, I am not sure what they mean by if the end of input is reached before the first successful conversion or before a matching failure occurs. How is that possible?
Second, I am not sure the difference between read error and matching failure?
First, I am not sure what they mean by if the end of input is reached before the first successful conversion or before a matching failure occurs. How is that possible?
Imagine you're trying to read a character from a file and you're at the end of the file. The end of input will be reached before any successful conversion or matching attempt takes place.
Second, I am not sure the difference between read error and matching failure?
A read error means you were unable to read data from the FILE. A matching failure means you were able to read data but it didn't match what was expected (for example, reading a for %d.)
The function scanf() returns the number of fields successfully read and converted. But if I type (Windows) Ctrl-Z for the input, that indicates EOF. In Linux, I think that may be Ctrl-D? So if you haven't entered meaningful values, scanf() indicates failure in one way or another.
Typically, you test for the number of inputs required, and this will cover the EOF situation too.
if (1 != scanf("%d", &i))
printf ("No valid input\n");
Related
During read/write operations is it absolutely necessary to check feof()?
The reason I ask is because I have a program that performs read/write ONCE, here is the code below:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file))
print(read error)
}
This is just pseudocode but is it necessary to check feof() in a case like this where a read will occur once? Currently, I think it is only necessary if you will do ANOTHER read after the one above like this:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file)) // feof error occurred oops
print(read error)
data = read_file_data(file); // reading after error
}
Lastly, what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
During read/write operations is it absolutely necessary to check feof()?
No. During normal operations, the best way of detecting EOF is to check the return value of the particular input call you're using. Input calls can always fail, so their return values should always be checked. This is especially true of scanf and fgets, which many beginning programmers (and unfortunately many beginning programming tutorials) conspicuously neglect to check.
Explicit calls to feof are rarely necessary. You might need to call feof if:
An input call has just returned 0 or EOF, but you'd like to know whether that was due to an actual end-of-file, or a more serious error.
You've used an input call such as getw() that has no way to indicate EOF or error.
(rare, and arguably poor form) You've made a long series of input calls in a row, and you didn't expect EOF from any of them, so you didn't feel like checking all of them explicitly, but decided to check with one catch-all call to feof() at the end.
The other thing to know is that feof() only tells you that you did hit end-of-file — that is, past tense. It does not predict the future; it does not tell you that next input call you try to make will hit EOF. It tells you that the previous call you made did hit EOF. See also Why is “while( !feof(file) )” always wrong?
See also how to detect read/write errors when using fread() and fwrite?
what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
That's a good question. The answer is, "it depends," and since unpredictability can be a real problem, your best bet is usually not to try to read past EOF.
When I was first learning C, if you got an EOF, but tried reading some more, and if the EOF had somehow "gone away", your next read might succeed. And it could be quite common for the EOF to "go away" like that, if you were reading from the keyboard, and the user indicated EOF by typing control-D. But if they typed more input after typing control-D, you could go ahead and read it.
But that was in the old days. These days EOF is "sticky", and once the end-of-file flag has been set for a stream, I think any future attempts to read are supposed to immediately return EOF. These days, if the user hits control-D and you want to keep reading, you have to call clearerr().
I'm pretty sure everything I just said abut feof() and the per-stream EOF flag is also true of ferror() and the per-stream error flag.
In a comment, #0___________ said that "if you ignore I/O errors and continue you invoke Undefined Behaviour". I don't think this is true, but I don't have my copy of the Standard handy to check.
The reason to call feof is to figure out whether the EOF return from an input function was due to some kind of actual I/O error (ferror() would return a non-zero value then but feof() wouldn't), or due to the input being exhausted (which is not an error, but a condition; feof() would return a non-zero value).
For example if your program is to consume all the input and process it, it might be crucial to be able to distinguish that you actually did read all of the input vs someone removed the USB stick from the drive when you'd read just half of the input.
As we know if we try to read a value, whose type does not represent the format's parameter (eg. %d with value z,...), scanf will not clear the input buffer after the reading's error (and usually this can cause infinite loop), but being a library's function (so not directly interfaced with kernel operations) should not use an atomic system call like the read function? If yes, why the output it's not being cleared while with a read it is?
Also when we read something with scanf in the input buffer it will always remain '\n' (new line) character. How it know that may not consume it? \
As per title I am trying to understand the exact behavior of Ctrl+D / Ctrl+Z in a while loop with a gets (which I am required to use). The code I am testing is the following:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char str[80];
while(printf("Insert string: ") && gets(str) != NULL) {
puts(str);
}
return 0;
}
If my input is simply a Ctrl+D (or Ctrl+Z on Windows) gets returns NULL and the program exits correctly. The unclear situation is when I insert something like house^D^D (Unix) or house^Z^Z\n (Windows).
In the first case my interpretation is a getchar (or something similar inside the gets function) waits for read() to get the input, the first Ctrl+D flushes the buffer which is not empty (hence not EOF) then the second time read() is called EOF is triggered.
In the second case though, I noticed that the first Ctrl+Z is inserted into the buffer while everything that follows is simply ignored. Hence my understanding is the first read() call inserted house^Z and discarded everything else returning 5 (number of characters read). (I say 5 because otherwise I think a simple Ctrl+Z should return 1 without triggering EOF). Then the program waits for more input from the user, hence a second read() call.
I'd like to know what I get right and wrong of the way it works and which part of it is simply implementation dependent, if any.
Furthermore I noticed that in both Unix and Windows even after EOF is triggered it seem to reset to false in the following gets() call and I don't understand why this happens and in which line of the code.
I would really appreciate any kind of help.
(12/20/2016) I heavily edited my question in order to avoid confusion
The CTRL-D and CTRL-Z "end of file" indicators serve a similar purpose on Unix and Windows systems respectively, but are implemented quite differently.
On Unix systems (including Unix clones like Linux) CTRL-D, while officially described as the end-of-file character, is actually a delimiter character. It does almost the same thing as the end-of-line character (usually carriage return or CTRL-M) which is used to delimit lines. Both characters tell the operating system that the input line is finished and to make it available the program. The only difference is that with end-of-line character a line feed (CTRL-J) character is inserted at the end of the input buffer to mark the end of the line, while with the end-of-file character nothing is inserted.
This means when you enter house^D^D on Unix the read system call will first return a buffer of length 5 with the 5 characters house in it. When read is called again to obtain more input, it will then returns of a buffer of length 0 with no characters in it. Since a zero length read on a normal file indicates that the end of file has been reached the gets library function also interprets this as the end of file and stops reading the input. However since it filled the buffer with 5 characters it doesn't return NULL to indicate that it reached end of the file. And since it hasn't actually actually reached end of file, as terminal devices aren't actually files, further calls to gets after this will make further calls to read which will return any subsequent characters that the user types.
On Windows CTRL-Z is handled much differently. The biggest difference is that it's not treated specially by the operating system at all. When you type house^Z^Z^M on Windows only the carriage return character is given special treatment. Just like on Unix, the carriage return makes the typed line available to the program, though in this case a carriage return and a line feed are added to the buffer to mark the end of the line. So the result is that ReadFile function returns a 9 byte long buffer with the 9 characters house^Z^Z^M^J in it.
It actually the program itself, specifically the C runtime library, that treats CTRL-Z specially. In the case of the Microsoft C runtime library when it sees the CTRL-Z character in the buffer returned by ReadFile it treats it as an end-of-file marker and ignores everything else after it. Using the example in the previous paragraph, gets ends up calling ReadFile to get more input because the fact its seen the CTRL-Z character isn't remembered when reading from the console (or other device) and it hasn't yet seen the end-of-line (which was ignored). If you then press enter again, gets will return with the buffer filled with the 7 bytes house^Z\0 (adding a 0 byte to indicate the end of the string). By default, it does the much same thing when reading from normal files, if a CTRL-Z character appears in a file, it and everything after it is ignored. This is for backward-compatibility with CP/M which only supported files in lengths that were multiples of 128 and used CTRL-Z to mark where text files really were supposed to end.
Note that both the Unix and Windows behaviours described above are only the normal default handling of user input. The Unix handling of CTRL-D only occurs when reading from a terminal device in canonical mode and it's possible to change the "end-of-file" character to something else. On Windows the operating system never treats CTRL-Z specially, but whether the C runtime library does or not depends on whether the FILE stream being read is in text or binary mode. This is why in portable programs you should always include the character b in the mode string when opening binary files (eg. fopen("foo.gif", "rb")).
I'm trying to read from a file in C and after I'm done reading want to write to the same file. I'm trying to use fread() for this. Does anyone know if fread advances the pointer after it encounters "\0"? I mean after I finish reading do I need to advance the pointer or do I need to straight-away start writing into the file using fwrite ?
fread will advance the file position (not pointer) until it hits EOF.
However, it will not stop reading simply because it encounters '\0'. In fact, even fgets will only stop reading when it encounters \n. No standard library function I know of stops reading a file at '\0'.
Yes, it does advance the pointer unless you meet EOF or encounter an error:
RETURN VALUES
The functions fread() and fwrite() advance the file position indicator for the stream by the number of bytes read or written. They return
the number of objects read or written. If an error occurs, or the end-of-file is reached, the return value is a short object count (or
zero).
In what cases does does the function fwrite put an error indicator onto the stream, such that ferror will return true?
Specifically, I am wondering if it puts an error when all the bytes were not successfully written.
Please provide a link as to where you get your information from.
Thank you
If any error occurs, the error indicator for the stream will be set, and will not be cleared until clearerr is called. However, due to buffering, it's difficult for stdio functions to report errors. Often the error will not be seen until a later call, since buffering data never fails, but the subsequent write after the buffer is full might fail. If you're using stdio to write files, the only ways I know to handle write errors robustly are (choose one):
disable buffering (with setbuf or setvbuf - this must be the very first operation performed on the FILE after it's opened or otherwise it has UB)
keep track of the last successful fflush and assume any data after that might be only partially-written
treat any failed write as a completely unrecoverable file and just delete it
Also note that on a POSIX system, fwrite (and other stdio functions) are required to set errno to indicate the type of error, if an error occurs. As far as plain C is concerned, these functions may set errno, or they may not.
From the fwrite man page on my Linux system:
RETURN VALUE
fread() and fwrite() return the number of items successfully read or written
(i.e., not the number of characters). If an error occurs, or the end-of-file
is reached, the return value is a short item count (or zero).
fread() does not distinguish between end-of-file and error, and callers
must use feof(3) and ferror(3) to determine which occurred.
Just from reading the man page, it doesn't look like it will set errno.
On Windows fwrite may put error when try to write to read-opened stream.
For example if there was a call to seekg which sets read flag to the stream and seekp was not called before writing.