I just read in a string using the following statement:
fgets(string, 100, file);
This string that was just read in was the last line. If I call feof() now will it return TRUE? Is it the same as calling feof() right at the start before reading in any lines?
No, don't use feof() to detect the end of the file. Instead check for a read failure, for example fgets() will return NULL if it attempts to read past the end of the file whereas feof() will return 0 until some function attempts to read past the end of the file, only after that it returns non-zero.
Does feof() work when called after reading in last line?
No.
feof() becomes true when reading past the end of data. Reading the last line may not be pass the end of data if the last line ended in '\n'.
The short answer is NO. Here is why:
If fgets successfully read the '\n' at the end of the line, the end-of-file indicator in the FILE structure has not been set. Hence feof() will return 0, just like it should before reading anything, even on an empty file.
feof() can only be used to distinguish between end-of-file and read-error conditions after an input operation failed. Similarly, ferr() can be used to check for read-error after an input operation failed.
Programmers usually ignore the difference between end-of-file and read-error. Hence they only rely on checking if the input operation succeeded or failed. Thus they never use feof(), and so should you.
The behavior is somewhat similar as that of errno: errno is set by some library functions in case of error or failure. It is not reset to 0 upon success. Checking errno after a function call is only meaningful if the operation failed and if errno was cleared to 0 before the function call.
If you want to check if you indeed reached to the of file, you need to try and read extra input. For example you can use this function:
int is_at_end_of_file(FILE *f) {
int c = getc(file);
if (c == EOF) {
return 1;
} else {
ungetc(c, file);
return 0;
}
}
But reading extra input might not be worthwhile if reading from the console: it will require for the user to type extra input that will be kept in the input stream. If reading from a pipe or a device, the side effect might be even more problematic. Alas, there is no portable way to test if a FILE stream is associated with an actual file.
Related
So I have a .txt file that I want to read via stdin in c11 program using scanf().
The file is essentially many lines made of one single string.
example:
hello
how
are
you
How can I know when the file is finished, I tried comparing a string with a string made only with eof character but the code loops in error.
Any advice is much appreciated.
Linux manual says (RETURN section):
RETURN VALUE
On success, these functions return the number of input items
successfully matched and assigned; this can be fewer than
provided for, or even zero, in the event of an early matching
failure.
The value EOF is returned if the end of input is reached before
either the first successful conversion or a matching failure
occurs. EOF is also returned if a read error occurs, in which
case the error indicator for the stream (see ferror(3)) is set,
and errno is set to indicate the error.
So test if the return value of scanf equals to EOF.
You can read the file redirected from standard input using scanf(), one word at time, testing for successful conversion, until no more words can be read from stdin.
Here is a simple example:
#include <stdio.h>
int main() {
char word[40];
int n = 0;
while (scanf("%39s", word) == 1) {
printf("%d: %s\n", ++n, word);
}
return 0;
}
Note that you must tell scanf() the maximum number of characters to store into the destination array before the null pointer. Otherwise, any longer word present in the input stream will cause undefined behavior, a flaw attackers can try and exploit using specially crafted input.
Upon looking at the ISO C11 standard for fgets §7.21.7.2, 3, the return value is stated regarding the synopsis code:
#include <stdio.h>
char *fgets(char* restrict s, int n, FILE* restrict stream);
The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.
The standard says that a null pointer is returned for either an end-of-file and no characters have been read in or a read error occurs. My question is, just from fgets, and the returned null pointer, is there a way to distinguish which of the two cases caused the error?
Is there a way to distinguish which of the two cases caused the error?
Yes, use feof() and ferror() to distinguish. #Nothing Nothing
Yet it is important to use correctly. Consider the two codes:
char buf[100];
fgets(s, sizeof s, stream);
if (feof(stream)) return "End-of-file occurred";
if (ferror(stream)) return "Input error occurred";
if (fgets(s, sizeof s, stream) == NULL) {
if (feof(stream)) return "End-of-file occurred";
if (ferror(stream)) return "Input error occurred";
return "Should never get here";
}
The second properly tests the return value against NULL, as suggested by OP.
The first can encounter a rare problem. The ferror(stream) tests a flag. This flag may have been set by a prior I/O function call on stream so this fgets() is not necessarily the cause of the error. Best to check the result of fgets() to see if this function failed.
If code is to continue using stream after an error detected, be sure to clear the error before continuing - like maybe to attempt a re-try.
if (ferror(stream)) {
clearerr(stream);
return "Input error occurred");
}
Note that clearerr() clears both the error and end-of-file flags.
The same applies for feof(), yet most code is written to quit using stream once an end-of-file is true.
There is a 3rd pathological way to receive NULL and neither feof() nor ferror() returns NULL as detailed in Is fgets() returning NULL with a short buffer compliant?. Careful reading of the C spec has 3 "ifs", of which it is possible that not of them are true as so the spec is lacking - which implies UB.
If the failure has been caused by end-of-file condition, additionally sets the eof indicator (see feof()) on stream. The contents of the array pointed to by str are not altered in this case.
If the failure has been caused by some other error, sets the error indicator (see ferror()) on stream. The contents of the array pointed to by str are indeterminate (it may not even be null-terminated).
Therefore, you would need to check for feof() and ferror() in order to determine the error.
From this site
fgetc() function reads characters from a text file in Ubuntu.
the last character before EoF is with code = -1.
what the heck is that?
in text editor file seems ok, no strange symbols at end.
while (!feof(fp))
{
c = fgetc(fp);
printf("%c %i\n", c, c);//
}
feof is meant to signal that you've tried to read past the end of file - which means that you first have to reach it. So it will only work after you try to read and the system realizes you're at the end. And what does fgetc return if you try to read past the end of file? EOF (conveniently, -1 - which is why fgetc returns an int instead of a char).
So what's happening is that you enter the loop - because you haven't yet tried to read past at the end yet - and call fgetc which returns -1 because you tried to read past the end of the file. The next time around the loop, feof tells you that you've already hit the end of the file and tried to read past it and you break out.
You should read the documentation of functions you intend to use: feof and fgetc documentation explain this. But even if they did not, a simple google search would have answered your question: Why is “while ( !feof (file) )” always wrong?.
I'm relatively new to C, my question is:
Is it ALWAYS true that there are only EOF chars past the end of a file?
Example code:
FILE *fr;
int i;
fr=fopen("file.txt","r");
for (i=0;i<20;i++) {
putchar(getc(fr));
}
fclose(fr);
Output:
user#host:~$ ./a.out | xxd
0000000: 6173 640a ffff ffff ffff asd.......
(file.txt contains asd\n)
Answer: there aren't any characters beyond the end of a file. My MSVC manual page here says that if you read past the end of the file, getc() returns EOF.
It does not matter how many times you try to make getc() read past the end of the file, it won't. It just keeps returning EOF.
The EOF is not part of the file marking its end - it is a flag value returned by getc() to tell you there is no more data.
EDIT included a sample to show the behaviour of feof(). Note, I made separate printf() statements, rather than merging them into a single statement, because it is important to be clear what order the functions feof() and getc() are called.
Note that feof() does not return a non-0 value until after getc() returned EOF.
#include <stdio.h>
int main( void )
{
FILE *fr;
int i;
fr=fopen("file.txt","r");
for (i=0;i<6;i++) {
printf("feof=%04X, ", feof(fr));
printf("getc=%04X\n", getc(fr));
}
fclose(fr);
}
Program input file:
abc\n
Program output:
feof=0000, getc=0061
feof=0000, getc=0062
feof=0000, getc=0063
feof=0000, getc=000A
feof=0000, getc=FFFFFFFF
feof=0010, getc=FFFFFFFF
So, you can't use feof() to tell you the end of file was reached. It tells that you made a read error after reaching the end of file.
There are no EOF characters in a file, nor any characters after the end of a file (it's the end of the file, after all). Rather, EOF is a special value used by getc (and others) to indicate that there isn't anything to read. You can use feof and ferror to see whether that EOF was caused by reaching the end of the file, or if an error ocurred.
What you are seeing are the EOF values (cast to an unsigned char) that getc returned after reaching the end of the file.
Normally, there aren't "EOF chars" in the file to mark the end. EOF is just an integer value, that does not correspond to a valid char value, that is returned by some functions when there's nothing left in the file.
In your example, you see the ff values after the contents of the file because when getc() returns EOF, indicating there's nothing left to read, you're displaying it as a char... effectively displaying the char corresponding to the low bits of the EOF value and ignoring the high bits. If you read the file in a different way, you might not see that result.
I have a text file of composed of sequences of 2 bytes which I have to store in an array.
I have declared FILE *ptr.
How can I loop until EOF without using the method:
while(c = getc() != EOF)
{
// do something
}
I want to implement something along the lines of (PSEUDOCODE):
while (ptr is not pointing to the end of the file)
{
fscanf(...) // I will read in the bytes etc.
}
The getc() method wouldn't work well for me because I am reading in blocks of 2 bytes at a time.
You can use fread to read more than one byte at a time. fread returns the number of items it was able to read.
For example, to read 2-byte chunks you might use:
while ((fread(target, 2, 1, ptr) == 1) {
/* ... */
}
Here 2 is the number of bytes in each "item", and 1 is the number of "items" you want to read on each call.
In general, you shouldn't use feof() to control when to terminate an input loop. Use the value returned by whichever input routine you're using. Different input functions vary in the information they provide; you'll have to read the documentation for the one you're using.
Note that this will treat an end-of-line as a single '\n' character. You say you're reading from a text file; it's not clear how you want to handle line endings. You should also decide what you want to do if the file has an odd number of characters.
Another option is to call getc() twice in the loop, checking its result both times.
The only way to tell when you've reached the end of the file is when you try to read past it, and the read fails. (Yes, there is an feof() function, but it only returns true after you've tried to read past the end of the file.)
This means that, if you're going to use fscanf() to read your input, it's the return value of fscanf() itself that you need to check.
Specifically, fscanf() returns the number of items it has successfully read, or EOF (which is a negative value, typically -1) if the input ended before anything at all could be read. Thus, your input loop might look something like this:
while (1) {
/* ... */
int n = fscanf(ptr, "...", ...);
if (n == EOF && !ferror(ptr)) {
/* we've reached the end of the input; stop the loop */
break;
} else if (n < items_requested) {
if (ferror(ptr)) perror("Error reading input file");
else fprintf(stderr, "Parse error or unexpected end of input!\n");
exit(1); /* or do whatever you want to handle the error */
}
/* ... */
}
That said, there may be other options, too. For example, if your input is structured as lines (which a lot of text input is), you may be better off reading the input line by line with fgets(), and then parsing the lines e.g. with sscanf().
Also, technically, there is a way to peek one byte ahead in the input, using ungetc(). That is, you could do something like this:
int c;
while ((c = getc(ptr)) != EOF) {
ungetc(c, ptr);
/* ... now read and parse the input ... */
}
The problem is that this only checks that you can read one more byte before EOF; it doesn't, and can't, actually check that your fscanf() call will have enough data to match all the requested items. Thus, you still need to check the return value of fscanf() anyway — and if you're going to do that, you might as well use it for EOF detection too.