fscanf is scanning an extra character at the end

fscanf is scanning an extra character at the end - c

it is always printing an extra character at the end. here is the code:
#include <stdio.h>
int main ()
{
char bit;
FILE *fp_read,*fp_write;
fp_read = fopen("test.txt","r");
if(fp_read==NULL) {
printf("Error!! Unable to open the file!!");
return 1;
}
while(!feof(fp_read)) {
fscanf(fp_read,"%c",&bit);
printf("%c",bit);
}
fclose(fp_read);
return 0;
}
if test.txt contains 010101 it prints 0101011 . if 00110 it prints 001100. if it contains abc it prints abcc . that means it always repeats the last character.
What is the problem ? can anybody explain ?

I am not able to reproduce the error.
Refer to David Bowling's first comment in the original post for a neat explanation.
The cppreference page for feof has a shorter version.
The eof function only reports the stream state as reported by the most recent I/O operation, it does not examine the associated data source. For example, if the most recent I/O was a fgetc, which returned the last byte of a file, feof returns zero. The next fgetc fails and changes the stream state to end-of-file. Only then feof returns non-zero.
In typical usage, input stream processing stops on any error; feof and ferror are then used to distinguish between different error conditions.
This means that the use of feof in the while loop may not be appropriate. The last character from the file may be junk and will be different in different systems.
Try doing this instead.
while(fscanf(fp_read,"%c",&bit) != EOF) {
printf("%c",bit);
}

Related

Why does an fread loop require an extra Ctrl+D to signal EOF with glibc?

Normally, to indicate EOF to a program attached to standard input on a Linux terminal, I need to press Ctrl+D once if I just pressed Enter, or twice otherwise. I noticed that the patch command is different, though. With it, I need to press Ctrl+D twice if I just pressed Enter, or three times otherwise. (Doing cat | patch instead doesn't have this oddity. Also, If I press Ctrl+D before typing any real input at all, it doesn't have this oddity.) Digging into patch's source code, I traced this back to the way it loops on fread. Here's a minimal program that does the same thing:
#include <stdio.h>
int main(void) {
char buf[4096];
size_t charsread;
while((charsread = fread(buf, 1, sizeof(buf), stdin)) != 0) {
printf("Read %zu bytes. EOF: %d. Error: %d.\n", charsread, feof(stdin), ferror(stdin));
}
printf("Read zero bytes. EOF: %d. Error: %d. Exiting.\n", feof(stdin), ferror(stdin));
return 0;
}
When compiling and running the above program exactly as-is, here's a timeline of events:
My program calls fread.
fread calls the read system call.
I type "asdf".
I press Enter.
The read system call returns 5.
fread calls the read system call again.
I press Ctrl+D.
The read system call returns 0.
fread returns 5.
My program prints Read 5 bytes. EOF: 1. Error: 0.
My program calls fread again.
fread calls the read system call.
I press Ctrl+D again.
The read system call returns 0.
fread returns 0.
My program prints Read zero bytes. EOF: 1. Error: 0. Exiting.
Why does this means of reading stdin have this behavior, unlike the way that every other program seems to read it? Is this a bug in patch? How should this kind of loop be written to avoid this behavior?
UPDATE: This seems to be related to libc. I originally experienced it on glibc 2.23-0ubuntu3 from Ubuntu 16.04. #Barmar noted in the comments that it doesn't happen on macOS. After hearing this, I tried compiling the same program against musl 1.1.9-1, also from Ubuntu 16.04, and it didn't have this problem. On musl, the sequence of events has steps 12 through 14 removed, which is why it doesn't have the problem, but is otherwise the same (except for the irrelevant detail of readv in place of read).
Now, the question becomes: is glibc wrong in its behavior, or is patch wrong in assuming that its libc won't have this behavior?

I've managed to confirm that this is due to an unambiguous bug in glibc versions prior to 2.28 (commit 2cc7bad). Relevant quotes from the C standard:
The byte input/output functions — those functions described in this subclause that perform
input/output: [...], fread
The byte input functions read characters from the stream as if by successive
calls to the fgetc function.
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream.
(emphasis on "or" mine)
The following program demonstrates the bug with fgetc:
#include <stdio.h>
int main(void) {
while(fgetc(stdin) != EOF) {
puts("Read and discarded a character from stdin");
}
puts("fgetc(stdin) returned EOF");
if(!feof(stdin)) {
/* Included only for completeness. Doesn't occur in my testing. */
puts("Standard violation! After fgetc returned EOF, the end-of-file indicator wasn't set");
return 1;
}
if(fgetc(stdin) != EOF) {
/* This happens with glibc in my testing. */
puts("Standard violation! When fgetc was called with the end-of-file indicator set, it didn't return EOF");
return 1;
}
/* This happens with musl in my testing. */
puts("No standard violation detected");
return 0;
}
To demonstrate the bug:
Compile the program and execute it
Press Ctrl+D
Press Enter
The exact bug is that if the end-of-file stream indicator is set, but the stream is not at end-of-file, glibc's fgetc will return the next character from the stream, rather than EOF as the standard requires.
Since fread is defined in terms of fgetc, this is the cause of what I originally saw. It's previously been reported as glibc bug #1190 and has been fixed since commit 2cc7bad in February 2018, which landed in glibc 2.28 in August 2018.

EOF reading last line twice

I am pretty new to C and have a very simple function for displaying file contents here. It works fine, except the last line of my file prints twice...I know that it has to do w/EOF but I can't figure out how to get the function to recognize EOF as the last line and not run once more. I know there are a billion places on the internet with similar issues, but lots were for C++ and since I am new I thought it would be best to just use my own code. Here is the code:
{
int count=0, fileEnd=0;
FILE* rockPtr=fopen("rockact.txt", "r");
printf("\n%8s%8s%8s%8s%8s\n", "BANDID", "NAME", "SIZE", "CREW", "TRANS");
do
{
fileEnd=fscanf(rockPtr, "%d%s%d%d%s", &(tempBand.rockid), tempBand.bandname, &(tempBand.bandsize), &(tempBand.crewsize), tempBand.transport);
if (fileEnd !=EOF); //checks EOF has not been reached
{
printf("\n%8d%8s%8d%8d%8s", tempBand.rockid, tempBand.bandname, tempBand.bandsize, tempBand.crewsize, tempBand.transport);
count++;
}
}
while (fileEnd !=EOF);
fclose(rockPtr);
printf("\n The total amount of rock acts on file is %d\n", count);
}

Your if condition doesn't want the semi-colon:
if (fileEnd !=EOF); // This semicolon is wrong!
The semicolon is a null statement and is the body of the if.
I'd rather see the whole loop cast as a while loop:
while (fscanf(rockPtr, "%d%s%d%d%s", &tempBand.rockid, tempBand.bandname,
&tempBand.bandsize, &tempBand.crewsize, tempBand.transport)) == 5)
{
printf("\n%8d%8s%8d%8d%8s", tempBand.rockid, tempBand.bandname,
tempBand.bandsize, tempBand.crewsize, tempBand.transport);
count++;
}
If you want to worry about it, you can spot the difference between EOF, read error and format error after the loop. Note that the check is that all values were converted OK.

you have ; after if - remove it
also, check manual for fscanf
If a reading error happens or the end-of-file is reached while
reading, the proper indicator is set (feof or ferror). And, if either
happens before any data could be successfully read, EOF is returned.
This mean that you can read at least partial data from file, reach EOF or error, but fscanf will not return it.
You should use feof function to check whether end of file is reached
so your logic should be:
read from file
if anything is read - display it, here I mean you should compare returned number with count of arguments, not with EOF
check for feof
UPDATE: during opening/reading from file you should always check ferror, as EOF is not the only problem

Confusion about file I/O (C)

fwrite(&studentg,sizeof(studentg),1,p);
while(!feof(p))
{
printf("flag");
fread(&studentg,sizeof(studentg),1,p);
printf("%s\t%s\t%s\t%s\t%s\t%s\t\n",studentg.name,studentg.add,studentg.tel,studentg.pc,studentg.qq,studentg.email);
}
Why I put only one object in file,but it output two same line?
And if I put two objects in file,it output one object correct,but another repeated.
I try show feof(p)'s return value,it show me that after fread ,feof(p)'s return value is still 0.Can anyone explain how it happens?

You won't get an end of file until you try to read beyond the file. This means that you have to check eof before the print:
fwrite(&studentg,sizeof(studentg),1,p);
finish = 0;
while(!finish)
{
printf("flag");
fread(&studentg,sizeof(studentg),1,p);
finish = feof(p);
if (!finish)
{
printf("%s\t%s\t%s\t%s\t%s\t%s\t\n",studentg.name,studentg.add,studentg.tel,studentg.pc,studentg.qq,studentg.email);
}
}
or
fwrite(&studentg,sizeof(studentg),1,p);
while(1)
{
printf("flag");
fread(&studentg,sizeof(studentg),1,p);
if (feof(p)) break;
printf("%s\t%s\t%s\t%s\t%s\t%s\t\n",studentg.name,studentg.add,studentg.tel,studentg.pc,studentg.qq,studentg.email);
}

From http://www.cplusplus.com/reference/cstdio/feof/:
"This indicator is generally set by a previous operation on the stream that attempted to read at or past the end-of-file."
This means that end of file is usually detected after an operation.
To fix your code, you may for example replace the condition in while loop with 1 or true and break execution when eof is reached (run feof inside loop).

Use of feof is one of the biggest misconception among beginners in File I/O. Everybody at some point has done the same mistake once or twice.
The way you have used it is Pascal's way but C way is different. The difference is::
Pascal's function returns true if the next read will fail because of end of file.
C's function returns true if the last function failed.
Thats why your code prints the last line twice because after the last line is read in and printed out, feof() will still return 0 (false) and the loop will continue. The next fgets() fails and so the line variable holding the contents of the last line is not changed and is printed out again. After this, feof() will return true (since fgets() failed) and the loop ends.
The correct way to use it is::
while( 1 ) {
fgets(line, sizeof(line), fp);
if ( feof(fp) ) /* check for EOF right after fgets() */
break;
fputs(line, stdout);
}
Still better way::
while( fgets(line, sizeof(line), fp) != NULL )
fputs(line, stdout);

First of all you should include a complete, reproducing, example to what you want to do, not a combined fragment of the code, which is hard to reproduce. Otherwise, note that using fwrite()/fread() on struct contents directly is not portable (see the free online book Porting UNIX Software), and is prone to errors. But you didn't provide enough context for us to understand what went wrong.

Useful context of scanf(...) != EOF

Soo... I saw a guy claim this code was working on another question.
while(scanf("%X", &hex) != EOF) {
//perform a task with the hex value.
}
So, in what context does the EOF flag get thrown? I though it would just keep asking for a number indefinitely. I added another line of code to test it, and it does exactly what I expected it too.....
This isn't a file, this seems to be stdin. So.... WHEN is this code useful?
Ie, in what context is the EOF return thrown?

If you look at the documentation for scanf, you will read that the value EOF is returned if a read failure occurred before the first value was assigned. (ie end of file)
http://en.cppreference.com/w/cpp/io/c/fscanf
You could equally test:
while(scanf("%X", &hex) == 1)
This is my preference. I expect one input, so I will be explicit.

Realistically speaking, this input is good on linux because ^d will end the stream, thus throwing the 'error.'
On windows, this behavior is different... whatever it is is not ctrl+d. At least I know now though, since I use both.
Thanks!

EOF is returned on I/O error and end-of-file. With stdin, an I/O error is a rare event and with keyboard input the end-of-file indication usual takes a special key sequence.
A practical use occurs with redirected input.
Assume a program exists that reads hexadecimal text and prints out decimal text:
// hex2dec.c
#include <stdio.h>
int main(void) {
unsigned hex;
int cnt;
while((cnt = scanf("%X", &hex)) == 1) {
printf("%u\n", hex);
}
// At this point, `cnt` should be 0 or EOF
if (cnt != EOF) {
puts("Invalid hexadecimal sequence found.");
return 1;
}
return 0;
}
// hex.txt contents:
abc
123
Conversion occurs with the command
hex2dec < hex.txt
2748
291
By detecting EOF on the stdin, the program knows when to return.

How feof() works in C

Does feof() checks for eof for the current position of filepointer or checks for the position next to current filepointer?
Thanks for your help !

Every FILE stream has an internal flag that indicates whether the caller has tried to read past the end of the file already. feof returns that flag. The flag does not indicate whether the current file position is as the end of the file, only whether a previous read has tried to read past the end of the file.
As an example, let's walk through what happens, when reading through a file containing two bytes.
f = fopen(filename, "r"); // file is opened
assert(!feof(f)); // eof flag is not set
c1 = getc(f); // read first byte, one byte remaining
assert(!feof(f)); // eof flag is not set
c2 = getc(f); // read second byte, no bytes remaining
assert(!feof(f)); // eof flag is not set
c3 = getc(f); // try to read past end of the file
assert(feof(f)); // now, eof flag is set
This is why the following is the wrong way to use eof when reading through a file:
f = fopen(filename, "r");
while (!feof(f)) {
c = getc(f);
putchar(c);
}
Because of the way feof works, the end-of-file flag is only set once getc
tries to read past the end of the file. getc will then return EOF, which is
not a character, and the loop construction causes putchar to try to write it
out, resulting in an error or garbage output.
Every C standard library input method returns an indication of success or
failure: getc returns the special value EOF if it tried to read past the
end of the file, or if there was an error while reading. The special value is
the same for end-of-file and error, and this is where the proper way to use
feof comes in: you can use it to distinguish between end-of-file and error
situations.
f = fopen(filename, "r");
c = getc(f);
if (c == EOF) {
if (feof(f))
printf("it was end-of-file\n");
else
printf("it was error\n");
}
There is another internal flag for FILE objects for error situations:
ferror. It is often clearer to test for errors instead of "not end of file".
An idiomatic way to read through a file in C is like this:
f = fopen(filename, "r");
while ((c = getc(f)) != EOF) {
putchar(c);
}
if (ferror(f)) {
perror(filename):
exit(EXIT_FAILURE);
}
fclose(f);
(Some error checking has been elided from examples here, for brevity.)
The feof function is fairly rarely useful.

You can get a better understanding of how feof works, by knowing how it's implemented. Here is a simplified version of how the 7th Edition Unix stdio library implements feof. Modern libraries are very similar, adding code offering thread-safety, increased efficiency, and a cleaner implementation.
extern struct _iobuf {
char *_ptr;
int _cnt;
char *_base;
char _flag;
char _file;
} _iob[_NFILE];
#define _IOEOF 020
#define feof(p) (((p)->_flag&_IOEOF)!=0)
#define getc(p) (--(p)->_cnt>=0? *(p)->_ptr++&0377:_filbuf(p))
int
_filbuf(FILE *iop)
{
iop->_ptr = iop->_base;
iop->_cnt = read(fileno(iop), iop->_ptr, BUFSIZ);
if (iop->_cnt == 0) {
iop->_flag |= _IOEOF;
return(EOF);
}
return(*iop->_ptr++ & 0377);
}
The stdio library maintains with each file a structure containing an internal buffer pointed by _base. The current character in the buffer is pointed by _ptr and the number of characters available is contained in _cnt. The getc macro, which is the base for a lot of higher-level functionality, like scanf, tries to return a character from the buffer. If the buffer is empty, it will call _filbuf to fill it. _filbuf in turn will call read. If read returns 0, which means that no more data is available, _filbuf will set the _IOEOF flag, which feof checks each time you call it to return true.
As you can understand from the above, feof will return true the first time you try to read a character past the end of the file (or a library function tries in your behalf). This has subtle implications on the behavior of various functions. Consider a file containing a single character: the digit 1. After you read that character with getc, feof will return false, because the _IOEOF flag is unset; nobody has yet tried to read past the end of the file. Calling getc again will result in a call to read, the setting of the _IOEOF flag, and this will cause feof to return true. However, after reading the number from the same file using fscanf("%d", &n), feof will immediately return true, because fscanf will have tried to read additional digits of the integer.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

fscanf is scanning an extra character at the end - c

Related

Why does an fread loop require an extra Ctrl+D to signal EOF with glibc?

EOF reading last line twice

Confusion about file I/O (C)

Useful context of scanf(...) != EOF

How feof() works in C

Categories

Resources