Why extra 'ÿ' added when writing into a .txt file in C? [duplicate] - c

This question already has answers here:
Why is “while( !feof(file) )” always wrong?
(5 answers)
Closed 11 months ago.
I searched about this problem everywhere, but none of the suggested solutions worked for me.
char currentChar;
FILE *fp_read = fopen("../input.txt", "r");
FILE *fp_write = fopen("../textArranged.txt", "w");
while (!feof(fp_read)){
currentChar = fgetc(fp_read);
...
}
I tried to change the while condition (using getc()), but it didn't work.

feof() seems to return 0 after reading the last byte of the file. It returns 1 after fgetc() already made the attempt to read one more byte after the end of the file.
When fgetc() makes the attempt to read data after the end of the file, fgetc() returns -1.
If you perform fputc(x, ...) and x is not in the range 0...255, fputc() will actually write the byte (x & 0xFF).
On nearly all modern computers, (-1 & 0xFF) is 0xFF which equals the character 'ÿ'.
So the following happens:
...
Your program reads the last byte of the file using fgetc()
It writes that character using fputc()
Although there are no more bytes left in the file, feof() returns 0 because you did not make the attempt to read bytes after the end of the file, yet.
Your program calls fgetc() and because there are no more bytes left, fgetc() returns -1.
Your program calls fputc(-1, ...) which writes the character 'ÿ'.
feof() returns 1 because fgetc() already tried to read bytes after the end of the file.

Related

Why can't I access EOF using fseek()? [duplicate]

This question already has answers here:
Why is “while( !feof(file) )” always wrong?
(5 answers)
Closed 6 years ago.
#include <stdio.h>
int main()
{
FILE * fp = fopen("Introduce.txt","rt");
fseek(fp,0,SEEK_END);
int i = feof(fp);
printf("%d",i);
fseek(fp,1,SEEK_END);
i = feof(fp);
printf("%d",i);
fseek(fp,-1,SEEK_END);
i = feof(fp);
printf("%d",i);
return 0;
}
I tried to access EOF positioning 'file position indicator' at the end of the file.
But the result of this code is "000".
Why does this happen?
The feof() function doesn't report that it is at EOF until you try to read some data and there is no data to read.
You can seek beyond the current EOF of a file that's open for writing (or that's open for reading and writing).
See while (!feof(file)) is always wrong for more information on why you seldom if ever need to use feof(). In some ways, feof() is a function you should forget about — most of your programs will improve if you assume it doesn't exist.
This documentation on feof is very clear (emphasis mine):
This function only reports the stream state as reported by the most
recent I/O operation, it does not examine the associated data source.
For example, if the most recent I/O was a fgetc, which returned the
last byte of a file, feof returns zero. The next fgetc fails and
changes the stream state to end-of-file. Only then feof returns
non-zero.
In typical usage, input stream processing stops on any error; feof and
ferror are then used to distinguish between different error
conditions.

fgetc reads character with value = -1

fgetc() function reads characters from a text file in Ubuntu.
the last character before EoF is with code = -1.
what the heck is that?
in text editor file seems ok, no strange symbols at end.
while (!feof(fp))
{
c = fgetc(fp);
printf("%c %i\n", c, c);//
}
feof is meant to signal that you've tried to read past the end of file - which means that you first have to reach it. So it will only work after you try to read and the system realizes you're at the end. And what does fgetc return if you try to read past the end of file? EOF (conveniently, -1 - which is why fgetc returns an int instead of a char).
So what's happening is that you enter the loop - because you haven't yet tried to read past at the end yet - and call fgetc which returns -1 because you tried to read past the end of the file. The next time around the loop, feof tells you that you've already hit the end of the file and tried to read past it and you break out.
You should read the documentation of functions you intend to use: feof and fgetc documentation explain this. But even if they did not, a simple google search would have answered your question: Why is “while ( !feof (file) )” always wrong?.

What characters are there past end of file?

I'm relatively new to C, my question is:
Is it ALWAYS true that there are only EOF chars past the end of a file?
Example code:
FILE *fr;
int i;
fr=fopen("file.txt","r");
for (i=0;i<20;i++) {
putchar(getc(fr));
}
fclose(fr);
Output:
user#host:~$ ./a.out | xxd
0000000: 6173 640a ffff ffff ffff asd.......
(file.txt contains asd\n)
Answer: there aren't any characters beyond the end of a file. My MSVC manual page here says that if you read past the end of the file, getc() returns EOF.
It does not matter how many times you try to make getc() read past the end of the file, it won't. It just keeps returning EOF.
The EOF is not part of the file marking its end - it is a flag value returned by getc() to tell you there is no more data.
EDIT included a sample to show the behaviour of feof(). Note, I made separate printf() statements, rather than merging them into a single statement, because it is important to be clear what order the functions feof() and getc() are called.
Note that feof() does not return a non-0 value until after getc() returned EOF.
#include <stdio.h>
int main( void )
{
FILE *fr;
int i;
fr=fopen("file.txt","r");
for (i=0;i<6;i++) {
printf("feof=%04X, ", feof(fr));
printf("getc=%04X\n", getc(fr));
}
fclose(fr);
}
Program input file:
abc\n
Program output:
feof=0000, getc=0061
feof=0000, getc=0062
feof=0000, getc=0063
feof=0000, getc=000A
feof=0000, getc=FFFFFFFF
feof=0010, getc=FFFFFFFF
So, you can't use feof() to tell you the end of file was reached. It tells that you made a read error after reaching the end of file.
There are no EOF characters in a file, nor any characters after the end of a file (it's the end of the file, after all). Rather, EOF is a special value used by getc (and others) to indicate that there isn't anything to read. You can use feof and ferror to see whether that EOF was caused by reaching the end of the file, or if an error ocurred.
What you are seeing are the EOF values (cast to an unsigned char) that getc returned after reaching the end of the file.
Normally, there aren't "EOF chars" in the file to mark the end. EOF is just an integer value, that does not correspond to a valid char value, that is returned by some functions when there's nothing left in the file.
In your example, you see the ff values after the contents of the file because when getc() returns EOF, indicating there's nothing left to read, you're displaying it as a char... effectively displaying the char corresponding to the low bits of the EOF value and ignoring the high bits. If you read the file in a different way, you might not see that result.

What is EOF and what is its significance? How can it be noticed? [duplicate]

This question already has answers here:
What is EOF in the C programming language?
(10 answers)
Closed 7 years ago.
While studying getchar() function in C ,I came across this EOF being returned , I want to know how can its existence be noticed, where is it stored?
Can we type EOF character explicitly?
EOF is short for End of File. It's not an actual character, but more like a signal that indicates the end of input stream.
Think about getchar(), it's used to get a character from stdin (the standard input). How could it tell when the stdin stream has come to the end? There must be a special return value which is different from any valid characters. EOF plays this role.
To generate EOF for stdin, type Ctrl + D on Unix systems, or Ctrl + Z on Windows.
EOF is the named constant to apply for End Of File when reading from a stdio input stream. If you look at the getchar() prototype, you'll first notice some strange thing is that it returns not a char value, but an int. Normally, EOF translates in some negative integer value (historically -1) meaning it's impossible to generate that character from the keyboard/input file.
Definitely, EOF constant is not a character, is the int value getchar(3) returns on end of file condition. This is also the reason of getchar(3) returning an int instead of a char value. It is also the reason always EOF maps to a negative value.
getchar(3) returns one of 257 possible values, 0 up to 255 and EOF (which is normally -1). Viewed as integer values, -1 is not the same as 255. It's one of the oldest implemented functions in C (it's there since the first edition of "The C programming language" by K&R)
EOF is the abbr. for End-Of-File. It's the special character for indicating that you have reached the end of the file you're reading a file stream.
Normally, people check whether they have reached to the end of file by:
while(!feof(fileStream)) {
// read one line here or so
...
// do your stuff here.
...
}

Reading from a binary file using fread() displays additional characters [duplicate]

This question already has answers here:
Why is “while( !feof(file) )” always wrong?
(5 answers)
Closed 8 years ago.
My question is about this fread() function that seems to be confusing for the time being. I create a binary file and put inside of it the values 1,2 and 3. And then I try to read the file and when I do using fread() it shows it like 1233 not 123.
#include <stdio.h>
#include <stdlib.h>
main ()
{
int x=1,y=2,z=3,i,j;
FILE *f;
f=fopen("Werid.bin","wb");
fwrite(&x,sizeof(int),1,f);
fwrite(&y,sizeof(int),1,f);
fwrite(&z,sizeof(int),1,f);
fclose(f);
f=fopen("Werid.bin","rb");
if (!f) perror("X");
while(!feof(f))
{
fread(&j,sizeof(int),1,f);
printf("%d",j);
}
fclose(f);
}
Why?
Change this
while(!feof(f))
to
while(fread(&j,sizeof(int),1,f) == 1)
From linux feof() manual
The function feof() tests the end-of-file indicator for the stream pointed to by stream, returning nonzero if it is set. The end-of-file indicator can only be
cleared by the function clearerr().
The feof() will return true after you try to call fread() at the end of file i.e. after you read the last number, you will need to call fread() again to set the end-of-file indicator.
So the loop will be executed one more time after the last read, and since it does not read anything but rather returns an error, it does not change the value of j either, so the previous value 3 is printed again.

Resources