Understanding fgetc program - c

I'm reading a book about c programming and don't understand a shown example. Or more precisely I don't understand why it works because I would think it shouldn't.
The code is simple, it reads the content of a text file and outputs it in output area. As far as I understand it, I would think that the
ch = fgetc(stream);
ought to be inside the while loop, because it only reads one int a time? and needs to read the next int after the current one has been outputted. Well, it turns out that this code indeed works fine so I hope someone could explain my fallacy to me. Thanks!
#include <stdio.h>
int main(int argc, char *argv[]) {
FILE *stream;
char filename[67];
int ch;
printf("Please enter the filename?\n");
gets(filename);
if((stream = fopen(filename, "r")) == NULL) {
printf("Error opening the file\n");
exit(1);
}
ch = fgetc(stream);
while (!feof(stream)) {
putchar(ch);
ch = fgetc(stream);
}
fclose(stream);
}

I think you are confuse because of feof():
Doc: int feof ( FILE * stream );
Checks whether the end-of-File indicator associated with stream is
set, returning a value different from zero if it is.
This indicator is generally set by a previous operation on the stream
that attempted to read at or past the end-of-file.
ch = fgetc(stream); <---"Read current symbol from file"
while (!feof(stream)) { <---"Check EOF read/returned by last fgetc() call"
putchar(ch); <---"Output lasts read symbol, that was not EOF"
ch = fgetc(stream); <---"Read next symbols from file"
}
<-- control reach here when EOF found
A much better way is to write your loop like:
while((ch = fgetc(stream))!= EOF){ <--" Read while EOF not found"
putchar(ch); <-- "inside loop print a symbol that is not EOF"
}
Additionally, Note: int fgetc ( FILE * stream );
Returns the character currently pointed by the internal file position
indicator of the specified stream. The internal file position
indicator is then advanced to the next character.
If the stream is at the end-of-file when called, the function returns
EOF and sets the end-of-file indicator for the stream (feof).
If a read error occurs, the function returns EOF and sets the error
indicator for the stream (ferror).

If the fgetc outside while is removed, like this:
while (!feof(stream)) {
putchar(ch);
ch = fgetc(stream);
}
ch will be un-initialized the first time putchar(ch) is called.
By the way, don't use gets, because it may cause buffer overflow. Use fgets or gets_s instead. gets is removed in C11.

The code you have provided has 'ch =fgetc(stream);' before the While loop and also
'ch = fgetc(stream);' within the body of the loop.
It would be logical that the statement within the loop is retrieving the ch from the stream one at a time as you correctly state.

It is inside and outside as you see. The one outside is responsible for reading the first character (which may be already the end of file, then the while wouldn't be entered anyway and nothing is printed), then it enters the loop, puts the character and reads the next one.. as long as the read character is not the end of file, the loop continues.

This is because of second fgetc which is getting call upto while (!feof(stream)).

fgetc() reads a char(byte) and return that byte,The reading of byte value depends on where the read pointer is available.
Once fgetc() successfully read one byte the read file pointer moves to the next byte .so if you read the file the next byte will be the output and it will continue upto it find the end of the file where it return EOF.

Actually this part here:
while (!feof(stream)) {
putchar(ch);
ch = fgetc(stream);
}
is pretty unsafe and you should avoid checking EOF like that (here why).
The way you should read a file using fgetc is like so:
int ch;
while ((ch = fgetc(stream)) != EOF)
{
printf("%c", ch)
}

This is non functional code. Last character from file is never outputted. fgetc will read last character and pointer will be at end of file. So, when while is checked, !feof will return false, and read character will not be outputed.
feofis not preventing reading after end of file: for empty files fgetc will be called before feof!
Unless there is some benefit in console handling, two better options exist:
Using feof:
while (!feof(stream)) {
ch=fgetc(stream);
putchar(ch);
}
Without using feof - because fgetc returns EOF when there are no more characters:
while ((ch=fgetc(stream))!=EOF) putchar(ch);

Related

how ftell() function works?

I have this code and I don't understand how it works:
void print(char * fileName)
{
FILE * fp;
int ch;
fp = fopen(fileName, "r");
while (ftell(fp) < 20)
{
ch = fgetc(fp);
putchar(ch);
}
fclose(fp);
}
So how is ftell(fp) works if it is in loop?
Because there is nothing inside the loop that get it up.
how it is progressive?
ftell() gets you the current value of the position indicator of the stream(in your case, it basically returns the character position it is currently pointing to right now).
fgetc() gets the next character (an unsigned char) from the specified stream and advances the position indicator for the stream. This function returns the character read as an unsigned char cast to an int or EOF on end of file or error
Flow of your program
What that means in very simple terms is -
fgetc() is reading one character after character from the file and advancing the pointer to the next character.
ftell() is returning you the current position in in bytes from the
beginning of the file. This means it tells the position of the character it is pointing right now(since 1 char takes 1 byte).
So, your program reads from the file until ftell() returns the
position which is less than 20.This means that it will keep looping until 20 characters have been read from your file.
Hope this clears your doubt !
ftell returns the current value of the file position indicator, and fgetc does advance the file position indicator within the loop.
But this program is wrong. For a stream opened in text mode ("r"), the return value of ftell cannot be used portably for anything else except for seeking to a previous position. From C11 draft n1570 7.21.9.4p2
[...] For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read.
Indeed it doesn't make any sense to use ftell in this program. Either open the file in binary mode, "rb", and then it is guaranteed that
[...] the value is the number of characters from the beginning of the file.
or for counting characters read from text file, use a counter variable:
int c_read = 0;
while (c_read < 20)
{
ch = fgetc(fp);
putchar(ch);
c_read ++;
}
Finally neither your original version or mine does not work correctly if the file has less than 20 characters. In that case EOF is returned from fgetc and putchar would write (unsigned char)EOF to the stream (most likely a byte of value 255!)
Thus the correct code would be
int c_read = 0;
while (c_read < 20)
{
ch = fgetc(fp);
if (ch == EOF) {
// report the error
perror("Failed to read 20 characters");
break;
}
putchar(ch);
c_read ++;
}

The strange behavior of 'read' system function

The programm I tried writing should have been able to read a string of a length not longer than 8 characters and check if such string were present in the file. I decided to use 'read' system function for it, but I've come up with a strange behavior of this function. As it's written in manual, it must return 0 when the end of file is reached, but in my case when there were no more characters to read it still read a '\n' and returned 1 (number of bytes read) (I've checked the ASCII code of the read character and it's actually 10 which is of '\n'). So considering this fact I changed my code and it worked, but I still can't understand why does it behave in this way. Here is the code of my function:
int is_present(int fd, char *string)
{
int i;
char ch, buf[9];
if (!read(fd, &ch, 1)) //file is empty
return 0;
while (1) {
i = 0;
while (ch != '\n') {
buf[i++] = ch;
read(fd, &ch, 1);
}
buf[i] = '\0';
if (!strncmp(string, buf, strlen(buf))) {
close(fd);
return 1;
}
if(!read(fd, &ch, 1)) //EOF reached
break;
}
close(fd);
return 0;
}
I think that your problem is in the inner read() call. There you are not checking the return of the function.
while (ch != '\n') {
buf[i++] = ch;
read(fd, &ch, 1);
}
If the file happens to be at EOF when entering the function and ch equals '\n' then it will be an infinite loop, because read() will not modify the ch value. BTW, you are not checking the bounds of buf.
I'm assuming the question is 'why does read() work this way' and not 'what is wrong with my program?'.
This is not an error. From the manual page:
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. In this case it is left unspecified whether the file position (if any) changes.
If you think about it read must work this way. If it returned 0 to indicate an end of file was reached when some data had been read, you would have no idea how much data had been read. Therefore read returns 0 only when no data is read because of an end-of-file condition.
Therefore in this case, where there is only a \n available, read() will succeed and return 1. The next read will return a zero to indicate end of file.
The read() function unless it finds a EOF keeps reading characters and places it on the buffer. here in this case \n is also considered as a character. hence it reads that also. Your code would have closed after it read the \n as there was nothing else other than EOF . So only EOF is the delimiter for the read() and every other character is considered normal. Cheers!

Elegant way to determine EOF?

I am reading from a text file, iterating with a while(!feof) loop,
but whenever I use this condition the loop iterates an extra time.
I solved the problem with this 'patchy' code
while (stop == FALSE)
{
...
terminator = fgetc(input);
if (terminator == EOF)
stop = TRUE;
else
fseek(input, -1, SEEK_CUR);
}
But it looks and feels very bad.
You can take advantage of the fact that an assignment gets evaluated as the value being assigned, in this case to the character being read:
while((terminator = fgetc(input))!= EOF) {
// ...
}
Here is an idiomatic example (source):
fp = fopen("datafile.txt", "r"); // error check this!
// this while-statement assigns into c, and then checks against EOF:
while((c = fgetc(fp)) != EOF) {
/* ... */
}
fclose(fp);
Similarly you ca read line-by-line:
char buf[MAXLINE];
// ...
while((fgets(buf,MAXLINE,stdin)) != NULL) {
do_something(buf);
}
Since fgets copies the detected newline character, you can detect
end of line by checking the second to last buffer element. You can use
realloc to resize the buffer (be sure you keep a pointer to the beginning of the buffer, but pass buf+n, to the next fgets, where n is the number of read characters). From the standard regarding fgets:
Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or the end-of-file is reached, whichever happens first. A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str.
Alternatively, you could read the whole file in one go using fread() (see example following the link).

understanding ungetc use in a simple getword

I've come across such an example of getword.
I understand all the checks and etc. but I have a problem with ungetc.
When the c does satisfy if ((!isalpha(c)) || c == EOF)and also doesn't satisfy while (isalnum(c)) -> it isn't a letter, nor a number - ungetc rejects that char.
Let's suppose it is '\n'.
Then it gets to return word however it can't be returned since it is not saved in any array. What happens then?
while (isalnum(c)) {
if (cur >= size) {
size += buf;
word = realloc(word, sizeof(char) * size);
}
word[cur] = c;
cur++;
c = fgetc(fp);
}
if ((!isalpha(c)) || c == EOF) {
ungetc(c, fp);
}
return word;
EDIT
#Mark Byers - thanks, but that c was rejected for a purpose, and will not satisfy the condition again and again in an infinite loop?
The terminal condition, just before the line you don't understand, is not good. It should probably be:
int c;
...
if (!isalpha(c) && c != EOF)
ungetc(c, fp);
This means that if the last character read was a real character (not EOF) and wasn't an alphabetic character, push it back for reprocessing by whatever next uses the input stream fp. That is, suppose you read a blank; the blank will terminate the loop and the blank will be pushed back so that the next getc(fp) will read the blank again (as would fscanf() or fread() or any other read operation on the file stream fp). If, instead of blank, you got EOF, then there is no attempt to push back the EOF in my revised code; in the original code, the EOF would be pushed back.
Note that c must be an int rather than a char.
ungetc pushes the characters onto the stream so that the next read will return that character again.
ungetc(c, fp); /* Push the character c onto the stream. */
/* ...etc... */
c = fgetc(fp); /* Reads the same value again. */
This can sometimes be convenient if you are reading characters to find out when the current token is complete, but aren't yet ready to read the next token.
OK. Now I understand why that case with eg. '\n' was troubling me. I'm just dumb and forgot about the section in main() referring to getword. Of course before calling getword there are a couple of tests (another ungetc there) and it fputs that characters not satisying isalnum
It emerges from this that while loop in getword always starts with at least one isalnum positive, and the check at then end is just for following characters.

how to stop reading from file in C

I am just trying to read each character of the file and print it out but when the file finishes reading, but I am getting a bunch of ? after it finishes reading. How do I fix it?
#include <stdio.h>
int main(void){
FILE *fr; /* declare the file pointer */
fr = fopen ("some.txt", "r"); /* open the file for reading */
/* elapsed.dta is the name of the file */
/* "rt" means open the file for reading text */
char c;
while((c = getc(fr)) != NULL)
{
printf("%c", c);
}
fclose(fr); /* close the file prior to exiting the routine */
/*of main*/
return 0;
}
In spite of its name, getc returns an int, not a char, so that it can represent all of the possible char values and, in addition, EOF (end of file). If getc returned a char, there would be no way to indicate the end of file without using one of the values that could possibly be in the file.
So, to fix your code, you must first change the declaration char c; to int c; so that it can hold the EOF marker when it is returned. Then, you must also change the while loop condition to check for EOF instead of NULL.
You could also call feof(fr) to test end of file separately from reading the character. If you did that, you could leave c as a char, but you would have to call feof() after you read the character but before you printed it out, and use a break to get out of the loop.
If unsuccessful, fgetc() returns EOF.
int c;
while ((c = getc(fr)) != EOF)
{
printf("%c", c);
}
Change this
char c;
while((c = getc(fr)) != NULL)
{
printf("%c", c);
}
to
char c;
int charAsInt;
while((charAsInt = getc(fr)) != EOF)
{
c = (char) charAsInt;
printf("%c", c);
}
In other words: You need to compare against EOF, not NULL. You also need to use an int variable to receive the return value from fgetc. If you use a char, the comparison with EOF may fail, and you'll be back where you started.
fgetc() returns EOF on end-of-file, not NULL.
Replace "NULL" with "EOF".
Others have already addressed the issue you're having, but rather than using printf("%c", c); it is probably much more efficient to use putchar(c);. There is quite a bit of overhead involved when you ask printf to print just one character.
getc returns an int.
change char c, to int c.
also getc returns EOF,
change your test against NULL to a test against EOF

Resources