How to use EOF to run through a text file in C? - c

I have a text file that has strings on each line. I want to increment a number for each line in the text file, but when it reaches the end of the file it obviously needs to stop. I've tried doing some research on EOF, but couldn't really understand how to use it properly.
I'm assuming I need a while loop, but I'm not sure how to do it.

How you detect EOF depends on what you're using to read the stream:
function result on EOF or error
-------- ----------------------
fgets() NULL
fscanf() number of succesful conversions
less than expected
fgetc() EOF
fread() number of elements read
less than expected
Check the result of the input call for the appropriate condition above, then call feof() to determine if the result was due to hitting EOF or some other error.
Using fgets():
char buffer[BUFFER_SIZE];
while (fgets(buffer, sizeof buffer, stream) != NULL)
{
// process buffer
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted the read
}
Using fscanf():
char buffer[BUFFER_SIZE];
while (fscanf(stream, "%s", buffer) == 1) // expect 1 successful conversion
{
// process buffer
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted the read
}
Using fgetc():
int c;
while ((c = fgetc(stream)) != EOF)
{
// process c
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted the read
}
Using fread():
char buffer[BUFFER_SIZE];
while (fread(buffer, sizeof buffer, 1, stream) == 1) // expecting 1
// element of size
// BUFFER_SIZE
{
// process buffer
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted read
}
Note that the form is the same for all of them: check the result of the read operation; if it failed, then check for EOF. You'll see a lot of examples like:
while(!feof(stream))
{
fscanf(stream, "%s", buffer);
...
}
This form doesn't work the way people think it does, because feof() won't return true until after you've attempted to read past the end of the file. As a result, the loop executes one time too many, which may or may not cause you some grief.

One possible C loop would be:
#include <stdio.h>
int main()
{
int c;
while ((c = getchar()) != EOF)
{
/*
** Do something with c, such as check against '\n'
** and increment a line counter.
*/
}
}
For now, I would ignore feof and similar functions. Exprience shows that it is far too easy to call it at the wrong time and process something twice in the belief that eof hasn't yet been reached.
Pitfall to avoid: using char for the type of c. getchar returns the next character cast to an unsigned char and then to an int. This means that on most [sane] platforms the value of EOF and valid "char" values in c don't overlap so you won't ever accidentally detect EOF for a 'normal' char.

You should check the EOF after reading from file.
fscanf_s // read from file
while(condition) // check EOF
{
fscanf_s // read from file
}

I would suggest you to use fseek-ftell functions.
FILE *stream = fopen("example.txt", "r");
if(!stream) {
puts("I/O error.\n");
return;
}
fseek(stream, 0, SEEK_END);
long size = ftell(stream);
fseek(stream, 0, SEEK_SET);
while(1) {
if(ftell(stream) == size) {
break;
}
/* INSERT ROUTINE */
}
fclose(stream);

Related

It takes very long time to read '.txt file' how can i solve this problem? ( C )

there is very long "dict.txt" file.
the size of this file is about 2400273(calculated by fseek, SEEK_END)
this file has lots of char like this 'apple = 사과'(simillar to dictionary)
Main problem is that reading file takes very long time
I couldn't find any solution to solve this problem in GOOGLE
The reason i guessed is associated with using fgets() but i don't know exactly.
please help me
here is my code written by C
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
int line = 0;
char txt_str[50];
FILE* pFile;
pFile = fopen("dict_test.txt", "r");
if (pFile == NULL) {
printf("file doesn't exist or there is problem to open your file\n");
}
else {
do{
fgets(txt_str, 50, pFile);;
line++;
} while (txt_str != EOF);
}
printf("%d", line);
}
Output
couldn't see result because program was continuosly running
Expected
the number of lines of this txt file
Major
OP's code fail to test the return value of fgets(). Code needs to check the return value of fgets() to know when to stop. #A4L
do{
fgets(txt_str, 50, pFile);; // fgets() return value not used.
Other
Line count should not get incremented when fgets() returns NULL.
Line count should not get incremented when fgets() read a partial line. (I. e.) the line was 50 or longer. Reasonable to use a wider than 50 buffer.
Line count may exceed INT_MAX. There is always some upper bound, yet trivial to use a wider type.
Good practice to close the stream.
Another approach to count lines would use fread() to read chunks of memory and then look for start of lines. (Not shown)
Recommend to print a '\n' after the line count.
int main(void) {
FILE* pFile = fopen("dict_test.txt", "r");
if (pFile == NULL) {
printf("File doesn't exist or there is problem to open your file.\n");
return EXIT_FAILURE;
}
unsigned long long line = 0;
char txt_str[4096];
while (fgets(txt_str, sizeof txt_str, pFile)) {
if (strlen(txt_str) == sizeof txt_str - 1) { // Buffer full?
if (txt_str[sizeof txt_str - 1] != '\n') { // Last not \n?
continue;
}
}
line++;
}
fclose(pFile);
printf("%llu\n", line);
}
fgets returns NULL on EOF.
You are never assigning the result of
fgets(txt_str, 50, pFile);
to txt_str, your program never sees the end of the file and thus enters an endless loop.
try something like this:
char* p_str;
do{
p_str = fgets(txt_str, 50, pFile);
} while (p_str != NULL);

getc() for passed in input and file reading in C

I have to develop a program in C that can have two kinds of inputs.
By feeding it a string ( I am assuming like this filename < String1234455678, please correct me if I am wrong).
By reading data from some file(s).
I have to do some checks regarding the characters that are in it and store them in an array. But I want to learn how to use the getc() from stdin first.
My first question is, can I use getc() in both cases?
I wanted to loop through every single character in the feed line/file, and I assume the code would look something like this:
char Array1[];
char charHolder;
//If the file/feed has chars (!NULL), execute
if ((charHolder = getchar())!=NULL){
//Do something
//Do some more
//Finally append to Array1
Array1[] = charHolder;
}
There might be some issues with the code above. I wanted to know if that kind of inserting is valid in C (with no index specified, which it will just push the value at the end of the array). Also, I read from http://beej.us/guide/bgc/output/html/multipage/getc.html that getc(stdin) and getchar() are exactly equivalent. I just want to double check that this is indeed true and either function will work with both my cases where I have to read data (from a file and feeding my program a string).
Also, I was wondering how I can achieve reading characters from multiple files. Say if my program was to be executed as programName file1 file2.
Thank you for your time and help!
Cheers!
Edit 1:
I also wanted to know how to check when the chars end from a file/string feed. Should I use the EOF for both cases?
Example:
while ((charHolder = getchar()) != EOF){
//code
}
Here is a sample:
#include <stdio.h>
void do_read(FILE * file, int abort_on_newline) {
char ch;
while (1) {
ch = getc(file);
if (ch == EOF) {
break;
}
if (abort_on_newline && ch == '\n') {
break;
}
printf("%c", ch);
}
}
int main(int argc, char * argv[])
{
int i = 1;
FILE * fp = NULL;
if (1 == argc) {
// read input string from stdin, abort on new line (in case of interactive input)
do_read (stdin, 1);
}
else {
// cycle through all files in command line arguments and read them
for (i=1; i < argc; i++) {
if ((fp = fopen(argv[i], "r")) == NULL) {
printf("Failed to open file.\n");
}
else {
do_read(fp,0);
fclose(fp);
}
}
}
return 0;
}
Use it like this:
To read from stdin: echo youstring | youprogram, or just start
yourprogram to get input from user
To read from file(s) yourprogram yourfile1 yourfile2 ...
Yes your can use getc in both cases, yes you should check for EOF in both cases, except for interactiv input. In case of binary files you also need to use feof function to check for EOF. See code above to read from multiple files.

Why is my file IO infinite looping in c?

I am new to file IO in c. I decided to write a simple script in c that copies a file to a new file for practice:
#include <stdio.h>
void main(int argc, char* argv[])
{
if (argc != 3)
{
printf("Usage: ./myFile source destination");
exit(-1);
}
FILE * src = fopen(argv[1], "r");
if (src == NULL)
{
printf("source file not found", argv[1]);
exit(-1);
}
FILE* dest = fopen(argv[2], "w");
unsigned char c;
do {
c = fgetc(src);
fputc(c, dest);
} while (c != EOF);
}
However, I am getting an infinite loop. Is this because I never actually hit a character called EOF?
Also, is there a faster way to write this script aside from reading each character 1 at a time?
Declare c as an int and it'll work.
EOF is not a valid value for a character, because if it were, the presence of that character in a file could mislead code into thinking that file has ended when it actually hasn't. That's precisely why fgetc() actually returns an int, not a char.
Edit: Your code also has another bug: when fgetc() does return EOF, you pass that value to fputc() before ending the loop, causing an extra character to appear at the end of your output file. (The extra character will be whatever you get when you cast EOF to unsigned char on your system, typically character 255 == 0xFF == (unsigned char) -1.) To fix that, you can rewrite your loop like this:
int c;
while ((c = fgetc(src)) != EOF) {
fputc(c, dst);
}
or, if you don't like assigments in loop conditions:
while (1) {
int c = fgetc(src);
if (c == EOF) break;
fputc(c, dst);
}
Anyway, it would be much more efficient to read and write the data in chunks using fread() and fwrite(), e.g. like this:
unsigned char buf[65536];
while (1) {
int n = fread(buf, 1, sizeof(buf), src);
fwrite(buf, 1, n, dst);
if (n < sizeof(buf)) break; /* end of file or read error */
}
Also, it would be a good idea to include some error checking, since both reading and writing a file can fail for a variety of unexpected reasons. You can use ferror() to tell whether an error has occurred on a particular I/O stream.
EOF is not an unsigned char but an int. See the prototype of fgetc:
int fgetc(FILE *stream);

feof wrong loop in c

I use below code to read a char from file and replace it with another,
but I have an error.loop in going to end of file.
What is wrong?
I tested this code on linux (netbeans IDE) and it was correct and worked beautiful but when I tried to use VS 2008 in windows , I found a non end loop.
//address = test.txt
FILE *fp;
fp=fopen(address,"r+");
if(fp == 0)
{
printf("can not find!!");
}
else
{
char w = '0'; /// EDIT : int w;
while(1)
{
if((w = fgetc(fp)) != EOF)
{
if((w = fgetc(fp)) != EOF)
{
fseek(fp,-2,SEEK_CUR);
fprintf(fp,"0");
}
}
else
{
break;
}
}
}
fclose(fp);
You are storing the result of fgetc in a char, instead of an int.
char w = '0'; /* Wrong, should be int. */
Incidentally, this problem is mentioned in the C FAQ.
If type char is unsigned, an actual
EOF value will be truncated (by having
its higher-order bits discarded,
probably resulting in 255 or 0xff) and
will not be recognized as EOF,
resulting in effectively infinite
input.
EDIT
Reading your question again, it's highly fishy the way you seek back two characters and write one character. That could well lead to an infinite loop.
EDIT2
You (likely) want something like this (untested):
while ((w = getc(fp)) != EOF) {
fseek(fp, -1, SEEK_CUR);
fprintf(fp, "0");
fflush(fp); /* Apparently necessary, see the answer of David Grayson. */
}
The fopen documentation on cplusplus.com says:
For the modes where both read and
writing (or appending) are allowed
(those which include a "+" sign), the
stream should be flushed (fflush) or
repositioned (fseek, fsetpos, rewind)
between either a reading operation
followed by a writing operation or a
writing operation followed by a
reading operation.
We can add an fflush call after the fprintf to satisfy that requirement.
Here is my working code. It creates a file named example.txt and after the program exits that file's contents will be 000000000000n.
#include <stdio.h>
int main(int argc, char **argv)
{
FILE * fp;
int w;
fp = fopen("example.txt","w");
fprintf(fp, "David Grayson");
fclose(fp);
fp = fopen("example.txt","r+");
while(1)
{
if((w = fgetc(fp)) != EOF)
{
if((w = fgetc(fp)) != EOF)
{
fseek(fp,-2,SEEK_CUR);
fprintf(fp,"0");
fflush(fp); // Necessary!
}
}
else
{
break;
}
}
fclose(fp);
}
This was tested with MinGW in Windows.

Reading text into a buffer in c. (leaves out the last line of data when there is no newline on the textfile)

I'm trying to read lines on a text file into a buffer by giving a function a line number as a parameter. This function then copies the text contained on that particular line of the file into the variable retreiveString for use. The problem I'm having is that, if there is no 'empty newline' at the very end of the text file, the program doesn't copy the last entry into the buffer. What am I doing wrong?.
example of textfile that reads properly
line 0
line 1
example of textfile that doesnt read in the last line into the buffer (i.e line 1).
line 0
line 1
//test2
#include <stdio.h>
#include <string.h>
#define BUFFER_SIZE 80
char read_in_buffer[BUFFER_SIZE];
char retreivedString[BUFFER_SIZE];
void getString(int lineNum);
int maxDataNum = 0;
bool endOfFileReached = false;
int main(void){
printf("-Main-\n");
getString(1);
printf("retrieved:%s\n",retreivedString);
printf("maxdata: %d\n",maxDataNum);
printf("strlen: %d",strlen(retreivedString));
/*
getString(2);
printf("retrieved: %s\n",retreivedString);
printf("maxdata: %d\n",maxDataNum);
getString(4);
printf("retrieved: %s\n",retreivedString);
printf("maxdata: %d\n",maxDataNum);
*/
return 0;
}
void getString(int lineNum){
FILE *fin=fopen("file1_Windows.txt","r");
int line_number = 0;
char *temp;
if(fin==NULL){
printf("cannot open file1_Windows.txt\n");
}
while (1){
memset(read_in_buffer,0,sizeof(read_in_buffer));
fgets(read_in_buffer,sizeof(read_in_buffer),fin); //change to segment size?
if (!feof(fin)) {
if (lineNum == line_number){
memset(retreivedString,0,sizeof(retreivedString));
strcpy(retreivedString,read_in_buffer);
}
//printf("current line %d: ",line_number);
//printf("%s",read_in_buffer);
line_number++;
}else {
fclose(fin);
printf("End-of-File reached. \n");
maxDataNum = line_number;
printf("maxdata: %d\n",maxDataNum);
if (lineNum == maxDataNum){
endOfFileReached = true;
}else if (lineNum > maxDataNum){
printf("file read error, you're reading further that data on file\n");
}
break;
}
}
}
Do this. It returns if the fopen call fails and puts the fgets call in the while loop:
void getString(int lineNum){
FILE *fin=fopen("file1_Windows.txt","r");
int line_number = 0;
char *temp;
if(fin==NULL){
printf("cannot open file1_Windows.txt\n");
return;
}
memset(read_in_buffer,0,sizeof(read_in_buffer));
while (fgets(read_in_buffer,sizeof(read_in_buffer),fin) != NULL){
if (lineNum == line_number){
memset(retreivedString,0,sizeof(retreivedString));
strcpy(retreivedString,read_in_buffer);
}
//printf("current line %d: ",line_number);
//printf("%s",read_in_buffer);
line_number++;
memset(read_in_buffer,0,sizeof(read_in_buffer));
}
fclose(fin);
printf("End-of-File reached. \n");
maxDataNum = line_number;
printf("maxdata: %d\n",maxDataNum);
if (lineNum == maxDataNum){
endOfFileReached = true;
}else if (lineNum > maxDataNum){
printf("file read error, you're reading further that data on file\n");
}
}
Rather than testing feof (see this thread for reasons not to) within the while loop, test to see if read_in_buffer is null, as once you've run out of lines to read, that pointer becomes null. You could even put your fgets as your if statement body:
if (fgets(read_in_buffer,sizeof(read_in_buffer),fin))
{
}
Here is what standard says about fgets:
Synopsis
char *fgets(char * restrict s, int n, FILE * restrict stream);
Description
The fgets function reads at most one
less than the number of characters
specified by n from the stream pointed
to by stream into the array pointed to
by s. No additional characters are
read after a new-line character (which
is retained) or after end-of-file. A
null character is written immediately
after the last character read into the
array.
Returns
The fgets function returns s if
successful. If end-of-file is
encountered and no characters have
been read into the array, the contents
of the array remain unchanged and a
null pointer is returned. If a read
error occurs during the operation, the
array contents are indeterminate and a
null pointer is returned.
So, here is what happens:
When reading the last line, fgets returns read_in_buffer (not null pointer) because there are characters read and no read error occurs.
Then the check feof(fin) returns true (because EOF already reached), that makes this code never executed: strcpy(retreivedString,read_in_buffer).
Conclusion:
retreivedString is never modified and because it is a global variable, it was initialized with 0 bits (equivalent as an empty string).
So if you print retreivedString, the output will be an empty string.
put the fgets() line inside if() condition. After all u want to read something from file only if EOF has not been set.
if (!feof(fin)) {
line_number++;
fgets(read_in_buffer,sizeof(read_in_buffer),fin);
...
}

Resources