getting undefined behavior in counting characters of text file program? - c

I wrote a c program meant to count the characters in a certain file.
int main(void) {
FILE *fp;
fp = fopen("txt.txt", "r");
char text;
int count;
while (fscanf(fp, "%c", &text) != EOF) {
count++;
}
printf("%d", count);
return 0;
}
I want to add a char array into it but for some reason it changes the value of my int type (count).
for example, if I run this program I get an output of 3549. Now, lets say I declare "char potato[5000]" alongside my other char type. For some reason I get a completely different output of 159062601. Why is this and how do I prevent that?

The following proposed code:
initializes variables before using them (your compiler should have told you about this problem.
properly checks and handles I/O errors for fopen() and for fscanf()
properly closes the open file before exiting. I.E. it cleans up after itself
properly terminates printed text, so it is immediately passed to the terminal
and now, the proposed code:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
fp = fopen("txt.txt", "r");
if( ! fp )
{
perror( "fopen failed" );
exit( EXIT_FAILURE );
}
char text;
int count = 0;
while ( fscanf( fp, "%c", &text ) == 1 )
{
count++;
}
fclose( fp );
printf( "%d\n", count );
return 0;
}

You have several problems in your code. i will list them below:
In c programming we declare variables in the scope begin. and initialize them if we need so. you have a mixture of declerations and code.
count variable non initialized!! you have entered the while loop with garbage value in count. UB (Undefined behavior) - in each run you will get different values.
you didnt check the return value of fopen !! you must check if the operating system succed in opening the file you have requested to manipulate.
regarding asking a question in stackoverflow, your code is not complete and you didnt post all of it.
Now lets try to learn new topics regarding working with IO streams.
return value of function fscanf
The value EOF is returned if the end of input is reached before
either the first successful conversion or a matching failure occurs.
EOF is also returned if a read error occurs, in which case the
error indicator for the stream (see ferror(3)) is set, and errno
is set indicate the error.
This is how check if errors ocured while working with the file we are reading:
int ferror(FILE *stream);
The function ferror() tests the error indicator for the stream pointed
to by stream, returning nonzero if it is set. The error indicator can
only be reset by the clearerr() function.
And in this function bellow we get a human readble error, not just an errnor number!
explain_ferror
const char *explain_ferror(FILE *fp);
The explain_ferror function is used to obtain an explanation of an
error returned by the ferror(3) system call. The least the message
will contain is the value of strerror(errno), but usually it will do
much better, and indicate the underlying cause in more detail.
The errno global variable will be used to obtain the error value to be
decoded.
#include <stdlib.h>
#include <stdio.h>
#include <libexplain/ferror.h> /* for the non standard const char* explain_ferror(FILE* fp); */
int main(void)
{
FILE *fp;
char text;
int count = 0;
fp = fopen("txt.txt", "r");
if(fp == NULL)
{
perror("fopen failed"); /*write to standard error*/
exit(EXIT_FAILURE);
}
while (fscanf(fp, "%c", &text) != EOF)
{
++count;
}
if (ferror(fp)) /* nonzero return if error occured */
{
fprintf(stderr, "%s\n", explain_ferror(fp));
exit(EXIT_FAILURE);
}
printf("%d", count);
return 0;
}
Since the const char *explain_ferror(FILE *fp); is not GNU standard function, i am posting a GNU standard functions in the code snippet below:
char *strerror(int errnum);
strerror is standard library c function which returns a pointer to a string that describes the error code passed in the argument errnum. Be aware that this function is not Thread safe. for thread safe function use The strerror_r().
Return Value
The strerror(), function return the appropriate error description string, or an "Unknown error nnn" message if the error number is unknown.
Since POSIX.1-2001 and POSIX.1-2008 requires that a successful call to strerror() shall leave errno unchanged, and note that, since no function return value is reserved to indicate an error, if we wishe to check for errors we should initialize errno to zero before the call (by calling void clearerr(FILE *stream);, and then check errno after the call.
#include <string.h>
#include <errno.h>
#include <stdio.h>
...
clearerr(fp); /* clear previous seted errno */
while (fscanf(fp, "%c", &text) != EOF)
{
++count;
}
if (ferror(fp)) /* nonzero return if error occured */
{
fprintf(stderr, "%s\n", strerror(errno));
exit(EXIT_FAILURE);
}
...
Finally:
man pages (or man7) or typing man <enter_string_here> in terminal on linux shall clear all the q.marks.
for further reading go to:
explain_ferror
ferror
fscanf

Related

fscanf - How to know if EOF means end of file or reading/another error?

I have a question about I/O in C language, how can I make a difference to know if the lecture of my file has ended or if the data can't be read (or has a problem) as in the both cases, fscanf returns EOF ?
Don´t rely only on the return value of fscanf(), rely beside this one on feof() and ferror() after the call to fscanf():
FILE* file;
if((file == fopen("file.txt","r")) == NULL)
{
fprintf(stderr, "File could not be opened!");
return EXIT_FAILURE;
}
char buf;
/******************************************************************************/
while(fscanf(file,"%c",buf) == 1) { // checks if an error was happen, else
// iterate to catching characters.
/* handling of read character */
}
if(ferror(file)) // checks if an I/O error occurred.
{
// I/O error handling
fprintf(stderr,"Input/Output error at reading file!");
clearerr(file);
// Further actions
}
else if(feof(file)) // checks if the end of the file is reached.
{
// end of file handling
fprintf(stderr,"Reached End of File!");
clearerr(file);
// Further actions
}
/******************************************************************************/
if(fclose(file) != 0)
{
fprintf(stderr, "File could not be closed properly!");
return EXIT_FAILURE;
}
As per fscanf() return value:
ISO/IEC 9899:2017
§ 7.21.6.2 - 16 - The fscanf function returns the value of the macro EOF if an input failure occurs before the first conversion (if any) has completed. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.
EOF is a macro with the value of -1, by itself it's not distinguishable as for the reasons why it occurs.
For this distinction § 7.21.6.2 - 19 recommends the use of feof() for end-of-file and ferror() for I/O error:
EXAMPLE 3 To accept repeatedly from stdin a quantity, a unit of measure, and an item name:
#include<stdio.h>
/*...*/
int count; floatquant;
charunits[21], item[21];
do {
count = fscanf(stdin, "%f%20sof%20s", &quant, units, item);
fscanf(stdin,"%*[^\n]");
} while(!feof(stdin) && !ferror(stdin));
My usual approach when reading formated input, is to check the inputed values. For a sample input of 2 integers you can do something like:
int a, b;
FILE* file;
//open file to read
while(fscanf(file, "%d %d", &a, &b) == 2){ //read each 2 integers in the file, stop when condition fails, i.e. there are nothing else to read or the read input is not an integer
//...handle inputs
}
This kind of read is safe and addresses all failure scenarios since it works for bad input and for "end of file".

How to do proper error handling with the fopen function

I wrote a program that asks the user to enter the full pathname of a file. It will then attempt to open that file from the pathname string provided. I used the standard error checking that most books have recommended, which is to close the program if fopen() returns NULL (which it will do in the case that the file does not exist). When I run the program and enter some random characters when prompted (obviously not a valid filename) my program hangs with a runtime error because it's trying to open that file that doesn't exist.
What is the point of the standard error check (pfile == NULL) if your program has already crashed when it calls fopen()? See below code.
I'm using LabWindows CVI 2017 as my enfironment which uses the clang compiler. See image of run time error.
#include <stdio.h>
#include <string.h>
#define MAX 200
int main (void){
char buffer[MAX];
int len = 0;
FILE *pfile = NULL;
printf("please enter the full pathname of the file you wish to process.\n");
fgets(buffer, MAX, stdin);
len = strlen(buffer);
buffer[len - 1] = '\0';
pfile = fopen(buffer, "r");
if(pfile == NULL){
printf("not a valid filename, press any key to exit.");
getchar();
return -1;
}
int sum = 0;
int c = 0;
while((c = fgetc(pfile)) != EOF){
sum += sizeof(c);
}
printf("the size of your file is %d\n", sum);
getchar();
return 0;
}
You are doing the proper error handling. Your program is valid in that respect. However, your IDE does some extra error checking, which is the cause of the behavior you're seeing.
The usual rules for error checking in these sorts of situations are:
Do check for error returns. (You're doing that.)
Do print a useful error message. (You're doing that.)
Print error messages to stderr.
If the error involves a file, do include the filename in the error message.
If the error involves a function that sets errno, do print the "perror" text" ("No such file or directory", etc.).
If you're writing a tool that will be combined into larger scripts, do include the program's name in the error message.
If the error occurs due to an input file you're reading, do print the name of that file and the line number.
Adopting rules 1 through 6, an improved version of your error check would be
if(pfile == NULL) {
fprintf(stderr, "%s: can't open %s: %s\n", progname, buffer, strerror(errno));
return EXIT_FAILURE;
}
For this to work you'll need both of:
#include <string.h>
#include <errno.h>
If that's too much work, a simpler way is just to call
perror(buffer);
although this falls down somewhat on rules 2, 6, and 7.

C - Print lines from file with getline()

I am trying to write a simple C program that loads a text-file, prints the first line to screen, waits for the user to press enter and then prints the next line, and so on.
As only argument it accepts a text-file that is loaded as a stream "database". I use the getline()-function for this, according to this example. It compiles fine, successfully loads the text-file, but the program never enters the while-loop and then exits.
#include <stdio.h>
#include <stdlib.h>
FILE *database = NULL; // input file
int main(int argc, char *argv[])
{
/* assuming the user obeyed syntax and gave input-file as first argument*/
char *input = argv[1];
/* Initializing input/database file */
database = fopen(input, "r");
if(database == NULL)
{
fprintf(stderr, "Something went wrong with reading the database/input file. Does it exist?\n");
exit(EXIT_FAILURE);
}
printf("INFO: database file %s loaded.\n", input);
/* Crucial part printing line after line */
char *line = NULL;
size_t len = 0;
ssize_t read;
while((read = getline(&line, &len, database)) != -1)
{
printf("INFO: Retrieved line of length %zu :\n", read);
printf("%s \n", line);
char confirm; // wait for user keystroke to proceed
scanf("%c", &confirm);
// no need to do anything with "confirm"
}
/* tidy up */
free(line);
fclose(database);
exit(EXIT_SUCCESS);
}
I tried it with fgets() -- I can also post that code --, but same thing there: it never enters the while-loop.
It might be something very obvious; I am new to programming.
I use the gcc-compiler on Kali Linux.
Change your scanf with fgetline using stdin as your file parameter.
You should step through this in a debugger, to make sure your claim that it never enters the while loop is correct.
If it truly never enters the while loop, it is necessarily because getline() has returned -1. Either the file is truly empty, or you have an error reading the file.
man getline says:
On success, getline() and getdelim() return the number of
characters
read, including the delimiter character, but not including the termi‐
nating null byte ('\0'). This value can be used to handle embedded
null bytes in the line read.
Both functions return -1 on failure to read a line (including end-of-
file condition). In the event of an error, errno is set to indicate
the cause.
Therefore, you should enhance your code to check for stream errors and deal with errno -- you should do this even when your code works, because EOF is not the only reason for the function
to return -1.
int len = getline(&line, &len, database);
if(len == -1 && ferror(database)) {
perror("Error reading database");
}
You can write more detailed code to deal with errno in more explicit ways.
Unfortunately handling this thoroughly can make your code a bit more verbose -- welcome to C!

fseek() API in C and error code

I am trying to run below code and expecting error as [EBADF] The stream is NULL
#include <stdio.h>
int main ()
{
FILE *fp;
char ch;
fp=fopen("test33.txt","r");
fseek(fp,0L,SEEK_SET);
while((ch=fgetc(fp))!=EOF)
putchar(ch);
}
Output:
/home/akhils/file_dir#./a.out
Memory fault(coredump)
Through which utility can I see error [EBADF]? I am running this C Program on HP-UX box and using a C++ compiler by HP.
I rewrote the code as below as per suggestion:
#include<stdio.h>
#include<errno.h>
extern int errno;
int main ()
{
FILE *fp;
int val;
char ch;
fp=fopen("test33.txt","r");
if(fp==NULL)
printf("\n Error code for fopen is : %d\n",errno);
else
{
val=fseek(fp,0L,SEEK_SET);
if(val!=0)
val=errno;
else {
while((ch=fgetc(fp))!=EOF)
putchar(ch);
}
printf("\nError code for fseek is %d\n",val);
}
}
Output : /home/akhils/file_dir#./a.out
Error code for fopen is : 2
My question is and sorry if I am asking it in wrong sense that how would I know that error is "[EBADF] The fildes argument is not a valid file descriptor." Note : EABDF is ALSO error set for fopen() when a NULL pointer is returned by fopen i.e in case of unsuccessfull completion of fopen().
First and foremost, you should be checking for the success of fopen(), as if it fails, passing the returned pointer (NULL) will invoke undefined behaviour in fseek(). You should not be using the returned pointer any further if fopen() failed.
That said, to detect the error in fseek() itself, you should be checking the return value of fseek() for success (or error). In case, fseek() is failure, it will set the errno variable. You can check the same against the EBADF.
You don't need any utility as such to check the error code. You can use #include <errno.h> with you code and you can access the errno variable value.
From the man page for fseek(),
[...] fgetpos(), fseek(), fsetpos() return 0, and ftell() returns the current offset. Otherwise, -1 is returned and errno is set to indicate the error.
and regarding the EBADF, as you mentioned,
EBADF
The stream specified is not a seekable stream.
You should at least check if the file can actually be opened:
...
fp=fopen("test33.txt","r");
if (fp == NULL)
{
// abort if file cannot be opened
printf("Cannot open file");
return 1;
}
...

compiler says:cannot convert int to FILE*

While doing filing im stuck here.The condition of the while loop is not working.The compiler says cannot convert int to FILE*.
while(pFile!=EOF);
Should i typecase the pFile to int?I tried that but it did not worked.Thanks in advance.
The complete code is:
int main()
{
char ch;
char name[20];
FILE *pFile;
int score;
pFile=fopen("database.txt","r");
if(pFile!=NULL)
{
while(pFile!=EOF);
{
fscanf(pFile,"%c",ch);
}
}
else
printf("Cant open the file.......");
fclose(pFile);
return 0;
}
First, you do not want to use while (!feof(pFile)) -- ever! Doing so will almost inevitably lead to an error where the last data you read from the file appears to be read twice. It's possible to make it work correctly, but only by adding another check in the middle of the loop to exit when EOF is reached -- in which case, the loop condition itself will never be used (i.e., the other check is the one that will actually do the job of exiting the loop).
What you normally do want to do is check for EOF as you read the data. Different functions indicate EOF in different ways. fgets signals failure (including EOF) by returning NULL. Most others (getc, fgetc, etc.) do return EOF, so you typically end up with something like this:
int ch; // Note, this should be int, NOT char
while (EOF != (ch=getc(pFile)))
process(ch);
or:
char buffer[MAX_LINE_SIZE];
while (fgets(buffer, sizeof(buffer), pFile))
process(buffer);
With scanf, checking for success is a little more complex -- it returns the number of successful conversions, so you want to make sure that matches what you expected. For example:
while (1 == fscanf(fPfile, "%d", &input_number))
process(input_number);
In this case I've used 1 because I specified 1 conversion in the format string. It's also possible, however, for conversion to fail for reasons other than EOF, so if this failes, you'll frequently want to check feof(pFile). If it returns false, do something like reading the remainder of the line, showing it to the user in a warning message, and then continuing to read the rest of the file.
It depends what pFile and EOF are defined as, but I will asssume that pFile is a *FILE, and EOF is from stdio.h. Then I guess you should do something like:
#include <stdlib.h>
#include <stdio.h>
#define FILENAME "file.txt"
int main(void) {
FILE *pFile;
int ch;
pFile = fopen(FILENAME,"r");
if (pFile) {
while ((ch = getc(pFile)) != EOF) {
printf("Read one character: %c\n", ch);
}
close(pFile);
return EXIT_SUCCESS;
} else {
printf("Unable to open file: '%s'\n", FILENAME);
return EXIT_FAILURE;
}
}
which yields
$ echo "abc" > file.txt
$ /tmp/fileread
Read one character: a
Read one character: b
Read one character: c
Read one character:
# last character being a linefeed
Assuming pFile is your file handle, this doesn't change as you read from the file. EOF is returned by e.g. fgetc(). See e.g. http://www.drpaulcarter.com/cs/common-c-errors.php#4.2 for common ways to solve this.
here is correct way:
c = getc(pFile);
while (c != EOF) {
/* Echo the file to stdout */
putchar(c);
c = getc(pFile);
}
if (feof(pFile))
puts("End of file was reached.");
else if (ferror(pFile))
puts("There was an error reading from the stream.");
else
/*NOTREACHED*/
puts("getc() failed in a non-conforming way.");
fclose(pFile);
pFile is a pointer to a file. EOF is usually defined as -1, a signed integer.
What you should do is fopen, make sure pFile != NULL, then call some function on the file handle until that function returns EOF. A pointer will (or rather, should) never be EOF. But a function acting on that pointer may return EOF.
I'm guessing you want to keep looping while you haven't hit end-of-file. In that case, you are looking for this:
while (!feof(pFile))
{
...
}
That said, this is still not quite correct. feof will only return true once it tries to read beyond the end of the file. This means feof can return false and yet there is no more data to read. You should really try your operation and only check for end of file if it fails:
char buffer[SIZE];
while (fgets(buffer, sizeof(buffer), pFile))
{
...
}
if (!feof(pFile))
{
// fgets failed for some reason *other* then end-of-file
}

Resources