C: Reading from a file is bugged [duplicate] - c

This question already has an answer here:
C file handling query
(1 answer)
Closed 9 years ago.
So I have a program that takes user input and compares it to a line in the file, before going down onto the next line, if the user gets the word right they get 2 points, if it's wrong they get 1 point. As a failsafe I have added a small function within the program that will take out any spaces from the word.
The program works as expected, the spaces are removed and when run all words are scanned and compared effectively.
HOWEVER, once on the last line of the file, the correct spelling of the word will give the wrong output, this might have something to do with the loops but I'm not sure.
In a nutshell: All I need is one of you talented programmers out there to take a look at my code and see what's causing this to happen.
File content (just a list of random words)
Baby
Milk
Car
Face
Library
Disc
Lollipop
Suck
Food
Pig
(libraries are stdio,conio and string)
char text[100], blank[100];
int c = 0, d = 0;
void space(void);
int main()
{
int loop = 0;
char str[512];
char string[512];
int line = 1;
int dis = 1;
int score = 0;
char text[64];
FILE *fd;
fd = fopen("Student Usernames.txt", "r"); // Should be test
if (fd == NULL)
{
printf("Failed to open file\n");
exit(1);
}
do
{
printf("Enter the string: ");
gets(text);
while (text[c] != '\0')
{
if (!(text[c] == ' ' && text[c] == ' '))
{
string[d] = text[c];
d++;
}
c++;
}
string[d] = '\0';
printf("Text after removing blanks\n%s\n", string);
getch();
for(loop = 0;loop<line;++loop)
{
fgets(str, sizeof(str), fd);
}
printf("\nLine %d: %s\n", dis, str);
dis=dis+1;
str[strlen(str)-1] = '\0';
if(strcmp(string,str) == 0 )
{
printf("Match\n");
score=score+2;
}
else
{
printf("Nope\n");
score=score+1;
}
getch();
c=0;
d=0;
}
while(!feof(fd));
printf("Score: %d",score);
getch();
}
For any input on the last line, the output will always be incorrect, I believe this is something to do with the for loop not turning it into the next variable, but seeing as the <= notation makes this program worse, I really just need a simple fix for the program thanks.
P.S. For anyone who is going to comment about my coding for the spaces function, Yes, I could make it better, but it's not a problem right now. So please don't write anything concerning it.

I guess your file is not terminated by a newline. So the last word, Pig, gets truncated by this line of code:
str[strlen(str)-1] = '\0';
(which is unconditional).
Either put a newline at the end of your file, or check the end-of-string before truncating:
if (isspace(str[strlen(str)-1]))
str[strlen(str)-1] = '\0';
(Might also use strtok to remove all whitespace from the string without writing tricky code)

You need to check if the last character is a linefeed in both the user input and the line read from the file, and remove it if and only if it is. (You also need to fix the other bugs, such as the use of gets and feof, and not all the changes can be done in isolation because some of your bugs depend on one another so fixing only one will break it until you fix the others.)

Related

Print the 5th letter of each word read from a text file

I wanted to find the 5th letter from every word after reading from the text file. I do not know where I am wrong.
After entering the file path, I am getting a pop up window reading
read.c has stopped working.
The code:
#include<stdio.h>
#include<conio.h>
int main(void){
char fname[30];
char ch[]={'\0'};
int i=0;
FILE *fp;
printf("enter file name with path\n");
gets(fname);
fp=fopen(fname,"r");
if(fp==0)
printf("file doesnot exist\n");
else{
printf("File read successfully\n");
do{
ch[i]=getc(fp);
if(feof(fp)){
printf("end of file");
break;
}
else if(ch[i]=='\n'){
putc(ch[4],stdout);
}
i++;
}while(1);
fclose(fp);
}
return 0;
}
I wanted to find the 5th letter from every word
That's not something your code is doing now. It is wrong for various reasons, like
char ch[]={'\0'}; is an array with length 1. with unbound ch[i], you're overrunning the allocated memory creating undefined behaviour.
gets() is very dangerous, it can cause buffer overflow.
getc() reads character-by-character, not word-by-word, so you need to take care of space character (' ') as a delimiter also.
etc.
My suggestion, rewrite your code using the following algorithm.
Allocate a buffer long enough to hold a complete line from the file.
Open the file, check for success.
Read a whole line to the buffer using fgets()
3.1. If fgets() return NULL, you've most probably reached the end of file. End.
3.2. Otherwise, continue to next step.
Tokenize the line using strtok(), using space ' ' as the delimiter. Check the returned token against NULL.
4.1. if token is NULL, go to step 3.
4.2. if token is not NULL, proceeded to next step .
Check strlen() of the returned token (which is the word). if it is more than 4, print the index 4 of the token. (Remember, array index in c is 0 based).
Continue to step 4.
You can use following snippet. With this code you need to know Maximum length of line. This code prints 5th character on each line on each word if its length is 5 or more.. Hope this work for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE 1000
int main()
{
char fname[30];
char myCurLine[MAX_LINE + 1];
char myCurWord[MAX_LINE + 1];
int index,loop;
FILE *fp;
printf("enter file name with path\n");
gets(fname);
fp=fopen(fname,"r");
if(fp==0)
printf("file doesnot exist\n");
else
{
printf("File read successfully\n");
do
{
if(fgets(myCurLine,MAX_LINE,fp) != NULL)
{
index = 0;
for(loop = 0;loop < MAX_LINE; loop++)
{
myCurWord[index] = myCurLine[loop];
index++;
if((myCurLine[loop] == ' ') || (myCurLine[loop] == '\n'))
{
myCurWord[index] = '\0';
index = 0;
if(strlen(myCurWord) > 4)
{
putchar(myCurWord[4]);
}
index = 0;
if(myCurLine[loop] == '\n')
break;
}
}
}
if(feof(fp))
{
break;
}
}while(1);
fclose(fp);
}
return 0;
}
I edited your code because of these reasons:
You don't need to use char array at all, since you're only checking for letters and you can count the letters in every word of the file (which can be checked using spaces) and print when your count reaches 4 (since we start at 0).
Since gets() has no overflow protection, fgets() is more preferred.
fgets(fname, sizeof(fname), stdin);
Another point is you can simplify your do-while loop into one while loop with the condition of breaking if reaching EOF, since your do-while is simply an infinite loop (with condition defined as true or 1) that breaks at EOF (which is checked in a separate if inside the infinite do-while).
while (!feof)
An alternative to char array is to loop until a space ' ' or newline '\n' is found.
I also removed the else from if (fp==0) to avoid too many indents.
I also added ctype.h to check if the 5th letter is really a letter using isalpha().
This is how the word's 5th letter search works:
Loop (outer loop) until end-of-file (EOF).
In each iteration of outer loop, loop (inner loop) until a space ' ' or newline '\n' is found.
If the counter in inner loop reaches 4 (which means 5th letter is reached),
print the current letter,
reset counter to zero,
then break the inner loop.
Applying those edits to your code,
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
int main(void){
char fname[30];
char ch; // changed from array to single char
int i=0;
FILE *fp;
printf("enter file name with path\n");
fgets(fname, sizeof(fname), stdin); // changed from gets()
// removes the \n from the input, or else the file won't be located
// I never encountered a file with newlines in its name.
strtok(fname, "\n");
fp=fopen(fname,"r");
// if file open failed,
// it tells the user that file doesn't exits,
// then ends the program
if (!fp) {
printf("file does not exist\n");
return -1;
}
// loops until end-of-file
while (!feof(fp))
// loops until space or newline or when 5th letter is found
for (i = 0; (ch=getc(fp)) != ' ' && ch != '\n'; i++)
// if 5th character is reached and it is a letter
if (i == 4 && isalpha(ch)) {
putc(ch ,stdout);
// resets letter counter, ends the loop
i = 0;
break;
}
fclose(fp);
return 0;
}
*Note: Words with less than 5 letters will not be included in the output, but you can specify a character or number to indicate that a word has less than 5 letters. (such as 0, -1)
sample read.txt:
reading write love
coder heart
stack overflow
output:
enter file name with path
read.txt
iertkf

Scanf skipped in loop (Hangman)

This program essentially asks for a secret string, then asks a user to repeatedly guess single chars of that string until he guesses it all. It works however every second time the while loop is run it skips user input for the guessed char. How do I fix this?
int main(){
char guess;
char test2 [50];
char * s = test2;
char output [50];
char * t = output;
printf("Enter the secret string:\n");
fgets(test2, 50, stdin);
for (int i=0;i<49;i++){ //fills ouput with _ spaces
*(output +i)='_';
while(strcmp(s,t) != 0){
printf("Enter a guess:");
scanf("%c",&guess);
printf("You entered: %c\n", guess);
showGuess(guess,s, t ); // makes a string "output" with guesses in it
printf("%s\n",t);
}
printf("Well Done!");
}
For a quick and dirty solution try
// the space in the format string consumes optional spaces, tabs, enters
if (scanf(" %c", &guess) != 1) /* error */;
For a better solution redo your code to use fgets() and then parse the input.
As pointed out in some other answers and comments, you need to "consume" the "newline character" in the input.
The reason for that is that the input from your keyboard to the program is buffered by your shell, and so, the program won't see anything until you actually tell your shell to "pass the content of its buffer to the program". At this point, the program will be able to read the data contained in the previous buffer, e.g. your input, followed by one the character(s) used to validate your input in the shell: the newline. If you don't "consume" the newline before you do another scanf, that second scanf will read the newline character, resulting in the "skipped scanf" you've witnessed. To consume the extra character(s) from the input, the best way is to read them and discard what you read (what the code below does, notice the
while(getc(stdin) != '\n');
line after your scanf. What this line does is: "while the character read from stdin is not '\n', do nothing and loop.").
As an alternative, you could tell your shell to not buffer the input, via the termios(3) functions, or you could use either of the curses/ncurses libraries for the I/O.
So here is what you want:
int main(){
char guess;
char test2 [50];
char * s = test2; // 3. Useless
char output [50];
char * t = output; // 3. Useless
int i; // 8. i shall be declared here.
printf("Enter the secret string:\n");
fgets(test2, 50, stdin);
for (i=0;i<50;i++) if (test2[i] == '\n') test2[i] = '\0'; // 4. Remove the newline char and terminate the string where the newline char is.
for (int i=0;i<49;i++){ // 5. You should use memset here; 8. You should not declare 'i' here.
*(output +i)='_';
} // 1. Either you close the block here, or you don't open one for just one line.
output[49] = '\0'; // 6. You need to terminate your output string.
while(strcmp(s,t) != 0){ // 7. That will never work in the current state.
printf("Enter a guess:");
scanf("%c",&guess);
while(getc(stdin) != '\n');
printf("You entered: %c\n", guess);
showGuess(guess,s, t );
printf("%s\n",t);
}
printf("Well Done!");
return 0; // 2. int main requires that.
}
Other comments on your code:
You opened a block after your for loop and never closed it. That might be causing problems.
You declared your main as a function returning an integer... So you should at least return 0; at the end.
You seem to have understood that char * t = output; copies output's value and uses t as a name for the new copy. This is wrong. You are indeed copying something, but you only copy the address (a.k.a reference) of output in t. As a result, output and t refer to the same data, and if you modify output, t will get modified; and vice versa. Otherwise said, those t and s variables are useless in the current state.
You also need to remove the newline character from your input in the test2 buffer. I have added a line after the fgets for that.
Instead of setting all the bytes of an array "by hand", please consider using the memset function instead.
You need to actually terminate the output string after you "fill" it, so you should allocate a '\0' in last position.
You will never be able to compare the test2 string with the output one, since the output one is filled with underscores, when your test2 is NULL terminated after its meaningful content.
While variables at the loop scope are valid according to C99 and C11, they are not standard in ANSI C; and it is usually better to not declare any variable in a loop.
Also, "_ spaces" are called "underscores" ;)
Here is a code that does what you want:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LEN 50
int main()
{
char phrase[LEN];
char guessed[LEN];
char guess;
int i, tries = 0;
puts("Please enter the secret string:");
if(fgets(phrase, LEN, stdin) == NULL)
return 1;
for(i = 0; i < LEN && phrase[i] != '\n'; i++); // Detect the end of input data.
for(; i < LEN; i++) // For the rest of the input data,
phrase[i] = '_'; // fill with underscores (so it can be compared with 'guessed' in the while loop).
phrase[LEN - 1] = '\0'; // NULL terminate 'phrase'
memset(guessed, '_', LEN); // Fill 'guessed' with underscores.
guessed[LEN - 1] = '\0'; // NULL terminate 'guessed'
while(strcmp(phrase, guessed) != 0) // While 'phrase' and 'guessed' differ
{
puts("Enter a guess (one character only):");
if(scanf("%c", &guess) != 1)
{
puts("Error while parsing stdin.");
continue;
}
if(guess == '\n')
{
puts("Invalid input.");
continue;
}
while(getc(stdin) != '\n'); // "Eat" the extra remaining characters in the input.
printf("You entered: %c\n", guess);
for(i = 0; i < LEN; i++) // For the total size,
if(phrase[i] == guess) // if guess is found in 'phrase'
guessed[i] = guess; // set the same letters in 'guessed'
printf("Guessed so far: %s\n", guessed);
tries++;
}
printf("Well played! (%d tries)\n", tries);
return 0;
}
Feel free to ask questions in the comments, if you are not getting something. :)
Newline character entered in the previous iteration is being read by scanf. You can take in the '\n' by using the getc() as follows:
scanf("%c",&guess);
getc(stdin);
..
This changed worked for me. Though the right explanation and c leaner code is the one given by #7heo.tk
Change
scanf("%c",&guess);
with
scanf(" %c",&guess);
It should ignore '\n'.

C file handling query

So I have a program that takes user input and compares it to a specific line in a file, however the final line will always be credited as incorrect, so can someone solve this for me?, thanks.
File content (just a list of random words)
Baby
Milk
Car
Face
Library
Disc
Lollipop
Suck
Food
Pig
(libraries are stdio,conio and string)
char text[100], blank[100];
int c = 0, d = 0;
void space(void);
int main()
{
int loop = 0;
char str[512];
char string[512];
int line = 1;
int dis = 1;
int score = 0;
char text[64];
FILE *fd;
fd = fopen("Student Usernames.txt", "r"); // Should be test
if (fd == NULL)
{
printf("Failed to open file\n");
exit(1);
}
do
{
printf("Enter the string: ");
gets(text);
while (text[c] != '\0')
{
if (!(text[c] == ' ' && text[c] == ' '))
{
string[d] = text[c];
d++;
}
c++;
}
string[d] = '\0';
printf("Text after removing blanks\n%s\n", string);
getch();
for(loop = 0;loop<line;++loop)
{
fgets(str, sizeof(str), fd);
}
printf("\nLine %d: %s\n", dis, str);
dis=dis+1;
str[strlen(str)-1] = '\0';
if(strcmp(string,str) == 0 )
{
printf("Match\n");
score=score+2;
}
else
{
printf("Nope\n");
score=score+1;
}
getch();
c=0;
d=0;
}
while(!feof(fd));
printf("Score: %d",score);
getch();
}
For any input on the last line, the output will always be incorrect, I believe this is something to do with the for loop not turning it into the next variable, but seeing as the <= notation makes this program worse, I really just need a simple fix for the program thanks.
Some observations:
You must never use gets (it is not even in the C11 standard anymore). Instead of gets(text) use fgets(text, sizeof(text), stdin) – this way a long input will not overflow the text array.
There will be stuff printed at the end because you don't check the return value of either the gets or the fgets, so when end of file occurs for either the file or for user input the rest of that iteration still runs. fgets returns NULL if it didn't read anything – check for that instead of using feof.
You remove newlines from the file input but not from the user input, so the comparison will always fail when you switch from gets to fgets (which doesn't strip linefeeds). The second (otherwise pointless) comparison of text[c] against ' ' should be against '\n'.
edit: Also, in case the last line of your file does not end in a linefeed, the comparison will fail on the last line because you don't check if the last character is a linefeed before you remove it.
The for (loop = 0; loop < line; ++loop) -loop is pointless because line is always 1, so the body is only executed once.
You have unnecessarily global variables which the program hard to follow. And, for instance, your local text[64] overshadows the global text[100], so if you think you are modifying the global buffer, you are not. If your code is complete, none of the variables should be global.
The function getch() is non-standard. There is no easy direct replacement, so you may just accept that you are not writing portable code, but it's something to be aware of.

Counting the occurrences of a word from a text file in C

The following code is intended to find the occurrences of the word 'if' from a text file chosen by the user, however the result after exiting the loop is always 0. Question is how it could be possibly fixed.
#include<stdio.h>
#include<conio.h>
#include<string.h>
int main() {
FILE * f;
int count = 0, i;
char buf[50], read[100];
printf("Which file to open\n");
fgets(buf, 50, stdin);
buf[strlen(buf) - 1] = '\0';
if (!(f = fopen(buf, "rt"))) {
printf("Wrong file name");
} else printf("File opened successfully\n");
for (i = 0; fgets(read, 100, f) != NULL; i++) {
if (read[i] == 'if') count++;
}
printf("Result is %d", count);
getch();
return 0;
}
'if' isn't what you think it is; it's a multicharacter literal, not a string.
You can't compare strings with == in C. Use strcmp(3).
Your loop doesn't look like it does what you want either; time to break out a debugger (and probably strtok(3)).
Your if test is wrong.
if (read[i]=='if') /* no */
Use strcmp
if (strcmp(read[i], "if") == 0) /* check if the strings are equal */
For one thing, read[i] only contains a single character and will never be equal to any multi-character word.
For another, single apostrophe's are used to define single characters. 'if' is not a string of characters.
You will need to parse each line to find each individual word, and then compare each word to the target word using something like stricmp().

rle compression algorithm c

I have to do a rle algorithm in c with the escape character (Q)
example if i have an input like: AAAAAAABBBCCCDDDDDDEFG
the output have to be: QA7BBBCCCQD6FFG
this is the code that i made:
#include <stdio.h>
#include <stdlib.h>
void main()
{
FILE *source = fopen("Test.txt", "r");
FILE *destination = fopen("Dest.txt", "w");
char carCorrente; //in english: currentChar
char carSucc; // in english: nextChar
int count = 1;
while(fread(&carCorrente, sizeof(char),1, source) != 0) {
if (fread(&carCorrente, sizeof(char),1, source) == 0){
if(count<=3){
for(int i=0;i<count;i++){
fprintf(destination,"%c",carCorrente);
}
}
else {
fwrite("Q",sizeof(char),1,destination);
fprintf(destination,"%c",carCorrente);
fprintf(destination,"%d",count);
}
break;
}
else fseek(source,-1*sizeof(char), SEEK_CUR);
while (fread(&carSucc, sizeof(char), 1, source) != 0) {
if (carCorrente == carSucc) {
count++;
}
else {
if(count<=3){
for(int i=0;i<count;i++){
fprintf(destination,"%c",carCorrente);
}
}
else {
fwrite("Q",sizeof(char),1,destination);
fprintf(destination,"%c",carCorrente);
fprintf(destination,"%d",count);
}
count = 1;
goto OUT;
}
}
OUT:fseek(source,-1*sizeof(char), SEEK_CUR); //exit 2° while
}
}
the problem is when i have an input like this: ABBBCCCDDDDDEFGD
in this case the output is: QB4CCCQD5FFDD
and i don't know why :(
There is no need to use Fseek to rewind as u have done , Here is a code that is have written without using it by using simple counter & current sequence character.
C implementation:
#include<stdio.h>
#include<stdlib.h>
void main()
{
FILE *source = fopen("Test.txt", "r");
FILE *destination = fopen("Dest.txt", "w");
char currentChar;
char seqChar;
int count = 0;
while(1) {
int flag = (fread(&currentChar, sizeof(char),1, source) == 0);
if(flag||seqChar!=currentChar) {
if(count>3) {
char ch = 'Q';
int k = count;
char str[100];
int digits = sprintf(str,"%d",count);
fwrite(&ch,sizeof(ch),1,destination);
fwrite(&seqChar,sizeof(ch),1,destination);
fwrite(&str,sizeof(char)*digits,1,destination);
}
else {
for(int i=0;i<count;i++)
fwrite(&seqChar,sizeof(char),1,destination);
}
seqChar = currentChar;
count =1;
}
else count++;
if(flag)
break;
}
fclose(source);
fclose(destination);
}
Your code has various problems. First, I'm not sure whether you should read straight from the file. In your case, it might be better to read the source string to a text buffer first with fgets and then do the encoding. (I think in your assignment, you should only encode letters. If source is a regular text file, it will have at least one newline.)
But let's assume that you need to read straight from the disk: You don't have to go backwards. You already habe two variables for the current and the next char. Read the next char from disk once. Before reading further "next chars", assign the :
int carSucc, carCorr; // should be ints for getc
carSucc = getc(source); // read next character once before loop
while (carSucc != EOF) { // test for end of input stream
int carCorr = next; // this turn's char is last turn's "next"
carSucc = getc(source);
// ... encode ...
}
The going forward and backward makes the loop complicated. Besides, what happens if the second read read zero characters, i.e. has reached the end of the file? Then you backtrace once and go into the second loop. That doesn't look as if it was intended.
Try to go only forward, and use the loop above as base for your encoding.
I think the major problem in your approach is that it's way too complicated with multiple different places where you read input and seek around in the input. RLE can be done in one pass, there should not be a need to seek to the previous characters. One way to solve this is to change the logic into looking at the previous characters and how many times they have been repeated, instead of trying to look ahead at future characters. For instance:
int repeatCount = 0;
int previousChar = EOF;
int currentChar; // type changed to 'int' for fgetc input
while ((currentChar = fgetc(source)) != EOF) {
if (currentChar != previousChar) {
// print out the previous run of repeated characters
outputRLE(previousChar, repeatCount, destination);
// start a new run with the current character
previousChar = currentChar;
repeatCount = 1;
} else {
// same character repeated
++repeatCount;
}
}
// output the final run of characters at end of input
outputRLE(previousChar, repeatCount, destination);
Then you can just implement outputRLE to do the output to print out a run of the character c repeated count times (note that count can be 0); here's the function declaration:
void outputRLE(const int c, const int count, FILE * const destination)
You can do it pretty much the same way as in your current code, although it can be simplified greatly by combining the fwrite and two fprintfs to a single fprintf. Also, you might want to think what happens if the escape character 'Q' appears in the input, or if there is a run of 10 or more repeated characters. Deal with those cases in outputRLE.
An unrelated problem in your code is that the return type of main should be int, not void.
Thank you so much, i fixed my algorithm.
The problem was a variable, in the first if after the while.
Before
if (fread(&carCorrente, sizeof(char),1, source) == 0)
now
if (fread(&carSucc, sizeof(char),1, source) == 0){
for sure all my algorithm is wild. I mean it is too much slow!
i made a test with my version and with the version of Vikram Bhat and i saw how much my algorithm losts time.
For sure with getc() i can save more time.
now i'm thinking about the encoding (decompression) and i can see a little problem.
example:
if i have an input like: QA7QQBQ33TQQ10QQQ
how can i recognize which is the escape character ???
thanks

Resources