C how to search string in a file? - c

I have a problem with my code, I'm trying to search a string in a file and I can read it but, when I compare two strings it takes only the last one of the file as equal to the the first string entered with the scanf().
So imagine I wrote in my file three words and each one is returning to the line.
test
test12
test123
If in my scanf() I write test12 for example or test when it's going to read it will return false to the compare so (!== 0). But if I write test123 it will works because it's the last word of the file but I don't know why?
char word[26];
char singleLine[26];
FILE *file = fopen("bin/Release/myWords.txt", "a+");
scanf("%26s", word);
if (file != NULL) {
while (!feof(file)) {
fgets(singleLine, 26, file);
compare = strcmp(singleLine, word);
if (compare == 0) {
printf("\n%s\n",word);
}
}
fclose(file);
}

Your program only works in very special cases and has several problems:
scanf("%26s", word); may affect up to 27 bytes in the destination array, which is defined with a length of only 26.
furthermore, you should check the return value to avoid undefined behavior on invalid input.
fopen("bin/Release/myWords.txt", "a+"); opens the file in append mode: is this necessary?
while (!feof(file)) is always wrong, you should instead check the return value of fgets() that returns NULL at end of file.
compare = strcmp(singleLine, word); only compares for an exact math of the full line, which can only happen if the word has 25 characters, otherwise the trailing newline in singleLine will cause the comparison to fail. Furthermore, broken lines may cause unexpected results, as well as if the file does not end with a newline.
the reason it matches the last word in the file is you forget to write a trailing newline after the last word in the file, so the last fgets() fills the buffer with the exact word and no trailing newline.
if you search for matches inside the line, you should use a larger buffer and search for a match with strstr.
if you search for a exact match, you should strip the trailing newline before the comparison.
Here is a modified version:
#include <stdio.h>
#include <string.h>
int main() {
char word[27];
char singleLine[256];
FILE *file = fopen("bin/Release/myWords.txt", "r");
if (scanf("%26s", word) != 1)
return 1;
if (file != NULL) {
while (fgets(singleLine, sizeof singleLine, file)) {
singleLine[strcspn(singleLine, "\n")] = '\0'; // strip the newline if any
compare = strcmp(singleLine, word);
if (compare == 0) {
printf("\n%s\n", word);
}
}
fclose(file);
}
return 0;
}

Related

How to get each string within a buffer fetched with "getline" from a file in C

I'm trying to read every string separated with commas, dots or whitespaces from every line of a text from a file (I'm just receiving alphanumeric characters with scanf for simplicity). I'm using the getline function from <stdio.h> library and it reads the line just fine. But when I try to "iterate" over the buffer that was fetched with it, it always returns the first string read from the file. Let's suppose I have a file called "entry.txt" with the following content:
test1234 test hello
another test2
And my "main.c" contains the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_WORD 500
int main()
{
FILE *fp;
int currentLine = 1;
size_t characters, maxLine = MAX_WORD * 500;
/* Buffer can keep up to 500 words of 500 characters each */
char *word = (char *)malloc(MAX_WORD * sizeof(char)), *buffer = (char *)malloc((int)maxLine * sizeof(char));
fp = fopen("entry.txt", "r");
if (fp == NULL) {
return 1;
}
for (currentLine = 1; (characters = getline(&buffer, &maxLine, fp)) != -1; currentLine++)
{
/* This line gets "test1234" onto "word" variable, as expected */
sscanf(buffer, "%[a-zA-Z_0-9]", word);
printf("%s", word); // As expected
/* This line should get "test" string, but again it obtains "test1234" from the buffer */
sscanf(buffer, "%[a-zA-Z_0-9]", word);
printf("%s", word); // Not intended...
// Do some stuff with the "word" and "currentLine" variables...
}
return 0;
}
What happens is that I'm trying to get every alphanumeric string (namely word from now on) in sequence from the buffer, when the sscanf function just gives me the first occurrence of a word within the specified buffer string. Also, every line on the entry file can contain an unknown amount of words separated by either whitespaces, commas, dots, special characters, etc.
I'm obtaining every line from the file separately with "getline" because I need to get every word from every line and store it in other place with the "currentLine" variable, so I'll know from which line a given word would've come. Any ideas of how to do that?
fscanf has an input stream argument. A stream can change its state, so that the second call to fscanf reads a different thing. For example:
fscanf(stdin, "%s", str1); // str1 contains some string; stdin advances
fscanf(stdin, "%s", str2); // str2 contains some other sting
scanf does not have a stream argument, but it has a global stream to work with, so it works exactly like fscanf(stdin, ...).
sscanf does not have a stream argument, nor there is any global state to keep track of what was read. There is an input string. You scan it, some characters get converted, and... nothing else changes. The string remains the same string (how could it possibly be otherwise?) and no information about how far the scan has advanced is stored anywhere.
sscanf(buffer, "%s", str1); // str1 contains some string; nothing else changes
sscanf(buffer, "%s", str2); // str2 contains the same sting
So what does a poor programmer fo?
Well I lied. No information about how far the scan has advanced is stored anywhere only if you don't request it.
int nchars;
sscanf(buffer, "%s%n", str1, &nchars); // str1 contains some string;
// nchars contains number of characters consumed
sscanf(buffer+nchars, "%s", str2); // str2 contains some other string
Error handling and %s field widths omitted for brevity. You should never omit them in real code.

C: How to read a string from a file and assigning it to an element from a structure [duplicate]

I have a file with multiple strings, each string on a separate line. All strings are 32 character long (so 33 with the '\n' at the end).
I am trying to read all the strings. For now, I just want to read them and not store them as follows:
char line[32];
while (!feof(fp)) {
fgets(line, 32, fp);
}
printf("%s", line);
This prints out zero. Why isn't it working?
Furthermore, I am trying to store a null terminator at the end of each string read. I changed the line array to length 33 but how would I make it that if '\n' is found, replace it with \0 and store that?
You code isn't working because you are only allocating space for lines of 30 characters plus a newline and a null terminator, and because you are only printing out one line after feof() returns true.
Additionally, feof() returns true only after you have tried and failed to read past the end of file. This means that while (!feof(fp)) is generally incorrect - you should simply read until the reading function fails - at that point you can use feof() / ferror() to distinguish between end-of-file and other types of failures (if you need to). So, you code could look like:
char line[34];
while (fgets(line, 34, fp) != NULL) {
printf("%s", line);
}
If you wish to find the first '\n' character in line, and replace it with '\0', you can use strchr() from <string.h>:
char *p;
p = strchr(line, '\n');
if (p != NULL)
*p = '\0';
Here is a basic approach:
// create an line array of length 33 (32 characters plus space for the terminating \0)
char line[33];
// read the lines from the file
while (!feof(fp)) {
// put the first 32 characters of the line into 'line'
fgets(line, 32, fp);
// put a '\0' at the end to terminate the string
line[32] = '\0';
// print the result
printf("%s\n", line);
}
It goes something like this:
char str[33]; //Remember the NULL terminator
while(!feof(fp)) {
fgets(str, 33, fp);
printf("%s\n",str);
}

How to save every line in file (IN C) in a variable? :)

I need to save every line of text file in c in a variable.
Here's my code
int main()
{
char firstname[100];
char lastname[100];
char string_0[256];
char string[256] = "Vanilla Twilight";
char string2[256];
FILE *file;
file = fopen("record.txt","r");
while(fgets(string_0,256,file) != NULL)
{
fgets(string2, 256, file);
printf("%s\n", string2);
if(strcmp(string, string2)==0)
printf("A match has been found");
}
fclose(file);
return 0;
}
Some lines are stored in the variable and printed on the cmd but some are skipped.
What should I do? When I tried sscanf(), all lines were complete but only the first word of each line is printed. I also tried ffscanf() but isn't working too. In fgets(), words per line are complete, but as I've said, some lines are skipped (even the first line).
I'm just a beginner in programming, so I really need help. :(
You're skipping over the check every odd number of lines, as you have two successive fgets() calls and only one strcmp(). Reduce your code to
while(fgets(string_0,256,file) != NULL)
{
if( ! strcmp(string_0, string2) )
printf("A match has been found\n");
}
FWIW, fgets() reads and stores the trailing newline, which can cause problem is string comparison, you need to take care of that, too.
As a note, you should always check the return value of fopen() for success before using the returned pointer.

fscanf() how to go in the next line?

So I have a wall of text in a file and I need to recognize some words that are between the $ sign and call them as numbers then print the modified text in another file along with what the numbers correspond to.
Also lines are not defined and columns should be max 80 characters.
Ex:
I $like$ cats.
I [1] cats.
[1] --> like
That's what I did:
#include <stdio.h>
#include <stdlib.h>
#define N 80
#define MAX 9999
int main()
{
FILE *fp;
int i=0,count=0;
char matr[MAX][N];
if((fp = fopen("text.txt","r")) == NULL){
printf("Error.");
exit(EXIT_FAILURE);
}
while((fscanf(fp,"%s",matr[i])) != EOF){
printf("%s ",matr[i]);
if(matr[i] == '\0')
printf("\n");
//I was thinking maybe to find two $ but Idk how to replace the entire word
/*
if(matr[i] == '$')
count++;
if(count == 2){
...code...
}
*/
i++;
}
fclose(fp);
return 0;
}
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array..also I don't know how to replace $word$ with a number.
Not only will fscanf("%s") read one whitespace-delimited string at a time, it will also eat all whitespace between those strings, including line terminators. If you want to reproduce the input whitespace in the output, as your example suggests you do, then you need a different approach.
Also lines are not defined and columns should be max 80 characters.
I take that to mean the number of lines is not known in advance, and that it is acceptable to assume that no line will contain more than 80 characters (not counting any line terminator).
When you say
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array
I suppose you're talking about this code:
char matr[MAX][N];
/* ... */
if(matr[i] == '\0')
Given that declaration for matr, the given condition will always evaluate to false, regardless of any other consideration. fscanf() does not factor in at all. The type of matr[i] is char[N], an array of N elements of type char. That evaluates to a pointer to the first element of the array, which pointer will never be NULL. It looks like you're trying to determine when to write a newline, but nothing remotely resembling this approach can do that.
I suggest you start by taking #Barmar's advice to read line-by-line via fgets(). That might look like so:
char line[N+2]; /* N + 2 leaves space for both newline and string terminator */
if (fgets(line, sizeof(line), fp) != NULL) {
/* one line read; handle it ... */
} else {
/* handle end-of-file or I/O error */
}
Then for each line you read, parse out the "$word$" tokens by whatever means you like, and output the needed results (everything but the $-delimited tokens verbatim; the bracket substitution number for each token). Of course, you'll need to memorialize the substitution tokens for later output. Remember to make copies of those, as the buffer will be overwritten on each read (if done as I suggest above).
fscanf() does recognize '\0', under select circumstances, but that is not the issue here.
Code needs to detect '\n'. fscanf(fp,"%s"... will not do that. The first thing "%s" directs is to consume (and not save) any leading white-space including '\n'. Read a line of text with fgets().
Simple read 1 line at a time. Then march down the buffer looking for words.
Following uses "%n" to track how far in the buffer scanning stopped.
// more room for \n \0
#define BUF_SIZE (N + 1 + 1)
char buffer[BUF_SIZE];
while (fgets(buffer, sizeof buffer, stdin) != NULL) {
char *p = buffer;
char word[sizeof buffer];
int n;
while (sscanf(p, "%s%n", word, &n) == 1) {
// do something with word
if (strcmp(word, "$zero$") == 0) fputs("0", stdout);
else if (strcmp(word, "$one$") == 0) fputs("1", stdout);
else fputs(word, stdout);
fputc(' ', stdout);
p += n;
}
fputc('\n', stdout);
}
Use fread() to read the file contents to a char[] buffer. Then iterate through this buffer and whenever you find a $ you perform a strncmp to detect with which value to replace it (keep in mind, that there is a 2nd $ at the end of the word). To replace $word$ with a number you need to either shrink or extend the buffer at the position of the word - this depends on the string size of the number in ascii format (look solutions up on google, normally you should be able to use memmove). Then you can write the number to the cave, that arose from extending the buffer (just overwrite the $word$ aswell).
Then write the buffer to the file, overwriting all its previous contents.

Reading from two columns into variables

I'm writing a C program that takes an input file and stores it. The input file has two columns, with an integer in the first and a string in the second, like so:
12 apple
17 frog
20 grass
I've tried using fgets to take an entire line as a string then break it apart using scanf but I'm getting lots of issues. I have searched quite a lot but haven't found anything that answers my question, but sorry if I missed something obvious.
This is the code that I've been trying:
while(fgets(line, sizeof(line), fp))
{
scanf(line, "%d\t%s", &key, value);
insert(key, value, newdict);
}
Let's have a quick go at doing with strtok since someone mentioned it. Let's imagine your file is called file.txt and has the following contents:
10 aaa
20 bbb
30 ccc
This is how we can parse it:
#include <stdio.h>
#include <string.h>
#define MAX_NUMBER_OF_LINES 10 // parse a maximum of 10 lines
#define MAX_LINE_SIZE 50 // parse a maximum of 50 chars per line
int main ()
{
FILE* fh = fopen("file.txt", "r"); // open the file
char temp[MAX_LINE_SIZE]; // some buffer storage for each line
// storage for MAX_NUMBER_OF_LINES integers
int d_out[MAX_NUMBER_OF_LINES];
// storage for MAX_NUMBER_OF_LINES strings each MAX_LINE_SIZE chars long
char s_out[MAX_NUMBER_OF_LINES][MAX_LINE_SIZE];
// i is a special variable that tells us if we're parsing a number or a string (0 for num, 1 for string)
// di and si are indices to keep track of which line we're currently handling
int i = 0, di = 0, si = 0;
while (fgets(temp, MAX_LINE_SIZE, fh) && di < MAX_NUMBER_OF_LINES) // read the input file and parse the string
{
temp[strlen(temp) -1] = '\0'; // get rid of the newline in the buffer
char* c = strtok(temp, " "); // set the delimiters
while(c != NULL)
{
if (i == 0) // i equal to 0 means we're parsing a number
{
i = 1; // next we'll parse a string, let's indicate that
sscanf(c, "%d", &d_out[di++]);
}
else // i must be 1 parsing a string
{
i = 0; // next we'll parse a number
sprintf(s_out[si++], "%s", c);
}
c = strtok(NULL, " ");
}
printf("%d %s\n", d_out[di -1], s_out[si - 1]); // print what we've extracted
}
fclose(fh);
return 0;
}
This will extract the contents from the file and store them in respective arrays, we then print them and get back our original contents:
$ ./a.out
10 aaa
20 bbb
30 ccc
Use:
fgets (name, 100, stdin);
100 is the max length of the buffer. You should adjust it as per your need.
Use:
scanf ("%[^\n]%*c", name);
The [] is the scanset character. [^\n] tells that while the input is not a newline ('\n') take input. Then with the %*c it reads the newline character from the input buffer (which is not read), and the * indicates that this read in input is discarded (assignment suppression), as you do not need it, and this newline in the buffer does not create any problem for next inputs that you might take.
The problem here seems to be that you are reading from the file twice. First with fgets and then with scanf. You will probably not get an errors from the compiler in your use of scanf, but should be getting warnings as you use line for the format string and the other arguments does not match the format. It would also be pretty obvious if you checked the return value from scanf, as it returns the number of successfully scanned items. Your call would most likely return zero (or minus one when you have hit end of file).
You should be using sscanf instead to parse the line you read with fgets.
See e.g. this reference for the different scanf variants.
Your problem can be solved by using sscanf (with the support of getline) like below:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
/* tokens bags */
char tok_str[255];
int tok_int;
fp = fopen("./file.txt", "r");
if (fp == NULL)
exit(EXIT_FAILURE);
/* Reads the line from the stream. */
while ((read = getline(&line, &len, fp)) != -1) {
/* Scans the character string pointed by line, according to given format. */
sscanf(line, "%d\t%s", &tok_int, tok_str);
printf("%d-%s\n", tok_int, tok_str);
}
if (line)
free(line);
exit(EXIT_SUCCESS);
}
Or, even simpler. You could use fscanf (with the support of feof) and replace the while loop shown above (along with some other redundant code cleanups) with the following one:
/* Tests the end-of-file indicator for the stream. */
while (!feof(fp)) {
/* Scans input from the file stream pointer. */
fscanf(fp,"%d\t%s\n",&tok_int, tok_str);
printf("%d-%s\n", tok_int, tok_str);
}
Assuming that your file contains following lines (where single line format is number[tab]string[newline]):
12 apple
17 frog
20 grass
the output will be:
12-apple
17-frog
20-grass

Resources