How to compare strings in two files? - c

I'm newbie in C, any help would be appreciated on this project.I need an expert advice on this project who has tried before similar ones?
I'm going to use C to read two files (contain array of string or just strings and numbers which ever best for the performance) and compare strings in two files line by line (first line in the first file with the first line in the second file, second line in the first file with the second line in the second file...) and printf them if they match or unmatch. I need to find the fastest way to complete this operation (I can also change the file structure if needed). The sample files below;
File1: File2:
Dens1 Dens1
Hige0 Hige1
Alte1 Alte0
Some1 Some1
I was think about the following options;
option1:
fopen
fgets
memcmp/strcmp/strstr
printf
option2:
open
mmap the file
search and pointer the data from mmap
close
option3:
Reading second file completely and store the content in a array. Then read from the first file and compare.
option4:
your opinion?
Thanks for your time.

The best solution depends to a large degree on the contents of your files. If you know that the files are sorted, reading from both files at the same time, line by line, will work. If your files can be extremely large (relative to available RAM), methods that require reading entire files into memory won't work.
You need to define your problem better to have enough information to decide the best solution.

AdamO, here is it an easy way to achieve what you want.
Good luck in your assignment.
#include<stdio.h>
int main()
{
char str1[30], str2[30];
FILE *fpOne, *fpTwo;
int x = 0;
int numberOfLines = 0;
char ch;
//Assuming both files has the same number of lines
fpOne = fopen("FileTextOne.txt","r");
do
{
ch = fgetc(fpOne);
if(ch=='\n')
{
numberOfLines++;
}
}while(ch != EOF);
fclose(fpOne);
fpOne = fopen("FileTextOne.txt","r");
fpTwo = fopen("FileTextTwo.txt","r");
do
{
fscanf(fpOne,"\n%s", str1);
fscanf(fpTwo,"\n%s", str2);
if(strcmp(str1, str2) == 0)
{
printf("%s\t", str1);
printf("%s\t", str2);
printf("\t - are identical Strings.\n");
}
else
{
printf("%s\t", str1);
printf("%s\t", str2);
printf("\t - are not identical Strings.\n");
}
x++;
}while(x!=(numberOfLines+1));
fclose(fpOne);
fclose(fpTwo);
return(0);
}
FileTextOne
Dens1
Hige0
Alte1
Some1
Dog
Donkey
FileTextTwo
Dens1
Hige0
Alte1
Some1
Cat
Donkey

Related

Why this code doesn't display all chars in the file ? I tried also getc() and fscanf(); aslo doesn't not worked. look at the screen shot?

#include<stdio.h>
#include<stdlib.h>
int main()
{
char str[1];
FILE *file;
file = fopen("dataloger.txt", "rb");
while(fread(&str[0], sizeof(str), 1, file) ==1){
if(str[0] =='\n'){
str[0] = '#';
}
printf("%s", str);
}
fclose(file);
/************************************************
int c;
if (file) {
while ((c = getc(file)) != EOF){
if(c =='\n') {
c = '#';
}
printf("%c", c);
}
fclose(file);
}
could someone explain why I didn't see normal output not just #?
As I faced the problem only after I try to replace the char'\n' by another char and print it.
Look at the screen shot added
I copied in your code minus the commented out bits and did some testing. Following is a version of your code with some tweaks.
#include<stdio.h>
#include<stdlib.h>
int main()
{
char str; /* In reality, "str" is a character - a string normally would be at least two characters long with a NULL terminator */
FILE *fp; /* Don't like to name variables with possibly reserved words */
fp = fopen("dataloger.txt", "rb");
if (fp == NULL) /* Just to make the program more robust */
{
printf("Could not find the file\n");
return -1;
}
while(fread(&str, sizeof(str), 1, fp) ==1)
{
if(str =='\n')
{
str = '#';
}
printf("%c", str);
}
fclose(fp);
printf("\n"); /* Just to be neat */
return 0;
}
Following are some points to note.
First, since the program is supposed to read a character at a time from the text file, it seemed more clear to utilize a character variable in lieu of a character array that had a length of "1". Technically both work, but that would come across a bit clearer to anyone analyzing the program.
It's usually a good idea to shy away from naming variables that might have reserved meaning in a program language. Although "file" and "FILE" are different names, it can add to confusion if anyone is analyzing this code.
Also, when working with files, it is often a good idea to check to make sure there were no issues with opening the file; therefore, a check to see if the file pointer is not NULL is beneficial - this might be at the core of your issue depending upon where the text file is in relation to the compiled program.
Testing out this tweaked code, a simple two-line text file was set up with the following test data.
The quick brown fox jumps over the lazy dog.
Now is the time for all good men to come to the aid of their country.
Executing the code resulted in the following output at the terminal (the program was compiled with a name of "ReadFile").
#Dev:~/C_Programs/Console/ReadFile/bin/Release$ ./ReadFile
The quick brown fox jumps over the lazy dog.#Now is the time for all good men to come to the aid of their country.#
Give those tweaks a try to see if it meets the spirit of your project.

Finding a keyword from one file in another file

So my project for class needs me to find a list of keywords from one file:
Master's,Bachelor's,Professor
And needs me to find it through a resume in another file:
John Smith
1234 Residence Road
johnsmith#gmail.com
Degree level: Bachelor's degree
Major: Applied mathematics
Work Experience:
Professor at local university for multiple math classes, mainly calculus
Worked at a tech company studying the analytics of their online store
Now I have the keywords stored in an array of char spacing[10][15] (with no commas)
and I have the resume saved as char* buffer. Both work, as they both print out, but when trying to find the keywords I keep getting 0 (int KWcount is the counter for the amount of times a keyword appears). Here's the code both putting the resume into buffer and my attempt at finding the words.
//Resume
FILE* fp2;
fp2 = fopen("resume.txt", "r");
if (fp == NULL) {
printf("\nFile not found.\n");
return 0;
}
//File Reading
fseek(fp2, 0L, SEEK_END);
numbytes = ftell(fp2);
fseek(fp2, 0L, SEEK_SET);
buffer = (char*)calloc(numbytes, sizeof(char));
if (buffer == NULL)
return 1;
fread(buffer, sizeof(char), numbytes, fp2);
fclose(fp2);
printf("Before process: %i", KWcount);
//Search for keyword
for (i=0; i < 10; i++) {
if (strcmp(buffer, spacing[i]) == 0) {
KWcount++;
}
}
printf("\n\nAfter process:%i\n", KWcount);
fclose(fp2);
}
Before process:0
After process:0
I genuinely cannot figure out what the problem is and my professor is not really any help, so does anyone have any tips or ways to fix this?
strcmp(buffer, spacing[i])
strcmp will return 0 only if the two arguments point to strings which are exactly equal. What you need to do is search buffer word by word and then compare those with the word you expect to find.
You may find the function strtok useful to break apart your resume buffer into words.
And also I recommend using strncmp, since it allows you to cap the maximum number of characters in your comparison. After all, your goal is to compare just a single word from the resume, not the entire string.

How to write two variables to the same line in .txt file? C lang

I need to write two variables to a text file with a space in between them, what I want my text file to look like:
log.txt
www.google.com feb17202101
www.yahoo.com feb17202102
www.xyz.com feb17202103
The domain name and dates are stored in two char arrays.
My code
FILE *fp;
fp = fopen("log.txt", "w+");
if(fp == NULL)
{
printf("Could not open file 31");
exit(0);
}
fprintf(fp,"%s %s\n",domainName, fileName);
(DomainName and fileName are two separate char arrays)
I was hoping this would print both my variables with a space in between them and go onto the next line so when my function needs to write to the text file again, it will be on a fresh line.
Instead my text file looks like
www.google.com
022321224825
This seems like it should be the easiest part of my program but it's one of those days where I spent all day coding and now my brain feels fried haha. I can't figure out how to format it properly. If someone could help me out, I would appreciate it!
As pointed out in comment, I think you have a newline character in domainname array itself. When I try to write two simple arrays to a file like this;
char domainNames[3][20] = {"www.google.com", "www.yahoo.com", "www.xyz.com"};
char timestamps[3][20] = {"feb17202101", "feb17202102", "feb17202103"};
for (int i = 0; i < 3; i++) {
fprintf(fp,"%s %s\n",domainNames[i], timestamps[i]);
}
This prints the following output:
c-posts : $ cat log.txt
www.google.com feb17202101
www.yahoo.com feb17202102
www.xyz.com feb17202103
If you're getting domainname and timestamp from user input using fgets(), a newline character gets added at the end. You might want to trim it:
void trimTrailingNewline(char input[]) {
input[strcspn(input, "\n")] = 0;
}
Then you can use it to remove trailing newlines in either before writing to file or after reading as user input:
for (int i = 0; i < 3; i++) {
trimTrailingNewline(domainNames[i]);
trimTrailingNewline(timestamps[i]);
fprintf(fp,"%s %s\n", domainNames[i], timestamps[i]);
}

Moving data from one program to another program using file structures c

Program(A)-----> file.txt-----> Program(B)
^This is the format I am using, I currently don't have enough knowledge with file structures.
My text file is named myStudents.txt
EDIT: Program(A) writes the information properly. Program(B) needs to retrieve the information from the text file.
#include<stdio.h>
int main()
{
char studentName[50];
int grade=0;
printf("Which students grade would you like to retrieve?: ");
scanf("%s",&studentName);
FILE *fptr;
fptr = (fopen("myStudents.txt", "r"));
if(fptr == NULL)
{
printf("Error!");
exit(1);
}
printf("\nStudent details:\n");
fscanf(fptr,"%d %[^\n]s",grade,studentName);
printf("Name: %s\n",studentName);
printf("Grade: %d\n",grade);
fclose(fptr);
return 0;
}
I'm very confused on how to use program A's information in program B. Apologies if this is a repeat thread, I couldn't find any information here or anywhere else to solve my issue.
*Note(A solid explanation would be very helpful along with any constructive criticism)
Cheers! Have a good day!
Your program B does actually not search for any name, it just tries to print the first one. I won't write the complete code for you but here is a little help with what your program should do:
read in the file line by line. (functions fscanf, fgets or getline may be useful)
extract the name and grade out of the line. (sscanf and all string functions)
check if the name is the one you are looking for, if yes print it and stop.
This is of course only an example what the program could look like, but i suggest to start by implementing this steps.

C, reading a multiline text file

I know this is a dumb question, but how would I load data from a multiline text file?
while (!feof(in)) {
fscanf(in,"%s %s %s \n",string1,string2,string3);
}
^^This is how I load data from a single line, and it works fine. I just have no clue how to load the same data from the second and third lines.
Again, I realize this is probably a dumb question.
Edit: Problem not solved. I have no idea how to read text from a file that's not on the first line. How would I do this? Sorry for the stupid question.
Try something like:
/edited/
char line[512]; // or however large you think these lines will be
in = fopen ("multilinefile.txt", "rt"); /* open the file for reading */
/* "rt" means open the file for reading text */
int cur_line = 0;
while(fgets(line, 512, in) != NULL) {
if (cur_line == 2) { // 3rd line
/* get a line, up to 512 chars from in. done if NULL */
sscanf (line, "%s %s %s \n",string1,string2,string3);
// now you should store or manipulate those strings
break;
}
cur_line++;
}
fclose(in); /* close the file */
or maybe even...
char line[512];
in = fopen ("multilinefile.txt", "rt"); /* open the file for reading */
fgets(line, 512, in); // throw out line one
fgets(line, 512, in); // on line 2
sscanf (line, "%s %s %s \n",string1,string2,string3); // line 2 is loaded into 'line'
// do stuff with line 2
fgets(line, 512, in); // on line 3
sscanf (line, "%s %s %s \n",string1,string2,string3); // line 3 is loaded into 'line'
// do stuff with line 3
fclose(in); // close file
Putting \n in a scanf format string has no different effect from a space. You should use fgets to get the line, then sscanf on the string itself.
This also allows for easier error recovery. If it were just a matter of matching the newline, you could use "%*[ \t]%*1[\n]" instead of " \n" at the end of the string. You should probably use %*[ \t] in place of all your spaces in that case, and check the return value from fscanf. Using fscanf directly on input is very difficult to get right (what happens if there are four words on a line? what happens if there are only two?) and I would recommend the fgets/sscanf solution.
Also, as Delan Azabani mentioned... it's not clear from this fragment whether you're not already doing so, but you have to either define space [e.g. in a large array or some dynamic structure with malloc] to store the entire dataset, or do all your processing inside the loop.
You should also be specifying how much space is available for each string in the format specifier. %s by itself in scanf is always a bug and may be a security vulnerability.
First off, you don't use feof() like that...it shows a probable Pascal background, either in your past or in your teacher's past.
For reading lines, you are best off using either POSIX 2008 (Linux) getline() or standard C fgets(). Either way, you try reading the line with the function, and stop when it indicates EOF:
while (fgets(buffer, sizeof(buffer), fp) != 0)
{
...use the line of data in buffer...
}
char *bufptr = 0;
size_t buflen = 0;
while (getline(&bufptr, &buflen, fp) != -1)
{
...use the line of data in bufptr...
}
free(bufptr);
To read multiple lines, you need to decide whether you need previous lines available as well. If not, a single string (character array) will do. If you need the previous lines, then you need to read into an array, possibly an array of dynamically allocated pointers.
Every time you call fscanf, it reads more values. The problem you have right now is that you're re-reading each line into the same variables, so in the end, the three variables have the last line's values. Try creating an array or other structure that can hold all the values you need.
The best way to do this is to use a two dimensional array and and just write each line into each element of the array. Here is an example reading from a .txt file of the poem Ozymandias:
int main() {
char line[15][255];
FILE * fpointer = fopen("ozymandias.txt", "rt");
for (int a = 0; a < 15; a++) {
fgets(line[a], 255, fpointer);
}
for (int b = 0; b < 15; b++) {
printf("%s", line[b]);
}
return 0;
This produces the poem output. Notice that the poem is 14 lines long, it is more difficult to print out a file whose length you do not know because reading a blank line will produce the output "x�oA". Another issue is if you check if the next line is null by writing
while (fgets(....) != NULL)) {
each line will be skipped. You could try going back a line each time to solve this but i think this solution is fine for all intents.
I have an even EASIER solution with no confusing snippets of puzzling methods (no offense to the above stated) here it is:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string line;//read the line
ifstream myfile ("MainMenu.txt"); // make sure to put this inside the project folder with all your .h and .cpp files
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
}
Happy coding

Resources