I'm writing a C code that checks the number of occurrences of a word (entered by the user) in a text file, prints the count per line, and the total count on the screen, and compares the word with the last word of the file.
I have dedicated a function to fetch the last word called "Fetch", that fetches the word and returns it to the main function. then another function counts the occurrences, then a third function that actually compares the two strings using strcmp().
my problem is that the function char* Fetch() is seemingly returning empty strings for some reason. Note that I wanted to check where the problem actually is, so I tried to print the result inside the function on the screen instead of returning it to main() and it worked!!, so seemingly, the problem is the return statement, What could be the problem??
char* Fetch() called in main():
//step 3: fetch the last word in the file
Lword = Fetch();
printf("the last word is %s", Lword);
char* Fetch():
char* Fetch()
{
char text[1000];
fread(text, sizeof(char), sizeof(text), spIn);
for(int i = 0; i < strlen(text); i++)
{
if(isspace(text[strlen(text) -1 -i])) //if space
{
return (text + (strlen(text)-i)); //return a pointer to the element after the space
}
}
}
declarations in main:
char Uword[20], *Lword;
int TotalCount;
You allocated a local char buffer text in function Fetch and store data into.
When function Fetch returns, the buffer text will be released, then the content should NOT be used anymore.
Function Fetch can be changed this way:
char* Fetch(char *text, size_t maxlen, FILE *spIn)
{
fread(text, sizeof(char), maxlen, spIn);
for(int i = 0; i < strlen(text); i++)
{
if(isspace(text[strlen(text) -1 -i])) //if space
{
return (text + (strlen(text)-i)); //return a pointer to the element after the space
}
}
return text;
}
And can be called like this:
FILE *spIn = ...
char text[1000] = {'\0'};
char *res = Fetch(text, sizeof(text), spIn);
printf("the last word is: %s\n",res);
Remember to use the text buffer when it's still valid.
Related
To study for the exam we are trying to do some exercise from past exams.
In this exercise we get a header file and we have to create a function that read an input file and print onto the stdout only the parts of strings that do not contain digits.
(We have to pass the pointer of the string red to the main function).
We tried to do it with a an array but when printing the first word is empty or has strange characters. Instead doing a malloc allocation works fine.
What is also strange is that printing before everything an empty string will fix the code.
Therefore we don't understand why using an array of char the first word is not printed correctly, although it is saved in the buffer.
Including a printf before the while loop in the main function will reset the problem.
Using dynamic allocation (malloc) and not static allocation (array) will fix the print.
Iterating over the whole array and set all the memory to 0 does not fix the problem.
Therefore the pointer is correct as with printing an empty string it prints it correctly, but I really cannot understand what cause the issue.
Question are:
How it is possible that printing an empty string the print is correct?
Array is allocated on the stack therefore it is deallocated when the program exit the scope, why is only the first broken and not all the words?
#include "word_reader.h"
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
const char * read_next_word(FILE * f) {
char buffer[WORD_MAX_LEN];
char * word = buffer;
for (int i = 0; i < WORD_MAX_LEN; ++i)
buffer[i] = 0;
//char * buffer = malloc(sizeof(char) * WORD_MAX_LEN);
int found = 0;
int c = 0;
int i = 0;
while (!found && c != EOF) {
while ((c = fgetc(f)) != EOF && isalpha(c)) {
found = 1;
buffer[i] = c;
++i;
}
buffer[i] = '\0';
}
if (found) {
return word;
//return buffer; // when use malloc
}
return 0;
}
int main(int argc, char * argv[]) {
FILE * f = fopen(argv[1], "r");
if(!f) {
perror(argv[1]);
return EXIT_FAILURE;
}
const char * word = 0;
//printf(""); // adding this line fix the problem
while ((word = read_next_word(f))) {
printf("%s\n", word);
}
fclose(f);
return 0;
}
the header file contain only the read_next_word declaration and define WORD_MAX_LEN to 1024. (Also include
the file to read (a simple .txt file)
ciao234 44242 toro
12Tiz23 where333
WEvo23
expected result:
ciao
toro
Tiz
where
WEvo
actual result
�rǫs+)co�0�*�E�L�mзx�<�/��d�c�q
toro
Tiz
where
WEvo
the first line is always some ascii characters or an empty line.
I am working on a program in c, and I have two variables: file[] and tok[]. The idea is to iterate through file[] character by character and place the characters in tok[]. I can print the characters from file[] directly, but I can't place them into tok[]. How would I grab file[], character by character and place it character by character into tok[]?
My main() method (always returns 0 without any errors):
int main()
{
char file[] = "PRINT \"Hello, world!\"";
int filelen = strlen(file);
int i = 0;
char tok[] = "";
for (i = 0; i < filelen; i++) {
printf("%c \n", file[i]); // Print every char from variable file
tok[strlen(tok)+1] = file[i]; // Add the character to variable tok
printf("%s \n", tok); // Print tok
}
return 0;
}
You make a few errors:
char tok[] = "";
This allocates a fixed-length array of one! The memory is not automatically expanded when you add characters. As you want to copy filelen characters, you should do:
char tok[filelen+1]; // note the "+1" for the terminating null character
In your loop, you repeatedly call strlen. Personally I find that a waste of CPU cycles and would prefer to use another index variable, for example:
int toklen= 0; // initially empty
...
tok[toklen++] = file[i]; // Add the character to variable tok
In your version you have added the character one position too far (indices in C go from 0..n-1).
After the loop you must still terminate the string with a null character:
tok[toklen] = '\0';
I have some C-code that reads in a text file line by line, hashes the strings in each line, and keeps a running count of the string with the biggest hash values.
It seems to be doing the right thing but when I issue the print statement:
printf("Found Bigger Hash:%s\tSize:%d\n", textFile.biggestHash, textFile.maxASCIIHash);
my print returns this in the output:
Preprocessing: dict1
Found BiSize:110h:a
Found BiSize:857h:aardvark
Found BiSize:861h:aardwolf
Found BiSize:937h:abandoned
Found BiSize:951h:abandoner
Found BiSize:1172:abandonment
Found BiSize:1283:abbreviation
Found BiSize:1364:abiogenetical
Found BiSize:1593:abiogenetically
Found BiSize:1716:absentmindedness
Found BiSize:1726:acanthopterygian
Found BiSize:1826:accommodativeness
Found BiSize:1932:adenocarcinomatous
Found BiSize:2162:adrenocorticotrophic
Found BiSize:2173:chemoautotrophically
Found BiSize:2224:counterrevolutionary
Found BiSize:2228:counterrevolutionist
Found BiSize:2258:dendrochronologically
Found BiSize:2440:electroencephalographic
Found BiSize:4893:pneumonoultramicroscopicsilicovolcanoconiosis
Biggest Size:46umonoultTotal Words:71885covolcanoconiosis
So tt seems I'm misusing printf(). Below is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define WORD_LENGTH 100 // Max number of characters per word
// data1 struct carries information about the dictionary file; preprocess() initializes it
struct data1
{
int numRows;
int maxWordSize;
char* biggestWord;
int maxASCIIHash;
char* biggestHash;
};
int asciiHash(char* wordToHash);
struct data1 preprocess(char* fileName);
int main(int argc, char* argv[]){
//Diagnostics Purposes; Not used for algorithm
printf("Preprocessing: %s\n",argv[1]);
struct data1 file = preprocess(argv[1]);
printf("Biggest Word:%s\t Size:%d\tTotal Words:%d\n", file.biggestWord, file.maxWordSize, file.numRows);
//printf("Biggest hashed word (by ASCII sum):%s\tSize: %d\n", file.biggestHash, file.maxASCIIHash);
//printf("**%s**", file.biggestHash);
return 0;
}
int asciiHash(char* word)
{
int runningSum = 0;
int i;
for(i=0; i<strlen(word); i++)
{
runningSum += *(word+i);
}
return runningSum;
}
struct data1 preprocess(char* fName)
{
static struct data1 textFile = {.numRows = 0, .maxWordSize = 0, .maxASCIIHash = 0};
textFile.biggestWord = (char*) malloc(WORD_LENGTH*sizeof(char));
textFile.biggestHash = (char*) malloc(WORD_LENGTH*sizeof(char));
char* str = (char*) malloc(WORD_LENGTH*sizeof(char));
FILE* fp = fopen(fName, "r");
while( strtok(fgets(str, WORD_LENGTH, fp), "\n") != NULL)
{
// If found a larger hash
int hashed = asciiHash(str);
if(hashed > textFile.maxASCIIHash)
{
textFile.maxASCIIHash = hashed; // Update max hash size found
strcpy(textFile.biggestHash, str); // Update biggest hash string
printf("Found Bigger Hash:%s\tSize:%d\n", textFile.biggestHash, textFile.maxASCIIHash);
}
// If found a larger word
if( strlen(str) > textFile.maxWordSize)
{
textFile.maxWordSize = strlen(str); // Update biggest word size
strcpy(textFile.biggestWord, str); // Update biggest word
}
textFile.numRows++;
}
fclose(fp);
free(str);
return textFile;
}
You forget to remove the \r after reading. This is in your input because (1) your source file comes from a Windows machine (or at least one which uses \r\n line endings), and (2) you use the fopen mode "r", which does not translate line endings on your OS (again, presumably Windows).
This results in the weird output as follows:
Found Bigger Hash:text\r\tSize:123
– see the position of the \r? So what happens when outputting this string, you get at first
Found Bigger Hash:text
and then the cursor gets repositioned to the start of the line by \r. Next, a tab is output – not by printing spaces but merely moving the cursor to the 8thth position:
1234567↓
Found Bigger Hash:text
and the rest of the string is printed over the one already shown:
Found BiSize:123h:text
Possible solutions:
Open your file in "rt" "text" mode, and/or
Check for, and remove, the \r code as well as \n.
I'd go for both. strchr is pretty cheap and will make your code a bit more foolproof.
(Also, please simplify your fgets line by splitting it up into several distinct operations.)
Your statement
while( strtok(fgets(str, WORD_LENGTH, fp), "\n") != NULL)
takes no account of the return value from fgets() or the way strtok() works.
The way to do this is something like
char *fptr, *sptr;
while ((fptr = fgets(str, WORD_LENGTH, fp)) != NULL) {
sptr = strtok(fptr, "\n");
while (sptr != NULL) {
printf ("%s,", sptr);
sptr = strtok (NULL, "\n");
}
printf("\n");
}
Note than after the first call to strtok(), subsequent calls on the same sequence must pass the parameter NULL.
How can I create an array of unique strings without knowing how many strings there are until I process the input file? There can be as many as 2 million strings, max length of 50.
My program is something like this. This works for 51 items then overwrites other data. I don't know how to add an element to the array, if possible.
main() {
char *DB_NAMES[51]; // i thought this gave me ptrs to chunks of 51
// but it's 51 pointers!
char *word;
while not eof {
...function to read big string
...function to separate big sting into words
...
processWord(ctr, DB_NAMES, word);
...
}
}
processWord(int ndx, char *array1[], char *word){
...function to find if word already exists...
//if word is new, store in array
array1[ndx]= (char *)malloc(sizeof(51)); // isn't this giving me a char[51]?
strcpy(array1[ndx],word);
...
}
You can first get the number of words in your file using the below logic and when you get the number of words in the file you can initialize the array size with the word count.
#include<stdio.h>
#define FILE_READ "file.txt"
int main()
{
FILE * filp;
int count = 1;
char c;
filp = fopen(FILE_READ, "r");
if(filp == NULL)
printf("file not found\n");
while((c = fgetc(filp)) != EOF) {
if(c == ' ')
count++;
}
printf("worrds = %d\n", count);
return 0;
}
Regards,
yanivx
Better not use a fixed string length; save memory space.
char **DB_NAMES = 0; // pointer to first char * ("string") in array; initially 0
Pass pointer by reference so that it can be altered. Moreover, you'll want the new ctr value in case a new word has been stored.
ctr = processWord(ctr, &DB_NAMES, word);
Change function processWord accordingly.
int processWord(int ndx, char ***array1a, char *word)
{ char **array1 = *array1a;
...function to find if word already exists...
// if word is new, store in array
{
array1 = realloc(array1, (ndx+1)*sizeof*array1); // one more string
if (!array1) exit(1); // out of memory
array1[ndx++] = strdup(word); // store word's copy
*array1a = array1; // return new array
}
return ndx; // return count
}
I have been given three '.txt' files.
The first is a list of words.
The second is a document to search.
The third is a blank document that will have my output written to it.
I'm supposed to take each word in the first file, search the second file and print the number of occurrences in the third file as "wordX = numOccurences."
I've got a good function that will return the wordCount, and it returns it correctly for the first word, but then I get a zero for all the remaining words.
I've tried to dereference everything, and I think I've come to a standstill. There's something wrong with the "pointer talk."
I have yet to start outputting the words to a new file, but that printf statement should be a print to file statement in append mode. Easy enough.
Here is the working wordCount function - it works if I just give it a single word, like "testing," but if I give it an array I want to iterate through, it just returns 0.
int countWord(char* filePath, char* word){ //Not mine. This is a working prototype function from SO, returns word count of particular word
FILE *fp;
int count = 0;
int ch, len;
if(NULL==(fp=fopen(filePath, "r")))
return -1;
len = strlen(word);
for(;;){
int i;
if(EOF==(ch=fgetc(fp))) break;
if((char)ch != *word) continue;
for(i=1;i<len;++i){
if(EOF==(ch = fgetc(fp))) goto end;
if((char)ch != word[i]){
fseek(fp, 1-i, SEEK_CUR);
goto next;
}
}
++count;
next: ;
}
end:
fclose(fp);
return count;
}
This is my part of the program, trying to call the function while the loop gets all the words from the first file. The loop IS grabbing the words, because it prints them, but wordCount isn't accepting anything beyond the first word.
int main(){
FILE *ptr_file;
char words[100];
ptr_file = fopen("searchWords.txt", "r");
if(!ptr_file)
return -1;
while( fgets(words, 100, ptr_file)!=NULL )
{
int wordCount = 0;
char key[100] = &*words;
wordCount = countWord("document.txt", words);
printf("%s = %d\n", words, wordCount);
}
fclose(ptr_file);
return 0;
}
fgets reads \n too.That is the problem. To quote
A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str.
To solve this, change it
while( fgets(words, 100, ptr_file)!=NULL )
{
int len = strlen(words);
words[len-1] = '\0';
An immediate problem: fgets doesn't strip end-of-line from the string, so whatever you pass to countWord has an embedded newline.