How to find out how many words are in each line? - c

Say you have a text file filled with sentences. For example:
hey how are you
you good?
nice to meet you jeff
I'm writing a program to print things out depending on how many indexes are on each line but I cant wrap my head around how to find how many words on each line. How could I go about counting how many words are on each line?
for (int i=0; i < wordle->leng; i++) {
printf ("%s ", wordle->allwords[i]);
This is my print function for the program. leng is how many lines so it knows how many times to repeat.
Some of the lines have 5 words, some 3, and it isn't printing in the correct format. Also not all lines will end with punctuation.

The POSIX getline() function is very useful for that; it reads line from stream until EOL. So you can read with that line by line and the you could make a loop that adds 1 to int word_count = 0; every time you read something that is not a whitespace and the previous char before that was whitespace (but you have to make additional logic for initial word).
You can use fgets() if you don't have getline() available, but it doesn't expand the buffer to deal with extra long lines, unlike getline().

Related

Implementation of a c counter

I wrote a c program to count the number of time the word "printf" occurs in a specific file (here "document.c"). "document.c" has multiple lines of code. What I have done is I started with a while loop to iterate over every lines of the file and then I am reading the characters of each lines inside the for loop by using the function strstr.
It does not print anything with my current code. Moreove, I think there is some other minor issues because in an older version it used to print but not correctly, it printed a number much more larger than the actual number of "printf" in the document.
I am also novice in c.thank you!
int counter() {
FILE * filePointer;
filePointer = fopen("document.c", "r");
int counter = 0;
char singleLine[200];
while(!feof(filePointer)){
fgets(singleLine, 200, filePointer);
for (int i = 0; i < strlen(singleLine); i++){
if(strstr(singleLine, "printf")){
counter++;
}
}
}
fclose(filePointer);
printf("%d",counter);
return 0;
}
You're iterating over each character in the input line, and then asking if the string "printf" appears anywhere in the line. If the line contains 5 characters, you'll ask this 5 times; if it contains 40 characters, you'll ask this 40 times.
Assuming that you're trying to cover the case where "printf" can appear more than once on the line, look up what strstr() returns, and use that to adjust the starting position of the search in the inner loop (which shouldn't iterate over each character, but should loop while new "hits" are found).
(Note to up-voters: I'm answering the question, but not providing code because I don't want to do their homework for them.)

How to skip lines until I find line of a certain format?

I'm trying to implement a cycle which will read lines off a file until it finds a line with a specific format. Namely, until it finds a line with:
number number character number
and nothing (but spaces or tabs) until the newline. Specifically, the stuff I have to go through until I find such a line will always be the contents of a 'lines.rows' matrix, but its data points are not necessarily neatly ordered in 'lines' lines of 'rows' elements. They can have any amount of spaces, tabs or newlines between each element.
However, after lines.rows elements there will always be a line in the format I'm scanning for, after an arbitrary number of spaces, tabs or newlines following the last element of the matrix.
I've been trying for several hours to use fgets and fscanf in different ways to achieve this, but the output is simply not correct. Right now I have this:
for (i=0;i<lines;i++)
{
for(j=0;j<rows;j++)
{
fscanf(in_fp, "%d", &temp);
}
}
, which is working but takes way too long in large matrices. Aside from this I've tried, for example,
for (i=0;i<lines;i++)
{
fscanf(in_fp, "%*[^\n]\n", NULL);
}
, which did not work. The idea was to skip to the end of each line and then also read the new line so as to start at the beginning of the following line. However, by the end of this cycle my file was not pointing to the correct line (which would be one with the format specific above - %d %d %c %d). Instead, it was pointing to a 'random' line in the middle of the matrix at hand.
I also tried the same code as above but with an fgets. When I ran an fscanf following a cycle of fgets for all the lines of the matrix, I still did not read the line that should be coming next (with the format specified above).
If you have any input on how to achieve this, or how to make this question more understandable, I would be very thankful.

Stack Smashing and using malloc

I'm making a program that counts the number of words contained within a file. My code works for certain test cases with files that have less than a certain amount of words/characters...But when I test it with, let's say, a word like:
"loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong", (this is not random--this is an actual test case I'm required to check), it gives me this error:
*** stack smashing detected ***: ./wcount.out terminated
Abort (core dumped)
I know what the error means and that I have to implement some sort of malloc line of code to be able to allocate the right amount of memory, but I can't figure out where in my function to put it or how to do it:
int NumberOfWords(char* argv[1]) {
FILE* inFile = NULL;
char temp_word[20]; <----------------------I think this is the problem
int num_words_in_file;
int words_read = 0;
inFile = fopen(argv[1], "r");
while (!feof(inFile)) {
fscanf(inFile, "%s", temp_word);
words_read++;
}
num_words_in_file = words_read;
printf("There are %d word(s).\n", num_words_in_file - 1);
fclose(inFile);
return num_words_in_file;
}
As you've correctly identified by rendering your source code invalid (future tip: /* put your arrows in comments */), the problem is that temp_word only has enough room for 20 characters (one of which must be a terminal null character).
In addition, you should check the return value of fopen. I'll leave that as an exercise for you. I've answered this question in other questions (such as this one), but I don't think just shoving code into your face will help you.
In this case, I think it may pay to better analyse the problem you have, to see if you actually need to store words to count them. As we define a word (the kind read by scanf("%s", ...) as a sequence of non-whitespace characters followed by a sequence of (zero or more) whitespace characters, we can see that such a counting program as yours needs to follow the following procedure:
Read as much whitespace as possible
Read as much non-whitespace as possible
Increment the "word" counter if all was successful
You don't need to store the non-whitespace any more than you do the whitespace, because once you've read it you'll never revisit it. Thus you could write this as two loops embedded into one: one loop which reads as much whitespace as possible, another which reads non-whitespace, followed by your incrementation and then the outer loop repeats the whole lot... until EOF is reached...
This will be best achieved using the %*s directive, which tells scanf-related functions not to try to store the word. For example:
size_t word_count = 0;
do {
fscanf(inFile, "%*s");
} while (!feof(inFile) && ++word_count);
You are limited by the size of your array. A simple solution would be to increase the size of your array. But you are always susceptible to stack smashing if someone enters a long word.
A word is delimited by spaces.
You can simply store a counter variable initialized to zero, and a variable that records the current char that you are looking at. Every time you read in a character using fgetc(inFile, &temp) that is a space, you increment the counter.
In your current code you simply want to count the words. Therefore you are not interested in the words themselves. You can suppress the assignment with the optional * character:
fscanf(inFile, "%*s");

Reading line by line using fscanf

I want to read a file with 3 lines:
The first one with strings, second with a number and the third with strings again.
Example:
Line 1: bird toy book computer water
Line 2: 2
Line 3: toy water
I have this code, that reads a file, word by word storing them in the word array, and then putting the word into the words 2d array.
char words [5][50];
char word [50];
int i,j;
j = 0;
while( (fscanf(file, "%s", word))!=EOF ){
for(i = 0; i<50; i++){
if(word[i] != NULL){
words[j][i] = word[i];
} else{
break;
}
}
j++;
}
it's working, but it reads all the lines, i want a way to just do this process for the first line, and then store the second line into a int variable and the third line into another 2d array.
Read more about fscanf. It is not suitable to read line by line.
Consider instead reading every line with fgets or even better (on POSIX) with getline (see this), then parse each line perhaps with sscanf. Its return value (the count of scanned items given from sscanf etc...) could be useful to test (and you might also want to use %n in the scan control format string; as Jonathan Leffler commented, read also about %ms assignment-allocation modifier, at least on POSIX systems, see Linux sscanf(3)).
BTW, hard-coding limits like 50 for your word length is bad taste (and not robust). Consider perhaps using more systematically C dynamic memory allocation (using malloc, free and friends) and pointers, perhaps using sometimes flexible array members in some of your struct-s

How to read numbers from a text file properly?

I would like to write a lottery program in C, that reads the chosen numbers of former weeks into an array. I have got a text file in which there are 5 columns that are separated with tabulators. My questions would be the following:
What should I separate the columns with? (e.g. a comma, a semicolon, a tabulator or something else)
Should I include a kind of EOF in the last row? (e.g. -1, "EOF") Is there any accepted or "official" convention to do this?
Which function should I use for reading the numbers? Is there any proper or "accepted" way of reading data from text files?
I used to write a C program for a "Who Wants to Be a Billionaire" game. In that one I used a kind of function that read each line into an array that was big enough to hold a whole line. After that I separated its data into variables like this:
line: "text1";"text2";"text3";"text4"endline (-> line loaded into a buffer array)
text1 -> answer1 (until reaching the semicolon)
text2 -> answer2 (until reaching the semicolon)
text3 -> answer3 (until reaching the semicolon)
text4 -> answer4 (until reaching the end of the line)
endline -> start over, that is read a new line and separate its contents into variables.
It worked properly, but I don't know if it was good enough for a programmer. (btw I'm not a programmer yet, I study Computer Science at a university)
Every answers and advice is welcome. Thanks in advance for your kind help!
The scanf() family of functions don't care about newlines, so if you want to process lines, you need to read the lines first and then process the lines with sscanf(). The scanf() family of functions also treats white space — blanks, tabs, newlines, etc. — interchangeably. Using tabs as separators is fine, but blanks will work too. Clearly, if you're reading and processing a line at a time, newlines won't really factor into the scanning.
int lottery[100][5];
int line;
char buffer[4096];
for (line = 0; fgets(buffer, sizeof(buffer), stdin) != 0 && line < 100; line++)
{
if (sscanf(buffer, "%d %d %d %d %d", &lottery[line][0], &lottery[line][1],
&lottery[line][2], &lottery[line][3], &lottery[line][4]) != 5)
{
fprintf(stderr, "Faulty line: [%s]\n", line);
break;
}
}
This stops on EOF, too many lines, and a faulty line (one which doesn't start with 5 numbers; you can check their values etc in the loop if you want to — but what are the tests you need to run?). If you want to validate the white space separators, you have to work harder.
Maybe you want to test for nothing but spaces and newlines after the 5 numbers; that's a bit trickier (it can be done; look up the %n conversion specification in sscanf()).

Resources