I want to calculate average word length for a sentence.
For example, given input abc def ghi, the average word length would be 3.0.
The program works but i want to ignore extra spaces between the words. So, given the following sentence:
abc def
(two spaces between the words), the average word length is calculated to be 2.0 instead of 3.0.
How can I take into account extra spaces between words? These are to be ignored, which would give average word length of 3.0 in the example above, instead of the erroneously calculated 2.0.
#include <stdio.h>
#include <conio.h>
int main()
{
char ch,temp;
float avg;
int space = 1,alphbt = 0,k = 0;
printf("Enter a sentence: ");
while((ch = getchar()) != '\n')
{
temp = ch;
if( ch != ' ')
{
alphbt++;
k++; // To ignore spaces before first word!!!
}
else if(ch == ' ' && k != 0)
space++;
}
if (temp == ' ') //To ignore spaces after last word!!!
printf("Average word lenth: %.1f",avg = (float) alphbt/(space-1));
else
printf("Average word lenth: %.1f",avg = (float) alphbt/space);
getch();
}
The counting logic is awry. This code seems to work correctly with both leading and trailing blanks, and multiple blanks between words, etc. Note the use of int ch; so that the code can check for EOF accurately (getchar() returns an int).
#include <stdio.h>
#include <stdbool.h>
int main(void)
{
int ch;
int numWords = 0;
int numLetters = 0;
bool prevWasASpace = true; //spaces at beginning are ignored
printf("Enter a sentence: ");
while ((ch = getchar()) != EOF && ch != '\n')
{
if (ch == ' ')
prevWasASpace = true;
else
{
if (prevWasASpace)
numWords++;
prevWasASpace = false;
numLetters++;
}
}
if (numWords > 0)
{
double avg = numLetters / (float)(numWords);
printf("Average word length: %.1f (C = %d, N = %d)\n", avg, numLetters, numWords);
}
else
printf("You didn't enter any words\n");
return 0;
}
Various example runs, using # to indicate where Return was hit.
Enter a sentence: A human in Algiers#
Average word length: 3.8 (C = 15, N = 4)
Enter a sentence: A human in Algiers #
Average word length: 3.8 (C = 15, N = 4)
Enter a sentence: A human in Algiers #
Average word length: 3.8 (C = 15, N = 4)
Enter a sentence: #
You didn't enter any words
Enter a sentence: A human in AlgiersAverage word length: 3.8 (C = 15, N = 4)
Enter a sentence: You didn't enter any words
In the last but one example, I typed Control-D twice (the first to flush the 'A human in Algiers' to the program, the second to give EOF), and once in the last example. Note that this code counts tabs as 'not space'; you'd need #include <ctype.h> and if (isspace(ch)) (or if (isblank(ch))) in place of if (ch == ' ') to handle tabs better.
getchar() returns an int
I am confused why you have used int ch and EOF!
There are several parts to this answer.
The first reason for using int ch is that the getchar() function returns an int. It can return any valid character plus a separate value EOF; therefore, its return value cannot be a char of any sort because it has to return more values than can fit in a char. It actually returns an int.
Why does it matter? Suppose the value from getchar() is assigned to char ch. Now, for most characters, most of the time, it works OK. However, one of two things will happen. If plain char is a signed type, a valid character (often ÿ, y-umlaut, 0xFF, formally Unicode U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS) is misrecognized as EOF. Alternatively, if plain char is an unsigned type, then you will never detect EOF.
Why does detecting EOF matter? Because your input code can get EOF when you aren't expecting it to. If your loop is:
int ch;
while ((ch = getchar()) != '\n')
...
and the input reaches EOF, the program is going to spend a long time doing nothing useful. The getchar() function will repeatedly return EOF, and EOF is not '\n', so the loop will try again. Always check for error conditions in input functions, whether the function is getchar(), scanf(), fread(), read() or any of their myriad relatives.
Obviously counting non-space characters is easy, your problem is counting words. Why count words as spaces as you're doing? Or more importantly, what defines a word?
IMO a word is defined as the transition from space character to non-space character. So, if you can detect that, you can know how many words you have and your problem is solved.
I have an implementation, there are many possible ways to implement it, I don't think you'll have trouble coming up with one. I may post my implementation later as an edit.
*Edit: my implementation
#include <stdio.h>
int main()
{
char ch;
float avg;
int words = 0;
int letters = 0;
int in_word = 0;
printf("Enter a sentence: ");
while((ch = getchar()) != '\n')
{
if(ch != ' ') {
if (!in_word) {
words++;
in_word = 1;
}
letters++;
}
else {
in_word = 0;
}
}
printf("Average word lenth: %.1f",avg = (float) letters/words);
}
Consider the following input: (hyphens represent spaces)
--Hello---World--
You currently ignore the initial spaces and the ending spaces, but you count each of the middle spaces, even though they are next to each other. With a slight change to your program, in particular to 'k' we can deal with this case.
#include <stdio.h>
#include <conio.h>
#include <stdbool.h>
int main()
{
char ch;
float avg;
int numWords = 0;
int numLetters = 0;
bool prevWasASpace = true; //spaces at beginning are ignored
printf("Enter a sentence: ");
while((ch = getchar()) != '\n')
{
if( ch != ' ')
{
prevWasASpace = false;
numLetters++;
}
else if(ch == ' ' && !prevWasASpace)
{
numWords++;
prevWasASpace = true; //EDITED this line until after the if.
}
}
avg = numLetters / (float)(numWords);
printf("Average word lenth: %.1f",avg);
getch();
}
You may need to modify the preceding slightly (haven't tested it).
However, counting words in a sentence based on only spaces between words, might not be everything you want. Consider the following sentences:
John said, "Get the phone...Now!"
The TV announcer just offered a buy-1-get-1-free deal while saying they are open 24/7.
It wouldn't cost them more than $100.99/month (3,25 euro).
I'm calling (555) 555-5555 immediately on his/her phone.
A(n) = A(n-1) + A(n-2) -- in other words the sequence: 0,1,1,2,3,5, . . .
You will need to decide what constitutes a word, and that is not an easy question (btw, y'all, none of the examples included all varieties of English). Counting spaces is a pretty good estimate in English, but it won't get you all of the way.
Take a look at the Wikipedia page on Text Segmentation. The article uses the phrase "non-trivial" four times.
Related
I am trying to code a program that count the number of letters, words, and sentences in a text. I may assume that a letter is any lowercase character from a to z or any uppercase character from A to Z, any sequence of characters separated by spaces should count as a word, and that any occurrence of a period, exclamation point, or question mark indicates the end of a sentence.
So far, I could count both the number of letters and sentences correctly, but I miss out on the number of words:
e.g.
yes!
The output should be:
3 letter(s)
1 word(s)
1 sentence(s)
What I get is:
3 letter(s)
0 word(s)
1 sentence(s)
UPDATE: It works fine now after typing out another (words++) in the end right before the printf function. Thanks for the help guys :).
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(void)
{
string text = get_string("Enter text: ");
printf("Output:\n");
int lettercount;
int words = 0;
int sentences = 0;
int letters = 0;
int length = strlen(text);
for(lettercount = 0; lettercount < length; lettercount++)
{
if(isalpha(text[lettercount]))
{
letters++;
}
else if(text[lettercount] == ' ' || text[lettercount] == '\t' || text[lettercount] == '\n' || text[lettercount] == '\0')
{
words++;
}
else if(text[lettercount] == '.' || text[lettercount] == '!' || text[lettercount] == '?')
{
sentences++;
}
}
words++;
printf("%i letter(s)\n", letters);
printf("%i word(s)\n", words);
printf("%i sentence(s)\n", sentences);
}
The main problem with your code is that it does not count any 'final' word in your input text if that does not have a space after it (the terminating '\0' character will not be part of the tested string, as the strlen function does not include that.
Further, you will have problems if you have words that are separated by more than one space; to address this, you could use an inWord flag to keep track of if the current character is already inside a word and, if not, set that flag whenever we find a letter.
Also, your sentence count will be problematical if you have things like "..." in your input; the commented-out line after your sentences++; line will fix that (if you want to).
And, finally, just to be precise, you should not assume that the letters "a" thru "z" and "A" thru "Z" will be in a continuous sequence. They probably will be (most systems these days use ASCII encoding) but you should use the isalpha function for more portability (and the isspace function, too).
int main(void)
{
string text = get_string("Enter text: ");
printf("Output:\n");
int lettercount;
int words = 0;
int sentences = 0;
int letters = 0;
int inWord = 0;// Set to 1 if we are inside a (new) word!
int length = (int)(strlen(text)); // Don't evaluate length on each loop!
for (lettercount = 0; lettercount < length; lettercount++) {
int testChar = text[lettercount]; // Get a local copy of the current character
if (isalpha(testChar)) { // Don't assume that 'a' ... 'z' and 'A' ... 'Z' are in contiguous sequences
letters++;
if (!inWord) words++; // Any letter means that we're in a (possibly new) word...
inWord = 1; // ... but now set this 'flag' so as not to count others!
}
else if (testChar == '.' || testChar == '!' || testChar == '?') {
sentences++;
// if (inWord) sentences++; // Check that we're in a word, or stuff like "..." will be wrong
inWord = 0; // Now we are no longer inside our current word
}
else if (isspace(testChar)) { // We could also just assume ANY other character is a non-word
inWord = 0; // Now we are no longer inside our current word
}
}
printf("%i letter(s)\n", letters);
printf("%i word(s)\n", words);
printf("%i sentence(s)\n", sentences);
return 0;
}
Feel free to ask for any further clarification and/or explanation.
You will always have words -1 because you only add new word to your counter after space or new line but what about the last word !? always the last word will not be counted so after counting any paragraph add 1 to your word counter ..
Ex : Yes! --> 3 letters 1 Sentence 0 word ! so you add one and solved
another Ex : Hello World! --> 10 letters 1 sentence 1 word ! adding one and it's solved
As part of an assignment, I am supposed to write a small program that accepts an indefinite number of strings, and then print them out.
This program compiles (with the following warning
desafio1.c:24:16: warning: format not a string literal and no format arguments [-Wform
at-security]
printf(words[i]);
and it prints the following characters on the screen: �����8 ���#Rl�. I guess it did not end the strings I entered by using getchar properly with the null byte, and it prints out garbage. The logic of the program is to initiate a while loop, which runs untill I press the enter key \n, and if there are an space, this is a word that will be store in the array of characters words. Why am I running into problems, if in the else statement once a space is found, I close the word[i] = \0, in that way and store the result in the array words?
#include <stdio.h>
#include <string.h>
int main()
{
char words[100][100];
int i,c;
char word[1000];
while((c = getchar()) != '\n')
{
if (c != ' '){
word[i++] = c;
c = getchar();
}
else{
word[i] = '\0';
words[i] == word;
}
}
int num = sizeof(words) / sizeof(words[0]);
for (i = 0; i < num; i++){
printf(words[i]);
}
return 0;
}
Here are some fixes to your code. As a pointer (as mentioned in other comments), make sure to enable compiler warnings, which will help you find 90% of the issues you had. (gcc -Wall)
#include <stdio.h>
#include <string.h>
int main() {
char words[100][100];
int i = 0;
int j = 0;
int c;
char word[1000];
while((c = getchar()) != '\n') {
if (c != ' '){
word[i++] = c;
} else {
word[i] = '\0';
strcpy(words[j++], word);
i = 0;
}
}
word[i] = '\0';
strcpy(words[j++], word);
for (i = 0; i < j; i++) {
printf("%s\n", words[i]);
}
return 0;
}
i was uninitialized, so its value was undefined. It should start at 0. It also needs to be reset to 0 after each word so it starts at the beginning.
The second c = getchar() was unnecessary, as this is done in every iteration of the loop. This was causing your code to skip every other letter.
You need two counters, one for the place in the word, and one for the number of words read in. That's what j is.
== is for comparison, not assignment. Either way, strcpy() was needed here since you are filling out an array.
Rather than looping through all 100 elements of the array, just loop through the words that have actually been filled (up to j).
The last word input was ignored by your code, since it ends with a \n, not a . That's what the lines after the while are for.
When using printf(), the arguments should always be a format string ("%s"), followed by the arguments.
Of course, there are other things as well that I didn't fix (such as the disagreement between the 1000-character word and the 100-character words). If I were you, I'd think about what to do if the user entered, for some reason, more than 1000 characters in a word, or more than 100 words. Your logic will need to be modified in these cases to prevent illegal memory accesses (outside the bounds of the arrays).
As a reminder, this program does not accept an indefinite number of words, but only up to 100. You may need to rethink your solution as a result.
I am doing a course on the basics of C programming, I've been given a task to create a program that counts the number of words in a sentence, I've achieved this, however I have a secondary task to stop the program from counting punctuation, on top of this if i type in a consecutive space i need the program to ignore it, i don't know how to get round it. could anyone point me into the right direction, I am not looking for anyone to write the code for me.
here is my code:
#include <stdio.h>
int main()
{
const char end = '.';
int words = 1;
printf("please enter a sentence: \n");
char c = getchar();
while (c != end)
{
c = getchar();
if (c == ' ')
words++;
}
printf("the total number of words is %d", words);
getchar();
getchar();
}
Working from the code you give (i.e. counting words by finding delimiting spaces), you can solve the multiple spaces problem by remembering the last character ignoring subsequent spaces
while (c != end)
{
c = getchar();
if (c == ' ' && previous_c != ' ')
words++;
previous_c = c;
}
Note though that if the user begins the input with a single space, then the program will still count this as one word. To prevent this you should initialse previous_c to some known value (e.g. 0) and check for this case also. This means the if condition would become (c == ' ' && (previous_c != ' ' || previous_c == 0))
As Cool Guy commented, the program you've shown already ignores punctuation as-is.
As another improvement I would suggest looking at using a do...while loop instead of the while loop to reduce the places you need to call getchar()
I would split it in 2 loops. One for skipping all non-word-chars and a second to skip the word-chars. Between these loops starts a word and will be counted.
#include <stdio.h>
int main()
{
char c;
int words = 0;
printf("please enter a sentence: \n");
for(;;) {
while((c=getchar())!=EOF && !isalpha(c));
if(c==EOF) break;
words++;
while((c=getchar())!=EOF && isalpha(c));
if(c==EOF) break;
}
}
As part of my course, I have to learn C using Turbo C (unfortunately).
Our teacher asked us to make a piece of code that counts the number of characters, words and sentences in a paragraph (only using printf, getch() and a while loop.. he doesn't want us to use any other commands yet). Here is the code I wrote:
#include <stdio.h>
#include <conio.h>
void main(void)
{
clrscr();
int count = 0;
int words = 0;
int sentences = 0;
char ch;
while ((ch = getch()) != '\n')
{
printf("%c", ch);
while ((ch = getch()) != '.')
{
printf("%c", ch);
while ((ch = getch()) != ' ')
{
printf("%c", ch);
count++;
}
printf("%c", ch);
words++;
}
sentences++;
}
printf("The number of characters are %d", count);
printf("\nThe number of words are %d", words);
printf("\nThe number of sentences are %d", sentences);
getch();
}
It does work (counts the number of characters and words at least). However when I compile the code and check it out on the console window I can't get the program to stop running. It is supposed to end as soon as I input the enter key. Why is that?
Here you have the solution to your problem:
#include <stdio.h>
#include <conio.h>
void main(void)
{
clrscr();
int count = 0;
int words = 0;
int sentences = 0;
char ch;
ch = getch();
while (ch != '\n')
{
while (ch != '.' && ch != '\n')
{
while (ch != ' ' && ch != '\n' && ch != '.')
{
count++;
ch = getch();
printf("%c", ch);
}
words++;
while(ch == ' ') {
ch = getch();
printf("%c", ch);
}
}
sentences++;
while(ch == '.' && ch == ' ') {
ch = getch();
printf("%c", ch);
}
}
printf("The number of characters are %d", count);
printf("\nThe number of words are %d", words);
printf("\nThe number of sentences are %d", sentences);
getch();
}
The problem with your code is that the innermost while loop was consuming all the characters. Whenever you enter there and you type a dot or a newline it stays inside that loop because ch is different from a blank. However, when you exit from the innermost loop you risk to remain stuck at the second loop because ch will be a blank and so always different from '.' and '\n'. Since in my solution you only acquire a character in the innermost loop, in the other loops you need to "eat" the blank and the dot in order to go on with the other characters.
Checking these conditions in the two inner loops makes the code work.
Notice that I removed some of your prints.
Hope it helps.
Edit: I added the instructions to print what you type and a last check in the while loop after sentences++ to check the blank, otherwise it will count one word more.
int ch;
int flag;
while ((ch = getch()) != '\r'){
++count;
flag = 1;
while(flag && (ch == ' ' || ch == '.')){
++words;//no good E.g Contiguous space, Space at the beginning of the sentence
flag = 0;;
}
flag = 1;
while(flag && ch == '.'){
++sentences;
flag=0;
}
printf("%c", ch);
}
printf("\n");
I think the problem is because of your outer while loop's condition. It checks for a newline character '\n', as soon as it finds one the loop terminates. You can try to include your code in a while loop with the following condition
while((c=getchar())!=EOF)
this will stop taking input when the user presses Ctrl+z
Hope this helps..
You can implement with ease an if statement using while statement:
bool flag = true;
while(IF_COND && flag)
{
//DO SOMETHING
flag = false;
}
just plug it in a simple solution that uses if statements.
For example:
#include <stdio.h>
#include <conio.h>
void main(void)
{
int count = 0;
int words = 1;
int sentences = 1;
char ch;
bool if_flag;
while ((ch = getch()) != '\n')
{
count++;
if_flag = true;
while (ch==' ' && if_flag)
{
words++;
if_flag = false;
}
if_flag = true;
while (ch=='.' && if_flag)
{
sentences++;
if_flag = false;
}
}
printf("The number of characters are %d", count);
printf("\nThe number of words are %d", words);
printf("\nThe number of sentences are %d", sentences);
getch();
}
#include <stdio.h>
#include <ctype.h>
int main(void){
int sentence=0,characters =0,words =0,c=0,inside_word = 0,temp =0;
// while ((c = getchar()) != EOF)
while ((c = getchar()) != '\n') {
//a word is complete when we arrive at a space after we
// are inside a word or when we reach a full stop
while(c == '.'){
sentence++;
temp = c;
c = 0;
}
while (isalnum(c)) {
inside_word = 1;
characters++;
c =0;
}
while ((isspace(c) || temp == '.') && inside_word == 1){
words++;
inside_word = 0;
temp = 0;
c =0;
}
}
printf(" %d %d %d",characters,words,sentence);
return 0;
}
this should do it,
isalnum checks if the letter is alphanumeric, if its an alphabetical letter or a number, I dont expect random ascii characters in my sentences in this program.
isspace as the name says check for space
you need the ctype.h header for this. or you could add in
while(c == ' ') and whie((c>='a' && c<='z') || (c >= 'A' && c<='Z')
if you don't want to use isalpace and isalnum, your choice, but it will be less elegant :)
The trouble with your code is that you consume the characters in each of your loops.
a '\n' will be consumed either by the loop that scans for words of for sentences, so the outer loop will never see it.
Here is a possible solution to your problem:
int sentences = 0;
int words = 0;
int characters = 0;
int in_word = 0; // state of our parser
int ch;
do
{
int end_word = 1; // consider a word wil end by default
ch = getch();
characters++; // count characters
switch (ch)
{
case '.':
sentences++; // any dot is considered end of a sentence and a word
break;
case ' ': // a space is the end of a word
break;
default:
in_word = 1; // any non-space non-dot char is considered part of a word
end_word = 0; // cancel word ending
}
// handle word termination
if (in_word and end_word)
{
in_word = 0;
words++;
}
} while (ch != '\n');
A general approach to these parsing problems is to write a finite-state machine that will read one character at a time and react to all the possible transitions this character can trigger.
In this example, the machine has to remember if it is currently parsing a word, so that one new word is counted only the first time a terminating space or dot is encountered.
This piece of code uses a switch for concision. You can replace it with an if...else if sequence to please your teacher :).
If your teacher forced you to use only while loops, then your teacher has done a stupid thing. The equivalent code without other conditional expressions will be heavier, less understandable and redundant.
Since some people seem to think it's important, here is one possible solution:
int sentences = 0;
int words = 0;
int characters = 0;
int in_word = 0; // state of our parser
int ch;
// read initial character
ch = getch();
// do it with only while loops
while (ch != '\n')
{
// count characters
characters++;
// count words
while (in_word)
{
in_word = 0;
words++;
}
// skip spaces
while (ch == ' ')
{
ch = -1;
}
// detect sentences
while (ch == '.')
{
sentences++;
ch = -1;
}
// detect words
while ((ch != '\n')
{
word_detected = 1;
ch = -1;
}
// read next character
ch = getch();
}
Basically you can replace if (c== xxx) ... with while (c== xxx) { c = -1; ... }, which is an artifical, contrieved way of programming.
An exercise should not promote stupid ways of doing things, IMHO.
That's why I suspect you misunderstood what the teacher asked.
Obviously if you can use while loops you can also use if statements.
Trying to do this exercise with only while loops is futile and results in something that as little or nothing to do with real parser code.
All these solutions are incorrect. The only way you can solve this is by creating an AI program that uses Natural Language Processing which is not very easy to do.
Input:
"This is a paragraph about the Turing machine. Dr. Allan Turing invented the Turing Machine. It solved a problem that has a .1% change of being solved."
Checkout OpenNLP
https://sourceforge.net/projects/opennlp/
http://opennlp.apache.org/
I'm coding a basic program to check if a string is a palindrome or not.
#include <stdio.h>
#include <string.h> //Has some very useful functions for strings.
#include <ctype.h> //Can sort between alphanumeric, punctuation, etc.
int main(void)
{
char a[100];
char b[100]; //Two strings, each with 100 characters.
int firstchar;
int midchar;
int lastchar;
int length = 0;
int counter = 0;
printf(" Enter a phrase or word for palindrome checking: \n \n ");
while ((a[length] == getchar()) !10 ) //Scanning for input ends if the user presses enter.
{
if ((a[length -1]), isalpha) // If a character isalpha, keep it.
{
b[counter] = a[length-1];
counter++;
}
length--; //Decrement.
}
makelower(b, counter); //Calls the function that changes uppercase to lowercase.
for( firstchar = 0; firstchar < midchar; firstchar++ ) //Compares the first and last characters.
{
if ( a[firstchar] != a[lastchar] )
{
printf(", is not a palindrome. \n \n");
break;
}
lastchar--;
}
if( firstchar == midchar )
{
printf(", is a palindrome. \n \n");
}
return 0;
}
//Declaring additional function "makelower" to change everything remaining to lowercase chars.
int makelower (char c[100], int minicount)
{
int count = 0;
while (count <= minicount)
{
c[count] = tolower(c[count]);
}
return 0;
}
And I'm getting the following compiler error on the line with the first while loop, immediately after the printf statement:
p5.c: In function 'main':
p5.c:30: error: expected ')' before '!' token
I've looked up and down, but I haven't found any out-of-place or nonpartnered parenthesis. The only thing I can think of is that I'm missing a comma or some kind of punctuation, but I've tried placing a comma in a few places to no avail.
Sorry if this is too specific. Thanks in advance.
while ((a[length] == getchar()) !10 )
What it looks like you're trying for is assigning to a[length] the result of getchar() and verifying that that is not equal to 10. Which is spelled like so:
while ((a[length] = getchar()) != 10)
= is assignment, == is the test.
Further, your counters are confused. length is initialized to 0 and is only decremented, which will lead to falling off the front of the array after the first decrement. This doesn't get a chance to happen, because you attempt to access a[length-1], which will also fail. This looks like a off-by-one error, also known as a fencepost error, in accessing the character you just read from getchar().
Also, since nothing is checking that the length of recorded input doesn't exceed the length of your buffer a[100], you could fall off the end there as well.
The counters for your palindrome check function are also off. midchar and lastchar are never initialized, midchar is never set, and lastchar is decremented without ever having a value set. You would probably be better off testing a[firstchar] == a[(counter-1)-firstchar].