So for an intro to C class we have to write a program that will count the number of lines, characters, and words in a file. In the program a word is defined as a sequence of letters, digits, and apostrophes that begins with a letter. For some reason the logic for counting words just isn't working for me, maybe it's because I'm new to C or because I've always been bad at formulating logic. My code now, when passed in
hey whats up\n
hey what's up\n
hey wh?ts 'p\n
returns 3 lines, 31 words, 40 characters. Thanks for any help, I know this is a super lame question it's just driving me insane.
Here is my code:
#include <stdio.h>
typedef enum yesno yesno;
enum yesno {
YES,
NO
};
int main() {
int c; // character
int nl, nw, nc; // number of lines, words, characters
yesno inword; // records if we are in a word or not
yesno badchar;
// initialize variables:
badchar=NO;
inword = NO;
nl = 0;
nw = 0;
nc = 0;`
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c == '\t')
inword = NO;
else if (inword == NO) {
inword = YES;
}
while (inword == YES){
if (( c<'A' || c>'Z')||(c<'a'||c>'z')||(c<'0'|| c>'9') ){
inword= NO;
//badchar = YES;
}
if (( c<'A' || c>'Z')||(c<'a'||c>'z')|| (c<'0'|| c>'9') ||(c!= '\'')){
nw=nw;
inword = NO;
//badchar=YES;
}
if(badchar==NO){
nw++;
badchar=NO;
inword= NO;
}
}
}
printf("%d %d %d\n", nl, nw, nc);
}
One problem is this condition:
if (( c<'A' || c>'Z')||(c<'a'||c>'z')||(c<'0'|| c>'9') ){
inword = NO;
Consider a value of c such as:
'A': this is going to be less than 'a', so you'll switch to inword = NO.
'a': this is going to be greater than 'Z', so you'll switch to inword = NO.
'0': this is going to be less than 'A', so you'll switch to inword = NO.
You need to use && between the sets of conditions:
if ((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c > '9')){
Or, better, you could use the macros/functions from <ctype.h>:
if (!isupper(c) && !islower(c) && !isdigit(c))
but that can be abbreviated to:
if (!isalnum(c))
You'll need to review the other tests too. There could be other problems too, but I've simply not reviewed the rest of the code.
I've never programmed C. But when I've programmed this same thing in other languages, it's not too tough. For word count, replace "\n" with space, then split the string into an array using a space as the delimiter, and finally count the number of elements in the array. Similar thing to get the line count: split the string into an array using "\n" as the delimiters, then count the number of elements in the array.
Related
I am reading the C programming language book Dennis M. Ritchie and
trying to solve this question:
Write a program to print a histogram of
the lengths of words in
its input. It is easy to draw the histogram with the bars horizontal; a vertical
orientation is more challenging.
I think my solution works, but the problem is that if I don't press EOF, the terminal won't show the
result. I know that the condition specifies that exactly, but I am
wondering whether there is any way to make the program terminate after
reading a single line? (Sorry if my explanation of the problem is a bit shallow. Feel free to ask more.)
#include <stdio.h>
int main ()
{
int digits[10];
int nc=0;
int c, i, j;
for (i = 0; i <= 10; i++)
digits[i] = 0;
//take input;
while ((c = getchar ()) != EOF) {
++nc;
if (c == ' ' || c=='\n') {
++digits[nc-1];
//is it also counting the space in nc? i think it is,so we should do nc-1
nc = 0;
}
}
for (i = 1; i <= 5; i++) {
printf("%d :", i);
for (j = 1; j <= digits[i]; j++) {
printf ("*");
}
printf ("\n");
}
// I think this is a problem with getchar()
//the program doesn't exit automatically
//need to find a way to do it
}
You could try to make something like
while ((c = getchar ()) != EOF && c != '\n') {
and then adding a line after the while loop to account for the last word:
if (c == '\n') {
++digits[nc-1];
nc = 0;
There is also another problem inside your program. ++digits[nc-1]; is correct, however, for the wrong reason. You should make it because an array starts at zero, i.e. if you have an array of length 10, it will go from 0 to 9, so you should count the length of the words and then add one to the position of the array length - 1 (as there are no words of length zero). The problem is that you are still counting the blank spaces or the newline characters inside the length of a word, so if you have two blank spaces after a word of length 4, the program will add to the array a word of length 5 + a word of length 1. To avoid this, you should do something like this:
while ((c = getchar ()) != EOF) {
if ((c == ' ' || c == '\n' || c == '\t') && nc > 0) {
++digits[nc-1]; // arrays start at zero
nc = 0;
}
else {
++nc;
}
}
Exact formulation:
Write a program, which counts numbers of words in the input line. In word means consistency, where the first character must be a letter.
Examples of inputs and outputs:
Input: one 2two three
Output: 2
Input: one two three four five 6six
Output: 5
Input: 789878moer and more
Output: 2
Input: something like 8this Output: 2
Program:
#include <stdio.h>
#define YES 1
#define NO 0
int main() {
int c, nw, inword, first_char;
inword = first_char = NO;
nw = 0;
while((c = getchar()) != EOF) {
if (c == ' ' || c == '\n' || c == '\t') {
inword = first_char = NO;
} else if (inword == NO && first_char == NO) {
if ((65 < c && c < 90) || (97 < c && c < 122)) {
++nw;
inword = YES;
} else {
first_char = YES;
}
}
}
printf("%d\n", nw);
}
Answers:
Is it correct solution?
Is it possible to decide this task in more elegant way? If yes, How?
Is it correct solution?
I tested a few cases and it seems okay to me.
Is it possible to decide this task in more elegant way? If yes, How?
The following line
if((65 < c && c < 90) || (97 < c && c < 122))
uses magic numbers and ASCII values to check if c is an alphabet.
You can instead use the library function isalpha() which is defined in <ctype.h> header file so that the above line becomes:
if (isalpha(c))
I am a new to programming. I am trying to write a program that reads lines, words and characters from a text file. Here is the code below.
#include "stdio.h"
#include "stdlib.h"
#define IN 1
#define OUT 0
int main (int argc, char *argv[]) {
FILE *input;
int character, newword, newline, state;
char c;
state = OUT;
character = newline = newword =0;
input = fopen(argv[1], "r");
if ( input == NULL){
printf("Error! Can not read the input\n");
exit(-1);
}
while ((c = fgetc(input)) != EOF){
character++;
if (c <'a' && c >'z'){;}
if ( c <'A' && c >'Z'){;}
if (c == '\n'){
newline++;
}
if (c == ' ' || c == '\n' || c == '\t'){
state = OUT;
}
else if (state == OUT){
state = IN;
newword++;
}
}
printf("The number of lines: %d\n", newline);
printf("The number of words: %d\n", newword);
printf("The number of characters: %d\n", character);
fclose(input);
}
I have been trying to figure how not to read special characters such as !, #, #, $, %, ^, &, *, (, ), _, +.
I tried using if statements so it won't read the special characters but it reads it. I think one the if statement for the capital letters is wrong because it probably will not read lower case letters.
In the file the following text is in it,
!!.
and it outputs in terminal:
The number of lines: 2
The number of words: 5
The number of characters: 7
However, if I take out the two for loops (c < 'A' && c > 'Z') and (c < 'a' && c > 'z'), then the output becomes
The number of lines: 2
The number of words: 1
The number of characters: 7
Any hints to fix this problem (I do not want the Answer!)?
Your if must be something like:
if ('a' <= c && c <='z'){character++;}
else if ( 'A' <= c && c <='Z'){character++;}
The easiest way to solve your problem is to increase your character counter when the character is between the interval 'a' and 'z' or the interval 'A' and 'Z', and then, since the escape sequence '\n' creates a new line, this also means that you are dealing with a new word, thus you should increment this counter too. And finally you can check for the space or the horizontal tab to increment the new word counter.
if ((c <'a' && c >'z') || ( c <'A' && c >'Z')){
++character;
}
else if (c == '\n'){
++newline;
++newword;
}
else if (c == ' ' || c == '\t'){
++newword;
}
I'm learning c language and I hit a wall, if you would like to help me I appreciate (here is the ex: "Write a program that reads characters from the standard input to end-of-file. For each character, have the program report whether it is a letter. If it is a letter, also report its numerical location in the alphabet and -1 otherwise." btw is not homework).The problem is with the \n i don't know how to make it an exception. I'm new around here please let me know if I omitted something. Thank you for your help.
int main(void)
{
char ch;
int order;
printf("Enter letters and it will tell you the location in the alphabet.\n");
while ((ch = getchar()) != EOF)
{
printf("%c", ch);
if (ch >= 'A' && ch <= 'Z')
{
order = ch - 'A' + 1;
printf(" %d \n", order);
}
if (ch >= 'a' && ch <= 'z')
{
order = ch - 'a' + 1;
printf(" %d \n", order);
}
if (order != (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z'))
{
if (ch == '\n');
else if (order != (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z'))
printf(" -1 \n");
}
}
system("pause");
}
You are talking about an "exception" which can be interpreted in other ways in programming.
I understand that you want that '\n' be "excepted" in the set of nonalphabetical characters, that is, that it doesn't generate the error value -1.
Since you are using console to run the program, a sequence of character is going to be read till ENTER key is pressed, which generates the character \n. So, I'm not pretty sure that the while() condition you used, that compares against EOF, it's a good decision of yours.
I would put there directly the comparisson against '\n'.
while ((ch = getchar()) != '\n')
To inform if ch is a letter or not, we could use string literals. The following use of string assignment would deserve more explanation, but I will omit it. It's valid with string literals:
char *info;
if (order != -1)
info = "is a letter";
else
info = "is not a letter";
You are assuming an encoding where the letters are in contiguous increasing order (as in ASCII).
By assuming that, it's enough to work with uppercase or lowercase letters, since you are only interested in the position that the letter occupy in the alphabet. So, you can choose to work with uppercase, for example, in this way:
if (ch >= 'a' && ch <= 'z')
ch = (ch - 'a' + 'A');
The effect of that line of code is that ch is converted to uppercase, only if ch is a lowercase letter. Another kind of character is not affected.
As a consequence, from now on, you only have uppercase letters, or nonalphabetical characters.
Then it's easy to code the remaining part:
if (ch >= 'A' && ch <= 'Z')
order = ch - 'A' + 1; // It brings no. position of letter in alphabet
else
order = -1; // This is the erroneous case
A printf() at the end of the loop could bring all the information about the character:
printf(" %16s: %4d \n", info, order);
The resulting code is shorter in more clear:
#include <stdio.h>
int main(void) {
char ch;
int order;
char *info;
while ((ch = getchar()) != '\n') {
printf("%c",ch);
if (ch >= 'a' && ch <= 'z') /* Converting all to uppercase */
ch = (ch - 'a' + 'A');
if (ch >= 'A' && ch <= 'Z')
order = ch - 'A' + 1; /* Position of letter in alphabet */
else
order = -1; /* Not in alphabet */
if (order != -1)
info = "is a letter";
else
info = "is not a letter";
printf(" %16s: %4d \n", info, order);
}
}
If you need to end the input by comparing against EOF, then the type of ch has to be changed to int instead of char, so you can be sure that the EOF value (that is an int) is properly held in ch.
Finally, this means that ch needs initialization now, for example to a neutral value in the program, as '\n'.
Finally, just for fun, I add my super-short version:
#include <stdio.h>
int main(void) {
int ch, order;
while ((ch = getchar()) != '\n') {
order = (ch>='a' && ch<='z')? ch-'a'+1:((ch>='A' && ch<='Z')? ch-'A'+1: -1);
printf("%c %8s a letter: %4d \n", ch, (order != -1)? "is":"is not", order);
}
}
The C language does not have exceptions. Exceptions were first introduced into programming in C++. You can do it manually in C using setjmp() and longjmp(), but it really isn't worth it.
The two most popular of doing error handling in C are:
Invalid return value. If you can return -1 or some other invalid value from a function to indicate 'there was an error', do it. This of course doesn't work for all situations. If all possible return values are valid, such as in a function which multiplies two numbers, you cannot use this method. This is what you want to do here - simply return -1.
Set some global error flag, and remember to check it later. Avoid this when possible. This method ends up resulting in code that looks similar to exception code, but has some serious problems. With exceptions, you must explicitly ignore them if you don't want to handle the error (by swallowing the exception). Otherwise, your program will crash and you can figure out what is wrong. With a global error flag, however, you must remember to check for them; and if you don't, your program will do the wrong thing and you will have no idea why.
First of all, you need to define what you mean by "exception"; do you want your program to actually throw an exception when it sees a newline, or do you simply want to handle a newline as a special case? C does not provide structured exception handling (you can kind-of sort-of fake it with with setjmp/longjmp and signal/raise, but it's messy and a pain in the ass).
Secondly, you will want to read up on the following library functions:
isalpha
tolower
as they will make this a lot simpler; your code basically becomes:
if ( isalpha( ch ) )
{
// this is an alphabetic character
int lc = tolower( ch ); // convert to lower case (no-op if ch is already lower case)
order = lc - 'a' + 1;
}
else
{
// this is a non-alphabetic character
order = -1;
}
As for handling the newline, do you want to just not count it at all, or treat it like any other non-alphabetic character? If the former, just skip past it:
// non-alphabetic character
if ( ch == '\n' )
continue; // immediately goes back to beginning of loop
order = -1;
If the latter, then you don't really have to do anything special.
If you really want to raise an honest-to-God exception when you see a newline, you can do something like the following (I honestly do not recommend it, though):
#include <setjmp.h>
...
jmp_buf try;
if ( setjmp( try ) == 0 ) // "try" block
{
while ( (ch = getchar() ) != EOF )
{
...
if ( ch == '\n' )
longjmp( try, 1 ); // "throw"
}
}
else
{
// "catch" block
}
I'm having hard time trying to understand why you even try to handle '\n' specifically.
You might be trying to implement something like this:
int main(void)
{
char ch;
int order;
printf("Enter letters and it will tell you the location in the alphabet.\n");
while ((ch = getchar()) != EOF)
{
printf("%c", ch);
if (ch >= 'A' && ch <= 'Z') {
order = ch - 'A' + 1;
printf(" %d \n", order);
} else if (ch >= 'a' && ch <= 'z') {
order = ch - 'a' + 1;
printf(" %d \n", order);
} else if (ch == '\n') { } else {
printf(" -1 \n");
}
}
system("pause");
}
While this is a good solution, I would recommend rewriting it in a more optimal way:
int main(void)
{
char ch;
printf("Enter letters and it will tell you the location in the alphabet.\n");
while ((ch = getchar()) != EOF)
{
int order;
if (ch != '\n'){
if (ch >= 'A' && ch <= 'Z' || ch >= 'a' && ch <= 'z') {
order = ch & 11111B;
printf("Letter %d\n", order);
} else {
order = -1;
printf("Not letter: %d\n", order);
}
}
}
system("pause");
}
This way the program relies on specific way letters coded in ASCII
As part of my course, I have to learn C using Turbo C (unfortunately).
Our teacher asked us to make a piece of code that counts the number of characters, words and sentences in a paragraph (only using printf, getch() and a while loop.. he doesn't want us to use any other commands yet). Here is the code I wrote:
#include <stdio.h>
#include <conio.h>
void main(void)
{
clrscr();
int count = 0;
int words = 0;
int sentences = 0;
char ch;
while ((ch = getch()) != '\n')
{
printf("%c", ch);
while ((ch = getch()) != '.')
{
printf("%c", ch);
while ((ch = getch()) != ' ')
{
printf("%c", ch);
count++;
}
printf("%c", ch);
words++;
}
sentences++;
}
printf("The number of characters are %d", count);
printf("\nThe number of words are %d", words);
printf("\nThe number of sentences are %d", sentences);
getch();
}
It does work (counts the number of characters and words at least). However when I compile the code and check it out on the console window I can't get the program to stop running. It is supposed to end as soon as I input the enter key. Why is that?
Here you have the solution to your problem:
#include <stdio.h>
#include <conio.h>
void main(void)
{
clrscr();
int count = 0;
int words = 0;
int sentences = 0;
char ch;
ch = getch();
while (ch != '\n')
{
while (ch != '.' && ch != '\n')
{
while (ch != ' ' && ch != '\n' && ch != '.')
{
count++;
ch = getch();
printf("%c", ch);
}
words++;
while(ch == ' ') {
ch = getch();
printf("%c", ch);
}
}
sentences++;
while(ch == '.' && ch == ' ') {
ch = getch();
printf("%c", ch);
}
}
printf("The number of characters are %d", count);
printf("\nThe number of words are %d", words);
printf("\nThe number of sentences are %d", sentences);
getch();
}
The problem with your code is that the innermost while loop was consuming all the characters. Whenever you enter there and you type a dot or a newline it stays inside that loop because ch is different from a blank. However, when you exit from the innermost loop you risk to remain stuck at the second loop because ch will be a blank and so always different from '.' and '\n'. Since in my solution you only acquire a character in the innermost loop, in the other loops you need to "eat" the blank and the dot in order to go on with the other characters.
Checking these conditions in the two inner loops makes the code work.
Notice that I removed some of your prints.
Hope it helps.
Edit: I added the instructions to print what you type and a last check in the while loop after sentences++ to check the blank, otherwise it will count one word more.
int ch;
int flag;
while ((ch = getch()) != '\r'){
++count;
flag = 1;
while(flag && (ch == ' ' || ch == '.')){
++words;//no good E.g Contiguous space, Space at the beginning of the sentence
flag = 0;;
}
flag = 1;
while(flag && ch == '.'){
++sentences;
flag=0;
}
printf("%c", ch);
}
printf("\n");
I think the problem is because of your outer while loop's condition. It checks for a newline character '\n', as soon as it finds one the loop terminates. You can try to include your code in a while loop with the following condition
while((c=getchar())!=EOF)
this will stop taking input when the user presses Ctrl+z
Hope this helps..
You can implement with ease an if statement using while statement:
bool flag = true;
while(IF_COND && flag)
{
//DO SOMETHING
flag = false;
}
just plug it in a simple solution that uses if statements.
For example:
#include <stdio.h>
#include <conio.h>
void main(void)
{
int count = 0;
int words = 1;
int sentences = 1;
char ch;
bool if_flag;
while ((ch = getch()) != '\n')
{
count++;
if_flag = true;
while (ch==' ' && if_flag)
{
words++;
if_flag = false;
}
if_flag = true;
while (ch=='.' && if_flag)
{
sentences++;
if_flag = false;
}
}
printf("The number of characters are %d", count);
printf("\nThe number of words are %d", words);
printf("\nThe number of sentences are %d", sentences);
getch();
}
#include <stdio.h>
#include <ctype.h>
int main(void){
int sentence=0,characters =0,words =0,c=0,inside_word = 0,temp =0;
// while ((c = getchar()) != EOF)
while ((c = getchar()) != '\n') {
//a word is complete when we arrive at a space after we
// are inside a word or when we reach a full stop
while(c == '.'){
sentence++;
temp = c;
c = 0;
}
while (isalnum(c)) {
inside_word = 1;
characters++;
c =0;
}
while ((isspace(c) || temp == '.') && inside_word == 1){
words++;
inside_word = 0;
temp = 0;
c =0;
}
}
printf(" %d %d %d",characters,words,sentence);
return 0;
}
this should do it,
isalnum checks if the letter is alphanumeric, if its an alphabetical letter or a number, I dont expect random ascii characters in my sentences in this program.
isspace as the name says check for space
you need the ctype.h header for this. or you could add in
while(c == ' ') and whie((c>='a' && c<='z') || (c >= 'A' && c<='Z')
if you don't want to use isalpace and isalnum, your choice, but it will be less elegant :)
The trouble with your code is that you consume the characters in each of your loops.
a '\n' will be consumed either by the loop that scans for words of for sentences, so the outer loop will never see it.
Here is a possible solution to your problem:
int sentences = 0;
int words = 0;
int characters = 0;
int in_word = 0; // state of our parser
int ch;
do
{
int end_word = 1; // consider a word wil end by default
ch = getch();
characters++; // count characters
switch (ch)
{
case '.':
sentences++; // any dot is considered end of a sentence and a word
break;
case ' ': // a space is the end of a word
break;
default:
in_word = 1; // any non-space non-dot char is considered part of a word
end_word = 0; // cancel word ending
}
// handle word termination
if (in_word and end_word)
{
in_word = 0;
words++;
}
} while (ch != '\n');
A general approach to these parsing problems is to write a finite-state machine that will read one character at a time and react to all the possible transitions this character can trigger.
In this example, the machine has to remember if it is currently parsing a word, so that one new word is counted only the first time a terminating space or dot is encountered.
This piece of code uses a switch for concision. You can replace it with an if...else if sequence to please your teacher :).
If your teacher forced you to use only while loops, then your teacher has done a stupid thing. The equivalent code without other conditional expressions will be heavier, less understandable and redundant.
Since some people seem to think it's important, here is one possible solution:
int sentences = 0;
int words = 0;
int characters = 0;
int in_word = 0; // state of our parser
int ch;
// read initial character
ch = getch();
// do it with only while loops
while (ch != '\n')
{
// count characters
characters++;
// count words
while (in_word)
{
in_word = 0;
words++;
}
// skip spaces
while (ch == ' ')
{
ch = -1;
}
// detect sentences
while (ch == '.')
{
sentences++;
ch = -1;
}
// detect words
while ((ch != '\n')
{
word_detected = 1;
ch = -1;
}
// read next character
ch = getch();
}
Basically you can replace if (c== xxx) ... with while (c== xxx) { c = -1; ... }, which is an artifical, contrieved way of programming.
An exercise should not promote stupid ways of doing things, IMHO.
That's why I suspect you misunderstood what the teacher asked.
Obviously if you can use while loops you can also use if statements.
Trying to do this exercise with only while loops is futile and results in something that as little or nothing to do with real parser code.
All these solutions are incorrect. The only way you can solve this is by creating an AI program that uses Natural Language Processing which is not very easy to do.
Input:
"This is a paragraph about the Turing machine. Dr. Allan Turing invented the Turing Machine. It solved a problem that has a .1% change of being solved."
Checkout OpenNLP
https://sourceforge.net/projects/opennlp/
http://opennlp.apache.org/