C word count program fails when testing for whitespace [duplicate] - c

This question already has answers here:
How is i==(20||10) evaluated?
(2 answers)
Closed 6 years ago.
I created a simple word counting program ("word": sequence of characters that does not contain whitespace character). My idea is to count a word whenever the program gets a character ch such that ch is not a whitespace character, but the character preceding ch, call it pre_ch is a whitespace character.
The following program doesn't quite work (nw remains stuck at 0):
/* Program to count the number of words in a text stream */
#include <stdio.h>
main()
{
int ch; /* The current character */
int pre_ch = ' '; /* The previous character */
int nw = 0; /* Number of words */
printf("Enter some text.\n");
printf("Press ctrl-D when done > ");
while ((ch = getchar()) != EOF)
{
if ((ch != (' ' || '\t' || '\n')) &&
(pre_ch == (' ' || '\t' || '\n')))
{
++nw;
}
pre_ch = ch;
}
printf("\nThere are %d words in the text stream.\n", nw);
}
But, if I change the if clause to:
if ((ch != (' ' || '\t' || '\n')) &&
(pre_ch == (' ')
(remove the tab and newline options for pre_ch), the program works. I have no idea why.

While it looks natural, the compiler does not understand your intent when you write:
if ((ch != (' ' || '\t' || '\n')) &&
(pre_ch == (' ' || '\t' || '\n')))
Instead you need to write:
if ((ch != ' ' || ch != '\t'|| ch != '\n') &&
(pre_ch == ' ' || pre_ch == '\t' || pre_ch == ’\n'))
That said, you might want to have a peek at isspace()in ctype.h

Related

Wordcount in C that supports singlar letter input [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have some issues with 'wordcount' counting correctly as it missed singular letter such as 'I'.
Essentially if space between a character/symbol or standalone character/symbol will counts a word count.
#include <stdio.h>
int main()
{
int wordcount;
int ch;
char lastch = -1;
wordcount = 0;
while ((ch = getc(stdin)) != EOF) {
if (ch == ' ' || ch == '\n')
{
if (!(lastch == ' ' && ch == ' '))
{
wordcount++;
}
}
lastch = ch;
}
printf("The document contains %d words.", wordcount);
}
You are over-complicating your conditional tests. If I understand your purpose, the only thing you are concerned with is if lastch != ' ' and either (ch == ' ' || ch == '\n').
Additionally, getchar returns type int. Therefore, ch should be type int to properly detect EOF on all systems.
Simplifying with those changes, you could do something similar to:
#include <stdio.h>
int main (void) {
int wordcount = 0,
lastch = 0, /* just initialize to zero */
ch; /* ch should be an int */
while ((ch = getc (stdin)) != EOF) {
if (lastch && lastch != ' ' && (ch == ' ' || ch == '\n'))
wordcount++;
lastch = ch;
}
if (lastch != '\n') /* handle no '\n' on final line */
wordcount++;
printf ("The document contains %d %s.\n",
wordcount, wordcount != 1 ? "words" : "word");
return 0;
}
Example Use/Output
$ echo " " | ./bin/wordcnt
The document contains 0 words.
$ echo " t " | ./bin/wordcnt
The document contains 1 word.
$ echo " t t " | ./bin/wordcnt
The document contains 2 words.
Note: in order to protect against a corner-case of a file not containing a POSIX eof (e.g. '\n' at the end of the file), you would need to add an additional flag that at least one character was found and check lastch in combination after you exit the loop, e.g.
#include <stdio.h>
int main (void) {
int wordcount = 0,
lastch = 0, /* just initialize to zero */
ch, /* ch should be an int */
c_exist = 0; /* flag at least 1 char found */
while ((ch = getc (stdin)) != EOF) {
if (lastch && lastch != ' ' && (ch == ' ' || ch == '\n'))
wordcount++;
if (ch != ' ' && ch != '\n') /* make sure 1 char found */
c_exist = 1;
lastch = ch;
}
if (c_exist && lastch != '\n') /* handle no '\n' on final line */
wordcount++;
printf ("The document contains %d %s.\n",
wordcount, wordcount != 1 ? "words" : "word");
return 0;
}
Corner-case Example
$ echo -n " t" | ./bin/wordcnt
The document contains 1 word.

Parsing a text file - any reason why space / new line would be ignored?

I have this while loop...
char count[3] = {0};
int i = 0;
while( c != ' ' || c != '\n' || c != '\t' ) {
count[i] = c;
c = fgetc(fp);
i++;
}
And even though I see while debugging that space and newline are the right ASCII numbers, the while loop does not exit. Anyone know what could be causing this?
The logic in the conditional is not right. It will evaluate to true all the time.
while( c != ' ' || c != '\n' || c != '\t' )
If c is equal to ' ' it is not equal to '\n' or '\t'.
What you probably need is:
while( c != ' ' && c != '\n' && c != '\t' )
And for good measure, I would also add c != EOF.
while( c != ' ' && c != '\n' && c != '\t' && c != EOF )
It might be simpler to use:
while( !isspace(c) && c != EOF )

Trying to understand this code and the getchar function

I'm trying to understand this code and I'm very confused:
I can't understand why is needed three time of the getchar function and I can't understand when the program is use that getchar functions.
The code is working very well and its took from here:
http://www.zetadev.com/svn/public/k&r/exercise.1-13.c
#include <stdio.h>
/* Exercise 1-13: Write a program to print a histogram of the lengths of words
in its input. It is easy to draw the histogram with the bars horizontal; a
vertical orientation is more challenging. */
/* At this point we haven't learned how to dynamically resize an array, and we
haven't learned how to buffer input, so that we could loop through it twice.
Therefore, I'm going to make an assumption that no word in the input will be
longer than 45 characters (per Wikipedia). */
#define MAX_WORD_LENGTH 45 /* maximum word length we will support */
main()
{
int i, j; /* counters */
int c; /* current character in input */
int length; /* length of the current word */
int lengths[MAX_WORD_LENGTH]; /* one for each possible histogram bar */
int overlong_words; /* number of words that were too long */
for (i = 0; i < MAX_WORD_LENGTH; ++i)
lengths[i] = 0;
overlong_words = 0;
while((c = getchar()) != EOF)
if (c == ' ' || c == '\t' || c == '\n')
while ((c = getchar()) && c == ' ' || c == '\t' || c == '\n')
;
else {
length = 1;
while ((c = getchar()) && c != ' ' && c != '\t' && c != '\n')
++length;
if (length < MAX_WORD_LENGTH)
++lengths[length];
else
++overlong_words;
}
printf("Histogram by Word Lengths\n");
printf("=========================\n");
for (i = 0; i < MAX_WORD_LENGTH; ++i) {
if (lengths[i] != 0) {
printf("%2d ", i);
for (j = 0; j < lengths[i]; ++j)
putchar('#');
putchar('\n');
}
}
}
This getchar() is used to make sure that when EOF is reached, the program
gets out of the while loop.
while((c = getchar()) != EOF)
This getchar is also in a while loop. It makes sure that the whitespace characters ' ', '\t', and '\n' are skipped.
while ((c = getchar()) && c == ' ' || c == '\t' || c == '\n')
;
To make the program more robust, the above line should really be:
while ((c = getchar()) != EOF && isspace(c))
This getchar is also in a while loop. It thinks that any character that is not a ' ', '\t', or '\n' is a word character and increments length for each such character.
while ((c = getchar()) && c != ' ' && c != '\t' && c != '\n')
++length;
Once again, to make the program more robust, the above line should really be:
while ((c = getchar()) != EOF && !isspace(c))
The function getchar() reads and returns a character from the standard input device.
Syntax is variable name = getchar(), variable can be of type int or char.
Here is the explanation for each getchar() call.
while((c = getchar()) != EOF)
This will check if you input EOF (i.e Ctrl+D in unix and Ctrl+Z in Windows system)
if (c == ' ' || c == '\t' || c == '\n')
while ((c = getchar()) && c == ' ' || c == '\t' || c == '\n');
This will check if you input space, tab and new line character if yes then skip those and return to while((c = getchar()) != EOF)
length = 1;
while ((c = getchar()) && c != ' ' && c != '\t' && c != '\n')
++length;
Anyother character other than space, tab and new line character will result in executing ++length.

copy input to output, for string with one or more blank, output one blank

here is the problem
for example
in = "a b\nab c\ndd";
out = "a b\nb c\ndd"
Here is my C code
while(c=getchar()!=EOF){
if(c==' '){
while( (c1=getchar()) == ' '); // ignore all other contiguous blank
putchar(c); // output one blank
putchar(c1); // output the next non-blank character
}
else putchar(c);
}
Can I have an implementation with shrinked size?
Assuming you remove only ' ' :
int c;
char space_found = 0;
while ( ( c = getchar() ) != EOF) {
if ( (!space_found) || (c != ' ') ) { // if the previous is not a space, or this is not a space
putchar(c);
}
space_found = (c == ' '); // (un)set the flag
}
You can change it to check for any white space with a simple macro:
#define is_white_space(X) ( ( (X) == ' ' ) || ( (X) == '\t' ) || ( (X) == '\n' ) )
and replace the c == ' ' with it
If you don't mind an artificial limit on the size of a "word", it's pretty easy to shorten it quite a bit:
// pick your limit here:
char word[256];
// and be sure the length here matches:
while (scanf("%255s", buffer))
printf(" %s", buffer);
Attempt to read a character.
If the input buffer is not empty, output the character previously read. Otherwise skip to step 6.
If the character previously read is a space, keep getting characters until you receive a non-space character.
If the input buffer is not empty, output the recently read character.
Go to step 1.
End Implementation
Sample implementation:
while ((c = getchar ()) != EOF)
{
putchar (c);
if (c == ' ')
{
while ((c = getchar ()) == ' ')
{}
if (c != EOF)
{
putchar (c);
}
}
}

Problem with getchar in C

I want to write a program which can:
when I enter, say "Alan Turing", it outputs "Turing, A".
But for my following program, it outputs "uring, A", I thought for long but failed to figure out where T goes.
Here is the code:
#include <stdio.h>
int main(void)
{
char initial, ch;
//This program allows extra spaces before the first name and between first name and second name, and after the second name.
printf("enter name: ");
while((initial = getchar()) == ' ')
;
while((ch = getchar()) != ' ') //skip first name
;
while ((ch = getchar()) == ' ')
{
if (ch != ' ')
printf("%c", ch); //print the first letter of the last name
}
while((ch = getchar()) != ' ' && ch != '\n')
{
printf("%c", ch);
}
printf(", %c.\n", initial);
return 0;
}
Your bug is here:
while ((ch = getchar()) == ' ')
{
if (ch != ' ')
printf("%c", ch); //print the first letter of the last name
}
while((ch = getchar()) != ' ' && ch != '\n')
{
printf("%c", ch);
}
The first loop reads characters until it finds a non-space. That's your 'T'. Then the second loop overwrites it with the next character, 'u', and prints it.
If you switch the second loop to a do {} while(); it should work.
while ((ch = getchar()) == ' ')
{
if (ch != ' ')
printf("%c", ch); //print the first letter of the last name
}
This part is wrong. The if in there will never match, because that block is only run if ch == ' '.
while ((ch = getchar()) == ' ');
printf("%c", ch); //print the first letter of the last name
should fix it.
Note that getchar returns an int, not a char. If you want to check for end of file at some point, this will byte you if you save getchar's return value in a char.
Using getchar() to read a string from the standard input is not really efficient. You should use read() or scanf() to read the input into a buffer an then work on your string.
It will be much easier.
Anyway, I added a comment where you bug is.
while((ch = getchar()) != ' ') //skip first name
;
// Your bug is here : you don't use the character which got you out of your first loop.
while ((ch = getchar()) == ' ')
{
if (ch != ' ')
printf("%c", ch); //print the first letter of the last name
}

Resources