Counting the number of characters in input with getchar() method - c

I am studying C from The C Programming Language. I have tried to make the following code work:
#include <stdio.h>
int main() {
int char_count = 0;
while (getchar() != EOF)
++char_count;
printf("Number of chars are %ld ", char_count);
return 0;
}
I build and run the code. Then, I enter a random word and press enter. I expect to see a number in the next line, but what happens is that cursor moves to the next line. When I enter a new word same thing happens. What am I missing?
Edit: I expect getchar() to return EOF when the program finishes getting the characters of the first word.

When you are pressing enter/return you are generating \n or \r\n based on the whether you are using unix/osx or windows. You are not generation EOF character.
To generate EOF character you need to press ^D on unix/osx or ^Z on windows.

Inputting the EOF character into the command prompt is always tricky. Doing what #Sarvex recommended will do the trick, by using Ctrl+Z on Windows or Ctrl+D on Unix.
One other interesting thing that you can do without making any changes to your code is instead of having your program accept input from the command prompt, you can redirect input to your program from a text file.
If you had a text file named, "data.txt" with the following text in it:
Hello World. How are you doing?
And the name of your program, after compiling the code was charCount. You could do the following:
The output I received was, Number of chars are 31.

If you change the EOF to \n character, it will do the work after pressing the enter key:
int c = 0;
while (((c = getchar()) != '\n') && (c != EOF))
++char_count;
P.S: Improved Using #David Bowling suggestion.

Related

C project help is needed [duplicate]

This question already has answers here:
What is EOF in the C programming language?
(10 answers)
Closed 5 years ago.
I am new to coding and I am learning through a book called "The C Programming Language - 2nd Edition - Ritchie Kernighan" and there is this code:
#include<stdio.h>
#include<stdlib.h>
int main(){
int c,nl;
nl =0;
while((c=getchar())!=EOF)
if(c == '\n')
++nl;
printf("%d\n",nl);
return 0;
}
After typing the code in CodeBlocks I run it and when I type in a word and press enter nothing happens. The word is not being counted and printed. I am new to all of this but if anyone has an idea feel free to share it. Thank you very much !
The issue is that you never read the EOF (End Of File); this is the end of the input data coming from the console (where you type).
Everything you type is either a letter, digit, special character, or newline, but never EOF.
To generate an EOF you need to enter a special control-key combination. On Windows this is Ctrl+Z and on UNIX/Linux/macOS this is Ctrl+D.
The book you're reading is great and written by the two creators of C. It one of my first programming books and I still have it; all worn-out.
Small piece of advice: Always put your code blocks inside { } to avoid mistakes and create more visual clarity, use spaces consistently, and add empty lines. Your code would look like this:
#include<stdio.h>
#include<stdlib.h>
int main()
{
int c, nl;
nl = 0;
while ((c = getchar()) != EOF)
{
if (c == '\n')
{
++nl;
}
}
printf("%d\n", nl);
return 0;
}
Why should it stop? Your expectations are wrong. getchar() will go on getting characters until it encounters EOF.
How to pass that to getchar?
For Windows Ctrl+Z will do the trick. And then press Enter.
For Unix or Linux system it would be Ctrl+D
Responsive output before entering EOF
To get a more responsive output you can add this line, that will tell you the cumulative sum of \n found.
if(c == '\n'){
++nl;
printf("Till now %d newline found",nl);
fflush(stdout);
}
Further Explanation of the added code
The code segment provided above will provide you some output when you press enter. But the thing is until you enter EOF it will keep on waiting for more and more input. That's what also happened in the first case. So you have to press Ctrl+Z and press Enter. That will break the loop. And you will see the final output - the number of line count.

difference between ((c=getchar())!=EOF) and ((c=getchar())!='~'),'~' can be any character

As soon as '~' is encountered, anything after that is not printed and control comes out
while loop
while((c = getchar()) != '~')
{
putchar(c);
printf(" ");
}
input: asdf~jkl
output: a s d f //control is out of while loop
as soon as '^Z' is encountered, anything after that is not printed but control doesn't come out of while loop
while((c = getchar()) != EOF)
{
putchar(c);
printf(" ");
}
input: asdf^Zjkl
output a s d f -> //control is still inside while loop
Please explain why is this happening?
As soon as EOF is encountered, while loop must exit, but this is not happening.
Is is necessary that (ctrl+Z) must be the first character on the new line to terminate while loop?
This has something to do with the working of getchar() and EOF (ctr+Z)
It's the way the console input editor works in a Windows/DOS command prompt. Input is done line by line, which is why you can go back and forth editing the characters until you press ENTER, and at that point the contents of the line are sent to the program and a new line is started.
Whoever wrote the editor in DOS decided that typing ^Z at the beginning of the line was the way to tell the editor that you're done providing input.
The thing is that EOF (end of file) is a virtual mark and not always a real character. Its actual value (-1) suggests that by being way outside the range of character codes (which is also why it's important to use an int variable instead of a char when calling getchar() and fgetc()).
In fact, your example...
while((c = getchar()) != EOF)
{
putchar(c);
printf(" ");
}
...can work as you expect it if you run the program using input redirection ("program.exe < input.txt") to feed it a file with ^Z in the middle. In that case, there is no command line editor in the way.

Theory Behind getchar() and putchar() Functions

I'm working through "The C Programming Language" by K&R and example 1.5 has stumped me:
#include <stdio.h>
/* copy input to output; 1st version */
int main(int argc, char *argv[])
{
int c;
while ((c = getchar()) != EOF)
putchar(c);
return 0;
}
I understand that 'getchar()' takes a character for 'putchar()' to display. However, when I run the program in terminal, why is it that I can pass an entire line of characters for 'putchar()' to display?
Because your terminal is line-buffered. getchar() and putchar() still only work on single characters but the terminal waits with submitting the characters to the program until you've entered a whole line. Then getchar() gets the character from that buffer one-by-one and putchar() displays them one-by-one.
Addition: that the terminal is line-buffered means that it submits input to the program when a newline character is encountered. It is usually more efficient to submit blocks of data instead of one character at a time. It also offers the user a chance to edit the line before pressing enter.
Note: Line buffering can be turned off by disabling canonical mode for the terminal and calling setbuf with NULL on stdin.
Yeah you can actually write whatever you want as long as it's not an EOF char, the keyboard is a special I/O device, it works directly through the BIOS and the characters typed on the keyboard are directly inserted in a buffer this buffer is, in your case read by the primitive getchar(), when typing a sentence you are pushing data to the buffer, and the getchar() function is in an infinite loop, this is why this works.
You can ask me more questions if you want more details about how the IO device work.
Cheers.
Consider the following program,
This program gets the input (getchar runs) till we enter '$' character and prints the output.
#include <stdio.h>
int main()
{
int c;
while ((c = getchar()) != '$')
putchar(c);
return 0;
}
If we enter input as abcd$$$$$$[Enter], it stores the input in buffer till the last $ symbol. After we pressed enter, the program (while loop) starts to run and getchar function receives one character at a time and prints it, from 'a' till first '$' .
And the program won't end till we press '$' in the input.

How do I enter data in a program to count the entered characters?

I wrote two programs in C to count characters and print the value.
One uses the while loop and the other uses for. No errors are reported while compiling, but neither print anything.
Here's the code using while:
#include <stdio.h>
/* count characters and input using while */
main()
{
long nc;
nc = 0;
while (getchar() != EOF)
++nc;
printf("%ld\n", nc);
}
And here's the code using for:
#include <stdio.h>
/* count characters and input using for */
main()
{
long nc;
for (nc = 0; getchar() != EOF; ++nc)
;
printf("%ld\n", nc);
}
Both compile and run.
When I type input and hit enter, a blank newline prints. When I hit enter without inputting anything, again a blank newline prints. I think it should at least print zero.
You need to terminate the input to your program (i.e. trigger the EOF test).
You can do this on most unix terminals with Ctrl-D or Ctrl-Z at the start of a new line on most windows command windows.
Alternately you can redirect stdin from a file, e.g.:
myprogram < test.txt
Also, you should give main a return type; implicit int is deprecated.
int main(void)
Your programs will only output after seeing an end of file (EOF). Generate this on UNIX with CTRL-D or on Windows with CTRL-Z.
You wait for an EOF character (end of file) here. This will only happen in two scenarios:
The user presses Ctrl+Break (seems to work on Windows here, but I wouldn't count on this).
The user enters a EOF character (can be done with Ctrl+Z for example).
A better way to do this would be to check for a newline instead.
for your program to work correctly. after you press the enter key, u have to ensure that uyou terminate your loop, so that the program flos correctly. this is done by typing the ctrl+Z from your keyboard. these are the keys that correspond to the EOF.
Pressing Enter on your keyboard, adds the character \n to your input. In order to and the program and have it print out the number of characters you would need to add an 'End of File' (EOF) character.
In order to inject an EOF character, you should press CTRL-D in Unix or CTRL-Z on Windows.
You do realize that newline is a normal character, and not an EOF indicator? EOF would be Ctrl-D or Ctrl-Z on most popular OSes.

Where does `getchar()` store the user input?

I've started reading "The C Programming Language" (K&R) and I have a doubt about the getchar() function.
For example this code:
#include <stdio.h>
main()
{
int c;
c = getchar();
putchar(c);
printf("\n");
}
Typing toomanychars + CTRL+D (EOF) prints just t. I think that's expected since it's the first character introduced.
But then this other piece of code:
#include <stdio.h>
main()
{
int c;
while((c = getchar()) != EOF)
putchar(c);
}
Typing toomanychars + CTRL+D (EOF) prints toomanychars.
My question is, why does this happens if I only have a single char variable? where are the rest of the characters stored?
EDIT:
Thanks to everyone for the answers, I start to get it now... only one catch:
The first program exits when given CTRL+D while the second prints the whole string and then waits for more user input. Why does it waits for another string and does not exit like the first?
getchar gets a single character from the standard input, which in this case is the keyboard buffer.
In the second example, the getchar function is in a while loop which continues until it encounters a EOF, so it will keep looping and retrieve a character (and print the character to screen) until the input becomes empty.
Successive calls to getchar will get successive characters which are coming from the input.
Oh, and don't feel bad for asking this question -- I was puzzled when I first encountered this issue as well.
It's treating the input stream like a file. It is as if you opened a file containing the text "toomanychars" and read or outputted it one character at a time.
In the first example, in the absence of a while loop, it's like you opened a file and read the first character, and then outputted it. However the second example will continue to read characters until it gets an end of file signal (ctrl+D in your case) just like if it were reading from a file on disk.
In reply to your updated question, what operating system are you using? I ran it on my Windows XP laptop and it worked fine. If I hit enter, it would print out what I had so far, make a new line, and then continue. (The getchar() function doesn't return until you press enter, which is when there is nothing in the input buffer when it's called). When I press CTRL+Z (EOF in Windows), the program terminates. Note that in Windows, the EOF must be on a line of its own to count as an EOF in the command prompt. I don't know if this behavior is mimicked in Linux, or whatever system you may be running.
Something here is buffered. e.g. the stdout FILE* which putchar writes to might be line.buffered. When the program ends(or encounters a newline) such a FILE* will be fflush()'ed and you'll see the output.
In some cases the actual terminal you're viewing might buffer the output until a newline, or until the terminal itself is instructed to flush it's buffer, which might be the case when the current foreground program exits sincei it wants to present a new prompt.
Now, what's likely to be the actual case here, is that's it's the input that is buffered(in addition to the output :-) ) When you press the keys it'll appear on your terminal window. However the terminal won't send those characters to your application, it will buffer it until you instruct it to be the end-of-input with Ctrl+D, and possibly a newline as well.
Here's another version to play around and ponder about:
int main() {
int c;
while((c = getchar()) != EOF) {
if(c != '\n')
putchar(c);
}
return 0;
}
Try feeding your program a sentence, and hit Enter. And do the same if you comment out
if(c != '\n') Maybe you can determine if your input, output or both are buffered in some way.
THis becomes more interesting if you run the above like:
./mytest | ./mytest
(As sidecomment, note that CTRD+D isn't a character, nor is it EOF. But on some systems it'll result closing the input stream which again will raise EOF to anyone attempting to read from the stream.)
Your first program only reads one character, prints it out, and exits. Your second program has a loop. It keeps reading characters one at a time and printing them out until it reads an EOF character. Only one character is stored at any given time.
You're only using the variable c to contain each character one at a time.
Once you've displayed the first char (t) using putchar(c), you forget about the value of c by assigning the next character (o) to the variable c, replacing the previous value (t).
the code is functionally equivalent to
main(){
int c;
c = getchar();
while(c != EOF) {
putchar(c);
c = getchar();
}
}
you might find this version easier to understand. the only reason to put the assignment in the conditional is to avoid having to type 'c=getchar()' twice.
For your updated question, in the first example, only one character is read. It never reaches the EOF. The program terminates because there is nothing for it to do after completing the printf instruction. It just reads one character. Prints it. Puts in a newline. And then terminates as it has nothing more to do. It doesn't read more than one character.
Whereas, in the second code, the getchar and putchar are present inside a while loop. In this, the program keeps on reading characters one by one (as it is made to do so by the loop) until reaches reaches the EOF character (^D). At that point, it matches c!=EOF and since the conditions is not satisfied, it comes out of the loop. Now there are no more statements to execute. So the program terminates at this point.
Hope this helps.

Resources