How is the "getchar()" function able to take multiple characters as input? - c

Here is a basic character counting program in C:
#include <stdio.h>
#include <stdlib.h>
int main(){
long nc = 0;
while(getchar() != EOF)
nc++;
printf("%ld\n" , nc);
}
When I am entering "abcde" as an input, it displays a value of 6(after triggering the EOF test), +1 for the newline character. But my doubt is that getchar(), as its name itself suggests, takes only 1 character into account. But when I am entering "abcde" at one go itself, it still works. Why is this so? What problem am I doing here?

Standard input, by default, is line-buffered with an interactive device. This means that your program won't see any input at all until a complete line is ready, in your case when you hit Enter. One good reason for this is that if the user types her 8 character password, then hits backspace 8 times, and then types her username and hits Enter, then your program only gets her username, and never sees the correction, which is usually what you want when your shell gives the opportunity to edit your input before you send it anywhere.
So what happens is essentially this:
You call getchar(). No input is available, so it waits.
You press a. It's not the end of a line, so no input is sent to your program, getchar() has nothing to read, so it still waits.
You press b. It's not the end of a line, so no input is sent to your program, getchar() has nothing to read, so it still waits.
You press c. It's not the end of a line, so no input is sent to your program, getchar() has nothing to read, so it still waits.
You press d. It's not the end of a line, so no input is sent to your program, getchar() has nothing to read, so it still waits.
You press e. It's not the end of a line, so no input is sent to your program, getchar() has nothing to read, so it still waits.
You press Enter. Now it is the end of a line, so the input "abcde\n" is sent to your program.
getchar() now has input to read, so it returns 'a', increments nc, and loops back to wait for input.
Immediately, getchar() has more input to read from the rest of the characters in that line, so it returns 'b', increments nc, and loops back to wait for input.
Immediately, getchar() has more input to read from the rest of the characters in that line, so it returns 'c', increments nc, and loops back to wait for input.
Immediately, getchar() has more input to read from the rest of the characters in that line, so it returns 'd', increments nc, and loops back to wait for input.
Immediately, getchar() has more input to read from the rest of the characters in that line, so it returns 'e', increments nc, and loops back to wait for input.
Immediately, getchar() has more input to read from the rest of the characters in that line, so it returns '\n', increments nc, and loops back to wait for input.
If you signified end-of-input, perhaps by pressing Control-D, then getchar() has nothing to read and knows there will never be anything to read, so it returns EOF and your loop ends. If it were not end-of-input, then getchar() would just again wait here for you to enter a new line of input.
So what actually happened here is that getchar() did nothing until you hit Enter. Then, probably before you even took your finger off the Enter key, it ran six times and consumed the six characters of input that you typed. But despite getchar() running six times, you were only prompted to enter something once (twice, if you include having to type Control-D), because getchar() will only wait for your input when it doesn't already have input available and waiting.
Back in the days where standalone terminals were common, the actual terminal device might not even transmit any characters to the computer until the end of a line, and could have a small amount of on-board memory to allow for this kind of local line-based editing, so the computer itself might literally never see it until end-of-line. On the kind of modern PC many people use, the operating system, down at the terminal driver level, will more probably be buffering these characters itself, and just presenting them and making them available to your program one line at a time (unless you specifically tell it that you want characters immediately, of course).

When you type in abcde\n (\n generated from Enter), it gets flushed into the standard input stream (stdin).
getchar(), as its name itself suggests, takes only 1 character into account
Yes, that's right. But notice that getchar is used in a loop which loops until getchar returns EOF. Also, recall that getchar reads input from the stdin.
So, in the first iteration, after you type in the data, getchar reads the first character a. In the second iteration, it doesn't wait for input since the stdin still contains bcde\n and hence, reads the next character b and so on.
Finally, the loop breaks when you trigger EOF. Then, the printf gets executed (printing 6 since six characters were read) and the program ends.

getchar reads next characters from standard input's buffer, and returns it, since you feed six characters - "abcde\n" - into the standard input, and you are calling getchar in a while loop, that means the loop body was run for six times, it read the chars one by one. You can test this by:
int c;
while ((c = getchar()) != EOF) {
printf("got %c\n", c);
nc++;
}

Related

Clarification regarding functioning of getchar()/putchar()

In this code:
#include<stdio.h>
int main()
{
int i,p=0;
while(i!=EOF)
{
i=getchar();
putchar(i);
printf("\n");
}
return 0;
}
When I enter hello as input in one go, the output is h then in the next line e and so on. But when h is printed then before printing e why getchar() doesn't take pause to take input from me just like it did in the first time?
getchar() returns either any successfully read character from stdin or some error, so which function is demanding terminal input and then sending it to stdin?
Input from a terminal is generally buffered. This means it is held in memory waiting for your program to read it.
This buffer is performed by multiple pieces of software. The software that is actually reading your input in the terminal window generally accumulates characters you type until you press enter or press certain other keys or combinations that end the current input. Then the line that has been read is made available to your program.
Inside your program, the C standard library, of which getchar is a part, reads the data that has been sent to it and holds it in a buffer of its own. The getchar routine reads the next character from this buffer. (If the buffer is empty when getchar wants another character, getchar will block, waiting for new data to arrive from the terminal software.)
It's because of the loop condition. You are continuing to loop until EOF is received. When you type "hello", it works exactly as you expect except STDIN has more characters in the buffer and none of them are EOF. The program prints out "h", then a newline, and goes back to check the loop condition. EOF has not been found, so then it gets the next character from STDIN (which you have already provided) and the cycle repeats.
If you remove the loop it will only print one character.

C programming — loop

I am following the exercises in the C language book. I am in the first chapter where he introduces loops. In this code:
#include <stdio.h>
/* copy input to output; 1st version */
int main() {
int c, n1;
n1 = 0;
while ((c = getchar()) != EOF) {
if (c == '\n') {
++n1;
}
printf("%d\n", n1);
}
}
In here I am counting the number of lines. When I just hit enter without entering anything else I get the right number of lines but when I enter a character then hit enter key the loop runs twice without asking for an input the second time. I get two outputs.
this how the output looks like:
// I only hit enter
1
// I only hit enter
2
// I only hit enter
3
g // I put char 'g' then hit enter
3
4
3 and 4 print at the same time. why is 4 printing after the loop has been iterated already? I thought the loop would restart and ask me for input before printing 4.
The getchar function reads one character at a time. The number of lines will be printed for every character in the input read by getchar, whether that character is newline or not, but the counter will only be incremented when there is a newline character in the input.
When you enter g then the actual input that goes to the standard input is g\n, and getchar will read this input in two iterations and that's the reason it is printing number of lines twice.
If you put the print statement inside the if block then it will print only for newline characters. If you put the print statement outside the loop, then it will only print the count of the number of lines at the end of the input.
To be clear this is the terminal that you are dealing with.
By default, the terminal will not get input from the user \n is entered. Then the whole line is placed in the stdin.
Now as I said earlier here the program is not affected by the buffering of stdin. And then the characters will be taken as input and it is processed as you expect it to be. The only hitch was the terminals buffering - line buffering.
And here from standard you will see how getchar behaves:-
The getchar function returns the next character from the input stream pointed to by stdin. If the stream is at end-of-file, the end-of-file indicator for the stream is set and getchar returns EOF. If a read error occurs, the error indicator for the stream is set and getchar returns EOF.
Now what are those characters - those charaacters include \n - the \n is what you put in the terminal and then to stdin via pressing the ENTER. Here earlier you were entering the characters earlier which were \n. This time you entered two characters. That's why the behavior you saw.

how putchar works with while loop?

I am new to c programming, so hope you guys can help me out with such questions.
1. I thought putchar() only print 1 char each time, while when I enter several char like 'hello' it print 'hello' before allow me to enter a next input? I thought that it should print only 'h' and then allow me to enter other input because getchar() only return one character each time.
2. how to make the loop stops? I know EOF has value of -1, but when I enter -1, the loop still runs.
#include <stdio.h>
main()
{
int c = getchar();
while(c != EOF){
putchar(c);
c = getchar();
}
}
After the first getchar() has completed reading one character, the next getchar(); is inside the while() loop, so as per the logic, it will keep reading the input one-by-one, until in encounters EOF.
Following the same logic, putchar(c); is under the while loop, so it will print all the characters [one character per loop basis] read by getchar() and stored in c.
In linux, EOF is produced by pressing CTRL+D. When waiting for input, if you press this key combination, the terminal driver will transform this to EOF and while loop will break.
I'm not very sure about windows, but the key combination should be CTRL+Z.
Note: even if it seems entering -1 should work in accordance with EOF, actually it won't. getchar() cannot read -1 all at a time. It will be read as - and 1, in two consecutive iterations. Also worthy to mention, a character 1 is not equal to an integer 1. A character 1, once read, will be encoded accordingly [mostly ASCII] and the corresponding value will be stored.
getchar() gets the input from the console. In a while loop, it will read all the characters from the input including the return key.
-1 is "-1". It's not a value but just another combination of characters. EOF occurs when there is no more char in the buffer. i.e. when you press Enter (or Ctrl-Z or Ctrl-D depending on your OS)

Input for examples from Kernighan and Ritchie

In section 1.5.2 of the 2nd ed. K&R introduce getchar() and putchar() and give an example of character counting, then line counting, and others throughout the chapter.
Here is the character counting program
#include <stdio.h>
main() {
long nc;
nc = 0;
while (getchar() != EOF)
++nc;
printf("%ld\n",nc);
}
where should the input come from? typing into the terminal command window and hitting enter worked for the file copying program but not for this. I am using XCode for Mac.
It seems like the easiest way would be to read a text file with pathway "pathway/folder/read.txt" but I am having trouble with that as well.
From the interactive command line, press ctrl-D after a newline, or ctrl-D twice not after newline, to terminate the input. Then the program will see EOF and show you the results.
To pass a file by path, and avoid the interactive part, use the < redirection operator of the shell, ./count_characters < path/to/file.txt.
Standard C input functions only start processing what you type in when you press the Enter key IOW.Every key you press adds a character to the system buffer (shell).Then when the line is complete (ie, you press Enter), these characters are moved to C standard buffer. getchar() reads the first character in the buffer, which also removes it from the buffer.Each successive call to getchar() reads and removes the next char, and so on. If you don't read every character that you had typed into the keyboard buffer, but instead enter another line of text, then the next call to getchar() after that will continue reading the characters left over from the previous line; you will usually witness this as the program blowing past your second input. BTW, the newline from the Enter key is also a character and is also stored in the keyboard buffer, so if you have new input to read in you first need to clear out the keyboard buffer.

Where does `getchar()` store the user input?

I've started reading "The C Programming Language" (K&R) and I have a doubt about the getchar() function.
For example this code:
#include <stdio.h>
main()
{
int c;
c = getchar();
putchar(c);
printf("\n");
}
Typing toomanychars + CTRL+D (EOF) prints just t. I think that's expected since it's the first character introduced.
But then this other piece of code:
#include <stdio.h>
main()
{
int c;
while((c = getchar()) != EOF)
putchar(c);
}
Typing toomanychars + CTRL+D (EOF) prints toomanychars.
My question is, why does this happens if I only have a single char variable? where are the rest of the characters stored?
EDIT:
Thanks to everyone for the answers, I start to get it now... only one catch:
The first program exits when given CTRL+D while the second prints the whole string and then waits for more user input. Why does it waits for another string and does not exit like the first?
getchar gets a single character from the standard input, which in this case is the keyboard buffer.
In the second example, the getchar function is in a while loop which continues until it encounters a EOF, so it will keep looping and retrieve a character (and print the character to screen) until the input becomes empty.
Successive calls to getchar will get successive characters which are coming from the input.
Oh, and don't feel bad for asking this question -- I was puzzled when I first encountered this issue as well.
It's treating the input stream like a file. It is as if you opened a file containing the text "toomanychars" and read or outputted it one character at a time.
In the first example, in the absence of a while loop, it's like you opened a file and read the first character, and then outputted it. However the second example will continue to read characters until it gets an end of file signal (ctrl+D in your case) just like if it were reading from a file on disk.
In reply to your updated question, what operating system are you using? I ran it on my Windows XP laptop and it worked fine. If I hit enter, it would print out what I had so far, make a new line, and then continue. (The getchar() function doesn't return until you press enter, which is when there is nothing in the input buffer when it's called). When I press CTRL+Z (EOF in Windows), the program terminates. Note that in Windows, the EOF must be on a line of its own to count as an EOF in the command prompt. I don't know if this behavior is mimicked in Linux, or whatever system you may be running.
Something here is buffered. e.g. the stdout FILE* which putchar writes to might be line.buffered. When the program ends(or encounters a newline) such a FILE* will be fflush()'ed and you'll see the output.
In some cases the actual terminal you're viewing might buffer the output until a newline, or until the terminal itself is instructed to flush it's buffer, which might be the case when the current foreground program exits sincei it wants to present a new prompt.
Now, what's likely to be the actual case here, is that's it's the input that is buffered(in addition to the output :-) ) When you press the keys it'll appear on your terminal window. However the terminal won't send those characters to your application, it will buffer it until you instruct it to be the end-of-input with Ctrl+D, and possibly a newline as well.
Here's another version to play around and ponder about:
int main() {
int c;
while((c = getchar()) != EOF) {
if(c != '\n')
putchar(c);
}
return 0;
}
Try feeding your program a sentence, and hit Enter. And do the same if you comment out
if(c != '\n') Maybe you can determine if your input, output or both are buffered in some way.
THis becomes more interesting if you run the above like:
./mytest | ./mytest
(As sidecomment, note that CTRD+D isn't a character, nor is it EOF. But on some systems it'll result closing the input stream which again will raise EOF to anyone attempting to read from the stream.)
Your first program only reads one character, prints it out, and exits. Your second program has a loop. It keeps reading characters one at a time and printing them out until it reads an EOF character. Only one character is stored at any given time.
You're only using the variable c to contain each character one at a time.
Once you've displayed the first char (t) using putchar(c), you forget about the value of c by assigning the next character (o) to the variable c, replacing the previous value (t).
the code is functionally equivalent to
main(){
int c;
c = getchar();
while(c != EOF) {
putchar(c);
c = getchar();
}
}
you might find this version easier to understand. the only reason to put the assignment in the conditional is to avoid having to type 'c=getchar()' twice.
For your updated question, in the first example, only one character is read. It never reaches the EOF. The program terminates because there is nothing for it to do after completing the printf instruction. It just reads one character. Prints it. Puts in a newline. And then terminates as it has nothing more to do. It doesn't read more than one character.
Whereas, in the second code, the getchar and putchar are present inside a while loop. In this, the program keeps on reading characters one by one (as it is made to do so by the loop) until reaches reaches the EOF character (^D). At that point, it matches c!=EOF and since the conditions is not satisfied, it comes out of the loop. Now there are no more statements to execute. So the program terminates at this point.
Hope this helps.

Resources