getchar buffered input, EOF and terminal driver - c

I'm trying to understand how the terminal driver works in conjunction with getchar. Here are a few sample codes i wrote while reading KandR:
Code 1:
#include <stdio.h>
int main(){
int c = getchar();
putchar(c);
return 0;
}
Code 2:
#include <stdio.h>
int main(){
int c = EOF;
while((c=getchar()) != EOF){
printf("%c",c);
}
return 0;
}
Code 3:
//barebones program that emulates functionality of wc command
#include <stdio.h>
#define IN 1
#define OUT 0
int main(){
//nc= number of characters, ns = number of spaces, bl=number of newlines, nw=number of words
int c = EOF,nc=0,nw=0,ns=0,nl=0, state = OUT;
while((c=getchar())!=EOF){
++nc;
if(c=='\n'){
++nl;
state = OUT;
}
else if(c==' '){
++ns;
state = OUT;
}
else{
if(state == OUT){
state = IN;
++nw;}
}
}
printf("\n%d %d %d %d",nc,nw,ns,nl);
return 0;
}
I wish to understand when the terminal driver actually hands over the input string to the program. Assume my input is the string "this is a test" and i press enter, then here is how the above mentioned codes work:
code 1: outputs "t" (and the program ends)
code 2: outputs "this is a test", jumps to the next line (because it also outputs the enter i pressed) and waits again for input.
code 3: does not output anything for the above string followed by an enter. I need to press Ctrl+D for the output to be displayed (output is 15 4 3 1)
1) Why in case of code 3 do i need to press Ctrl+D (EOF) explicitly for the input to be sent to my program? To put this in other words, why was my input string sent to my program in case of code 1 and code 2 after i pressed enter? Why didn't it also ask for EOF?
2) Also, in case of code 3, if i do not press enter after the input string, i need to press Ctrl+D twice for the output to be displayed. Why is this the case?
EDIT:
For another input say "TESTING^D", here is how the above codes work:
1) outputs "T" and ends
2) outputs "TESTING" and waits for more input
3) ouputs nothing until another Ctrl+D is pressed. then it outputs 7 1 0 0.
In case of this input, the terminal driver sends the input string to the program when Ctrl+D is received in case of code 1 and code 2. Does that mean /n and Ctrl+D are treated the same way i.e. they both serve as a marker for the terminal driver to send the input to the program? Then why i need to press Ctrl+D twice for the second case?
This http://en.wikipedia.org/wiki/End-of-file says that the driver converts Ctrl+D into an EOF when it is on a newline. But in case of my "TESTING^D" input, it works fine even though the ^D is on the same line as the rest of the input. What can be the possible explanation for this?

General information:
In case code 2: you also need to do ctrl+D in order to exit.
In fact EOF is achieved by pressing ctrl+D, so what your while loop condition says:
get input from keyboard
store it in c
if the input was not equal to EOF execute the body of the while loop
EOF is nothing but the integer -1, and this can be achieved in the terminal by pressing ctrl+D. So taking this example:
while((c=getchar()) != EOF){
// execute code
}
printf("loop has exited because you press ctrl+D");
The condition keeps taking input but stops when you press ctrl+D, then it continue to execute the rest of the code.
Answering you questions:
1) Why in case of code 3 do i need to press Ctrl+D (EOF) explicitly
for the input to be sent to my program? To put this in other words,
why was my input string sent to my program in case of code 1 and code
2 after i pressed enter? Why didn't it also ask for EOF?
In code 2 and 3 (Not only 3), you need to press Ctrl+D because the while loop stops taking input from keyboard only when it reads EOF. In code 1 you are not looping, so when you enter one or more character, the program will read the characters entered but will store the first one only, then it will print it and terminate the program, so no need for EOF in that case because you are not asking for it anywhere in any condition.
2) Also, in case of code 3, if i do not press enter after the input
string, i need to press Ctrl+D twice for the output to be displayed.
Why is this the case?
If you started typing when the program is expecting input, then after typing at least one character you press ctrl+D, this will tell the program to stop taking input and return the entered characters. After that, if you press ctrl+D again without entering any character before, this will return EOF which will then not satisfy the condition of the while loop and skip to continue executing the rest of the code

Related

"Count characters" program: While loop not terminating on EOF

Following "The C Programming Language" by Kernighan and Ritchie, I am trying to enter the program described on page 18 (see below).
The only changes I made were to add "int" before "main" and "return 0;" before closing the brackets.
When I run the program in Terminal (Mac OS 10.15) I am prompted to enter an input. After I enter the input I am prompted to enter an input again - the "printf" line is apparently never reached and so the number of characters is never displayed.
Can anyone help me with the reason why EOF is never reached letting the while loop exit? I read some other answers suggesting CTRL + D or CTRL + Z, but I thought this shouldn't require extra input. (I was able to get the loop to exit with CTRL + D).
I have also pasted my code and the terminal window below.
#include <stdio.h>
int main(){
long nc;
nc = 0;
while( getchar() != EOF )
++nc;
printf("%ld\n", nc);
return 0;
}
From pg. 18 of "The C Programming Language
My screenshot
You already have the correct answer: when entering data at the terminal, Ctrl-D is the proper way to indicate "I'm done" to the terminal driver so that it sends an EOF condition to your program (Ctrl-Z on Windows). Ctrl-C breaks out of your program early.
If you ran this program with a redirect from an actual file, it would properly count the characters in the file.
EOF means end of file; newlines are not ends of files. You need to press CTRL+D to give the terminal an EOF signal, that's why you're never exiting your while loop.
If you were to give a file as input instead of through the command line, then you would not need to press CTRL+D
Adding to the two good answers I would stress that EOF does not naturally occur in stdin like in other files, a signal from the user must be sent, as you already stated in your question.
Think about it for a second, your input is a number of characters and in the end you press Enter, so the last character present in stdin is a newline character not EOF. For it to work EOF would have to be inputed, and that is precisely what Ctrl+D for Linux/Mac or Ctrl+Z for Windows, do.
As #DavidC.Rankin correctly pointed out EOF can also occur on stdin through bash piping e.g. echo "count this" | ./count or redirecting e.g. ./count < somefile, where somefile would be a text file with the contents you want to pass to stdin.
By the way Ctrl+C just ends the program, whereas Ctrl+D ends the loop and continues the program execution.
For a single line input from the command line you can use something like:
int c = 0;
while((c = getchar()) != EOF && c != '\n'){
++nc;
}

EOF in C doesnt work

I have a problem with reading chars from input. Program should end when I press ENTER or CTRL + Z.
Example input:
LoreCTRL+Z .... is cycling
but when I press CTRL+Z and there is no text before it, it works.
Could anybody help me? Thanks
int intFromConsole = getchar();
if((intFromConsole == EOF) || (intFromConsole == '\n')){
//code
}
That is probably not related to your program but to the terminal subsystem (in the kernel) which is responsible for that behaviour.
Your terminal is usually in "line discipline" state. Notice that your program doesn't receive the "Lore" directly but only when you press return. (the input is line-buffered).
Another catch, you usually shouldn't expect Ctrl+Z (ASCII 26) as input since that is intercepted by the terminal subsystem in most terminal states, which pauses your program and sends it to the background.
You can get more interesting information from Linus Akesson's article The TTY demystified.
The function getchar() uses buffered input. In Windows, Ctrl-Z is only recognised when it follows a newline entry. But in Windows, you might have conio.h available to you in the library. If so, here is something for you to experiment with. It accepts each keystroke immediately, and prints the value in hex. It exits when Ctrl-Z is pressed.
#include <stdio.h>
#include <conio.h>
#define ENDOF 0x1A // Ctrl-Z
int main(void) {
int ch;
do {
ch = _getch();
printf("%02X\n", ch);
} while(ch != ENDOF);
}
Program input
1234<Ctrl-Z>
Program output
31
32
33
34
1A
Program should end when I press ENTER or CTRL + Z.
If above is what you want, you could do something like :
int main(void)
{
char ch;
do
{
while(ch!='\n')
getchar(); // waste the buffer
/* Do the useful stuff here */
ch=getchar();
}while(ch!=EOF && ch!='\n');
/* Note the logical operator is && . It says if the key pressed
* is neither Enter nor Ctrl -Z do some stuff.
* The program exits only when you directly press Enter key or
* Ctrl -Z / Ctrl -D. Okay ! Now it is time to take a nice walk
* thru the Grand Central Park.
*/
return 0;
}
For Linux machines Ctrl-Z generates the SIGSTOP which cannot be trapped. So to signal EOF you should instead use Ctrl-D.
References :
Linux Signals

Why do I need to press CTRL+D twice to break out of `while ((c=getchar())!=EOF)` in Ubuntu 14.10?

I am new to C Programming and Ubuntu. I was reading the "The C Programming Language" by D.M Ritchie where I found the following code:
#include <stdio.h>
int main()
{
int c;
int nc=0;
while((c = getchar()) != EOF)
{
nc++;
}
printf("%d Characters \n",nc);
return 0;
}
But while running the program I enter "Hello" ,then CTRL+D twice to get the actual number of characters which is 5.
But when I enter "Hello" then CTRL+D once, nothing happens, the terminal still waits for input.
Why?
Quoting #Veritas's comment,
On linux Ctrl-D only works when the buffer is already empty otherwise it just flushes it. Therefore unless he has pressed enter without any characters after that, he will have to press Ctrl-D twice.
This explains the issue. You have to press it twice because you , after typing Hello, did not press the Enter to flush the input into the stdin. So the first time you press CTRL+D, it flushes the data into the stdin. The second time you press it, EOF is sent.

EOF in Windows command prompt doesn't terminate input stream

Code:
#include <stdio.h>
#define NEWLINE '\n'
#define SPACE ' '
int main(void)
{
int ch;
int count = 0;
while((ch = getchar()) != EOF)
{
if(ch != NEWLINE && ch != SPACE)
count++;
}
printf("There are %d characters input\n" , count);
return 0;
}
Question:
Everything works just fine, it will ignore spaces and newline and output the number of characters input to the screen (in this program I just treat comma, exclamation mark, numbers or any printable special symbol character like ampersand as character too) when I hit the EOF simulation which is ^z.
But there's something wrong when I input this line to the program. For example I input this: abcdefg^z, which means I input some character before and on the same line as ^z. Instead of terminating the program and print out total characters, the program would continue to ask for input.
The EOF terminating character input only works when I specify ^z on a single line or by doing this: ^zabvcjdjsjsj. Why is this happening?
This is true in almost every terminal driver. You'll get the same behavior using Linux.
Your program isn't actually executing the loop until \n or ^z has been entered by you at the end of a line. The terminal driver is buffering the input and it hasn't been sent to your process until that occurs.
At the end of a line, hitting ^z (or ^d on Linux) does not cause the terminal driver to send EOF. It only makes it flush the buffer to your process (with no \n).
Hitting ^z (or ^d on Linux) at the start of a line is interpreted by the terminal as "I want to signal EOF".
You can observe this behavior if you add the following inside your loop:
printf("%d\n",ch);
Run your program:
$ ./test
abc <- type "abc" and hit "enter"
97
98
99
10
abc97 <- type "abc" and hit "^z"
98
99
To better understand this, you have to realize that EOF is not a character. ^z is a user command for the terminal itself. Because the terminal is responsible for taking user input and passing it to processes, this gets tricky and thus the confusion.
A way to see this is by hitting ^v then hitting ^z as input to your program.
^v is another terminal command that tells the terminal, "Hey, the next thing I type - don't interpret that as a terminal command; pass it to the process' input instead".
^Z is only translated by the console to an EOF signal to the program when it is typed at the start of a line. That's just the way that the Windows console works. There is no "workaround" to this behaviour that I know of.

count the number of lines, words, and characters within an input

Right now I am going through a book on C and have come across an example in the book which I cannot get to work.
#include <stdio.h>
#define IN 1
#define OUT 0
main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf("%d %d %d\n", nl, nw, nc);
}
It's supposed to count the number of lines, words, and characters within an input. However, when I run it in the terminal it appears to do nothing. Am I missing something or is there a problem with this code?
The program only terminates when the input ends (getchar returns EOF). When running on terminal, this normally never happens and because of this it seems that the program is stuck. You need to close the input manually by pressing Ctrl+D (possibly twice) on Linux or pressing F6 and Enter at the beginning of the line on Windows (different systems may use different means for this).
It's waiting for input on stdin. Either redirect a file into it (myprog < test.txt) or type out the data and hit Ctrl-D (*nix) or Ctrl-Z (Windows).
When you run it, you need to type in your text, press return, then type Ctrl-d and return (nothing else on the line) to signify end-of-file. Seems to work fine with my simple test.
What it is doing is entering a loop for input. If you enter a character or newline, nothing happens on the screen. You need to interrupt the process (on my Mac this is CTRL+D) which serves as EOF. Then, you will get the result.
getchar() returns the input from the standard input. Start typing the text for which you want to have the word count and line count. Your input terminates when EOF is reached, which you do by hitting CTRL D.
CTRL D in this case acts as an End Of Transmission character.
cheers
I usually handle this kind of input like this (for Linux):
1. make a file (for example, named "input.txt"), type your input and save
2. use a pipe to send the text to your application (here assume your application named "a.out" and in the current directory):
cat input.txt | ./a.out
you'll see the program running correctly.

Resources