Code:
#include <stdio.h>
#define NEWLINE '\n'
#define SPACE ' '
int main(void)
{
int ch;
int count = 0;
while((ch = getchar()) != EOF)
{
if(ch != NEWLINE && ch != SPACE)
count++;
}
printf("There are %d characters input\n" , count);
return 0;
}
Question:
Everything works just fine, it will ignore spaces and newline and output the number of characters input to the screen (in this program I just treat comma, exclamation mark, numbers or any printable special symbol character like ampersand as character too) when I hit the EOF simulation which is ^z.
But there's something wrong when I input this line to the program. For example I input this: abcdefg^z, which means I input some character before and on the same line as ^z. Instead of terminating the program and print out total characters, the program would continue to ask for input.
The EOF terminating character input only works when I specify ^z on a single line or by doing this: ^zabvcjdjsjsj. Why is this happening?
This is true in almost every terminal driver. You'll get the same behavior using Linux.
Your program isn't actually executing the loop until \n or ^z has been entered by you at the end of a line. The terminal driver is buffering the input and it hasn't been sent to your process until that occurs.
At the end of a line, hitting ^z (or ^d on Linux) does not cause the terminal driver to send EOF. It only makes it flush the buffer to your process (with no \n).
Hitting ^z (or ^d on Linux) at the start of a line is interpreted by the terminal as "I want to signal EOF".
You can observe this behavior if you add the following inside your loop:
printf("%d\n",ch);
Run your program:
$ ./test
abc <- type "abc" and hit "enter"
97
98
99
10
abc97 <- type "abc" and hit "^z"
98
99
To better understand this, you have to realize that EOF is not a character. ^z is a user command for the terminal itself. Because the terminal is responsible for taking user input and passing it to processes, this gets tricky and thus the confusion.
A way to see this is by hitting ^v then hitting ^z as input to your program.
^v is another terminal command that tells the terminal, "Hey, the next thing I type - don't interpret that as a terminal command; pass it to the process' input instead".
^Z is only translated by the console to an EOF signal to the program when it is typed at the start of a line. That's just the way that the Windows console works. There is no "workaround" to this behaviour that I know of.
Related
Following "The C Programming Language" by Kernighan and Ritchie, I am trying to enter the program described on page 18 (see below).
The only changes I made were to add "int" before "main" and "return 0;" before closing the brackets.
When I run the program in Terminal (Mac OS 10.15) I am prompted to enter an input. After I enter the input I am prompted to enter an input again - the "printf" line is apparently never reached and so the number of characters is never displayed.
Can anyone help me with the reason why EOF is never reached letting the while loop exit? I read some other answers suggesting CTRL + D or CTRL + Z, but I thought this shouldn't require extra input. (I was able to get the loop to exit with CTRL + D).
I have also pasted my code and the terminal window below.
#include <stdio.h>
int main(){
long nc;
nc = 0;
while( getchar() != EOF )
++nc;
printf("%ld\n", nc);
return 0;
}
From pg. 18 of "The C Programming Language
My screenshot
You already have the correct answer: when entering data at the terminal, Ctrl-D is the proper way to indicate "I'm done" to the terminal driver so that it sends an EOF condition to your program (Ctrl-Z on Windows). Ctrl-C breaks out of your program early.
If you ran this program with a redirect from an actual file, it would properly count the characters in the file.
EOF means end of file; newlines are not ends of files. You need to press CTRL+D to give the terminal an EOF signal, that's why you're never exiting your while loop.
If you were to give a file as input instead of through the command line, then you would not need to press CTRL+D
Adding to the two good answers I would stress that EOF does not naturally occur in stdin like in other files, a signal from the user must be sent, as you already stated in your question.
Think about it for a second, your input is a number of characters and in the end you press Enter, so the last character present in stdin is a newline character not EOF. For it to work EOF would have to be inputed, and that is precisely what Ctrl+D for Linux/Mac or Ctrl+Z for Windows, do.
As #DavidC.Rankin correctly pointed out EOF can also occur on stdin through bash piping e.g. echo "count this" | ./count or redirecting e.g. ./count < somefile, where somefile would be a text file with the contents you want to pass to stdin.
By the way Ctrl+C just ends the program, whereas Ctrl+D ends the loop and continues the program execution.
For a single line input from the command line you can use something like:
int c = 0;
while((c = getchar()) != EOF && c != '\n'){
++nc;
}
I was working on the following example of C code from Deitel & Deitel. It seems that the code is supposed to print the characters entered before EOF, in the reverse order. But I have to press EOF (ctrl+z in windows) several times and Enter key to get it done. Could you please let me know why it does not respond at the first EOF?
#include <stdio.h>
int main( void )
{
int c;
if ( ( c = getchar() ) != EOF ) {
main();
printf( "%c", c );
} /* end if */
return 0;
}
Well getchar(3) is a function that operates in buffer mode, so you have to input some characters, and press the character used to signal end of data (^D in unix, ^Z in windows)
The problem here is that windows console driver is not specified the same way as the unix tty driver, so the behaviour will, in general, not be the same... Try to test the program in a real unix environment (or linux) and see if the input, at least is reversed, as the example said.
In unix, the behaviour of terminal input is that ^D is interpreted as soon as it is pressed, but if some input is in the driver buffer before it, it will make those input available to the program (so you'll have to press it a second time to signal EOF condition, which consists in a read(2) that results in 0 characters actually read). In case you have pressed <Return> before ^D (return makes all data available to the application, with the difference that the \n char is also appended to the data read), the input buffer is empty, so the EOF condition comes inmediately, after the <return> char.
In windows, you need to press <return> for anything to be read (the ^Z to be interpreted), and things complicate.
By the way, I have executed your program on a BSD unix system, with the following result:
$ a.out
apsiodjfpaosijdfa
^D
afdjisoapfjdoispa$ _
Explanation: first line is the input line "apsiodjfpaosijdfa", followed by a \n, and ^D signalling end of input. All this data goes to the application at once, and getchar() then processes it character by character. It prints the \n first (making the line to appear below ^D) and then the input chars, reversed. As there was no \r at the beginning of data, no return is issued at end, and the prompts appears next to the output.
The final _ signals the position of the cursor at the end.
If you don't want to deal with end of data characters (or don't have any unix at hand to make the test) you can use a text file to test your program (no eof char, only the actual end of the file), by redirecting program input from a file, like in this example that uses the original source code as input:
$ a.out <pru.c
}
;0 nruter
/* fi dne */ }
;) c ,"c%" (ftnirp
;)(niam
{ ) FOE =! ) )(rahcteg = c ( ( fi
;c tni
{
) diov (niam tni
>h.oidts< edulcni#$ _
It's to do with they way the windows console is passing the CTRL+Z to the program. It's probably waiting for you to compose a line and not recognising a line until it has a non- CTRL-Z character in it. So it waits until you accidentally press space and also enter.
Just echo everything in a scratch program to see exactly what is going on.
Simple c program to accept and print the character.
int c;
while((c=getchar())!=EOF)
{
putchar(c);
}
I am not getting why it accept input when I press Ctrl+Z at the end of line
ex Hello(press Ctrl+Z)
hello (some symbol)
but it work properly after leaving a line then pressing Ctrl+Z.
And I am using Window 7
When you call getchar() it in turn ends up making a read() system call. The read() will block until it has some characters available. The terminal driver only makes characters available when you press the return key or the key signifying end of input. At least that is what it looks like. In reality, it's more involved.
On the assumption that by ctrl-Z you mean whatever keystroke combination means "end of input", the reason is the way that the read() system call works. The ctrl-D (it's ctrl-D on most Unixes) character doesn't mean "end of input", it means "send the current pending input to the application".
When you've typed something in before pressing ctrl-D, that input gets sent to the application which is probably blocked on the read() system call. read() fills a buffer with the input and returns the number of bytes it put in the buffer.
When you press ctrl-D without any input pending (i.e. the last thing you didwas hit return or ctrl-D, the same thing happens but there are no characters, so read() returns 0. read() returning 0 is the convention for end of input. When getchar() sees this, it returns EOF to the calling program.
This answer in Stack Exchange puts it a bit more clearly
https://unix.stackexchange.com/a/177662/6555
You have not said what system you are working on, [U|Li]nix or Windows. This answer is Windows specific. For [Li|U]nix, replace references to ctrl-z with ctrl-d.
While using a terminal, Ctrl-z will not produce an EOF (-1) (see good answers from Haccks & JeremyP for detailed whys), so the loop will not exit the way you have it written. However, you can put a test for ctrl-z in your while loop condition to exit...
int main ()
{
int c=0;
puts ("Enter text. ctrl-z to exit:");
while(c != 26) //(26 is the ASCII value for ctrl-z)
{
putchar(c);
c = getchar();
}
return 0;
}
By the way, here is a table showing the values for ASCII control characters.
I found the answer on wiki:
In Microsoft's DOS and Windows (and in CP/M and many DEC operating systems), reading from the terminal will never produce an EOF. Instead, programs recognize that the source is a terminal (or other "character device") and interpret a given reserved character or sequence as an end-of-file indicator; most commonly this is an ASCII Control-Z, code 26.
#include <stdio.h>
int main()
{
int c;
while((c=getchar())!=26)
{
putchar(c);
}
}
You can use ASCII value of CTRL-Z.Now it won't take input after pressing CTRL-Z.
getchar() fution read single character by Pressing Ctrl+Z sends the TSTP signal to your process, means terminate the process (unix/linux)
I'm trying to understand how the terminal driver works in conjunction with getchar. Here are a few sample codes i wrote while reading KandR:
Code 1:
#include <stdio.h>
int main(){
int c = getchar();
putchar(c);
return 0;
}
Code 2:
#include <stdio.h>
int main(){
int c = EOF;
while((c=getchar()) != EOF){
printf("%c",c);
}
return 0;
}
Code 3:
//barebones program that emulates functionality of wc command
#include <stdio.h>
#define IN 1
#define OUT 0
int main(){
//nc= number of characters, ns = number of spaces, bl=number of newlines, nw=number of words
int c = EOF,nc=0,nw=0,ns=0,nl=0, state = OUT;
while((c=getchar())!=EOF){
++nc;
if(c=='\n'){
++nl;
state = OUT;
}
else if(c==' '){
++ns;
state = OUT;
}
else{
if(state == OUT){
state = IN;
++nw;}
}
}
printf("\n%d %d %d %d",nc,nw,ns,nl);
return 0;
}
I wish to understand when the terminal driver actually hands over the input string to the program. Assume my input is the string "this is a test" and i press enter, then here is how the above mentioned codes work:
code 1: outputs "t" (and the program ends)
code 2: outputs "this is a test", jumps to the next line (because it also outputs the enter i pressed) and waits again for input.
code 3: does not output anything for the above string followed by an enter. I need to press Ctrl+D for the output to be displayed (output is 15 4 3 1)
1) Why in case of code 3 do i need to press Ctrl+D (EOF) explicitly for the input to be sent to my program? To put this in other words, why was my input string sent to my program in case of code 1 and code 2 after i pressed enter? Why didn't it also ask for EOF?
2) Also, in case of code 3, if i do not press enter after the input string, i need to press Ctrl+D twice for the output to be displayed. Why is this the case?
EDIT:
For another input say "TESTING^D", here is how the above codes work:
1) outputs "T" and ends
2) outputs "TESTING" and waits for more input
3) ouputs nothing until another Ctrl+D is pressed. then it outputs 7 1 0 0.
In case of this input, the terminal driver sends the input string to the program when Ctrl+D is received in case of code 1 and code 2. Does that mean /n and Ctrl+D are treated the same way i.e. they both serve as a marker for the terminal driver to send the input to the program? Then why i need to press Ctrl+D twice for the second case?
This http://en.wikipedia.org/wiki/End-of-file says that the driver converts Ctrl+D into an EOF when it is on a newline. But in case of my "TESTING^D" input, it works fine even though the ^D is on the same line as the rest of the input. What can be the possible explanation for this?
General information:
In case code 2: you also need to do ctrl+D in order to exit.
In fact EOF is achieved by pressing ctrl+D, so what your while loop condition says:
get input from keyboard
store it in c
if the input was not equal to EOF execute the body of the while loop
EOF is nothing but the integer -1, and this can be achieved in the terminal by pressing ctrl+D. So taking this example:
while((c=getchar()) != EOF){
// execute code
}
printf("loop has exited because you press ctrl+D");
The condition keeps taking input but stops when you press ctrl+D, then it continue to execute the rest of the code.
Answering you questions:
1) Why in case of code 3 do i need to press Ctrl+D (EOF) explicitly
for the input to be sent to my program? To put this in other words,
why was my input string sent to my program in case of code 1 and code
2 after i pressed enter? Why didn't it also ask for EOF?
In code 2 and 3 (Not only 3), you need to press Ctrl+D because the while loop stops taking input from keyboard only when it reads EOF. In code 1 you are not looping, so when you enter one or more character, the program will read the characters entered but will store the first one only, then it will print it and terminate the program, so no need for EOF in that case because you are not asking for it anywhere in any condition.
2) Also, in case of code 3, if i do not press enter after the input
string, i need to press Ctrl+D twice for the output to be displayed.
Why is this the case?
If you started typing when the program is expecting input, then after typing at least one character you press ctrl+D, this will tell the program to stop taking input and return the entered characters. After that, if you press ctrl+D again without entering any character before, this will return EOF which will then not satisfy the condition of the while loop and skip to continue executing the rest of the code
#include <stdio.h>
int main() {
char read = ' ';
while ((read = getchar()) != '\n') {
putchar(read);
}
return 0;
}
My input is f (followed by an enter, of course). I expect getchar() to ask for input again, but instead the program is terminated. How come? How can I fix this?
The Terminal can sometimes be a little bit confusing. You should change your program to:
#include <stdio.h>
int main() {
int read;
while ((read = getchar()) != EOF) {
putchar(read);
}
return 0;
}
This will read until getchar reads EOF (most of the time this macro expands to -1) from the terminal. getchar returns an int so you should make your variable 'read' into an integer, so you can check for EOF. You can send an EOF from your terminal on Linux with ^D and I think on windows with ^Z (?).
To explain a little bit what happens. In your program the expression
(read = getchar()) !='\n'
will be true as long as no '\n' is read from the buffer. The problem is, to get the buffer to your program, you have to hit enter which corresponds to '\n'.
The following steps happen when your program is invoked in the terminal:
~$\a.out
this starts your program
(empty line)
getchar() made a system call to get an input from the terminal and the terminal takes over
f
you made an input in the terminal. The 'f' is written into the buffer and echoed back on the terminal, your program has no idea about the character yet.
f
f~$
You hit enter. Your buffer contains now 'f\n'. The 'enter' also signals to the terminal, that it should return to your program. Your progam
reads the buffer and will find the f and put it onto the screen and then find an '\n' and immediatley stop the loop and end your program.
This would be standard behaviour of most terminals. You can change this behaviour, but that would depend on your OS.
getchar() returns the next character from the input stream. This includes of course also newlines etc. The fact that you don't see progress in your loop unless you press 'Enter' is caused by the fact that your file I/O (working on stdin) doesn't hand over the input buffer to getchar() unless it detects the '\n' at the end of the buffer. Your routine first blocks then handles the two keystrokes in one rush, terminating, like you specified it, with the appearance of '\n' in the input stream. Facit: getchar() will not remove the '\n' from the input stream (why should it?).
after f you are putting "enter" which is '/n'.
so the loop ends there.
if you want to take another character just keep on putting them one after the other as soon as enter is pressed the loop exits.
You've programmed it so the loop ends when you read a \n (enter), and you then return 0; from main which exits the program.
Perhaps you want something like
while ((read = getchar()) != EOF) {
putchar(read);
}
On nx terminals you can press Control-D which will tell the tty driver to return the input buffer to the app reading it. That's why ^D on a new line ends input - it causes the tty to return zero bytes, which the app interprets as end-of-file. But it also works anywhere on a line.