"Count characters" program: While loop not terminating on EOF - c

Following "The C Programming Language" by Kernighan and Ritchie, I am trying to enter the program described on page 18 (see below).
The only changes I made were to add "int" before "main" and "return 0;" before closing the brackets.
When I run the program in Terminal (Mac OS 10.15) I am prompted to enter an input. After I enter the input I am prompted to enter an input again - the "printf" line is apparently never reached and so the number of characters is never displayed.
Can anyone help me with the reason why EOF is never reached letting the while loop exit? I read some other answers suggesting CTRL + D or CTRL + Z, but I thought this shouldn't require extra input. (I was able to get the loop to exit with CTRL + D).
I have also pasted my code and the terminal window below.
#include <stdio.h>
int main(){
long nc;
nc = 0;
while( getchar() != EOF )
++nc;
printf("%ld\n", nc);
return 0;
}
From pg. 18 of "The C Programming Language
My screenshot

You already have the correct answer: when entering data at the terminal, Ctrl-D is the proper way to indicate "I'm done" to the terminal driver so that it sends an EOF condition to your program (Ctrl-Z on Windows). Ctrl-C breaks out of your program early.
If you ran this program with a redirect from an actual file, it would properly count the characters in the file.

EOF means end of file; newlines are not ends of files. You need to press CTRL+D to give the terminal an EOF signal, that's why you're never exiting your while loop.
If you were to give a file as input instead of through the command line, then you would not need to press CTRL+D

Adding to the two good answers I would stress that EOF does not naturally occur in stdin like in other files, a signal from the user must be sent, as you already stated in your question.
Think about it for a second, your input is a number of characters and in the end you press Enter, so the last character present in stdin is a newline character not EOF. For it to work EOF would have to be inputed, and that is precisely what Ctrl+D for Linux/Mac or Ctrl+Z for Windows, do.
As #DavidC.Rankin correctly pointed out EOF can also occur on stdin through bash piping e.g. echo "count this" | ./count or redirecting e.g. ./count < somefile, where somefile would be a text file with the contents you want to pass to stdin.
By the way Ctrl+C just ends the program, whereas Ctrl+D ends the loop and continues the program execution.
For a single line input from the command line you can use something like:
int c = 0;
while((c = getchar()) != EOF && c != '\n'){
++nc;
}

Related

getchar and EOF

I was working on the following example of C code from Deitel & Deitel. It seems that the code is supposed to print the characters entered before EOF, in the reverse order. But I have to press EOF (ctrl+z in windows) several times and Enter key to get it done. Could you please let me know why it does not respond at the first EOF?
#include <stdio.h>
int main( void )
{
int c;
if ( ( c = getchar() ) != EOF ) {
main();
printf( "%c", c );
} /* end if */
return 0;
}
Well getchar(3) is a function that operates in buffer mode, so you have to input some characters, and press the character used to signal end of data (^D in unix, ^Z in windows)
The problem here is that windows console driver is not specified the same way as the unix tty driver, so the behaviour will, in general, not be the same... Try to test the program in a real unix environment (or linux) and see if the input, at least is reversed, as the example said.
In unix, the behaviour of terminal input is that ^D is interpreted as soon as it is pressed, but if some input is in the driver buffer before it, it will make those input available to the program (so you'll have to press it a second time to signal EOF condition, which consists in a read(2) that results in 0 characters actually read). In case you have pressed <Return> before ^D (return makes all data available to the application, with the difference that the \n char is also appended to the data read), the input buffer is empty, so the EOF condition comes inmediately, after the <return> char.
In windows, you need to press <return> for anything to be read (the ^Z to be interpreted), and things complicate.
By the way, I have executed your program on a BSD unix system, with the following result:
$ a.out
apsiodjfpaosijdfa
^D
afdjisoapfjdoispa$ _
Explanation: first line is the input line "apsiodjfpaosijdfa", followed by a \n, and ^D signalling end of input. All this data goes to the application at once, and getchar() then processes it character by character. It prints the \n first (making the line to appear below ^D) and then the input chars, reversed. As there was no \r at the beginning of data, no return is issued at end, and the prompts appears next to the output.
The final _ signals the position of the cursor at the end.
If you don't want to deal with end of data characters (or don't have any unix at hand to make the test) you can use a text file to test your program (no eof char, only the actual end of the file), by redirecting program input from a file, like in this example that uses the original source code as input:
$ a.out <pru.c
}
;0 nruter
/* fi dne */ }
;) c ,"c%" (ftnirp
;)(niam
{ ) FOE =! ) )(rahcteg = c ( ( fi
;c tni
{
) diov (niam tni
>h.oidts< edulcni#$ _
It's to do with they way the windows console is passing the CTRL+Z to the program. It's probably waiting for you to compose a line and not recognising a line until it has a non- CTRL-Z character in it. So it waits until you accidentally press space and also enter.
Just echo everything in a scratch program to see exactly what is going on.

K&R C Counting Characters

I'm working out of the K&R book, and I'm on the code example of how to count characters from a stream of text. I copied their code and tried running it, but when the command line prompts you for characters, the loop doesn't exit and thus will never print out the character count. Is there an error here I'm not catching?
#include <stdio.h>
main()
{
long nc;
nc = 0;
while(getchar() != EOF) {
++nc;
}
printf("%1d\n", nc);
}
Whenever you want to stop it just send the EOF signal to the shell.
Ctrl+d in Linux or Ctrl+z on Windows.
By the way (as additional info) Ctrl+c send SIGINT to a process in Linux and on Windows it does something similar.
Have you tried to press Ctrl+D (on Linux) or Ctrl+Z (on Windows)? If yes then It will come out of loop for sure. On pressing these keys, it will return EOF and loop will terminate.
Try ending your character stream with CNTL-Z (end of file character). Just hitting Enter results in a CR which is just another character to count

Character Counter from "The C Programming Language" Not Working As I Expected

I am reading through "The C Programming Language", and working through all the exercises with CodeBlocks. But I cannot get my character counter to work, despite copying it directly from the book. The code looks like this:
#include <stdio.h>
main(){
long nc;
nc = 0;
while (getchar() != EOF)
++nc;
printf("%ld\n", nc);
}
When I run the program, it opens a window I can type in, but when I hit enter all that happens is it skips down a line and I can keep typing, but I think it's supposed to print the number of characters.
Any idea what's going wrong?
This line:
while (getchar() != EOF)
means that it keeps reading until the end of input — not until the end of a line. (EOF is a special constant meaning "end of file".) You need to end input (probably with Ctrl-D or with Ctrl-Z) to see the total number of characters that were input.
If you want to terminate on EOL (end of line), replace EOF with '\n':
#include <stdio.h>
main(){
long nc;
nc = 0;
while (getchar() != '\n')
++nc;
printf("%ld\n", nc);
}
Enter is not EOF. Depending on your OS, Ctrl-D or Ctrl-Z should act as EOF on standard input.
I ran into the problem tonight, too. Finally found out that Ctrl-D on Linux worked. You build the source file using cc, and start the program and input a word, then press Ctrl-D twice when finished typing. The number that the program countered will be printed just behind the very word you just typed, and the program terminates immediately. Just like this:
The above answer provided by nujabse is correct. But recently coming across this issue myself and researching the answer, I would like to add why.
Using Ctrl+C tells the terminal to send a SIGINT to the current foreground process, which by default translates into terminating the application.
Ctrl+D tells the terminal that it should register a EOF on standard input, which bash interprets as a desire to exit.
What's the difference between ^C and ^D

EOF in Windows command prompt doesn't terminate input stream

Code:
#include <stdio.h>
#define NEWLINE '\n'
#define SPACE ' '
int main(void)
{
int ch;
int count = 0;
while((ch = getchar()) != EOF)
{
if(ch != NEWLINE && ch != SPACE)
count++;
}
printf("There are %d characters input\n" , count);
return 0;
}
Question:
Everything works just fine, it will ignore spaces and newline and output the number of characters input to the screen (in this program I just treat comma, exclamation mark, numbers or any printable special symbol character like ampersand as character too) when I hit the EOF simulation which is ^z.
But there's something wrong when I input this line to the program. For example I input this: abcdefg^z, which means I input some character before and on the same line as ^z. Instead of terminating the program and print out total characters, the program would continue to ask for input.
The EOF terminating character input only works when I specify ^z on a single line or by doing this: ^zabvcjdjsjsj. Why is this happening?
This is true in almost every terminal driver. You'll get the same behavior using Linux.
Your program isn't actually executing the loop until \n or ^z has been entered by you at the end of a line. The terminal driver is buffering the input and it hasn't been sent to your process until that occurs.
At the end of a line, hitting ^z (or ^d on Linux) does not cause the terminal driver to send EOF. It only makes it flush the buffer to your process (with no \n).
Hitting ^z (or ^d on Linux) at the start of a line is interpreted by the terminal as "I want to signal EOF".
You can observe this behavior if you add the following inside your loop:
printf("%d\n",ch);
Run your program:
$ ./test
abc <- type "abc" and hit "enter"
97
98
99
10
abc97 <- type "abc" and hit "^z"
98
99
To better understand this, you have to realize that EOF is not a character. ^z is a user command for the terminal itself. Because the terminal is responsible for taking user input and passing it to processes, this gets tricky and thus the confusion.
A way to see this is by hitting ^v then hitting ^z as input to your program.
^v is another terminal command that tells the terminal, "Hey, the next thing I type - don't interpret that as a terminal command; pass it to the process' input instead".
^Z is only translated by the console to an EOF signal to the program when it is typed at the start of a line. That's just the way that the Windows console works. There is no "workaround" to this behaviour that I know of.

Why does this getchar() loop stop after one character has been entered?

#include <stdio.h>
int main() {
char read = ' ';
while ((read = getchar()) != '\n') {
putchar(read);
}
return 0;
}
My input is f (followed by an enter, of course). I expect getchar() to ask for input again, but instead the program is terminated. How come? How can I fix this?
The Terminal can sometimes be a little bit confusing. You should change your program to:
#include <stdio.h>
int main() {
int read;
while ((read = getchar()) != EOF) {
putchar(read);
}
return 0;
}
This will read until getchar reads EOF (most of the time this macro expands to -1) from the terminal. getchar returns an int so you should make your variable 'read' into an integer, so you can check for EOF. You can send an EOF from your terminal on Linux with ^D and I think on windows with ^Z (?).
To explain a little bit what happens. In your program the expression
(read = getchar()) !='\n'
will be true as long as no '\n' is read from the buffer. The problem is, to get the buffer to your program, you have to hit enter which corresponds to '\n'.
The following steps happen when your program is invoked in the terminal:
~$\a.out
this starts your program
(empty line)
getchar() made a system call to get an input from the terminal and the terminal takes over
f
you made an input in the terminal. The 'f' is written into the buffer and echoed back on the terminal, your program has no idea about the character yet.
f
f~$
You hit enter. Your buffer contains now 'f\n'. The 'enter' also signals to the terminal, that it should return to your program. Your progam
reads the buffer and will find the f and put it onto the screen and then find an '\n' and immediatley stop the loop and end your program.
This would be standard behaviour of most terminals. You can change this behaviour, but that would depend on your OS.
getchar() returns the next character from the input stream. This includes of course also newlines etc. The fact that you don't see progress in your loop unless you press 'Enter' is caused by the fact that your file I/O (working on stdin) doesn't hand over the input buffer to getchar() unless it detects the '\n' at the end of the buffer. Your routine first blocks then handles the two keystrokes in one rush, terminating, like you specified it, with the appearance of '\n' in the input stream. Facit: getchar() will not remove the '\n' from the input stream (why should it?).
after f you are putting "enter" which is '/n'.
so the loop ends there.
if you want to take another character just keep on putting them one after the other as soon as enter is pressed the loop exits.
You've programmed it so the loop ends when you read a \n (enter), and you then return 0; from main which exits the program.
Perhaps you want something like
while ((read = getchar()) != EOF) {
putchar(read);
}
On nx terminals you can press Control-D which will tell the tty driver to return the input buffer to the app reading it. That's why ^D on a new line ends input - it causes the tty to return zero bytes, which the app interprets as end-of-file. But it also works anywhere on a line.

Resources