How is this program meant to work? - c

I've been reading The C Programming Language by Kernighan and Ritchie and very early own I came across a program that didn't work, even though I copied it directly from the book. Here is a screen cap of the description - http://i.imgur.com/SBQSE.png
It gets stuck in an infinite loop because anything I enter is obviously a keyboard entry, and it's checking in against EOF which is clearly not a keyboard entry.
#include <stdio.h>
/* copy input to output; 1st version */
main()
{
int c;
c = getchar();
while (c != EOF) {
putchar(c);
c = getchar();
}
}
Surely an authority book on C like this can't have an error, am I missing something?

You would run it and use ctrl + d to signal the EOF (end of file) when reading from the command line.
If you were reading a stream of bytes from a file then it would have that in there automatically. However, since you are reading from the command line, one needs to signal the end of file manually by using the key sequence above.
How it actually works
EOF usually represents -1 in (glibc at least) which is why you can't just say while(c) { do work; } because any non-zero value is true -- hence EOF is true, just like any other positive number character returned by the call to getchar(). This is why you have to directly check to see if c matches EOF (-1) with the code c != EOF which appears in the stream when you send the EOF signal with ctrl + d.

EOF is End-Of-File. Try Ctrl-D.

Related

Confused about EOF for scanf() in C [duplicate]

I'm practicing using C & Unix by writing up some of the C programs in Vim and compiling them.
The word count program is supposed to end when the character read is EOF (CTRL-D). However, when I run it, the first CTRL-D pressed just makes it print "^D" (minus the quotes) on the terminal. The second time it's pressed, the "^D" goes away and it terminates normally.
How can I change this so that it terminates after only one CTRL-D? I notice that if I've just made a newline character, then pressing CTRL-D once does the trick. But I don't really understand why it works then and not in the general case.
Here's what the program looks like for those of you who don't have the book.
#include <stdio.h>
#define IN 1 /* Inside a word */
#define OUT 0 /* Outside a word */
int main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n') {
++nl;
--nc;
}
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
nw++;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Input in Unix-type systems is typically taken from a shell, most often in canonical mode, which means that it is partially interpeted by the shell in order to implement important control functions that govern how the command is used. An example of such a control function is to use control-X to clear the current line. What is happening in your case is that the shell (for example, bash) is interpreting the control-D as a user command to close the stream of input. All prior characters get sent to your program, as well as a control-D -- but the control-D has not been converted to an EOF. When all characters have already been sent on, as is the case following a new line, then bash does not intercept the control-D function but re-interprets control-D as a stand-alone input which it translates to indicate EOF. This EOF then successfully closes your program out.
For further information see prior answer:
ctrl-d didn't stop the while(getchar()!=EOF) loop
If you type Control-D at the start of the line, once should be enough. If you type Control-D after you've typed anything, then you need to type it twice.
That's because the Control-D tells the terminal driver to send all available characters to any waiting process. When it is at the start of a line, there are no available characters, so a process waiting on a read() gets zero characters returned, which is the meaning of EOF (no characters available for reading). When it is part way through a line, the first Control-D sends the characters to the program, which reads them; the second indicates no more characters and hence EOF once more.

"Count characters" program: While loop not terminating on EOF

Following "The C Programming Language" by Kernighan and Ritchie, I am trying to enter the program described on page 18 (see below).
The only changes I made were to add "int" before "main" and "return 0;" before closing the brackets.
When I run the program in Terminal (Mac OS 10.15) I am prompted to enter an input. After I enter the input I am prompted to enter an input again - the "printf" line is apparently never reached and so the number of characters is never displayed.
Can anyone help me with the reason why EOF is never reached letting the while loop exit? I read some other answers suggesting CTRL + D or CTRL + Z, but I thought this shouldn't require extra input. (I was able to get the loop to exit with CTRL + D).
I have also pasted my code and the terminal window below.
#include <stdio.h>
int main(){
long nc;
nc = 0;
while( getchar() != EOF )
++nc;
printf("%ld\n", nc);
return 0;
}
From pg. 18 of "The C Programming Language
My screenshot
You already have the correct answer: when entering data at the terminal, Ctrl-D is the proper way to indicate "I'm done" to the terminal driver so that it sends an EOF condition to your program (Ctrl-Z on Windows). Ctrl-C breaks out of your program early.
If you ran this program with a redirect from an actual file, it would properly count the characters in the file.
EOF means end of file; newlines are not ends of files. You need to press CTRL+D to give the terminal an EOF signal, that's why you're never exiting your while loop.
If you were to give a file as input instead of through the command line, then you would not need to press CTRL+D
Adding to the two good answers I would stress that EOF does not naturally occur in stdin like in other files, a signal from the user must be sent, as you already stated in your question.
Think about it for a second, your input is a number of characters and in the end you press Enter, so the last character present in stdin is a newline character not EOF. For it to work EOF would have to be inputed, and that is precisely what Ctrl+D for Linux/Mac or Ctrl+Z for Windows, do.
As #DavidC.Rankin correctly pointed out EOF can also occur on stdin through bash piping e.g. echo "count this" | ./count or redirecting e.g. ./count < somefile, where somefile would be a text file with the contents you want to pass to stdin.
By the way Ctrl+C just ends the program, whereas Ctrl+D ends the loop and continues the program execution.
For a single line input from the command line you can use something like:
int c = 0;
while((c = getchar()) != EOF && c != '\n'){
++nc;
}

Why EOF(end of file) isn't working at the end of a line without a '\n' before it?

So I started to learn C using the ANSI C book. One of the early exercises in the book is to write a program that takes text input and prints every word on a new line, simple enough. So i did:
#include <stdio.h>
#define IN 1
#define OUT 0
main() {
int c;
int state;
state = OUT;
while((c = getchar()) != EOF){
if(c != ' ' && c != '\n' && c != '\t'){
state = IN;
}else if(state == IN){
state = OUT;
putchar('\n');
}
if(state == IN)
putchar(c);
}
getchar();
}
The thing is that while the program works fine it won't break from the while loop if I enter EOF(Ctrl+Z on windows) as the last char of a line or in the middle of it.
So I found an answer here.
What I learned is that the (Ctrl+Z) char is some sort of signal to end the stream and it must be on a new line for getchar() to return EOF. While this is all good and it kinda helped I really want to know why is it necessary for the EOF to be on its own line?
The problem you are having is related to your command line terminal and has nothing to do with the end of file marker itself. Instead of sending characters to the program as you type them, most terminals will wait until you finish a whole line before sending what you type to the program.
You can test this by having the input come from a text file instead of being typed by hand. You should be able to end the input file without a newline without any problems.
./myprogram.exe < input.txt
By the way, the answer you linked to also points out that EOF is not a character that is actually in your input stream, so there is no way for it to come "before" a "\n". EOF is just the value that getchar returns once there are no characters left to be read.
When reading from a tty device (such as stdin for a program running in a console or terminal window) the terminal is in so-called cooked mode. In this mode, some level of line editing facilities are provided, allowing the user to backspace and change what has been typed.
The characters that are typed are not returned to the program until after return has been pressed.
It is possible to do this by placing the terminal in 'raw' mode. Unfortunately it seems this is not well standardised though, so it is somewhat system specific. The answers to this question have some examples for various platforms.

How can I make the word count program (from K&R) terminate after just one application of CTRL-D?

I'm practicing using C & Unix by writing up some of the C programs in Vim and compiling them.
The word count program is supposed to end when the character read is EOF (CTRL-D). However, when I run it, the first CTRL-D pressed just makes it print "^D" (minus the quotes) on the terminal. The second time it's pressed, the "^D" goes away and it terminates normally.
How can I change this so that it terminates after only one CTRL-D? I notice that if I've just made a newline character, then pressing CTRL-D once does the trick. But I don't really understand why it works then and not in the general case.
Here's what the program looks like for those of you who don't have the book.
#include <stdio.h>
#define IN 1 /* Inside a word */
#define OUT 0 /* Outside a word */
int main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n') {
++nl;
--nc;
}
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
nw++;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Input in Unix-type systems is typically taken from a shell, most often in canonical mode, which means that it is partially interpeted by the shell in order to implement important control functions that govern how the command is used. An example of such a control function is to use control-X to clear the current line. What is happening in your case is that the shell (for example, bash) is interpreting the control-D as a user command to close the stream of input. All prior characters get sent to your program, as well as a control-D -- but the control-D has not been converted to an EOF. When all characters have already been sent on, as is the case following a new line, then bash does not intercept the control-D function but re-interprets control-D as a stand-alone input which it translates to indicate EOF. This EOF then successfully closes your program out.
For further information see prior answer:
ctrl-d didn't stop the while(getchar()!=EOF) loop
If you type Control-D at the start of the line, once should be enough. If you type Control-D after you've typed anything, then you need to type it twice.
That's because the Control-D tells the terminal driver to send all available characters to any waiting process. When it is at the start of a line, there are no available characters, so a process waiting on a read() gets zero characters returned, which is the meaning of EOF (no characters available for reading). When it is part way through a line, the first Control-D sends the characters to the program, which reads them; the second indicates no more characters and hence EOF once more.

Character Counter from "The C Programming Language" Not Working As I Expected

I am reading through "The C Programming Language", and working through all the exercises with CodeBlocks. But I cannot get my character counter to work, despite copying it directly from the book. The code looks like this:
#include <stdio.h>
main(){
long nc;
nc = 0;
while (getchar() != EOF)
++nc;
printf("%ld\n", nc);
}
When I run the program, it opens a window I can type in, but when I hit enter all that happens is it skips down a line and I can keep typing, but I think it's supposed to print the number of characters.
Any idea what's going wrong?
This line:
while (getchar() != EOF)
means that it keeps reading until the end of input — not until the end of a line. (EOF is a special constant meaning "end of file".) You need to end input (probably with Ctrl-D or with Ctrl-Z) to see the total number of characters that were input.
If you want to terminate on EOL (end of line), replace EOF with '\n':
#include <stdio.h>
main(){
long nc;
nc = 0;
while (getchar() != '\n')
++nc;
printf("%ld\n", nc);
}
Enter is not EOF. Depending on your OS, Ctrl-D or Ctrl-Z should act as EOF on standard input.
I ran into the problem tonight, too. Finally found out that Ctrl-D on Linux worked. You build the source file using cc, and start the program and input a word, then press Ctrl-D twice when finished typing. The number that the program countered will be printed just behind the very word you just typed, and the program terminates immediately. Just like this:
The above answer provided by nujabse is correct. But recently coming across this issue myself and researching the answer, I would like to add why.
Using Ctrl+C tells the terminal to send a SIGINT to the current foreground process, which by default translates into terminating the application.
Ctrl+D tells the terminal that it should register a EOF on standard input, which bash interprets as a desire to exit.
What's the difference between ^C and ^D

Resources