Right now I am going through a book on C and have come across an example in the book which I cannot get to work.
#include <stdio.h>
#define IN 1
#define OUT 0
main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf("%d %d %d\n", nl, nw, nc);
}
It's supposed to count the number of lines, words, and characters within an input. However, when I run it in the terminal it appears to do nothing. Am I missing something or is there a problem with this code?
The program only terminates when the input ends (getchar returns EOF). When running on terminal, this normally never happens and because of this it seems that the program is stuck. You need to close the input manually by pressing Ctrl+D (possibly twice) on Linux or pressing F6 and Enter at the beginning of the line on Windows (different systems may use different means for this).
It's waiting for input on stdin. Either redirect a file into it (myprog < test.txt) or type out the data and hit Ctrl-D (*nix) or Ctrl-Z (Windows).
When you run it, you need to type in your text, press return, then type Ctrl-d and return (nothing else on the line) to signify end-of-file. Seems to work fine with my simple test.
What it is doing is entering a loop for input. If you enter a character or newline, nothing happens on the screen. You need to interrupt the process (on my Mac this is CTRL+D) which serves as EOF. Then, you will get the result.
getchar() returns the input from the standard input. Start typing the text for which you want to have the word count and line count. Your input terminates when EOF is reached, which you do by hitting CTRL D.
CTRL D in this case acts as an End Of Transmission character.
cheers
I usually handle this kind of input like this (for Linux):
1. make a file (for example, named "input.txt"), type your input and save
2. use a pipe to send the text to your application (here assume your application named "a.out" and in the current directory):
cat input.txt | ./a.out
you'll see the program running correctly.
Related
I'm practicing using C & Unix by writing up some of the C programs in Vim and compiling them.
The word count program is supposed to end when the character read is EOF (CTRL-D). However, when I run it, the first CTRL-D pressed just makes it print "^D" (minus the quotes) on the terminal. The second time it's pressed, the "^D" goes away and it terminates normally.
How can I change this so that it terminates after only one CTRL-D? I notice that if I've just made a newline character, then pressing CTRL-D once does the trick. But I don't really understand why it works then and not in the general case.
Here's what the program looks like for those of you who don't have the book.
#include <stdio.h>
#define IN 1 /* Inside a word */
#define OUT 0 /* Outside a word */
int main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n') {
++nl;
--nc;
}
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
nw++;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Input in Unix-type systems is typically taken from a shell, most often in canonical mode, which means that it is partially interpeted by the shell in order to implement important control functions that govern how the command is used. An example of such a control function is to use control-X to clear the current line. What is happening in your case is that the shell (for example, bash) is interpreting the control-D as a user command to close the stream of input. All prior characters get sent to your program, as well as a control-D -- but the control-D has not been converted to an EOF. When all characters have already been sent on, as is the case following a new line, then bash does not intercept the control-D function but re-interprets control-D as a stand-alone input which it translates to indicate EOF. This EOF then successfully closes your program out.
For further information see prior answer:
ctrl-d didn't stop the while(getchar()!=EOF) loop
If you type Control-D at the start of the line, once should be enough. If you type Control-D after you've typed anything, then you need to type it twice.
That's because the Control-D tells the terminal driver to send all available characters to any waiting process. When it is at the start of a line, there are no available characters, so a process waiting on a read() gets zero characters returned, which is the meaning of EOF (no characters available for reading). When it is part way through a line, the first Control-D sends the characters to the program, which reads them; the second indicates no more characters and hence EOF once more.
So I started to learn C using the ANSI C book. One of the early exercises in the book is to write a program that takes text input and prints every word on a new line, simple enough. So i did:
#include <stdio.h>
#define IN 1
#define OUT 0
main() {
int c;
int state;
state = OUT;
while((c = getchar()) != EOF){
if(c != ' ' && c != '\n' && c != '\t'){
state = IN;
}else if(state == IN){
state = OUT;
putchar('\n');
}
if(state == IN)
putchar(c);
}
getchar();
}
The thing is that while the program works fine it won't break from the while loop if I enter EOF(Ctrl+Z on windows) as the last char of a line or in the middle of it.
So I found an answer here.
What I learned is that the (Ctrl+Z) char is some sort of signal to end the stream and it must be on a new line for getchar() to return EOF. While this is all good and it kinda helped I really want to know why is it necessary for the EOF to be on its own line?
The problem you are having is related to your command line terminal and has nothing to do with the end of file marker itself. Instead of sending characters to the program as you type them, most terminals will wait until you finish a whole line before sending what you type to the program.
You can test this by having the input come from a text file instead of being typed by hand. You should be able to end the input file without a newline without any problems.
./myprogram.exe < input.txt
By the way, the answer you linked to also points out that EOF is not a character that is actually in your input stream, so there is no way for it to come "before" a "\n". EOF is just the value that getchar returns once there are no characters left to be read.
When reading from a tty device (such as stdin for a program running in a console or terminal window) the terminal is in so-called cooked mode. In this mode, some level of line editing facilities are provided, allowing the user to backspace and change what has been typed.
The characters that are typed are not returned to the program until after return has been pressed.
It is possible to do this by placing the terminal in 'raw' mode. Unfortunately it seems this is not well standardised though, so it is somewhat system specific. The answers to this question have some examples for various platforms.
I'm practicing using C & Unix by writing up some of the C programs in Vim and compiling them.
The word count program is supposed to end when the character read is EOF (CTRL-D). However, when I run it, the first CTRL-D pressed just makes it print "^D" (minus the quotes) on the terminal. The second time it's pressed, the "^D" goes away and it terminates normally.
How can I change this so that it terminates after only one CTRL-D? I notice that if I've just made a newline character, then pressing CTRL-D once does the trick. But I don't really understand why it works then and not in the general case.
Here's what the program looks like for those of you who don't have the book.
#include <stdio.h>
#define IN 1 /* Inside a word */
#define OUT 0 /* Outside a word */
int main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n') {
++nl;
--nc;
}
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
nw++;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Input in Unix-type systems is typically taken from a shell, most often in canonical mode, which means that it is partially interpeted by the shell in order to implement important control functions that govern how the command is used. An example of such a control function is to use control-X to clear the current line. What is happening in your case is that the shell (for example, bash) is interpreting the control-D as a user command to close the stream of input. All prior characters get sent to your program, as well as a control-D -- but the control-D has not been converted to an EOF. When all characters have already been sent on, as is the case following a new line, then bash does not intercept the control-D function but re-interprets control-D as a stand-alone input which it translates to indicate EOF. This EOF then successfully closes your program out.
For further information see prior answer:
ctrl-d didn't stop the while(getchar()!=EOF) loop
If you type Control-D at the start of the line, once should be enough. If you type Control-D after you've typed anything, then you need to type it twice.
That's because the Control-D tells the terminal driver to send all available characters to any waiting process. When it is at the start of a line, there are no available characters, so a process waiting on a read() gets zero characters returned, which is the meaning of EOF (no characters available for reading). When it is part way through a line, the first Control-D sends the characters to the program, which reads them; the second indicates no more characters and hence EOF once more.
#include <stdio.h>
// copy input to output
// my version
int main()
{
int c;
printf("\n\nUse CONTROL + D to terminate this program\n\n");
while ((c = getchar()) != EOF) {
putchar(c);
}
if ((c = getchar()) == EOF) {
printf("\n\nProgram TERMINATED\n\n");
}
return 0;
}
When I enter control + D, the body of the if statement runs. That's what I had wanted, but as I analyzed the code more thoroughly, shouldn't it ask for my input again since the if's condition is (c = getchar()) == EOF?
When you hit ^D, input to the program is closed, so getchar() will subsequently always return EOF.
Control-D is canonical mode end-of-file character. When entered at the beginning of a line it causes an EOF condition to be seen by the process, that is the read returns 0. However if if Control-D is entered somewhere other than the beginning of the line it just causes the read to return immediately with what has been input thus far.
If you hit Control-D twice in a row you should see what I think you asking about.
EDIT
Here is a pretty good explanation.
^D terminates the program instantly. Thus you're getchar would never return when ^D is hit.
That is why REPL like python exits using 'exit()'.
If you want, try to use 'q' for quiting:
#include <stdio.h>
int main() {
char read = ' ';
while ((read = getchar()) != '\n') {
putchar(read);
}
return 0;
}
My input is f (followed by an enter, of course). I expect getchar() to ask for input again, but instead the program is terminated. How come? How can I fix this?
The Terminal can sometimes be a little bit confusing. You should change your program to:
#include <stdio.h>
int main() {
int read;
while ((read = getchar()) != EOF) {
putchar(read);
}
return 0;
}
This will read until getchar reads EOF (most of the time this macro expands to -1) from the terminal. getchar returns an int so you should make your variable 'read' into an integer, so you can check for EOF. You can send an EOF from your terminal on Linux with ^D and I think on windows with ^Z (?).
To explain a little bit what happens. In your program the expression
(read = getchar()) !='\n'
will be true as long as no '\n' is read from the buffer. The problem is, to get the buffer to your program, you have to hit enter which corresponds to '\n'.
The following steps happen when your program is invoked in the terminal:
~$\a.out
this starts your program
(empty line)
getchar() made a system call to get an input from the terminal and the terminal takes over
f
you made an input in the terminal. The 'f' is written into the buffer and echoed back on the terminal, your program has no idea about the character yet.
f
f~$
You hit enter. Your buffer contains now 'f\n'. The 'enter' also signals to the terminal, that it should return to your program. Your progam
reads the buffer and will find the f and put it onto the screen and then find an '\n' and immediatley stop the loop and end your program.
This would be standard behaviour of most terminals. You can change this behaviour, but that would depend on your OS.
getchar() returns the next character from the input stream. This includes of course also newlines etc. The fact that you don't see progress in your loop unless you press 'Enter' is caused by the fact that your file I/O (working on stdin) doesn't hand over the input buffer to getchar() unless it detects the '\n' at the end of the buffer. Your routine first blocks then handles the two keystrokes in one rush, terminating, like you specified it, with the appearance of '\n' in the input stream. Facit: getchar() will not remove the '\n' from the input stream (why should it?).
after f you are putting "enter" which is '/n'.
so the loop ends there.
if you want to take another character just keep on putting them one after the other as soon as enter is pressed the loop exits.
You've programmed it so the loop ends when you read a \n (enter), and you then return 0; from main which exits the program.
Perhaps you want something like
while ((read = getchar()) != EOF) {
putchar(read);
}
On nx terminals you can press Control-D which will tell the tty driver to return the input buffer to the app reading it. That's why ^D on a new line ends input - it causes the tty to return zero bytes, which the app interprets as end-of-file. But it also works anywhere on a line.