#include <stdio.h>
int main() {
char read = ' ';
while ((read = getchar()) != '\n') {
putchar(read);
}
return 0;
}
My input is f (followed by an enter, of course). I expect getchar() to ask for input again, but instead the program is terminated. How come? How can I fix this?
The Terminal can sometimes be a little bit confusing. You should change your program to:
#include <stdio.h>
int main() {
int read;
while ((read = getchar()) != EOF) {
putchar(read);
}
return 0;
}
This will read until getchar reads EOF (most of the time this macro expands to -1) from the terminal. getchar returns an int so you should make your variable 'read' into an integer, so you can check for EOF. You can send an EOF from your terminal on Linux with ^D and I think on windows with ^Z (?).
To explain a little bit what happens. In your program the expression
(read = getchar()) !='\n'
will be true as long as no '\n' is read from the buffer. The problem is, to get the buffer to your program, you have to hit enter which corresponds to '\n'.
The following steps happen when your program is invoked in the terminal:
~$\a.out
this starts your program
(empty line)
getchar() made a system call to get an input from the terminal and the terminal takes over
f
you made an input in the terminal. The 'f' is written into the buffer and echoed back on the terminal, your program has no idea about the character yet.
f
f~$
You hit enter. Your buffer contains now 'f\n'. The 'enter' also signals to the terminal, that it should return to your program. Your progam
reads the buffer and will find the f and put it onto the screen and then find an '\n' and immediatley stop the loop and end your program.
This would be standard behaviour of most terminals. You can change this behaviour, but that would depend on your OS.
getchar() returns the next character from the input stream. This includes of course also newlines etc. The fact that you don't see progress in your loop unless you press 'Enter' is caused by the fact that your file I/O (working on stdin) doesn't hand over the input buffer to getchar() unless it detects the '\n' at the end of the buffer. Your routine first blocks then handles the two keystrokes in one rush, terminating, like you specified it, with the appearance of '\n' in the input stream. Facit: getchar() will not remove the '\n' from the input stream (why should it?).
after f you are putting "enter" which is '/n'.
so the loop ends there.
if you want to take another character just keep on putting them one after the other as soon as enter is pressed the loop exits.
You've programmed it so the loop ends when you read a \n (enter), and you then return 0; from main which exits the program.
Perhaps you want something like
while ((read = getchar()) != EOF) {
putchar(read);
}
On nx terminals you can press Control-D which will tell the tty driver to return the input buffer to the app reading it. That's why ^D on a new line ends input - it causes the tty to return zero bytes, which the app interprets as end-of-file. But it also works anywhere on a line.
Related
I'm practicing using C & Unix by writing up some of the C programs in Vim and compiling them.
The word count program is supposed to end when the character read is EOF (CTRL-D). However, when I run it, the first CTRL-D pressed just makes it print "^D" (minus the quotes) on the terminal. The second time it's pressed, the "^D" goes away and it terminates normally.
How can I change this so that it terminates after only one CTRL-D? I notice that if I've just made a newline character, then pressing CTRL-D once does the trick. But I don't really understand why it works then and not in the general case.
Here's what the program looks like for those of you who don't have the book.
#include <stdio.h>
#define IN 1 /* Inside a word */
#define OUT 0 /* Outside a word */
int main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n') {
++nl;
--nc;
}
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
nw++;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Input in Unix-type systems is typically taken from a shell, most often in canonical mode, which means that it is partially interpeted by the shell in order to implement important control functions that govern how the command is used. An example of such a control function is to use control-X to clear the current line. What is happening in your case is that the shell (for example, bash) is interpreting the control-D as a user command to close the stream of input. All prior characters get sent to your program, as well as a control-D -- but the control-D has not been converted to an EOF. When all characters have already been sent on, as is the case following a new line, then bash does not intercept the control-D function but re-interprets control-D as a stand-alone input which it translates to indicate EOF. This EOF then successfully closes your program out.
For further information see prior answer:
ctrl-d didn't stop the while(getchar()!=EOF) loop
If you type Control-D at the start of the line, once should be enough. If you type Control-D after you've typed anything, then you need to type it twice.
That's because the Control-D tells the terminal driver to send all available characters to any waiting process. When it is at the start of a line, there are no available characters, so a process waiting on a read() gets zero characters returned, which is the meaning of EOF (no characters available for reading). When it is part way through a line, the first Control-D sends the characters to the program, which reads them; the second indicates no more characters and hence EOF once more.
Following "The C Programming Language" by Kernighan and Ritchie, I am trying to enter the program described on page 18 (see below).
The only changes I made were to add "int" before "main" and "return 0;" before closing the brackets.
When I run the program in Terminal (Mac OS 10.15) I am prompted to enter an input. After I enter the input I am prompted to enter an input again - the "printf" line is apparently never reached and so the number of characters is never displayed.
Can anyone help me with the reason why EOF is never reached letting the while loop exit? I read some other answers suggesting CTRL + D or CTRL + Z, but I thought this shouldn't require extra input. (I was able to get the loop to exit with CTRL + D).
I have also pasted my code and the terminal window below.
#include <stdio.h>
int main(){
long nc;
nc = 0;
while( getchar() != EOF )
++nc;
printf("%ld\n", nc);
return 0;
}
From pg. 18 of "The C Programming Language
My screenshot
You already have the correct answer: when entering data at the terminal, Ctrl-D is the proper way to indicate "I'm done" to the terminal driver so that it sends an EOF condition to your program (Ctrl-Z on Windows). Ctrl-C breaks out of your program early.
If you ran this program with a redirect from an actual file, it would properly count the characters in the file.
EOF means end of file; newlines are not ends of files. You need to press CTRL+D to give the terminal an EOF signal, that's why you're never exiting your while loop.
If you were to give a file as input instead of through the command line, then you would not need to press CTRL+D
Adding to the two good answers I would stress that EOF does not naturally occur in stdin like in other files, a signal from the user must be sent, as you already stated in your question.
Think about it for a second, your input is a number of characters and in the end you press Enter, so the last character present in stdin is a newline character not EOF. For it to work EOF would have to be inputed, and that is precisely what Ctrl+D for Linux/Mac or Ctrl+Z for Windows, do.
As #DavidC.Rankin correctly pointed out EOF can also occur on stdin through bash piping e.g. echo "count this" | ./count or redirecting e.g. ./count < somefile, where somefile would be a text file with the contents you want to pass to stdin.
By the way Ctrl+C just ends the program, whereas Ctrl+D ends the loop and continues the program execution.
For a single line input from the command line you can use something like:
int c = 0;
while((c = getchar()) != EOF && c != '\n'){
++nc;
}
I'm making a simple program in C that reads an input. It then displays the number of characters used.
What I tried first:
#include <stdio.h>
int main(int argc, char** argv) {
int currentChar;
int charCount = 0;
while((currentChar = getchar()) != EOF) {
charCount++;
}
printf("Display char count? [y/n]");
int response = getchar();
if(response == 'y' || response == 'Y')
printf("Count: %d\n",charCount);
}
What happened:
I would enter some lines and end it with ^D (I'm on Mac). The program would not wait at int response = getchar();. I found online that this is because there is still content left in the input stream.
My first question is what content would that be? I don't enter anything after pressing ^D to input EOF and when I tried to print anything left in the stream, it would print a ?.
What I tried next:
Assuming there were characters left in the input stream, I made a function to clear the input buffer:
void clearInputBuffer() {
while(getchar() != '\n') {};
}
I called the function right after the while loop:
while((currentChar = getchar()) != EOF) {
charCount++;
}
clearInputBuffer();
Now I would assume if there is anything left after pressing ^D, it would be cleared up to the next \n.
But instead, I can't stop the input request. When I press ^D, rather than sending EOF to currentChar, a ^D is shown on the terminal.
I know there is a probably a solution to this online, but since I'm not sure what exactly my problem is, I don't really know what to look for.
Why is this happening? Can someone also explain exactly what is going on behind the scenes of this program and the Terminal?
man 3 termios - search for VEOF. That will tell you what it actually does.
If you need more explanation, I'll start by saying the ISO C stdin stream has a default buffer, so any bytes read are stored into that buffer unless this behavior is somehow overridden (e.g. setvbuf).
The getchar function will read from this default buffer unless the buffer has no characters in it left to read. In that case, it will call the read function to actually store new data into that buffer and return the number of bytes read.
However, your terminal has its own input buffer. It will wait for an input sequence recognized as an end-of-line (EOL) delimiter. This is where things get interesting. If ICANON is enabled, and you use Ctrl+D with bytes in the terminal's input buffer already, then you effectively will send all of that pending bytes to the program, as if you had entered an end-of-line delimiter. The read function will receive those bytes and store them in the input buffer used for stdin, resulting in getchar returning an appropriate value.
If Ctrl+D is pressed with no pending bytes in the terminal's input buffer, no data will be sent, read will return 0, and EOF gets returned by getchar after getchar sets the end-of-file indicator for the stdin stream.
Given the two behaviors of Ctrl+D, it follows that pressing it twice will send all pending bytes on the first key press, effectively emptying the terminal's input buffer, followed by the second key press sending 0 bytes to read, which means getchar returns EOF and the end-of-file indicator for stdin is set.
If an error occurs (e.g. stdin was closed), read itself will return -1, and getchar will return EOF after setting the error indicator for the stdin stream. The following may help to illustrate the idea of how it works, though there's likely a lot more going on behind the scenes with the TTY itself than just waiting for an EOL or VEOF and sending data after either one is detected:
Of course, if ICANON isn't set on the controlling terminal, then you will never receive EOF unless your input is not from a terminal because suddenly certain special key sequences like Ctrl+D won't be recognized as special key sequences since the feature is turned off.
For a bit more completeness, please note that the ICANON bit and termios stuff in general do not necessarily apply much on Windows. The Windows Command Prompt uses Ctrl+Z for one thing, and the Windows operating system has no concept of terminals other than things like the _isatty C runtime function that is used to detect whether a file descriptor points to a file description that involves a console handle.
Pressing Ctrl+Z with data pending will effectively cancel any remaining input that follows it, though an end-of-line character (Ctrl+M or Enter) still needs to be pressed for the data to be sent unless processed input was disabled by using the SetConsoleMode Windows API function.
If pressed with no input data pending and sent by entering an end-of-line character, it acts as EOF. For example, hello^Z1234^M results in hello^Z being read, and everything including the ^M end-of-line character is ignored. ^Z1234^M or just ^Z^M will trigger EOF.
Operating systems are weird.
Ctrl+D is a bit weird on Unix -- it's not actually an EOF character. Rather, it's a signal to the shell that stdin should be closed. As a result, the behavior can be somewhat unintuitive. Two Ctrl+Ds in a row, or a Return followed by a Ctrl+D, will give you the behavior you're looking for. I tested it with this code:
#include <stdio.h>
int main(void) {
size_t charcount = 0;
while (getchar() != EOF)
charcount++;
printf("Characters: %zu\n", charcount);
return 0;
}
Edited to include chux's format character suggestion.
You can do it (also) this way:
fseek(stdin,0,SEEK_END);
This works fine for me.
I'm practicing using C & Unix by writing up some of the C programs in Vim and compiling them.
The word count program is supposed to end when the character read is EOF (CTRL-D). However, when I run it, the first CTRL-D pressed just makes it print "^D" (minus the quotes) on the terminal. The second time it's pressed, the "^D" goes away and it terminates normally.
How can I change this so that it terminates after only one CTRL-D? I notice that if I've just made a newline character, then pressing CTRL-D once does the trick. But I don't really understand why it works then and not in the general case.
Here's what the program looks like for those of you who don't have the book.
#include <stdio.h>
#define IN 1 /* Inside a word */
#define OUT 0 /* Outside a word */
int main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n') {
++nl;
--nc;
}
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
nw++;
}
}
printf("%d %d %d\n", nl, nw, nc);
return 0;
}
Input in Unix-type systems is typically taken from a shell, most often in canonical mode, which means that it is partially interpeted by the shell in order to implement important control functions that govern how the command is used. An example of such a control function is to use control-X to clear the current line. What is happening in your case is that the shell (for example, bash) is interpreting the control-D as a user command to close the stream of input. All prior characters get sent to your program, as well as a control-D -- but the control-D has not been converted to an EOF. When all characters have already been sent on, as is the case following a new line, then bash does not intercept the control-D function but re-interprets control-D as a stand-alone input which it translates to indicate EOF. This EOF then successfully closes your program out.
For further information see prior answer:
ctrl-d didn't stop the while(getchar()!=EOF) loop
If you type Control-D at the start of the line, once should be enough. If you type Control-D after you've typed anything, then you need to type it twice.
That's because the Control-D tells the terminal driver to send all available characters to any waiting process. When it is at the start of a line, there are no available characters, so a process waiting on a read() gets zero characters returned, which is the meaning of EOF (no characters available for reading). When it is part way through a line, the first Control-D sends the characters to the program, which reads them; the second indicates no more characters and hence EOF once more.
Code:
#include <stdio.h>
#define NEWLINE '\n'
#define SPACE ' '
int main(void)
{
int ch;
int count = 0;
while((ch = getchar()) != EOF)
{
if(ch != NEWLINE && ch != SPACE)
count++;
}
printf("There are %d characters input\n" , count);
return 0;
}
Question:
Everything works just fine, it will ignore spaces and newline and output the number of characters input to the screen (in this program I just treat comma, exclamation mark, numbers or any printable special symbol character like ampersand as character too) when I hit the EOF simulation which is ^z.
But there's something wrong when I input this line to the program. For example I input this: abcdefg^z, which means I input some character before and on the same line as ^z. Instead of terminating the program and print out total characters, the program would continue to ask for input.
The EOF terminating character input only works when I specify ^z on a single line or by doing this: ^zabvcjdjsjsj. Why is this happening?
This is true in almost every terminal driver. You'll get the same behavior using Linux.
Your program isn't actually executing the loop until \n or ^z has been entered by you at the end of a line. The terminal driver is buffering the input and it hasn't been sent to your process until that occurs.
At the end of a line, hitting ^z (or ^d on Linux) does not cause the terminal driver to send EOF. It only makes it flush the buffer to your process (with no \n).
Hitting ^z (or ^d on Linux) at the start of a line is interpreted by the terminal as "I want to signal EOF".
You can observe this behavior if you add the following inside your loop:
printf("%d\n",ch);
Run your program:
$ ./test
abc <- type "abc" and hit "enter"
97
98
99
10
abc97 <- type "abc" and hit "^z"
98
99
To better understand this, you have to realize that EOF is not a character. ^z is a user command for the terminal itself. Because the terminal is responsible for taking user input and passing it to processes, this gets tricky and thus the confusion.
A way to see this is by hitting ^v then hitting ^z as input to your program.
^v is another terminal command that tells the terminal, "Hey, the next thing I type - don't interpret that as a terminal command; pass it to the process' input instead".
^Z is only translated by the console to an EOF signal to the program when it is typed at the start of a line. That's just the way that the Windows console works. There is no "workaround" to this behaviour that I know of.