Issues with standard input in C - c

I'm making a simple program in C that reads an input. It then displays the number of characters used.
What I tried first:
#include <stdio.h>
int main(int argc, char** argv) {
int currentChar;
int charCount = 0;
while((currentChar = getchar()) != EOF) {
charCount++;
}
printf("Display char count? [y/n]");
int response = getchar();
if(response == 'y' || response == 'Y')
printf("Count: %d\n",charCount);
}
What happened:
I would enter some lines and end it with ^D (I'm on Mac). The program would not wait at int response = getchar();. I found online that this is because there is still content left in the input stream.
My first question is what content would that be? I don't enter anything after pressing ^D to input EOF and when I tried to print anything left in the stream, it would print a ?.
What I tried next:
Assuming there were characters left in the input stream, I made a function to clear the input buffer:
void clearInputBuffer() {
while(getchar() != '\n') {};
}
I called the function right after the while loop:
while((currentChar = getchar()) != EOF) {
charCount++;
}
clearInputBuffer();
Now I would assume if there is anything left after pressing ^D, it would be cleared up to the next \n.
But instead, I can't stop the input request. When I press ^D, rather than sending EOF to currentChar, a ^D is shown on the terminal.
I know there is a probably a solution to this online, but since I'm not sure what exactly my problem is, I don't really know what to look for.
Why is this happening? Can someone also explain exactly what is going on behind the scenes of this program and the Terminal?

man 3 termios - search for VEOF. That will tell you what it actually does.
If you need more explanation, I'll start by saying the ISO C stdin stream has a default buffer, so any bytes read are stored into that buffer unless this behavior is somehow overridden (e.g. setvbuf).
The getchar function will read from this default buffer unless the buffer has no characters in it left to read. In that case, it will call the read function to actually store new data into that buffer and return the number of bytes read.
However, your terminal has its own input buffer. It will wait for an input sequence recognized as an end-of-line (EOL) delimiter. This is where things get interesting. If ICANON is enabled, and you use Ctrl+D with bytes in the terminal's input buffer already, then you effectively will send all of that pending bytes to the program, as if you had entered an end-of-line delimiter. The read function will receive those bytes and store them in the input buffer used for stdin, resulting in getchar returning an appropriate value.
If Ctrl+D is pressed with no pending bytes in the terminal's input buffer, no data will be sent, read will return 0, and EOF gets returned by getchar after getchar sets the end-of-file indicator for the stdin stream.
Given the two behaviors of Ctrl+D, it follows that pressing it twice will send all pending bytes on the first key press, effectively emptying the terminal's input buffer, followed by the second key press sending 0 bytes to read, which means getchar returns EOF and the end-of-file indicator for stdin is set.
If an error occurs (e.g. stdin was closed), read itself will return -1, and getchar will return EOF after setting the error indicator for the stdin stream. The following may help to illustrate the idea of how it works, though there's likely a lot more going on behind the scenes with the TTY itself than just waiting for an EOL or VEOF and sending data after either one is detected:
Of course, if ICANON isn't set on the controlling terminal, then you will never receive EOF unless your input is not from a terminal because suddenly certain special key sequences like Ctrl+D won't be recognized as special key sequences since the feature is turned off.
For a bit more completeness, please note that the ICANON bit and termios stuff in general do not necessarily apply much on Windows. The Windows Command Prompt uses Ctrl+Z for one thing, and the Windows operating system has no concept of terminals other than things like the _isatty C runtime function that is used to detect whether a file descriptor points to a file description that involves a console handle.
Pressing Ctrl+Z with data pending will effectively cancel any remaining input that follows it, though an end-of-line character (Ctrl+M or Enter) still needs to be pressed for the data to be sent unless processed input was disabled by using the SetConsoleMode Windows API function.
If pressed with no input data pending and sent by entering an end-of-line character, it acts as EOF. For example, hello^Z1234^M results in hello^Z being read, and everything including the ^M end-of-line character is ignored. ^Z1234^M or just ^Z^M will trigger EOF.
Operating systems are weird.

Ctrl+D is a bit weird on Unix -- it's not actually an EOF character. Rather, it's a signal to the shell that stdin should be closed. As a result, the behavior can be somewhat unintuitive. Two Ctrl+Ds in a row, or a Return followed by a Ctrl+D, will give you the behavior you're looking for. I tested it with this code:
#include <stdio.h>
int main(void) {
size_t charcount = 0;
while (getchar() != EOF)
charcount++;
printf("Characters: %zu\n", charcount);
return 0;
}
Edited to include chux's format character suggestion.

You can do it (also) this way:
fseek(stdin,0,SEEK_END);
This works fine for me.

Related

when control+d is pressed fgets do not stop reading

When you read from stdin using getchar, fgets or some similar function, if you type some text and then put an eof (control+d in linux) you cannot delete the previous text. For example, if I type 'program' and then enter eof by pressing control+d, I can't delete what I typed before, i.e. program.
#include<string.h>
#include<stdlib.h>
int main() {
char buffer[1024] = "";
printf("> ");
if(fgets(buffer,sizeof(buffer),stdin) == NULL){
puts("eof");
}
else{
puts(buffer);
}
return 0;
}
How can this be avoided?
The readline function of The GNU Readline Library I think is my best option to do the job. It's pretty simple to use but it uses dynamic memory to host the string so you have to use the free function to free up the memory. You can find more information by opening a terminal and typing 'man readline'.
The code would look like this:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include <readline/readline.h>
int main() {
char *ptr = readline("> ");
if(!ptr){
puts("eof");
}
else{
puts(ptr);
}
free(ptr);
return 0;
}
To be able to use readline with gcc you must pass it -lreadline
When fgets reads a line, what will happen is that it will read characters from the specified stream until it encounters a '\n' or EOF, until it has read the specified maximum size to read or a read error occurs. It does not see what you are doing on your keyboard at all. It only sees the stream, but it is the terminal that sends the data to the stream.
What's happening when you are editing the input has absolutely nothing to do with fgets. That's the terminals job.
As Eric Postpischil wrote in the comments:
Pressing control-D in Linux does not signal EOF. It actually means “Complete the current read operation.” At that point, if characters have been typed, they are immediately sent to the program, whereas the system would usually wait until Enter is pressed. If no characters have been typed, the read operation completes with zero characters read, which some I/O routines treat as an EOF, and that is why programs may seem to receive an EOF when control-D is pressed at the start of a line of input. Since the data is sent to the program, of course there is no way to undo it—it has already been sent.
I guess there is some way to alter the behavior of pressing C-d, but then you need to decide what it should do instead. If you want it to do "nothing" instead of sending the data to stdin I cannot really see what you have won. The only use case I can see with this is if you for some reason are having a problem with accidentally pressing C-d from time to time.
One thing you could do is to take complete control of every keystroke. Then you would have to write code to move the cursor every time the user presses a key, and also write code to remove characters when the user is pressing backspace. You can use a library like ncurses for this.
It can't be avoided. Simply put, Ctrl+D ends the current read operation.
If you want to ignore this, make your own fgets based on fgetc and have it ignore end-of-file.

getchar() continues to accept input after including Ctrl+Z in same line

Simple c program to accept and print the character.
int c;
while((c=getchar())!=EOF)
{
putchar(c);
}
I am not getting why it accept input when I press Ctrl+Z at the end of line
ex Hello(press Ctrl+Z)
hello (some symbol)
but it work properly after leaving a line then pressing Ctrl+Z.
And I am using Window 7
When you call getchar() it in turn ends up making a read() system call. The read() will block until it has some characters available. The terminal driver only makes characters available when you press the return key or the key signifying end of input. At least that is what it looks like. In reality, it's more involved.
On the assumption that by ctrl-Z you mean whatever keystroke combination means "end of input", the reason is the way that the read() system call works. The ctrl-D (it's ctrl-D on most Unixes) character doesn't mean "end of input", it means "send the current pending input to the application".
When you've typed something in before pressing ctrl-D, that input gets sent to the application which is probably blocked on the read() system call. read() fills a buffer with the input and returns the number of bytes it put in the buffer.
When you press ctrl-D without any input pending (i.e. the last thing you didwas hit return or ctrl-D, the same thing happens but there are no characters, so read() returns 0. read() returning 0 is the convention for end of input. When getchar() sees this, it returns EOF to the calling program.
This answer in Stack Exchange puts it a bit more clearly
https://unix.stackexchange.com/a/177662/6555
You have not said what system you are working on, [U|Li]nix or Windows. This answer is Windows specific. For [Li|U]nix, replace references to ctrl-z with ctrl-d.
While using a terminal, Ctrl-z will not produce an EOF (-1) (see good answers from Haccks & JeremyP for detailed whys), so the loop will not exit the way you have it written. However, you can put a test for ctrl-z in your while loop condition to exit...
int main ()
{
int c=0;
puts ("Enter text. ctrl-z to exit:");
while(c != 26) //(26 is the ASCII value for ctrl-z)
{
putchar(c);
c = getchar();
}
return 0;
}
By the way, here is a table showing the values for ASCII control characters.
I found the answer on wiki:
In Microsoft's DOS and Windows (and in CP/M and many DEC operating systems), reading from the terminal will never produce an EOF. Instead, programs recognize that the source is a terminal (or other "character device") and interpret a given reserved character or sequence as an end-of-file indicator; most commonly this is an ASCII Control-Z, code 26.
#include <stdio.h>
int main()
{
int c;
while((c=getchar())!=26)
{
putchar(c);
}
}
You can use ASCII value of CTRL-Z.Now it won't take input after pressing CTRL-Z.
getchar() fution read single character by Pressing Ctrl+Z sends the TSTP signal to your process, means terminate the process (unix/linux)

C - how to handle user input in a while loop

I'm new to C and I have a simple program that takes some user input inside a while loop, and quits if the user presses 'q':
while(1)
{
printf("Please enter a choice: \n1)quit\n2)Something");
*choice = getc(stdin);
// Actions.
if (*choice == 'q') break;
if (*choice == '2') printf("Hi\n");
}
When I run this and hit 'q', the program does quit correctly. However if I press '2' the program first prints out "Hi" (as it should) but then goes on to print the prompt "Please choose an option" twice. If I enter N characters and press enter, the prompt gets printed N times.
This same behaviour happens when I use fgets() with a limit of 2.
How do I get this loop working properly? It should only take the first character of input and then do something once according to what was entered.
EDIT
So using fgets() with a larger buffer works, and stops the repeated prompt issue:
fgets(choice, 80, stdin);
This kind of helped: How to clear input buffer in C?
When you getc the input, it's important to note that the user has put in more than one character: at the very least, the stdin contains 2 chars:
2\n
when getc gets the "2" the user has put in, the trailing \n character is still in the buffer, so you'll have to clear it. The simplest way here to do so would be to add this:
if (*choice == '2')
puts("Hi");
while (*choice != '\n' && *choice != EOF)//EOF just in case
*choice = getc(stdin);
That should fix it
For completeness:
Note that getc returns an int, not a char. Make sure to compile with -Wall -pedantic flags, and always check the return type of the functions you use.
It is tempting to clear the input buffer using fflush(stdin);, and on some systems, this will work. However: This behavior is undefined: the standard clearly states that fflush is meant to be used on update/output buffers, not input buffers:
C11 7.21.5.2 The fflush function, fflush works only with output/update stream, not input stream
However, some implementations (for example Microsoft) do support fflush(stdin); as an extension. Relying on it, though, goes against the philosophy behind C. C was meant to be portable, and by sticking to the standard, you are assured your code is portable. Relying on a specific extension takes away this advantage.
What seems to be a very simple problem is actually pretty complicated. The root of the problem is that terminals operate in two different modes: raw and cooked. Cooked mode, which is the default, means that the terminal does not read characters, it reads lines. So, your program never receives any input at all unless a whole line is entered (or an end of file character is received). The way the terminal recognizes an end of line is by receiving a newline character (0x0A) which can be caused by pressing the Enter key. To make it even more confusing, on a Windows machine pressing Enter causes TWO characters to be generated, (0x0D and 0x0A).
So, your basic problem is that you want a single-character interface, but your terminal is operating in a line-oriented (cooked) mode.
The correct solution is to switch the terminal to raw mode so your program can receive characters as the user types them. Also, I would recommend the use of getchar() rather than getc() in this usage. The difference is that getc() takes a file descriptor as an argument, so it can read from any stream. The getchar() function only reads from standard input, which is what you want. Therefore, it is a more specific choice. After your program is done it should switch the terminal back to the way it was, so it needs to save the current terminal state before modifying it.
Also, you should handle the case that the EOF (0x04) is received by the terminal which the user can do by pressing CTRL-D.
Here is the complete program that does these things:
#include <stdio.h>
#include <termios.h>
main(){
tty_mode(0); /* save current terminal mode */
set_terminal_raw(); /* set -icanon, -echo */
interact(); /* interact with user */
tty_mode(1); /* restore terminal to the way it was */
return 0; /* 0 means the program exited normally */
}
void interact(){
while(1){
printf( "\nPlease enter a choice: \n1)quit\n2)Something\n" );
switch( getchar() ){
case 'q': return;
case '2': {
printf( "Hi\n" );
break;
}
case EOF: return;
}
}
}
/* put file descriptor 0 into chr-by-chr mode and noecho mode */
set_terminal_raw(){
struct termios ttystate;
tcgetattr( 0, &ttystate); /* read current setting */
ttystate.c_lflag &= ~ICANON; /* no buffering */
ttystate.c_lflag &= ~ECHO; /* no echo either */
ttystate.c_cc[VMIN] = 1; /* get 1 char at a time */
tcsetattr( 0 , TCSANOW, &ttystate); /* install settings */
}
/* 0 => save current mode 1 => restore mode */
tty_mode( int operation ){
static struct termios original_mode;
if ( operation == 0 )
tcgetattr( 0, &original_mode );
else
return tcsetattr( 0, TCSANOW, &original_mode );
}
As you can see, what seems to be a pretty simple problem is quite tricky to do properly.
A book I can highly recommend to navigate these matters is "Understanding Unix/Linux Programming" by Bruce Molay. Chapter 6 explains all the things above in detail.
The reason why this is happening is because stdin is buffered.
When you get to the line of code *choice = getc(stdin); no matter how many characters you type, getc(stdin) will only retrieve the first character. So if you type "foo" it will retrieve 'f' and set *choice to 'f'. The characters "oo" are still in the input buffer. Moreover, the carriage return character that resulted from you striking the return key is also in the input buffer. Therefore since the buffer isn't empty, the next time the loop executes, rather than waiting for you to enter something, getc(stdin); will immediately return the next character in the buffer. The function getc(stdin) will continue to immediately return the next character in the buffer until the buffer is empty. Therefore, in general it will prompt you N number of times when you enter a string of length N.
You can get around this by flushing the buffer with fflush(stdin); immediately after the line *choice = getc(stdin);
EDIT: Apparently someone else is saying not to use fflush(stdin); Go with what he says.

Confusion about how a getchar() loop works internally

I've included an example program using getchar() below, for reference (not that anyone probably needs it), and feel free to address concerns with it if you desire. But my question is:
What exactly is going on when the program calls getchar()?
Here is my understanding (please clarify or correct me):
When getchar is called, it checks the STDIN buffer to see if there is any input.
If there isn't any input, getchar sleeps.
Upon wake, getchar checks to see if there is any input, and if not, puts it self to sleep again.
Steps 2 and 3 repeat until there is input.
Once there is input (which by convention includes an 'EOF' at the end), getchar returns the first character of this input and does something to indicate that the next call to getchar should return the second letter from the same buffer? I'm not really sure what that is.
When there are no more characters left other than EOF, does getchar flush the buffer?
The terms I used are probably not quite correct.
#include <stdio.h>
int getLine(char buffer[], int maxChars);
#define MAX_LINE_LENGTH 80
int main(void){
char line[MAX_LINE_LENGTH];
int errorCode;
errorCode = getLine(line, sizeof(line));
if(errorCode == 1)
printf("Input exceeded maximum line length of %d characters.\n", MAX_LINE_LENGTH);
printf("%s\n", line);
return 0;
}
int getLine(char buffer[], int maxChars){
int c, i = 0;
while((c = getchar()) != EOF && c != '\n' && i < maxChars - 1)
buffer[i++] = c;
buffer[i++] = '\0';
if(i == maxChars)
return 1;
else
return 0;
}
Step 2-4 are slightly off.
If there is no input in the standard I/O buffer, getchar() calls a function to reload the buffer. On a Unix-like system, that normally ends up calling the read() system call, and the read() system call puts the process to sleep until there is input to be processed, or the kernel knows there will be no input to be processed (EOF). When the read returns, the code adjusts the data structures so that getchar() knows how much data is available. You description implies polling; the standard I/O system does not poll for input.
Step 5 uses the adjusted pointers to return the correct values.
There really isn't an EOF character; it is a state, not a character. Even though you type Control-D or Control-Z to indicate 'EOF', that character is not inserted into the input stream. In fact, those characters cause the system to flush any typed characters that are still waiting for 'line editing' operations (like backspace) to change them so that they are made available to the read() system call. If there are no such characters, then read() returns 0 as the number of available characters, which means EOF. Then getchar() returns the value EOF (usually -1 but guaranteed to be negative whereas valid characters are guaranteed to be non-negative (zero or positive)).
So basically, rather than polling, is it that hitting Return causes a certain I/O interrupt, and then when the OS receives this, it wakes up any processes that are sleeping for I/O?
Yes, hitting Return triggers interrupts and the OS kernel processes them and wakes up processes that are waiting for the data. The terminal driver is woken by the kernel when interrupt occurs, and decides what to do with the character(s) that were just received. They may be stashed for further processing (canonical mode) or made available immediately (raw mode), etc. Assuming, of course, that the input is a terminal; if the input is from a disk file, it is simpler in many ways — or if it is a pipe, or …
Nominally, it isn't the terminal app that gets woken by the interrupt; it is the kernel that wakes first, then the shell running in the terminal app that is woken because there's data for it to read, and only when there's output does the terminal app get woken.
I say 'nominally' because there's an outside chance that in fact the terminal app does mediate the I/O via a pty (pseudo-tty), but I think it happens at the kernel level and the terminal application is involved fairly late in the process. There's a huge disconnect really between the keyboard where you type and the display where what you type appears.
See also Canonical vs non-canonical terminal input.

Why does this getchar() loop stop after one character has been entered?

#include <stdio.h>
int main() {
char read = ' ';
while ((read = getchar()) != '\n') {
putchar(read);
}
return 0;
}
My input is f (followed by an enter, of course). I expect getchar() to ask for input again, but instead the program is terminated. How come? How can I fix this?
The Terminal can sometimes be a little bit confusing. You should change your program to:
#include <stdio.h>
int main() {
int read;
while ((read = getchar()) != EOF) {
putchar(read);
}
return 0;
}
This will read until getchar reads EOF (most of the time this macro expands to -1) from the terminal. getchar returns an int so you should make your variable 'read' into an integer, so you can check for EOF. You can send an EOF from your terminal on Linux with ^D and I think on windows with ^Z (?).
To explain a little bit what happens. In your program the expression
(read = getchar()) !='\n'
will be true as long as no '\n' is read from the buffer. The problem is, to get the buffer to your program, you have to hit enter which corresponds to '\n'.
The following steps happen when your program is invoked in the terminal:
~$\a.out
this starts your program
(empty line)
getchar() made a system call to get an input from the terminal and the terminal takes over
f
you made an input in the terminal. The 'f' is written into the buffer and echoed back on the terminal, your program has no idea about the character yet.
f
f~$
You hit enter. Your buffer contains now 'f\n'. The 'enter' also signals to the terminal, that it should return to your program. Your progam
reads the buffer and will find the f and put it onto the screen and then find an '\n' and immediatley stop the loop and end your program.
This would be standard behaviour of most terminals. You can change this behaviour, but that would depend on your OS.
getchar() returns the next character from the input stream. This includes of course also newlines etc. The fact that you don't see progress in your loop unless you press 'Enter' is caused by the fact that your file I/O (working on stdin) doesn't hand over the input buffer to getchar() unless it detects the '\n' at the end of the buffer. Your routine first blocks then handles the two keystrokes in one rush, terminating, like you specified it, with the appearance of '\n' in the input stream. Facit: getchar() will not remove the '\n' from the input stream (why should it?).
after f you are putting "enter" which is '/n'.
so the loop ends there.
if you want to take another character just keep on putting them one after the other as soon as enter is pressed the loop exits.
You've programmed it so the loop ends when you read a \n (enter), and you then return 0; from main which exits the program.
Perhaps you want something like
while ((read = getchar()) != EOF) {
putchar(read);
}
On nx terminals you can press Control-D which will tell the tty driver to return the input buffer to the app reading it. That's why ^D on a new line ends input - it causes the tty to return zero bytes, which the app interprets as end-of-file. But it also works anywhere on a line.

Resources