Well I been doing a lot of searching on google and on here about how to flush the input stream properly. All I hear is mixed arguments about how fflush() is undefined for the input stream, and some say just do it that way, and others just say don't do it, I haven't had much luck on finding a clear efficient/proper way of doing so, that the majority of people agree on.. I am quite new at programming so I don't know all the syntax/tricks of the language yet, so my question which way is the most efficient/proper solution to clearing the C input stream??
Use the getchar() twice before I try to receive more input?
Just use the fflush() function on the input? or
This is how I thought I should do it.
void clearInputBuf(void);
void clearInputBuf(void)
{
int garbageCollector;
while ((garbageCollector = getchar()) != '\n' && garbageCollector != EOF)
{}
}
So, whenever I need to read a new scanf(), or use getchar() to pause the program I just call the clearInputBuf.. So what would be the best way out of the three solutions or is there a even better option?
All I hear is mixed arguments about how fflush() is undefined for the input stream
That's correct. Don't use fflush() to flush an input stream.
(A) will work for simple cases where you leave single character in the input stream (such as scanf() leaving a newline).
(B) Don't use. It's defined on some platforms. But don't rely on what standard calls undefined behaviour.
(C) Clearly the best out of the 3 options as it can "flush" any number of characters in the input stream.
But if you read lines (such as using fgets()), you'll probably have much less need to clear input streams.
It depends on what you think of as "flushing an input stream".
For output streams, the flush operation makes sure that all data that were written to the stream but were being kept buffered in memory have been flushed to the underlying filesystem file. That's a very well defined operation.
For input streams, there is no well defined operation of what flushing the stream should do. I would say that it does not make any sense.
Some implementations of the C standard library redefine the meaning of "flush" for input streams to mean "clear the rest of the current line which has not been read yet". But that's entirely arbitrary, and other implementations choose to do nothing instead.
As of C11 this disparity has been corrected, and the standard now explicitly states that the fflush() function does not work with input streams, precisely because it makes no sense, and we do not want each runtime library vendor to go implementing it in whatever way they feel like.
So, please by all means do go ahead and implement your clearInputBuf() function the way you did, but do not think of it as "flushing the input stream". There is no such thing.
It turns out to be platform dependent.
The fflush() cannot have an input stream as a parameter because according to the c standard, IT'S UNDEFINED BEHAVIOR since the behavior is not defined anywhere.
On Windows, there is a defined behavior for fflush() and it does what you need it to do.
On Linux, there is fpurge(3) which does what you want it to do too.
The best way is to simply read all characters in a loop until
A newline character is found.
EOF is returned from getchar().
like your clearInputBuf() function.
Note that flushing an output stream means writing all unwritten data to the stream, the data that is still in a buffer waiting to be flushed. But reading all the unread bytes from a stream, does not have the same meaning.
That's why it doesn't make sense to fflush() an input stream. On the other hand fpurge() is designed specifically for this, and it's name is a better choice because you want to clear the input stream and start fresh. The problem is, it's not a standard function.
Reading fpurge(3) should clarify why fflush(stdin) is undefined behavior, and why an implementation like the one on Windows doesn't make sense because it makes fflush() behave differently with different inputs. That's like making c compliant with PHP.
The problem is more subtile than it looks:
On systems with elaborate terminal devices, such as unix and OS/X, input from the terminal is buffered at 2 separate levels: the system terminal uses a buffer to handle line editing, from just correcting input with backspace to full line editing with cursor and control keys. This is called cooked mode. A full line of input is buffered in the system until the enter key is typed or the end-of-file key combination is entered.
The FILE functions perform their own buffering, which is line buffered by default for streams associated with a terminal. The buffer size in set to BUFSIZ by default and bytes are requested from the system when the buffered contents have been consumed. For most requests, a full line will be read from the system into the stream buffer, but in some cases such as when the buffer is full, only part of the line will have been read from the system when scanf() returns. This is why discarding the contents of the stream buffer might not always suffice.
Flushing the input buffer may mean different things:
discarding extra input, including the newline character, that have been entered by the user in response to input requests such as getchar(), fgets() or scanf(). The need for flushing this input is especially obvious in the case of scanf() because most format lines will not cause the newline to be consumed.
discarding any pending input and waiting for the user to hit a key.
You can implement a fluch function portably for the first case:
int flush_stream(FILE *fp) {
int c;
while ((c = getc(fp)) != EOF && c != '\n')
continue;
return c;
}
And this is exactly what your clearInputBuf() function does for stdin.
For the second case, there is no portable solution, and system specific methods are non trivial.
Related
I would like to read characters from stdin until one of the following occurs:
an end-of-line marker is encountered (the normal case, in my thinking),
the EOF condition occurs, or
an error occurs.
How can I guarantee that one of the above events will happen eventually? In other words, how do I guarantee that getchar will eventually return either \n or EOF, provided that no error (in terms of ferror(stdin)) occurs?
// (How) can we guarantee that the LABEL'ed statement will be reached?
int done = 0;
while (!0) if (
(c = getchar()) == EOF || ferror(stdin) || c == '\n') break;
LABEL: done = !0;
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur. It seems like the answer will have to do with the properties of the device. Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition. Similarly for files stored on disc / SSD.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I am using C89, but I am curious if the answer depends on which C standard is used.
You can't.
Let's say I run your program, then I put a weight on my keyboard's "X" key and go on vacation to Hawaii. On the way there, I get struck by lightning and die.
There will never be any input other than 'x'.
Or, I may decide to type the complete story of Moby Dick, without pressing enter. It will probably take a few days. How long should your program wait before it decides that maybe I won't ever finish typing?
What do you want it to do?
Looking at all the discussion in the comments, it seems you are looking in the wrong place:
It is not a matter of keyboard drivers or wrapping stdin.
It is also not a matter of what programming language you are using.
It is a matter of the purpose of the input in your software.
Basically, it is up to you as a programmer to know how much input you want or need, and then decide when to stop reading input, even if valid input is still available.
Note, that not only are there devices that can send input forever without triggering EOF or end of line condition, but there are also programs that will happily read input forever.
This is by design.
Common examples can be found in POSIX style OS (like Linux) command line tools.
Here is a simple example:
cat /dev/urandom | hexdump
This will print random numbers for as long as your computer is running, or until you hit Ctrl+C
Though cat will stop working when there is nothing more to print (EOF or any read error), it does not expect such an end, so unless there is a bug in the implementation you are using it should happily run forever.
So the real question is:
When does your program need to stop reading characters and why?
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur.
A device such as /dev/zero, for example. Yes, stdin can be connected to a device that never provides a newline or reaches EOF, and that is not expected ever to report an error condition.
It seems like the answer will have to do with the properties of the device.
Indeed so.
Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
Generally, it's a question of the device driver. And in some cases (such as the /dev/zero example) that's all there is anyway. Generally drivers do things that are sensible for the underlying hardware, but in principle, they don't have to do.
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition.
No. Generally speaking, an end-of-line marker is sent by a terminal device if and only if the <enter> key is pressed. An end-of-file condition might be signaled if the terminal disconnects (but the program continues), or if the user explicitly causes one to be sent (by typing <-<D> on Linux or Mac, for example, or <-<Z> on Windows). Neither of those events need actually happen on any given run of a program, and it is very common for the latter not to do.
Similarly for files stored on disc / SSD.
You can generally rely on data read from an ordinary file to contain newlines where they are present in the file itself. If the file is open in text mode, then the system-specific text line terminator will also be translated to a newline, if it differs. It is not necessary for a file to contain any of those, so a program reading from a regular file might never see a newline.
You can rely on EOF being signaled when a read is attempted while the file position is at or past the and of the file's data.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I think you're trying too hard.
Reading to end-of-line might be a reasonable thing to do in some cases. Expecting a newline to eventually be reached is reasonable if the program is intended to support interactive use. But trying to ensure that invalid data cannot be fed to your program is a losing cause. Your objective should be to accept the widest array of inputs you reasonably can, and to fail gracefully when other inputs are presented.
If you need to read input in a line-by-line mode then by all means do that, and document that you do it. If only the first n characters of each line are significant to the program then document that, too. Then, if your program never terminates when a user connects its input to /dev/zero that's on them, not on you.
On the other hand, try to avoid placing arbitrary constraints, especially on sizes of things. If there is not a natural limit on the size of something, then no artificial limit you introduce will ever be enough.
What is the exact difference between the getch and getchar functions?
getchar() is a standard function that gets a character from the stdin.
getch() is non-standard. It gets a character from the keyboard (which may be different from stdin) and does not echo it.
The Standard C function is is getchar(), declared in <stdio.h>. It has existed basically since the dawn of time. It reads one character from standard input (stdin), which is typically the user's keyboard, unless it has been redirected (for example via the shell input redirection character <, or a pipe).
getch() and getche() are old MS-DOS functions, declared in <conio.h>, and still popular on Windows systems. They are not Standard C functions; they do not exist on all systems. getch reads one keystroke from the keyboard immediately, without waiting for the user to hit the Return key, and without echoing the keystroke. getche is the same, except that it does echo. As far as I know, getch and getche always read from the keyboard; they are not affected by input redirection.
The question naturally arises, if getchar is the standard function, how do you use it to read one character without waiting for the Return key, or without echoing? And the answers to those questions are at least a little bit complicated. (In fact, they're complicated enough that I suspect they explain the enduring popularity of getch and getche, which if nothing else are very easy to use.)
And the answer is that getchar has no control over details like echoing and input buffering -- as far as C is concerned, those are lower-level, system-dependent issues.
But it is useful to understand the basic input model which getchar assumes. Confusingly, there are typically two different levels of buffering.
As the user types keys on the keyboard, they are read by the operating system's terminal driver. Typically, in its default mode, the terminal driver echoes keystrokes immediately as they are typed (so the user can see what they are typing). Typically, in its default mode, the terminal driver also supports some amount of line editing -- for example, the user can hit the Delete or Backspace key to delete an accidentally-typed character. In order to support line editing, the terminal driver is typically collecting characters in an input buffer. Only when the user hits Return are the contents of that buffer made available to the calling program. (This level of buffering is present only if standard input is in fact a keyboard or other serial device. If standard input has been redirected to a file or pipe, the terminal driver is not in effect and this level of buffering does not apply.)
The stdio package reads characters from the operating system into its own input buffer. getchar simply fetches the next character from that buffer. When the buffer is empty, the stdio package attempts to refill it by reading more characters from the operating system.
So, if we trace what happens starting when a program calls getchar for the first time: stdio discovers that its input buffer is empty, so it tries to read some characters from the operating system, but there aren't any characters available yet, so the read call blocks. Meanwhile, the user may be typing some characters, which are accumulating in the terminal driver's input buffer, but the user hasn't hit Return yet. Finally, the user hits Return, and the blocked read call returns, returning a whole line's worth of characters to stdio, which uses them to fill its input buffer, out of which it then returns the first one to that initial call to getchar, which has been patiently waiting all this time. (And then if the program calls getchar a second or third time, there probably are some more characters -- the next characters on the line the user typed -- available in stdio's input buffer for getchar to return immediately. For a bit more on this, see section 6.2 of these C course notes.)
But in all of this, as you can see, getchar and the stdio package have no control over details like echoing or input line editing, because those are handled earlier, at a lower level, in the terminal driver, in step 1.
So, at least under Unix-like operating systems, if you want to read a character without waiting for the Return key, or control whether characters are echoed or not, you do that by adjusting the behavior of the terminal driver. The details vary, but there's a way to turn echo on and off, and a way (actually a couple of ways) to turn input line editing on and off. (For at least some of those details, see this SO question, or question 19.1 in the old C FAQ list.)
When input line editing is turned off, the operating system can return characters immediately (without waiting for the Return key), because in that case it doesn't have to worry that the user might have typed a wrong keystroke that needs to be "taken back" with the Delete or Backspace key. (But by the same token, when a program turns off input line editing in the terminal driver, if it wants to let the user correct mistakes, it must implement its own editing, because it is going to see --- that is, successive calls to getchar are going to return -- both the user's wrong character(s) and the character code for the Delete or Backspace key.)
getch() it just gets an input but never display that as an output on the screen despite of us pressing an enter key.
getchar() it gets an input and display it on the screen when we press the enter key.
getchar is standard C, found in stdio.h. It reads one character from stdin(the standard input stream = console input on most systems). It is a blocking call, since it requires the user to type a character then press enter. It echoes user input to the screen.
getc(stdin) is 100% equivalent to getchar, except it can also be use for other input streams.
getch is non-standard, typically found in the old obsolete MS DOS header conio.h. It works just like getchar except it isn't blocking after the first keystroke, it allows the program to continue without the user pressing enter. It does not echo input to the screen.
getche is the same as getch, also non-standard, but it echoes input to the screen.
Can anyone please explain me about difference between fpurge(FILE *stream) and fflush(FILE *stream) in C?
Both fflush() and fpurge() will discard any unwritten or unread data in the buffer.
Please explain me the exact difference between these two and also their pros and cons.
"... both fflush and fpurge will discard any unwritten or unread data in the buffer..." : No.
fflush:
The function fflush forces a write of all buffered data for the given output or update stream via the stream's underlying write function. The open status of the stream is unaffected.
If the stream argument is NULL, fflush flushes all open output streams.
fpurge:
The function fpurge erases any input or output buffered in the given stream. For output streams this discards any unwritten output. For input streams this discards any input read from the underlying object but not yet obtained via getc. This includes any text pushed back via ungetc. (P.S.: there also exists __fpurge, which does the same, but without returning any value).
Besides the obvious effect on buffered data, one use where you would notice the difference is with input streams. You can fpurge one such stream (although it is usually a mistake, possibly conceptual). Depending on the environment, you might not fflush an input stream (its behaviour might be undefined, see man page). In addition to the above remarked differences: 1) the cases where they lead to errors are different, and 2) fflush can work on all output streams with a single statement, as said (this might be very useful).
As for pros and cons, I would not really quote any... they simply work different (mostly), so you should know when to use each.
In addition to the functional difference (what you were asking), there is a portability difference: fflush is a standard function, while fpurge is not (and __fpurge is not either).
Here you have the respective man pages (fflush, fpurge).
To start with, both the functions clear the buffers (type of operable buffers are discussed below), the major difference is what happens with the data present in the buffer.
For fflush(), the data is forced to be written to disk.
For fpurge(), data is discarded.
That being said, fflush() is a standard C function, mentioned in the C11, chapter §7.21.5.2.
Whereas, fpurge() is a non-portable and non-standard function. From the man page
These functions are nonstandard and not portable. The function
fpurge() was introduced in 4.4BSD and is not available under Linux.
The function __fpurge() was introduced in Solaris, and is present in
glibc 2.1.95 and later.
That said, the major usage-side difference is,
Calling fflush() with input stream is undefined behavior.
If stream points to an output stream or an update stream in which the most recent
operation was not input, the fflush function causes any unwritten data for that stream
to be delivered to the host environment to be written to the file; otherwise, the behavior is
undefined.
Calling fpurge() with input stream is defined.
For input streams
this discards any input read from the underlying object but not yet
obtained via getc(3); this includes any text pushed back via
ungetc(3).
Still, try to stick to fflush().
I'm reading Advanced Programming in the UNIX Environment, 3rd Edition and misunderstanding a section in it (page 145, Section 5.4 Buffering, Chapter 5).
Line buffering comes with two caveats. First, the size of the buffer that the
standard I/O library uses to collect each line is fixed, so I/O might take place if
we fill this buffer before writing a newline. Second, whenever input is
requested through the standard I/O library from either (a) an unbuffered stream or (b) a line-buffered stream (that requires data to be requested from the kernel),
all line-buffered output streams are flushed. The reason for the qualifier on (b)
is that the requested data may already be in the buffer, which doesn’t require
data to be read from the kernel. Obviously, any input from an unbuffered
stream, item (a), requires data to be obtained from the kernel.
I can't get the bold lines. My English isn't good. So, could you clarify it for me? Maybe in an easier way. Thanks.
The point behind the machinations described is to ensure that prompts appear before the system goes into a mode where it is waiting for input.
If an input stream is unbuffered, every time the standard I/O library needs data, it has to go to the kernel for some information. (That's the last sentence.) That's because the standard I/O library does not buffer any data, so when it needs more data, it has to read from the kernel. (I think that even an unbuffered stream might buffer one character of data, because it would need to read up to a space character, for example, to detect when it has reached the end of a %s format string; it has to put back (ungetc()) the extra character it read so that the next time it needs a character, there is the character it put back. But it never needs more than the one character of buffering.)
If an input stream is line buffered, there may already be some data in its input buffer, in which case it may not need to go to the kernel for more data. In that case, it might not flush anything. This can occur if the scanf() format requested "%s" and you typed hello world; it would read the whole line, but the first scan would stop after hello, and the next scanf() would not need to go to the kernel for the world word because it is already in the buffer.
However, if there isn't any data in the buffer, it has to ask the kernel to read the data, and it ensures that any line-buffered output streams are flushed so that if you write:
printf("Enter name: ");
if (scanf("%63s", name) != 1)
…handle error or EOF…
then the prompt (Enter name:) appears. However, if you'd previously typed hello world and previously read just hello, then the prompt wouldn't necessarily appear because the world was already waiting in the (line buffered) input stream.
This may explain the point.
Let's imagine that you have a pipe in your program and you use it for communication between different parts of your program (single thread program writing and reading from this single pipe).
If you write to the writing end of the pipe, say the letter 'A', and then call the read operation to read from the reading end of the pipe. You would expect that the letter 'A' is read. However, read operation is a system call to the kernel. To be able to return the letter 'A' it must be written to the kernel first. This means that the writing of 'A' must be flushed, otherwise it would stay in your local writing buffer and your program would be locked forever.
In consequence, before calling a read operation all write buffers are flushed. This is what the section (b) says.
The size of the buffer that the standard I/O library is using to collect each line is fixed.
with the help of the fgets function we are getting the line continuously, during that time it will read the content with the specified buffer size or up to newline.
Second, whenever input is requested through the standard I/O library, it can use an unbuffered stream or line-buffered stream.
unbuffered stream - It will not buffer the character, flush the character regularly.
line-buffered - It will store the character into the buffer and then flush when the operation is completed.
lets take without using \n we are going to print the content in printf statement, that time it will buffer all the content until we flush or printing with new line. Like that when the operation is completed the stream buffer is flushed internally.
(b) is that the requested data may already be in the buffer, which doesn't require data to be read from the kernel
In line oriented stream the requested buffer may already in the buffer because the data can be buffered, so we can't required data to read from the kernel once again.
(a) requires data to be obtained from the kernel.
Any input from unbuffered stream item, a data to be get from the kernel due to the unbuffered stream can't store anything in the buffer.
What is the exact difference between the getch and getchar functions?
getchar() is a standard function that gets a character from the stdin.
getch() is non-standard. It gets a character from the keyboard (which may be different from stdin) and does not echo it.
The Standard C function is is getchar(), declared in <stdio.h>. It has existed basically since the dawn of time. It reads one character from standard input (stdin), which is typically the user's keyboard, unless it has been redirected (for example via the shell input redirection character <, or a pipe).
getch() and getche() are old MS-DOS functions, declared in <conio.h>, and still popular on Windows systems. They are not Standard C functions; they do not exist on all systems. getch reads one keystroke from the keyboard immediately, without waiting for the user to hit the Return key, and without echoing the keystroke. getche is the same, except that it does echo. As far as I know, getch and getche always read from the keyboard; they are not affected by input redirection.
The question naturally arises, if getchar is the standard function, how do you use it to read one character without waiting for the Return key, or without echoing? And the answers to those questions are at least a little bit complicated. (In fact, they're complicated enough that I suspect they explain the enduring popularity of getch and getche, which if nothing else are very easy to use.)
And the answer is that getchar has no control over details like echoing and input buffering -- as far as C is concerned, those are lower-level, system-dependent issues.
But it is useful to understand the basic input model which getchar assumes. Confusingly, there are typically two different levels of buffering.
As the user types keys on the keyboard, they are read by the operating system's terminal driver. Typically, in its default mode, the terminal driver echoes keystrokes immediately as they are typed (so the user can see what they are typing). Typically, in its default mode, the terminal driver also supports some amount of line editing -- for example, the user can hit the Delete or Backspace key to delete an accidentally-typed character. In order to support line editing, the terminal driver is typically collecting characters in an input buffer. Only when the user hits Return are the contents of that buffer made available to the calling program. (This level of buffering is present only if standard input is in fact a keyboard or other serial device. If standard input has been redirected to a file or pipe, the terminal driver is not in effect and this level of buffering does not apply.)
The stdio package reads characters from the operating system into its own input buffer. getchar simply fetches the next character from that buffer. When the buffer is empty, the stdio package attempts to refill it by reading more characters from the operating system.
So, if we trace what happens starting when a program calls getchar for the first time: stdio discovers that its input buffer is empty, so it tries to read some characters from the operating system, but there aren't any characters available yet, so the read call blocks. Meanwhile, the user may be typing some characters, which are accumulating in the terminal driver's input buffer, but the user hasn't hit Return yet. Finally, the user hits Return, and the blocked read call returns, returning a whole line's worth of characters to stdio, which uses them to fill its input buffer, out of which it then returns the first one to that initial call to getchar, which has been patiently waiting all this time. (And then if the program calls getchar a second or third time, there probably are some more characters -- the next characters on the line the user typed -- available in stdio's input buffer for getchar to return immediately. For a bit more on this, see section 6.2 of these C course notes.)
But in all of this, as you can see, getchar and the stdio package have no control over details like echoing or input line editing, because those are handled earlier, at a lower level, in the terminal driver, in step 1.
So, at least under Unix-like operating systems, if you want to read a character without waiting for the Return key, or control whether characters are echoed or not, you do that by adjusting the behavior of the terminal driver. The details vary, but there's a way to turn echo on and off, and a way (actually a couple of ways) to turn input line editing on and off. (For at least some of those details, see this SO question, or question 19.1 in the old C FAQ list.)
When input line editing is turned off, the operating system can return characters immediately (without waiting for the Return key), because in that case it doesn't have to worry that the user might have typed a wrong keystroke that needs to be "taken back" with the Delete or Backspace key. (But by the same token, when a program turns off input line editing in the terminal driver, if it wants to let the user correct mistakes, it must implement its own editing, because it is going to see --- that is, successive calls to getchar are going to return -- both the user's wrong character(s) and the character code for the Delete or Backspace key.)
getch() it just gets an input but never display that as an output on the screen despite of us pressing an enter key.
getchar() it gets an input and display it on the screen when we press the enter key.
getchar is standard C, found in stdio.h. It reads one character from stdin(the standard input stream = console input on most systems). It is a blocking call, since it requires the user to type a character then press enter. It echoes user input to the screen.
getc(stdin) is 100% equivalent to getchar, except it can also be use for other input streams.
getch is non-standard, typically found in the old obsolete MS DOS header conio.h. It works just like getchar except it isn't blocking after the first keystroke, it allows the program to continue without the user pressing enter. It does not echo input to the screen.
getche is the same as getch, also non-standard, but it echoes input to the screen.