How does fflush work? - c

I'm not sure if I properly understand how flushing works in C. I just can't get it to work as described in multiple manuals and reference books. Here's an example with comments:
#include <stdio.h>
int main(void) {
int x;
char ch;
printf("Prompt: ");
scanf("%d", &x); /* I key in 67 and press Enter. At this very moment,
the input buffer should contain the ASCII codes
for the numbers 6 and 7 as well as the ASCII code
for the newline character, which is 10. The scanf
function is going to find the ASCII codes for 6
and 7, convert them to the integer 67, assign it
to the variable x and remove them from the
buffer. At this point, what still remains in the
buffer is the newline character. */
fflush(stdin); /* I want to do some more input. So, I flush the
buffer to remove anything that still might be
there, but it doesn't really look like fflush is
doing that. */
printf("Prompt: "); /* I'm not going to be able to get my hands on
the following line of code because fflush is
not doing its job properly. The remaining
newline character is going to be read into the
variable ch automatically, thus not allowing
me to enter anything from the keyboard. */
scanf("%c", &ch);
printf("x: %d, ch: %d\n", x, ch);
/*
OUTPUT:
Prompt: 67
Prompt: x: 67, ch: 10
*/
return 0;
}

Don't do fflush(stdin);, it invokes undefined behavior.
Quoting C11,
Ifstream points to an output stream or an update stream in which the most recent
operation was not input, the fflush function causes any unwritten data for that stream
to be delivered to the host environment to be written to the file; otherwise, the behavior is
undefined.
and stdin is not an output stream.

For completion's sake:
Some implementations do define fflush on input streams. Examples are Microsoft's msvcrt and GNU libc. Others, like the BSD libc, may additionally provide a separate function for purging the buffer.
These functions, beside being unportable, have some serious drawback: They are often used with the assumption that the buffer looks some particular way, e.g. has a single newline.
stdin may be connected to a file, and its buffer might contain more than just a newline. These buffered data would be skipped over, if the input stream is purged.
Further, there can be buffers outside the libc's stdio which aren't affected.
Therefore, you should explicitly read and discard the data you don't want.

Aside from fflush() not being defined on input streams as Sourav pointed out...
I key in 67 and press Enter. At this very moment,
the input buffer should contain the ASCII codes
for the numbers 6 and 7 as well as the ASCII code
for the newline character, which is 10. The scanf
function is going to find the ASCII codes for 6
and 7, convert them to the integer 67, assign it
to the variable x and remove them from the
buffer. At this point, what still remains in the
buffer is the newline character.
And if the user did input something that is not a number, other things could have happened.
If there were non-digits before the first digit, x will not be initialized. You wouldn't know, because you did not check the return value
of scanf().
The user might have anticipated the next prompt (for a character), and have entered something like 67 x, expecting the 67 to satisfy your first prompt, and the x your second. (And he won't be happy your program dropped part of his entry.)
Using *scanf() on input you cannot be sure is of the expected format (user input, as opposed to e.g. reading back something you yourself wrote with *printf()) is fragile.
So my generic advice is to not use the *scanf() functions on user input, but instead use fgets() to read user input line-wise, and parse the input in-memory at your leisure.
You have many much more powerful functions at your disposal that way, and you can handle error conditions more fine-grained (including the ability to give the full line entered in any error messages).
The below is only a rough scetch; depending on your applications, you'll want to organize this differently:
const size_t BUFFERSIZE = 1024;
char buffer[ BUFFERSIZE ];
long x;
char ch;
printf( "Prompt: " );
if ( fgets( buffer, BUFFERSIZE, stdin ) == NULL )
{
// Error occurred, use feof() / ferror() as appropriate
}
else
{
size_t len = strlen( buffer );
if ( buffer[ len - 1 ] != '\n' )
{
// The line entered was too long for the buffer,
// there is unread input. Handle accordingly, e.g.
// resizing the buffer and reading the rest.
}
else
{
// You got the whole line; blot out the newline
buffer[ len - 1 ] = '\0';
// Assuming a decimal number first
char * curr = buffer;
errno = 0;
long x = strtol( curr, &curr, 10 );
if ( errno == ERANGE )
{
// Number exceeds "long" range, handle condition
}
else if ( curr == buffer )
{
// What you got was not a number, handle condition
}
// Keep parsing until you hit end-of-string
}
}

Related

How does scanf know if it should scan a new value?

I'm studying about how scanf works.
After scanned other type variable, char variable stores a white-space('\n') by getchar() or scanf("%c").
To prevent this, they should clear buffer. And I did it with rewind(stdin)
though stdin is rewinded previous input value is keeping in buffer.
and I can do something with the previous value normally.(nothing runtime errors)
but if I try scanf again, scanf will scan a new value even there is a normal value in buffer.
how does scanf determine if it should scan a new value?
I found this mechanism with below code.
#include <stdio.h>
#define p stdin
int main() {
int x;
char ch;
void* A, * B, * C, * D, * E;
A = p->_Placeholder;
printf("A : %p\n", A);//first time, it shows 0000
scanf_s("%d", &x);
B = p->_Placeholder;
printf("B : %p\n", B);//after scanned something, I think it's begin point of buffer which is assigned for this process
rewind(stdin);//rewind _Placeholder
C = p->_Placeholder;
printf("C : %p\n", C);//it outputs the same value as B - length of x
D = p->_Placeholder;
printf("D : %c\n", ((char*)D)[0]);//the previous input value is printed successfully without runtime error. it means buffer is not be cleared by scanf
scanf_s("%c", &ch, 1);//BUT scanf knows the _Placeholder is not pointing new input value, so it will scan a new value from console. How??
E = p->_Placeholder;
printf("E : %p\n", E);
printf("ch : %c\n", ch);
}
You have at least three misunderstandings:
"char variable stores a white-space"
rewind(stdin) clears the buffer
_Placeholder tells you something interesting about how scanf handles whitespace
But, I'm sorry, none of these are true.
Let's review how scanf actually handles whitespace. We start with two important pieces of background information:
The newline character, \n, is in most respects an ordinary whitespace character. It occupies space in the input buffer just like any other character. It arrives in the input buffer when you press the Enter key.
When it's done parsing a %-directive, scanf always leaves unparsed input on the input stream.
Suppose you write
int a, b;
scanf("%d%d", &a, &b);
Suppose you run that code and type, as input
12 34
and then hit the Enter key. What happens?
First, the input stream (stdin) now contains six characters:
"12 34\n"
scanf first processes the first of the two %d directives you gave it. It scans the characters 1 and 2, converting them to the integer 12 and storing it in the variable a. It stops reading at the first non-digit character it sees, which is the space character between 2 and 3. The input stream is now
" 34\n"
Notice that the space character is still on the input stream.
scanf next processes the second %d directive. It doesn't immediately find a digit character, because the space character is still there. But that's okay, because like most (but not quite all) scanf format directives, %d has a secret extra power: it automatically skips whitespace characters before reading and converting an integer. So the second %d reads and discards the space character, then reads the characters 3 and 4 and converts them to the integer 34, which it stores in the variable b.
Now scanf is done. The input stream is left containing just the newline:
"\n"
Next, let's look at a slightly different — although, as we'll see, actually very similar — example. Suppose you write
int x, y;
scanf("%d", &x);
scanf("%d", &y);
Suppose you run that code and type, as input
56
78
(where that's on two lines, meaning that you hit Enter twice).
What happens now?
In this case, the input stream will end up containing these six characters:
"56\n78\n"
The first scanf call has a %d directive to process. It scans the characters 5 and 6, converting them to the integer 56 and storing it in the variable x. It stops reading at the first non-digit character it sees, which is the newline after the 6. The input stream is now
"\n78\n"
Notice that the newline character (both newline characters) are still on the input stream.
Now the second scanf call runs. It, too, has a %d directive to process. The first character on the input stream is not a digit: it's a newline. But that's okay, because %d knows how to skip whitespace. So it reads and discards the newline character, then reads the characters 7 and 8 and converts them to the integer 78, which it stores in the variable y.
Now the second scanf is done. The input stream is left containing just the newline:
"\n"
This may all have made sense, may have seemed unsurprising, may have left you feeling, "Okay, so what's the big deal?" The big deal is this: In both examples, the input was left containing that one, last newline character.
Suppose, later in your program, you have some other input to read. We now come to a hugely significant decision point:
If the next input call is another call to scanf, and if it involves one of the (many) format specifiers that has the secret extra power of also skipping whitespace, that format specifier will skip the newline, then do its job of scanning and converting whatever input comes after the newline, and the program will work as you expect.
But if the next input call is not a call to scanf, or if it's a call to scanf that involves one of the few input specifiers that does not have the secret extra power, the newline will not be "skipped", instead it will be read as actual input. If the next input call is getchar, it will read and return the newline character. If the next input call is fgets, it will read and return a blank line. If the next input call is scanf with the %c directive, it will read and return the newline. If the next input call is scanf with the %[^\n] directive, it will read an empty line. (Actually %[^\n] will read nothing in this case, because it leaves the \n still on the input.)
It's in the second case that the "extra" whitespace causes a problem. It's in the second case that you may find yourself wanting to explicitly "flush" or discard the extra whitespace.
But it turns out that the problem of flushing or discarding the extra whitespace left behind by scanf is a remarkably stubborn one. You can't portably do it by calling fflush. You can't portably do it by calling rewind. If you care about correct, portable code, you basically have three choices:
Write your own code to explicitly read and discard "extra" characters (typically, up to and including the next newline).
Don't try to intermix scanf and other calls. Don't call scanf and then, later, try to call getchar or fgets. If you call scanf and then, later, call scanf with one of the directives (such as "%c") that lacks the "secret extra power", insert an extra space before the format specifier to cause whitespace to be skipped. (That is, use " %c" instead of "%c".)
Don't use scanf at all — do all your input in terms of fgets or getchar.
See also What can I use for input conversion instead of scanf?
Addendum: scanf's handling of whitespace can often seem puzzling. If the above explanation isn't sufficient, it may help to look at some actual C code detailing how scanf works inside. (The code I'm going to show obviously isn't the exact code that's behind your system's implementation, but it will be similar.)
When it comes time for scanf to process a %d directive, you might imagine it will do something like this. (Be forewarned: this first piece of code I'm going to show you is incomplete and wrong. It's going to take me three tries to get it right.)
c = getchar();
if(isdigit(c)) {
int intval;
intval = c - '0';
while(isdigit(c = getchar())) {
intval = 10 * intval + (c - '0');
}
*next_pointer_arg = intval;
n_vals_converted++;
} else {
/* saw no digit; processing has failed */
return n_vals_converted;
}
Let's make sure we understand everything that's going on here. We've been told to process a %d directive. We read one character from the input by calling getchar(). If that character is a digit, it's the first of possibly several digits making up an integer. We read characters and, as long as they're digits, we add them to the integer value, intval, we're collecting. The conversion involves subtracting the constant '0', to convert an ASCII character code to a digit value, and successive multiplication by 10. Once we see a character that's not a digit, we're done. We store the converted value into the pointer handed to us by our caller (here schematically but approximately represented by the pointer value next_pointer_arg), and we add one to a variable n_vals_converted keeping count of how many values we've successfully scanned and converted, which will eventually be scanf's return value.
If, on the other hand, we don't even see one digit character, we've failed: we return immediately, and our return value is the number of values we've successfully scanned and converted so far (which may well be 0).
But there is actually a subtle bug here. Suppose the input stream contains
"123x"
This code will successfully scan and convert the digits 1, 2, and 3 to the integer 123, and store this value into *next_pointer_arg. But, it will have read the character x, and after the call to isdigit in the loop while(isdigit(c = getchar())) fails, the x character will have effectively been discarded: it is no longer on the input stream.
The specification for scanf says that it is not supposed to do this. The specification for scanf says that unparsed characters are supposed to be left on the input stream. If the user had actually passed the format specifier "%dx", that would mean that, after reading and parsing an integer, a literal x is expected in the input stream, and scanf is going to have to explicitly read and match that character. So it can't accidentally read and discard the x in the process of parsing a %d directive.
So we need to modify our hypothetical %d code slightly. Whenever we read a character that turns out not to be an integer, we have to literally put it back on the input stream, for somebody else to maybe read later. There's actually a function in <stdio.h> to do this, sort of the opposite of getc, called ungetc. Here is a modified version of the code:
c = getchar();
if(isdigit(c)) {
int intval;
intval = c - '0';
while(isdigit(c = getchar())) {
intval = 10 * intval + (c - '0');
}
ungetc(c, stdin); /* push non-digit character back onto input stream */
*next_pointer_arg = intval;
n_vals_converted++;
} else {
/* saw no digit; processing has failed */
ungetc(c, stdin);
return n_vals_converted;
}
You will notice that I have added two calls to ungetc, in both places in the code where, after calling getchar and then isdigit, the code has just discovered that it has read a character that is not a digit.
It might seem strange to read a character and then change your mind, meaning that you have to "unread" it. It might make more sense to peek at at an upcoming character (to determine whether or not it's a digit) without reading it. Or, having read a character and discovered that it's not a digit, if the next piece of code that's going to process that character is right here in scanf, it might make sense to just keep it in the local variable c, rather than calling ungetc to push it back on the input stream, and then later calling getchar to fetch it from the input stream a second time. But, having called out these other two possibilities, I'm just going to say that, for now, I'm going to plough ahead with the example that uses ungetc.
So far I've shown the code that you might have imagined lay behind scanf's processing of %d. But the code I've shown so far is still significantly incomplete, because it does not show the "secret extra power". It starts looking for digit characters right away; it doesn't do any skipping of leading whitespace.
Here, then, is my third and final sample fragment of %d-processing code:
/* skip leading whitespace */
while(isspace(c = getchar())) {
/* discard */
}
if(isdigit(c)) {
int intval;
intval = c - '0';
while(isdigit(c = getchar())) {
intval = 10 * intval + (c - '0');
}
ungetc(c, stdin); /* push non-digit character back onto input stream */
*next_pointer_arg = intval;
n_vals_converted++;
} else {
/* saw no digit; processing has failed */
ungetc(c, stdin);
return n_vals_converted;
}
That initial loop reads and discards characters as long as they're whitespace. Its form is very similar to the later loop that reads and processes characters as long as they're digits. The initial loop will read one more character than it seems like it should: when the isspace call fails, that means that it has just read a non whitespace character. But that's okay, because we were just about to read a character to see if it was the first digit.
[Footnotes: This code is still far from perfect. One pretty significant problem is that it doesn't have any checks for an EOF coming along in the middle of its parsing. Another problem is that it doesn't look for - or + before the digits, so it won't handle negative numbers. Yet another, more obscure problem is that, ironically, obvious-looking calls like isdigit(c) are not always correct — strictly speaking they need to be somewhat cumbersomely rendered as isdigit((unsigned char)c).]
If you're still with me, my point in all this is to illustrate these two points in a concrete way:
The reason %d is able to automatically skip leading whitespace is because (a) the specification says it's supposed to and (b) it has explicit code to do so, as my third example illustrates.
The reason scanf always leaves unprocessed input (that is, input that comes after the input it does read and process) on the input stream is because (a) again, the specification says it's supposed to and (b) its code is typically sprinkled with explicit calls to ungetc, or the equivalent, to make sure that every unprocessed character remains on the input, as my second example illustrates.
There are some problems with you approach:
you use an undocumented, implementation specific member of the FILE object _Placeholder which may or may not be available on different platforms and whose contents are implementation specific anyway.
you use scanf_s(), which is a Microsoft specific so-called secure version of scanf(): this function is optional and may not be available on all platforms. Furthermore, Microsoft's implementation does not conform to the C Standard: for example the size argument passed after &ch is documented in VS with a type of UINT whereas the C Standard specifies it as a size_t, which on 64-bit versions of Windows has a different size.
scanf() is quite tricky to use: even experienced C programmers get bitten by its many quirks and pitfalls. In your code, you test %d and %c, which behave very differently:
for %d, scanf() will first read and discard any white space characters, such as space, TAB and newlines, then read an optional sign + or -, it then expects to read at least one digit and stop when it gets a byte that is not a digit and leave this byte in the input stream, pushing it back with ungetc() or equivalent. If no digits can be read, the conversion fails and the first non digit character is left pending in the input stream, but the previous bytes are not necessarily pushed back.
processing %c is much simpler: a single byte is read and stored into the char object or the conversion fails if the stream is at end of file.
Processing %c after %d is tricky if the input stream is bound to a terminal as the user will enter a newline after the number expected for %d and this newline will be read immediately for the %c. The program can ignore white space before the byte expected for %c by inserting a space before %c in the format string: res = scanf(" %c", &ch);
To better understand the behavior of scanf(), you should output the return value of each call and the stream current position, obtained via ftell(). It is also more reliable to first set the stream to binary mode for the return value of ftell() to be exactly the number of bytes from the beginning of the file.
Here is a modified version:
#include <stdio.h>
#ifdef _MSC_VER
#include <fcntl.h>
#include <io.h>
#endif
int main() {
int x, res;
char ch;
long A, B, C, D;
#ifdef _MSC_VER
_setmode(_fileno(stdin), _O_BINARY);
#endif
A = ftell(stdin);
printf("A : %ld\n", A);
x = 0;
res = scanf_s("%d", &x);
B = ftell(stdin);
printf("B : %ld, res=%d, x=%d\n", B, res, x);
rewind(stdin);
C = ftell(stdin);
printf("C : %ld\n", C);
ch = 0;
res = scanf_s("%c", &ch, 1);
D = ftell(stdin);
printf("D : %ld, res=%d, ch=%d (%c)\n", D, res, ch, ch);
return 0;
}
Here's some code that illustrates the behavior of the %d conversion specifier; it may help understand how that aspect of scanf works. This isn't how it's actually implemented anywhere, but it follows the same rules (Updated to handle leading +/- sign, checks for overflow, etc).
#include <stdio.h>
#include <ctype.h>
#include <errno.h>
#include <limits.h>
/**
* Mimics the behavior of the scanf %d conversion specifier.
* Skips over leading whitespace, then reads and converts
* decimal digits up to the next non-digit character.
*
* Returns EOF if no non-whitespace characters are
* seen before EOF.
*
* Returns 0 if the first non-whitespace character
* is not a digit.
*
* Returns 1 if at least one decimal digit was
* read and converted.
*
* Stops reading on the first non-digit
* character, pushes that character back
* on the input stream.
*
* In the event of a signed integer overflow,
* sets errno to ERANGE.
*/
int scan_to_int( FILE *stream, int *value )
{
int conv = 0;
int tmp = 0;
int c;
int sign = 0;
/**
* Skip over leading whitespace
*/
while( ( c = fgetc( stream ) ) != EOF && isspace( c ) )
; // empty loop
/**
* If we see end of file before any non-whitespace characters,
* return EOF.
*/
if ( c == EOF )
return c;
/**
* Account for a leading sign character.
*/
if ( c == '-' || c == '+' )
{
sign = c;
c = fgetc( stream );
}
/**
* As long as we see decimal digits, read and convert them
* to an integer value. We store the value to a temporary
* variable until we're done converting - we don't want
* to update value unless we know the operation was
* successful
*/
while( c != EOF && isdigit( c ) )
{
/**
* Check for overflow. While setting errno on overflow
* isn't required by the C language definition, I'm adding
* it anyway.
*/
if ( tmp > INT_MAX / 10 - (c - '0') )
errno = ERANGE;
tmp = tmp * 10 + (c - '0');
conv = 1;
c = fgetc( stream );
}
/**
* Push the last character read back onto the input
* stream.
*/
if ( c != EOF )
ungetc( c, stream );
/**
* If we read a sign character (+ or -) but did not have a
* successful conversion, then that character was not part
* of a numeric string and we need to put it back on the
* input stream in case it's part of a non-numeric input.
*/
if ( sign && !conv )
ungetc( sign, stream );
/**
* If there was a successful read and conversion,
* update the output parameter.
*/
if ( conv )
*value = tmp * (sign == '-' ? -1 : 1);
/**
* Return 1 if the read was successful, 0 if there
* were no digits in the input.
*/
return conv;
}
/**
* Simple test program - attempts to read 1 integer from
* standard input and display it. Display any trailing
* characters in the input stream up to and including
* the next newline character.
*/
int main( void )
{
int val;
int r;
errno = 0;
/**
* Read the next item from standard input and
* attempt to convert it to an integer value.
*/
if ( (r = scan_to_int( stdin, &val )) != 1 )
printf( "Failed to read input, r = %d\n", r );
else
printf( "Read %d%s\n", val, errno == ERANGE ? " (overflow)" : "" );
/**
* If we didn't hit EOF, display the remaining
* contents of the input stream.
*/
if ( r != EOF )
{
fputs( "Remainder of input stream: {", stdout );
int c;
do {
c = fgetc( stdin );
switch( c )
{
case '\a': fputs( "\\a", stdout ); break;
case '\b': fputs( "\\b", stdout ); break;
case '\f': fputs( "\\f", stdout ); break;
case '\n': fputs( "\\n", stdout ); break;
case '\r': fputs( "\\r", stdout ); break;
case '\t': fputs( "\\t", stdout ); break;
default: fputc( c, stdout ); break;
}
} while( c != '\n' );
fputs( "}\n", stdout );
}
return 0;
}
Some examples - first, we signal EOF (in my case, by typing Ctrl-D):
$ ./convert
Failed to read input, r = -1
Next, we pass in a non-numeric string:
$ ./convert
abcd
Failed to read input, r = 0
Remainder of input stream: {abcd\n}
Since nothing was converted, the remainder of the input stream contains everything we typed (including the newline from hitting Enter).
Next, a numeric string with non-numeric trailing characters:
$ ./convert
12cd45
Read 12
Remainder of input stream: {cd45\n}
We stopped reading at 'c' - only the leading 12 is read and converted.
Several numeric strings separated by whitespace - only the first string is converted:
$ ./convert
123 456 789
Read 123
Remainder of input stream: {\t456\t789\n}
And a numeric string with leading whitespace:
$ ./convert
12345
Read 12345
Remainder of input stream: {\n}
Handle leading signs:
$ ./convert
-123abd
Read -123
Remainder of input stream: {abd\n}
$ ./convert
+456
Read 456
Remainder of input stream: {\n}
$ ./convert
-abcd
Failed to read input, r = 0
Remainder of input stream: {-abcd\n}
And, finally, we add an overflow check - note that scanf is not required to check for overflow by the C language standard, but I figured it was a useful thing to do:
$ ./convert
123456789012345678990012345667890
Read -701837006 (overflow)
Remainder of input stream: {\n}
%d, %i, %f, %s, etc., all skip over leading whitespace, since whitespace is not meaningful in those cases except to act as a separator between inputs. %c and %[ do not skip over leading whitespace, because it may be meaningful for those particular conversions (there are times when you want to know whether the character you just read is a space, or a tab, or a newline).
As Steve points out, whitespace handling in C stdio routines is and always has been a thorny problem, and no one solution always works the best, especially since different library routines handle it differently.

What can I use for input conversion instead of scanf?

I have very frequently seen people discouraging others from using scanf and saying that there are better alternatives. However, all I end up seeing is either "don't use scanf" or "here's a correct format string", and never any examples of the "better alternatives" mentioned.
For example, let's take this snippet of code:
scanf("%c", &c);
This reads the whitespace that was left in the input stream after the last conversion. The usual suggested solution to this is to use:
scanf(" %c", &c);
or to not use scanf.
Since scanf is bad, what are some ANSI C options for converting input formats that scanf can usually handle (such as integers, floating-point numbers, and strings) without using scanf?
The most common ways of reading input are:
using fgets with a fixed size, which is what is usually suggested, and
using fgetc, which may be useful if you're only reading a single char.
To convert the input, there are a variety of functions that you can use:
strtoll, to convert a string into an integer
strtof/d/ld, to convert a string into a floating-point number
sscanf, which is not as bad as simply using scanf, although it does have most of the downfalls mentioned below
There are no good ways to parse a delimiter-separated input in plain ANSI C. Either use strtok_r from POSIX or strtok, which is not thread-safe. You could also roll your own thread-safe variant using strcspn and strspn, as strtok_r doesn't involve any special OS support.
It may be overkill, but you can use lexers and parsers (flex and bison being the most common examples).
No conversion, simply just use the string
Since I didn't go into exactly why scanf is bad in my question, I'll elaborate:
With the conversion specifiers %[...] and %c, scanf does not eat up whitespace. This is apparently not widely known, as evidenced by the many duplicates of this question.
There is some confusion about when to use the unary & operator when referring to scanf's arguments (specifically with strings).
It's very easy to ignore the return value from scanf. This could easily cause undefined behavior from reading an uninitialized variable.
It's very easy to forget to prevent buffer overflow in scanf. scanf("%s", str) is just as bad as, if not worse than, gets.
You cannot detect overflow when converting integers with scanf. In fact, overflow causes undefined behavior in these functions.
TL;DR
fgets is for getting the input. sscanf is for parsing it afterwards. scanf tries to do both at the same time. That's a recipe for trouble. Read first and parse later.
Why is scanf bad?
The main problem is that scanf was never intended to deal with user input. It's intended to be used with "perfectly" formatted data. I quoted the word "perfectly" because it's not completely true. But it is not designed to parse data that are as unreliable as user input. By nature, user input is not predictable. Users misunderstands instructions, makes typos, accidentally press enter before they are done etc. One might reasonably ask why a function that should not be used for user input reads from stdin. If you are an experienced *nix user the explanation will not come as a surprise but it might confuse Windows users. In *nix systems, it is very common to build programs that work via piping, which means that you send the output of one program to another by piping the stdout of the first program to the stdin of the second. This way, you can make sure that the output and input are predictable. During these circumstances, scanf actually works well. But when working with unpredictable input, you risk all sorts of trouble.
So why aren't there any easy-to-use standard functions for user input? One can only guess here, but I assume that old hardcore C hackers simply thought that the existing functions were good enough, even though they are very clunky. Also, when you look at typical terminal applications they very rarely read user input from stdin. Most often you pass all the user input as command line arguments. Sure, there are exceptions, but for most applications, user input is a very minor thing.
So what can you do?
First of all, gets is NOT an alternative. It's dangerous and should NEVER be used. Read here why: Why is the gets function so dangerous that it should not be used?
My favorite is fgets in combination with sscanf. I once wrote an answer about that, but I will re-post the complete code. Here is an example with decent (but not perfect) error checking and parsing. It's good enough for debugging purposes.
Note
I don't particularly like asking the user to input two different things on one single line. I only do that when they belong to each other in a natural way. Like for instance printf("Enter the price in the format <dollars>.<cent>: "); fgets(buffer, bsize, stdin); and then use sscanf(buffer "%d.%d", &dollar, &cent). I would never do something like printf("Enter height and base of the triangle: "). The main point of using fgets below is to encapsulate the inputs to ensure that one input does not affect the next.
#define bsize 100
void error_function(const char *buffer, int no_conversions) {
fprintf(stderr, "An error occurred. You entered:\n%s\n", buffer);
fprintf(stderr, "%d successful conversions", no_conversions);
exit(EXIT_FAILURE);
}
char c, buffer[bsize];
int x,y;
float f, g;
int r;
printf("Enter two integers: ");
fflush(stdout); // Make sure that the printf is executed before reading
if(! fgets(buffer, bsize, stdin)) error_function(buffer, 0);
if((r = sscanf(buffer, "%d%d", &x, &y)) != 2) error_function(buffer, r);
// Unless the input buffer was to small we can be sure that stdin is empty
// when we come here.
printf("Enter two floats: ");
fflush(stdout);
if(! fgets(buffer, bsize, stdin)) error_function(buffer, 0);
if((r = sscanf(buffer, "%f%f", &f, &g)) != 2) error_function(buffer, r);
// Reading single characters can be especially tricky if the input buffer
// is not emptied before. But since we're using fgets, we're safe.
printf("Enter a char: ");
fflush(stdout);
if(! fgets(buffer, bsize, stdin)) error_function(buffer, 0);
if((r = sscanf(buffer, "%c", &c)) != 1) error_function(buffer, r);
printf("You entered %d %d %f %c\n", x, y, f, c);
If you do a lot of these, I could recommend creating a wrapper that always flushes:
int printfflush (const char *format, ...)
{
va_list arg;
int done;
va_start (arg, format);
done = vfprintf (stdout, format, arg);
fflush(stdout);
va_end (arg);
return done;
}
Doing like this will eliminate a common problem, which is the trailing newline that can mess with the nest input. But it has another issue, which is if the line is longer than bsize. You can check that with if(buffer[strlen(buffer)-1] != '\n'). If you want to remove the newline, you can do that with buffer[strcspn(buffer, "\n")] = 0.
In general, I would advise to not expect the user to enter input in some weird format that you should parse to different variables. If you want to assign the variables height and width, don't ask for both at the same time. Allow the user to press enter between them. Also, this approach is very natural in one sense. You will never get the input from stdin until you hit enter, so why not always read the whole line? Of course this can still lead to issues if the line is longer than the buffer. Did I remember to mention that user input is clunky in C? :)
To avoid problems with lines longer than the buffer you can use a function that automatically allocates a buffer of appropriate size, you can use getline(). The drawback is that you will need to free the result afterwards. This function is not guaranteed to exist by the standard, but POSIX has it. You could also implement your own, or find one on SO. How can I read an input string of unknown length?
Stepping up the game
If you're serious about creating programs in C with user input, I would recommend having a look at a library like ncurses. Because then you likely also want to create applications with some terminal graphics. Unfortunately, you will lose some portability if you do that, but it gives you far better control of user input. For instance, it gives you the ability to read a key press instantly instead of waiting for the user to press enter.
Interesting reading
Here is a rant about scanf: https://web.archive.org/web/20201112034702/http://sekrit.de/webdocs/c/beginners-guide-away-from-scanf.html
scanf is awesome when you know your input is always well-structured and well-behaved. Otherwise...
IMO, here are the biggest problems with scanf:
Risk of buffer overflow - if you do not specify a field width for the %s and %[ conversion specifiers, you risk a buffer overflow (trying to read more input than a buffer is sized to hold). Unfortunately, there's no good way to specify that as an argument (as with printf) - you have to either hardcode it as part of the conversion specifier or do some macro shenanigans.
Accepts inputs that should be rejected - If you're reading an input with the %d conversion specifier and you type something like 12w4, you would expect scanf to reject that input, but it doesn't - it successfully converts and assigns the 12, leaving w4 in the input stream to foul up the next read.
So, what should you use instead?
I usually recommend reading all interactive input as text using fgets - it allows you to specify a maximum number of characters to read at a time, so you can easily prevent buffer overflow:
char input[100];
if ( !fgets( input, sizeof input, stdin ) )
{
// error reading from input stream, handle as appropriate
}
else
{
// process input buffer
}
One quirk of fgets is that it will store the trailing newline in the buffer if there's room, so you can do an easy check to see if someone typed in more input than you were expecting:
char *newline = strchr( input, '\n' );
if ( !newline )
{
// input longer than we expected
}
How you deal with that is up to you - you can either reject the whole input out of hand, and slurp up any remaining input with getchar:
while ( getchar() != '\n' )
; // empty loop
Or you can process the input you got so far and read again. It depends on the problem you're trying to solve.
To tokenize the input (split it up based on one or more delimiters), you can use strtok, but beware - strtok modifies its input (it overwrites delimiters with the string terminator), and you can't preserve its state (i.e., you can't partially tokenize one string, then start to tokenize another, then pick up where you left off in the original string). There's a variant, strtok_s, that preserves the state of the tokenizer, but AFAIK its implementation is optional (you'll need to check that __STDC_LIB_EXT1__ is defined to see if it's available).
Once you've tokenized your input, if you need to convert strings to numbers (i.e., "1234" => 1234), you have options. strtol and strtod will convert string representations of integers and real numbers to their respective types. They also allow you to catch the 12w4 issue I mentioned above - one of their arguments is a pointer to the first character not converted in the string:
char *text = "12w4";
char *chk;
long val;
long tmp = strtol( text, &chk, 10 );
if ( !isspace( *chk ) && *chk != 0 )
// input is not a valid integer string, reject the entire input
else
val = tmp;
In this answer I'm going to assume that you are reading and
interpreting lines of text.
Perhaps you're prompting the user, who is typing something and
hitting RETURN. Or perhaps you're reading lines of structured
text from a data file of some kind.
Since you're reading lines of text, it makes sense to organize
your code around a library function that reads, well, a line of
text.
The Standard function is fgets(), although there are others (including getline). And then the next step is to interpret
that line of text somehow.
Here's the basic recipe for calling fgets to read a line of
text:
char line[512];
printf("type something:\n");
fgets(line, 512, stdin);
printf("you typed: %s", line);
This simply reads in one line of text and prints it back out.
As written it has a couple of limitations, which we'll get to in
a minute. It also has a very great feature: that number 512 we
passed as the second argument to fgets is the size of the array
line we're asking fgets to read into. This fact -- that we can
tell fgets how much it's allowed to read -- means that we can
be sure that fgets won't overflow the array by reading too much
into it.
So now we know how to read a line of text, but what if we really
wanted to read an integer, or a floating-point number, or a
single character, or a single word? (That is, what if the
scanf call we're trying to improve on had been using a format
specifier like %d, %f, %c, or %s?)
It's easy to reinterpret a line of text -- a string -- as any of these things.
To convert a string to an integer, the simplest (though
imperfect) way to do it is to call atoi().
To convert to a floating-point number, there's atof().
(And there are also better ways, as we'll see in a minute.)
Here's a very simple example:
printf("type an integer:\n");
fgets(line, 512, stdin);
int i = atoi(line);
printf("type a floating-point number:\n");
fgets(line, 512, stdin);
float f = atof(line);
printf("you typed %d and %f\n", i, f);
If you wanted the user to type a single character (perhaps y or
n as a yes/no response), you can literally just grab the first
character of the line, like this:
printf("type a character:\n");
fgets(line, 512, stdin);
char c = line[0];
printf("you typed %c\n", c);
(This ignores, of course, the possibility that the user typed a
multi-character response; it quietly ignores any extra characters
that were typed.)
Finally, if you wanted the user to type a string definitely not containing
whitespace, if you wanted to treat the input line
hello world!
as the string "hello" followed by something else (which is what
the scanf format %s would have done), well, in that case, I
fibbed a little, it's not quite so easy to reinterpret the line
in that way, after all, so the answer to that part of the question will have
to wait for a bit.
But first I want to go back to three things I skipped over.
(1) We've been calling
fgets(line, 512, stdin);
to read into the array line, and where 512 is the size of the
array line so fgets knows not to overflow it. But to make
sure that 512 is the right number (especially, to check if maybe
someone tweaked the program to change the size), you have to read
back to wherever line was declared. That's a nuisance, so
there are two much better ways to keep the sizes in sync.
You could, (a) use the preprocessor to make a name for the size:
#define MAXLINE 512
char line[MAXLINE];
fgets(line, MAXLINE, stdin);
Or, (b) use C's sizeof operator:
fgets(line, sizeof(line), stdin);
(2) The second problem is that we haven't been checking for
error. When you're reading input, you should always check for
the possibility of error. If for whatever reason fgets can't
read the line of text you asked it to, it indicates this by
returning a null pointer. So we should have been doing things like
printf("type something:\n");
if(fgets(line, 512, stdin) == NULL) {
printf("Well, never mind, then.\n");
exit(1);
}
Finally, there's the issue that in order to read a line of text,
fgets reads characters and fills them into your array until it
finds the \n character that terminates the line, and it fills
the \n character into your array, too. You can see this if
you modify our earlier example slightly:
printf("you typed: \"%s\"\n", line);
If I run this and type "Steve" when it prompts me, it prints out
you typed: "Steve
"
That " on the second line is because the string it read and
printed back out was actually "Steve\n".
Sometimes that extra newline doesn't matter (like when we called
atoi or atof, since they both ignore any extra non-numeric
input after the number), but sometimes it matters a lot. So
often we'll want to strip that newline off. There are several
ways to do that, which I'll get to in a minute. (I know I've been
saying that a lot. But I will get back to all those things, I promise.)
At this point, you may be thinking: "I thought you said scanf
was no good, and this other way would be so much better.
But fgets is starting to look like a nuisance.
Calling scanf was so easy! Can't I keep using it?"
Sure, you can keep using scanf, if you want. (And for really
simple things, in some ways it is simpler.) But, please, don't
come crying to me when it fails you due to one of its 17 quirks
and foibles, or goes into an infinite loop because of input your
didn't expect, or when you can't figure out how to use it to do
something more complicated. And let's take a look at fgets's
actual nuisances:
You always have to specify the array size. Well, of course,
that's not a nuisance at all -- that's a feature, because buffer
overflow is a Really Bad Thing.
You have to check the return value. Actually, that's a wash,
because to use scanf correctly, you have to check its return
value, too.
You have to strip the \n back off. This is, I admit, a true
nuisance. I wish there were a Standard function I could point
you to that didn't have this little problem. (Please nobody
bring up gets.) But compared to scanf's 17 different
nuisances, I'll take this one nuisance of fgets any day.
So how do you strip that newline? There are many ways:
(a) Obvious way:
char *p = strchr(line, '\n');
if(p != NULL) *p = '\0';
(b) Tricky & compact way:
strtok(line, "\n");
Unfortunately this doesn't work quite right on empty lines.
(c) Another compact and mildly obscure way:
line[strcspn(line, "\n")] = '\0';
And there are other ways as well. Me, I always just use (a), since it's simple & obvious, if less than concise.
See this question, or this question, for more (much more) on stripping the \n from what fgets gives you.
And now that that's out of the way, we can get back to another
thing I skipped over: the imperfections of atoi() and atof().
The problem with those is they don't give you any useful
indication of success of success or failure: they quietly ignore
trailing nonnumeric input, and they quietly return 0 if there's
no numeric input at all. The preferred alternatives -- which
also have certain other advantages -- are strtol and strtod.
strtol also lets you use a base other than 10, meaning you can
get the effect of (among other things) %o or %x with scanf.
But showing how to use these functions correctly is a story in itself,
and would be too much of a distraction from what is already turning
into a pretty fragmented narrative, so I'm not going to say
anything more about them now.
The rest of the main narrative concerns input you might be trying
to parse that's more complicated than just a single number or
character. What if you want to read a line containing two
numbers, or multiple whitespace-separated words, or specific
framing punctuation? That's where things get interesting, and
where things were probably getting complicated if you were trying
to do things using scanf, and where there are vastly more
options now that you've cleanly read one line of text using fgets,
although the full story on all those options could probably fill
a book, so we're only going to be able to scratch the surface here.
My favorite technique is to break the line up into
whitespace-separated "words", then do something further with each
"word". One principal Standard function for doing this is
strtok (which also has its issues, and which also rates a whole
separate discussion). My own preference is a dedicated function
for constructing an array of pointers to each broken-apart
"word", a function I describe in
these course notes.
At any rate, once you've got "words", you can further process
each one, perhaps with the same atoi/atof/strtol/strtod
functions we've already looked at.
Paradoxically, even though we've been spending a fair amount of
time and effort here figuring out how to move away from scanf,
another fine way to deal with the line of text we just read with
fgets is to pass it to sscanf. In this way, you end up with
most of the advantages of scanf, but without most of the
disadvantages.
If your input syntax is particularly complicated, it might be appropriate to use a "regexp" library to parse it.
Finally, you can use whatever ad hoc parsing solutions suit
you. You can move through the line a character at a time with a
char * pointer checking for characters you expect. Or you can
search for specific characters using functions like strchr or strrchr,
or strspn or strcspn, or strpbrk. Or you can parse/convert
and skip over groups of digit characters using the strtol or
strtod functions that we skipped over earlier.
There's obviously much more that could be said, but hopefully
this introduction will get you started.
What can I use to parse input instead of scanf?
Instead of scanf(some_format, ...), consider fgets() with sscanf(buffer, some_format_and %n, ...)
By using " %n", code can simply detect if all the format was successfully scanned and that no extra non-white-space junk was at the end.
// scanf("%d %f fred", &some_int, &some_float);
#define EXPECTED_LINE_MAX 100
char buffer[EXPECTED_LINE_MAX * 2]; // Suggest 2x, no real need to be stingy.
if (fgets(buffer, sizeof buffer, stdin)) {
int n = 0;
// add ----------------> " %n" -----------------------, &n
sscanf(buffer, "%d %f fred %n", &some_int, &some_float, &n);
// Did scan complete, and to the end?
if (n > 0 && buffer[n] == '\0') {
// success, use `some_int, some_float`
} else {
; // Report bad input and handle desired.
}
Let's state the requirements of parsing as:
valid input must be accepted (and converted into some other form)
invalid input must be rejected
when any input is rejected, it is necessary to provide the user with a descriptive message that explains (in clear "easily understood by normal people who are not programmers" language) why it was rejected (so that people can figure out how to fix the problem)
To keep things very simple, lets consider parsing a single simple decimal integer (that was typed in by the user) and nothing else. Possible reasons for the user's input to be rejected are:
the input contained unacceptable characters
the input represents a number that is lower than the accepted minimum
the input represents a number that is higher than the accepted maximum
the input represents a number that has a non-zero fractional part
Let's also define "input contained unacceptable characters" properly; and say that:
leading whitespace and trailing whitespace will be ignored (e.g. "
5 " will be treated as "5")
zero or one decimal point is allowed (e.g. "1234." and "1234.000" are both treated the same as "1234")
there must be at least one digit (e.g. "." is rejected)
no more than one decimal point is allowed (e.g. "1.2.3" is rejected)
commas that are not between digits will be rejected (e.g. ",1234" is rejected)
commas that are after a decimal point will be rejected (e.g. "1234.000,000" is rejected)
commas that are after another comma are rejected (e.g. "1,,234" is rejected)
all other commas will be ignored (e.g. "1,234" will be treated as "1234")
a minus sign that is not the first non-whitespace character is rejected
a positive sign that is not the first non-whitespace character is rejected
From this we can determine that the following error messages are needed:
"Unknown character at start of input"
"Unknown character at end of input"
"Unknown character in middle of input"
"Number is too low (minimum is ....)"
"Number is too high (maximum is ....)"
"Number is not an integer"
"Too many decimal points"
"No decimal digits"
"Bad comma at start of number"
"Bad comma at end of number"
"Bad comma in middle of number"
"Bad comma after decimal point"
From this point we can see that a suitable function to convert a string into an integer would need to distinguish between very different types of errors; and that something like "scanf()" or "atoi()" or "strtoll()" is completely and utterly worthless because they fail to give you any indication of what was wrong with the input (and use a completely irrelevant and inappropriate definition of what is/isn't "valid input").
Instead, lets start writing something that isn't useless:
char *convertStringToInteger(int *outValue, char *string, int minValue, int maxValue) {
return "Code not implemented yet!";
}
int main(int argc, char *argv[]) {
char *errorString;
int value;
if(argc < 2) {
printf("ERROR: No command line argument.\n");
return EXIT_FAILURE;
}
errorString = convertStringToInteger(&value, argv[1], -10, 2000);
if(errorString != NULL) {
printf("ERROR: %s\n", errorString);
return EXIT_FAILURE;
}
printf("SUCCESS: Your number is %d\n", value);
return EXIT_SUCCESS;
}
To meet the stated requirements; this convertStringToInteger() function is likely to end up being several hundred lines of code all by itself.
Now, this was just "parsing a single simple decimal integer". Imagine if you wanted to parse something complex; like a list of "name, street address, phone number, email address" structures; or maybe like a programming language. For these cases you might need to write thousands of lines of code to create a parse that isn't a crippled joke.
In other words...
What can I use to parse input instead of scanf?
Write (potentially thousands of lines) of code yourself, to suit your requirements.
Here is an example of using flex to scan a simple input, in this case a file of ASCII floating point numbers that might be in either US (n,nnn.dd) or European (n.nnn,dd) formats. This is just copied from a much larger program, so there may be some unresolved references:
/* This scanner reads a file of numbers, expecting one number per line. It */
/* allows for the use of European-style comma as decimal point. */
%{
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifdef WINDOWS
#include <io.h>
#endif
#include "Point.h"
#define YY_NO_UNPUT
#define YY_DECL int f_lex (double *val)
double atofEuro (char *);
%}
%option prefix="f_"
%option nounput
%option noinput
EURONUM [-+]?[0-9]*[,]?[0-9]+([eE][+-]?[0-9]+)?
NUMBER [-+]?[0-9]*[\.]?[0-9]+([eE][+-]?[0-9]+)?
WS [ \t\x0d]
%%
[!##%&*/].*\n
^{WS}*{EURONUM}{WS}* { *val = atofEuro (yytext); return (1); }
^{WS}*{NUMBER}{WS}* { *val = atof (yytext); return (1); }
[\n]
.
%%
/*------------------------------------------------------------------------*/
int scan_f (FILE *in, double *vals, int max)
{
double *val;
int npts, rc;
f_in = in;
val = vals;
npts = 0;
while (npts < max)
{
rc = f_lex (val);
if (rc == 0)
break;
npts++;
val++;
}
return (npts);
}
/*------------------------------------------------------------------------*/
int f_wrap ()
{
return (1);
}
One of the most common uses of scanf is to read a single int as input from the user. Therefore, my answer will focus on this one problem only.
Here is an example of how scanf is commonly used for reading an int from the user:
int num;
printf( "Please enter an integer: " );
if ( scanf( "%d", &num ) != 1 )
{
printf( "Error converting input!\n" );
}
else
{
printf( "The input was successfully converted to %d.\n", num );
}
Using scanf in this manner has several problems:
The function scanf will not always read a whole line of input.
If the input conversion fails due to the user entering bad input such as abc, then the bad input will be left on the input stream. If this bad input is not discarded afterwards, then all further calls to scanf with the %d format specifier will immediately fail, without waiting for the user to enter further input. This may cause an infinite loop.
Even if the input conversion succeeds, any trailing bad input will be left on the input stream. For example, if the user enters 6abc, then scanf will successfully convert the 6, but leave abc on the input stream. If this input is not discarded, then we will once again have the problem of all further calls to scanf with the %d format specifier immediately failing, which may cause an infinite loop.
Even in the case of the input succeeding and the user not entering any trailing bad input, the mere fact that scanf generally leaves the newline character on the input stream can cause trouble, as demonstrated in this question.
Another issue with using scanf with the %d format spcifier is that if the result of the conversion is not representable as an int (e.g. if the result is larger than INT_MAX), then, according to §7.21.6.2 ¶10 of the ISO C11 standard, the behavior of the program is undefined, which means that you cannot rely on any specific behavior.
In order to solve all of the issues mentioned above, it is generally better to use the function fgets, which will always read an entire line of input at once, if possible. This function will read the input as a string. After doing this, you can use the function strtol to attempt to convert the string to an integer. Here is an example program:
#include <stdio.h>
#include <stdlib.h>
int main( void )
{
char line[200], *p;
int num;
//prompt user for input
printf( "Enter a number: " );
//attempt to read one line of input
if ( fgets( line, sizeof line, stdin ) == NULL )
{
printf( "Input failure!\n" );
exit( EXIT_FAILURE );
}
//attempt to convert string to integer
num = strtol( line, &p, 10 );
if ( p == line )
{
printf( "Unable to convert to integer!\n" );
exit( EXIT_FAILURE );
}
//print result
printf( "Conversion successful! The number is %d.\n", num );
}
However, this code has the following issues:
It does not check whether the input line was too long to fit into the buffer.
It does not check whether the converted number is representable as an int, for example whether the number is too large to be stored in an int.
It will accept 6abc as valid input for the number 6. This is not as bad as scanf, because scanf will leave abc on the input stream, whereas fgets will not. However, it would probably still be better to reject the input instead of accepting it.
All of these issues can be solved by doing the following:
Issue #1 can be solved by checking
whether the input buffer contains a newline character, or
whether end-of-file has been reached, which can be treated as equivalent to a newline character, because it also indicates the end of the line.
Issue #2 can be solved by checking whether the function strtol set errno to the value of the macro constant ERANGE, to determine whether the converted value is representable as a long. In order to determine whether this value is also representable as an int, the value returned by strtol should be compared against INT_MIN and INT_MAX.
Issue #3 can be solved by checking all remaining characters on the line. Since strtol accepts leading whitespace characters, it would probably also be appropriate to accept trailing whitespace characters. However, if the input contains any other trailing characters, the input should probably be rejected.
Here is an improved version of the code, which solves all of the issues mentioned above and also puts everything into a function named get_int_from_user. This function will automatically reprompt the user for input, until the input is valid.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <limits.h>
#include <errno.h>
int get_int_from_user( const char *prompt )
{
//loop forever until user enters a valid number
for (;;)
{
char buffer[1024], *p;
long l;
//prompt user for input
fputs( prompt, stdout );
//get one line of input from input stream
if ( fgets( buffer, sizeof buffer, stdin ) == NULL )
{
fprintf( stderr, "Unrecoverable input error!\n" );
exit( EXIT_FAILURE );
}
//make sure that entire line was read in (i.e. that
//the buffer was not too small)
if ( strchr( buffer, '\n' ) == NULL && !feof( stdin ) )
{
int c;
printf( "Line input was too long!\n" );
//discard remainder of line
do
{
c = getchar();
if ( c == EOF )
{
fprintf( stderr, "Unrecoverable error reading from input!\n" );
exit( EXIT_FAILURE );
}
} while ( c != '\n' );
continue;
}
//attempt to convert string to number
errno = 0;
l = strtol( buffer, &p, 10 );
if ( p == buffer )
{
printf( "Error converting string to number!\n" );
continue;
}
//make sure that number is representable as an "int"
if ( errno == ERANGE || l < INT_MIN || l > INT_MAX )
{
printf( "Number out of range error!\n" );
continue;
}
//make sure that remainder of line contains only whitespace,
//so that input such as "6abc" gets rejected
for ( ; *p != '\0'; p++ )
{
if ( !isspace( (unsigned char)*p ) )
{
printf( "Unexpected input encountered!\n" );
//cannot use `continue` here, because that would go to
//the next iteration of the innermost loop, but we
//want to go to the next iteration of the outer loop
goto continue_outer_loop;
}
}
return l;
continue_outer_loop:
continue;
}
}
int main( void )
{
int number;
number = get_int_from_user( "Enter a number: " );
printf( "Input was valid.\n" );
printf( "The number is: %d\n", number );
return 0;
}
This program has the following behavior:
Enter a number: abc
Error converting string to number!
Enter a number: 6000000000
Number out of range error!
Enter a number: 6 7 8
Unexpected input encountered!
Enter a number: 6abc
Unexpected input encountered!
Enter a number: 6
Input was valid.
The number is: 6
Other answers give the right low-level details, so I'll limit myself to a higher-level: First, analyse what you expect each input line to look like. Try to describe the input with a formal syntax - with luck, you will find it can be described using a regular grammar, or at least a context-free grammar. If a regular grammar suffices, then you can code up a finite-state machine which recognizes and interprets each command-line one character at a time. Your code will then read a line (as explained in other replies), then scan the chars in the buffer through the state-machine. At certain states you stop and convert the substring scanned thus far to a number or whatever. You can probably 'roll your own' if it is this simple; if you find you require a full context-free grammar you are better off figuring out how to use existing parsing tools (re: lex and yacc or their variants).

how to find out if there is a newline or number in c?

I have this assignment where I have to read till the "?" char and then check if it is followed by number and newline, or newline and then the number and than again newline.
I checked the first char after the "?"
if (scanf("%c",c)=='\n') ...;
but that only works if the first one is a newline, and when it isn't and i want to read the number instead, it cuts the first digit ... for example, it doesn´t read 133 but only 33
... how do i do this?
I also tried puting the char back, but that wouldn't work
please help :)
One advantage of getline over either fgets (or a distant scanf) is that getline returns the actual number of characters successfully read. This allows a simple check for a newline at the end by using the return to getline. For example:
while (printf ((nchr = getline (&line, &n, stdin)) != -1)
{
if (line[nchr - 1] = '\n') /* check whether the last character is newline */
line[--nchr] = 0; /* replace the newline with null-termination */
/* while decrementing nchr to new length */
Use fgets(3), or better yet, getline(3) (like here) to read the entire line, then parse the line using strtol(3) or sscanf(3) (like here)
Don't forget to carefully read the documentation of every function you are using. Handle the error cases - perhaps using perror then exit to show a meaningful message. Notice that scanf and sscanf return the number of scanned items, and know about %n, and that strtol can set some end pointer.
Remember that on some OSes (e.g. Linux), the terminal is a tty and is often line-buffered by the kernel; so nothing is sent to your program until you press the return key (you could do raw input on a terminal, but that is OS specific; consider also readline on Linux).
this line: if (scanf("%c",c)=='\n') ...; will NEVER work.
scanf returns a value that indicates the number of successful parameter conversions.
suggest:
// note: 'c' must be defined as int, not char
// for several reasons including:
// 1) getchar returns an int
// 2) on some OSs (dos/windows) '\n' is 2 characters long
// 3) if checking for EOF, EOF is defined as an int
if( '\n' == (c = getchar() ) )
{ // then found newline
...
#include <stdio.h>
int main (void){
int num;
scanf("%*[^?]?");//read till the "?"
while(1==scanf("%d", &num)){
printf("%d\n", num);
}
return 0;
}
DEMO

Inputting characters and integers in a line in c reading characters from past input stream

I was always bad at inputting characters in C and this is another example. Though I understood (maybe) what's happening but I can't figure out the solution.
I have the following code
scanf("%ld %ld",&n,&m);
for(i=0;i<n;i++)
scanf("%ld",&array[i]);
for(i=0;i<m;i++)
{
fflush(stdin);
//inputting a character 'R' but it is picking '\n' from past buffer
scanf("%c",&query);
//As a result of above problem, it is also acting wierd for same reason
scanf("%ld",&d);
printf("%c %ld",query,d);
printf("\nI=%ld\n",i);
}
Please help me figure out the reason why its happening and what is the solution.
Using scanf with %d (or %ld) only extracts the number from the input stream; it leaves the newline in the stream.
So when you write scanf("%c", it reads that newline.
To fix this (if your intent is that scanf("%c" reads the first character of the next line), you need to flush the input of the previous line. One way to do that is:
int ch; while ( (ch == getchar()) != EOF && ch != '\n' ) { }
Your line fflush(stdin); causes undefined behaviour - don't do that. The fflush function is only for output streams.
Also , it is a really good idea to check the return value of scanf. If it was not what you expected then you may wish to take some action, instead of pretending that a number was entered.
Since you are tired of input issues, I can give you a method that can help to simplify your live.
I can observe that:
You have problems in handling end-of-lines.
Sometimes you need to input numbers and sometimes you need characters or another kind of input. So, you (think that you) are forced to use formatted input.
My advice is that you separate the issue of reading input from the issue of interpreting data entered from input.
The standard C brings only a few functions to handle input/output operations, in the standard header <stdio.h>.
If you are not interested in very sofisticated I/O results, the standard library is enough.
However, the functions of <stdio.h> usually have the effect that input is read one line at the time, which includes the end-of-line character: '\n'.
What you can do, then, it's what follows:
Read a line with fgets(..., stdin) and put the result in a buffer (not so long), used only for this purpose.
Once you have read an entire line, no more issues with end-of-line will bother you.
Then, re-read this line, that it's held in a buffer, and apply to it all the formatted input that you need.
A short example:
#include <stdio.h>
int main(void) {
char buffer[200] = ""; // Initialize array to 0's
long int n, m;
char c;
fgets(buffer, sizeof(buffer), stdin);
sscanf(buffer,"%ld %ld",&n,&m);
// Now you have processed the "integer number" input,
// read input characters again, withou any "flushes" and extrange things:
fgets(buffer, sizeof(buffer), stdin);
sscanf(buffer,"%c", &c);
fgets(buffer, sizeof(buffer), stdin);
// and so on...
}
Thus, every time you need to separate a section of input from a previous one, just do a new line reading with fgets(..., stdin), which stores the input in buffer, and then process the buffer with sscanf(), which applies the format string to the buffer instead of the input itself (in its flesh).
Note: This method can have a little problem: If the string input has more than sizeof(buffer) characters (in the example: 200), the line is not completely read. This situation can be handled by checking if the character before last in buffer is not equal to '\n' nor '\0'. In such a case, you would make automatically some kind of "flushing input" operation (reading and discarding characters till the next end-of-line is found).

peek at input buffer, and flush extra characters in C

If I want to receive a one character input in C, how would I check to see if extra characters were sent, and if so, how would I clear that?
Is there a function which acts like getc(stdin), but which doesn't prompt the user to enter a character, so I can just put while(getc(stdin)!=EOF);? Or a function to peek at the next character in the buffer, and if it doesn't return NULL (or whatever would be there), I could call a(nother) function which flushes stdin?
Edit
So right now, scanf seems to be doing the trick but is there a way to get it to read the whole string, up until the newline? Rather than to the nearest whitespace? I know I can just put "%s %s %s" or whatever into the format string but can I handle an arbitrary number of spaces?
You cannot flush the input stream. You will be invoking undefined behavior if you do. Your best bet is to do:
int main() {
int c = getchar();
while (getchar() != EOF);
return 0;
}
To use scanf magic:
#include <stdio.h>
#include <stdlib.h>
#define str(s) #s
#define xstr(s) str(s)
#define BUFSZ 256
int main() {
char buf[ BUFSZ + 1 ];
int rc = scanf("%" xstr(BUFSZ) "[^\n]%*[^\n]", buf);
if (!feof(stdin)) {
getchar();
}
while (rc == 1) {
printf("Your string is: %s\n", array);
fflush(stdout);
rc = scanf("%" xstr(LENGTH) "[^\n]%*[^\n]", array);
if (!feof(stdin)) {
getchar();
}
}
return 0;
}
You can use getline to read a whole line of input.
Alternatively (in response to your original question), you can call select or poll on stdin to see if there are additional characters to be read.
I had a similar problem today, and I found a way that seems to work. I don't know the details of your situation, so I don't know if it will work for you or not.
I'm writing a routine that needs to get a single character from the keyboard, and it needs to be one of three specific keystrokes (a '1', a '2', or a '3'). If it's not one of those, the program needs to send and error message and loop back for another try.
The problem is that in addition to the character I enter being returned by getchar(), the 'Enter' keystroke (which sends the keystroke to the program) is saved in an input buffer. That (non-printing) newline-character is then returned by the getchar() facility in the error-correction loop, resulting further in a second error message (since the newline-character is not either a '1', a '2', nor a '3'.)
The issue is further complicated because I sometimes get ahead of myself and instead of entering a single character, I'll enter the filename that one of these options will request. Then I have a whole string of unwanted characters in the buffer, resulting in a long list of error messages scrolling down the screen.
Not cool.
What seems to have fixed it, though, is the following:
c = getchar(); // get first char in line
while(getchar() != '\n') ; // discard rest of buffer
The first line is the one that actually uses the character I enter. The second line disposes of whatever residue remains in the input buffer. It simply creates a loop that pulls a character at a time from the input buffer. There's no action specified to take place while the statement is looping. It simply reads a character and, if it's not a newline, goes back for the next. When it finds a newline, the loop ends and it goes on to the next order of business in the program.
We can make a function to clear the keyboard buffer, like this:
#include <stdio.h>
void clear_buffer(){
char b;
//this loop take character by character in the keyboard buffer using
//getchar() function, it stop when the variable "b" was
//enter key or EOF.
while (((b = getchar()) != '\n') && (b != EOF));
}
int main()
{
char input;
//get the input. supposed to be one char!
scanf("%c", &input);
//call the clearing function that clear the buffer of the keyboard
clear_buffer();
printf("%c\n",input); //print out the first character input
// to make sure that our function work fine, we have to get the
// input into the "input" char variable one more time
scanf("%c", &input);
clear_buffer();
printf("%c\n",input);
return 0;
}
Use a read that will take a lot of characters (more than 1, maybe 256), and see how many are actually returned. If you get more than one, you know; if you only get one, that's all there were available.
You don't mention platform, and this gets quite tricky quite rapidly. For example, on Unix (Linux), the normal mechanism will return a line of data - probably the one character you were after and a newline. Or maybe you persuade your user to type ^D (default) to send the preceding character. Or maybe you use control functions to put the terminal into raw mode (like programs such as vi and emacs do).
On Windows, I'm not so sure -- I think there is a getch() function that does what you need.
Why don't you use scanf instead of getc, by using scanf u can get the whole string.

Resources