How to definitely solve the scanf input stream problem - c

Suppose I want to run the following C snippet:
scanf("%d" , &some_variable);
printf("something something\n\n");
printf("Press [enter] to continue...")
getchar(); //placed to give the user some time to read the something something
This snippet will not pause! The problem is that the scanf will leave the "enter" (\n)character in the input stream1, messing up all that comes after it; in this context the getchar() will eat the \n and not wait for an actual new character.
Since I was told not to use fflush(stdin) (I don't really get why tho) the best solution I have been able to come up with is simply to redefine the scan function at the start of my code:
void nsis(int *pointer){ //nsis arconim of: no shenanigans integer scanf
scanf("%d" , pointer);
getchar(); //this will clean the inputstream every time the scan function is called
}
And then we simply use nsis in place of scanf. This should fly. However it seems like a really homebrew, put-together-with-duct-tape, solution. How do professional C developers handle this mess? Do they not use scanf at all? Do they simply accept to work with a dirty input stream? What is the standard here?
I wasn't able to find a definite answer on this anywhere! Every source I could find mentioned a different (and sketchy) solution...
EDIT: In response to all commenting some version of "just don't use scanf": ok, I can do that, but what is the purpose of scanf then? Is it simply an useless broken function that should never be used? Why is it in the libraries to begin with then?
This seems really absurd, especially considering all beginners are taught to use scanf...
[1]: The \n left behind is the one that the user typed when inputting the value of the variable some_variable, and not the one present into the printf.

but what is the purpose of scanf then?
An excellent question.
Is it simply a useless broken function that should never be used?
It is almost useless. It is, arguably, quite broken. It should almost never be used.
Why is it in the libraries to begin with then?
My personal belief is that it was an experiment. It tries to be the opposite of printf. But that turned out not to be such a good idea in practice, and the function never got used very much, and pretty much fell out of favor, except for one particular use case...
This seems really absurd, especially considering all beginners are taught to use scanf...
You're absolutely right. It is really quite absurd.
There's a decent reason why all beginners are taught to use scanf, though. During week 1 of your first C programming class, you might write the little program
#include <stdio.h>
int main()
{
int size = 5;
for(int i = 0; i < size; i++) {
for(int j = 0; j < size; j++)
putchar('*');
putchar('\n');
}
}
to print a square. And during that first week, to make a square of a different size, you just edit the line int size = 5; and recompile.
But pretty soon — say, during week 2 — you want a way for the user to enter the size of the square, without having to recompile. You're probably not ready to muck around with argv. You're probably not ready to read a line of text using fgets and convert it back to an integer using atoi. (You're probably not even ready to seriously contemplate the vast differences between the integer 5 and the string "5" at all.) So — during week 2 of your first C programming class — scanf seems like just the ticket.
That's the "one particular use case" I was talking about. And if you only used scanf to read small integers into simple C programs during the second week of your first C programming class, things wouldn't be so bad. (You'd still have problems forgetting the &, but that would be more or less manageable.)
The problem (though this is again my personal belief) is that it doesn't stop there. Virtually every instructor of beginning C classes teaches students to use scanf. Unfortunately, few or none of those instructors ever explicitly tell students that scanf is a stopgap, to be used temporarily during that second week, and to be emphatically graduated beyond in later weeks. And, even worse, many instructors go on to assign more advanced problems, involving scanf, for which it is absolutely not a good solution, such as trying to do robust or "user friendly" input validation.
scanf's only virtue is that it seems like a nice, simple way to get small integers and other simple input from the user into your early programs. But the problem — actually a big, shuddering pile of 17 separate problems — is that scanf turns out to be vastly complicated and full of exceptions and hard to use, precisely the opposite of what you'd want in order to make things easy for beginners. scanf is only useful for beginners, and it's almost perfectly useless for beginners. It has been described as being like square training wheels on a child's bicycle.
How do professional C developers handle this mess?
Quite simply: by not using scanf at all. For one thing, very few production C programs print prompts to a line-based screen and ask users to type something followed by Return. And for those programs that do work that way, professional C developers unhesitatingly use fgets or the like to read a full line of input as text, then use other techniques to break down the line to extract the necessary information.
In answer to your initial question, there's no good answer. One of the fundamental rules of scanf usage (a set of rules, by the way, that no instructor ever teaches) is that you should never try to mix scanf and getchar (or fgets) in the same program. If there were a good way to make your "Press [enter] to continue..." code work after having called scanf, we wouldn't need that rule.
If you do want to try to flush the extra newline, so that a later call to getchar might work, there are several questions here with a bunch of good answers:
scanf() leaves the newline character in the buffer
Using fflush(stdin)
How to properly flush stdin in fgets loop
There's one more unrelated point that ends up being pretty significant to your question. When C was invented, there was no such thing as a GUI with multiple windows. Therefore no C programmer ever had the problem of having their output disappear before they could read it. Therefore no C programmer ever felt the need to write printf("Press [enter] to continue..."); followed by getchar(). I believe (another personal belief) that it is egregiously bad behavior for any vendor of a GUI-based C compiler to rig things up so that the output disappears upon program exit. Persistent output windows ought to be the default, for the benefit of beginning C programmers, with some kind of non-default option to turn that behavior off for those who don't want it.

Is scanf broken? No it is not. It is an excellent input function when you want to parse free form input data where few errors are to be expected. Free form means here that new lines are not relevant exactly as when you read/write very long paragraphs on a normal screen. And few errors expected is common when you read from files.
The scanf family function has another nice point: you have the same syntax when reading from the standard input stream, a file stream or a character string. It can easily parse simple common types and provide a minimal return value to allow cautious programmers to know whether all or part of all the expected data could be decoded.
That being said, it has major drawbacks: first being a C function, it cannot directly control whether the programmer has passed types meeting the format specifications, and second, as beginners are not consistenly hit on their head when they forget to control its return value, it is really too easy to make fully broken programs using it.
But the rule is:
if input is expected to be line oriented, first use fgets to get lines and then sscanf testing return values of both
only if input is expect to be free form (irrelevant newlines), scanf should be used directly. But never without testing its return value except for trivial tests.
Another drawback is that beginners hope it to be clever. It can indeed parse simple input formats, but is only a poor man's parser: do not use it as a generic parser because that is not what it is intended for.
Provided those rules are observed, it is a nice tool consistent with most of C language and its standard library: a simple tool to do simple things. It is up to programmers or library implementers to build richer tools.
I have only be using C language for more than 30 years, and was never bitten by scanf (well I was when I was a beginner, but I now know that I was to blame). Simply I have just tried for decades to only use it for what it can do...

Related

Creating a user length defined array in C

I'm trying to make an array with variable starting length to get a string. The code should count the words and adjust the size of the array, but this is only a test and I expose it here because I want to know if it's a good practice or one error. And if there is something I should know about, or I must have in mind.
Note, I talk about C, not C++
#include <stdio.h>
int main()
{ int c,b,count;
scanf("%d",&c);
count=c+1;
getchar();
char a[count];
for ( c=b=0 ; c!=count && b!='\n' ; c++ )
{
b=getchar();
a[c]=b;
}
a[c]='\0';
printf("%s",a); printf("%d",c-1);
}
I don't need change the size of the array at the execution time.
I was testing and I don't remember well why I'm using the c variable at first time instead of count directly, but I remember the first getchar was to flush the buffer, because it didn't work without the getchar.
I don't know why I need to put getchar. If I delete the getchar the program fails.
Anyway the program works fine. The first time you run, it expects a number with scanf and then expects the text.
If the text is larger than the size of the array the program will ignore it.
The number is the size of the array.
My questions are:
It is a good practice do a[variable] to do this job?
Why I need the getchar?
It will be portable? I mean, I don't know if some systems or standards don't accept this like some old C compilers or somewhat.
There are better methods?
It is a good practice do a[variable] to do this job?
It depends on someone's compiler configuration. It has been supported since C99. However since there's not a good reason to use it in such a simple program, use the standard malloc instead. Here's an in-depth discussion of the topic.
Why I need the getchar?
There's likely some input still buffered up in your terminal, and that first character is discarding it. Try printing the value out to the screen to see what it is, that might help as figure it out.
It will be portable?
See my answer to your first question. It will probably work on modern versions of gcc, but for example it doesn't work in Windows C (which is still basically on C89).
It is a good practice do a[variable] to do this job?
Where the size is determined by arbitrary user input without imposed limits, it is not good practice. A user could easily enter a very large value and overrun the stack.
Use either dynamic allocation, or check and coerce the input value to some sensible limit.
Also worth noting that VLAs are not supported in C++ or some C compilers, so the code lacks portability.
Why I need the getchar?
The user has to enter at least a newline for scanf() to return, but the %d format specifier does not consume non-digit characters, so it remains buffered. However your code is easily broken by entering additional non-digit characters for example "16a<newline>" will assign 16 to c, and the a will be discarded leaving the newline buffered as before. A better solution is:
while( getchar() != `\n` ) {}
It will be portable? I mean, I don't know if some systems or standards don't accept this like some old C compilers or somewhat.
Adoption of C99 VLAs is variable, and in C11 they are optional in any case.
There are better methods?
I hesitate to say "better", but safer and more flexible and portable ways sure. With respect to the array allocation, you could use malloc().
Using malloc or calloc would be a better choice in C
https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm

Is it possible to capture Enter keypresses in C?

I want to write a C program that tokenizes a string and prints it out word by word, slowly, and I want it to simultaneously listen for the user pressing Enter, at which point they will be able to input data. Is this possible?
Update: see #modifiable-lvalue 's comment below for additional helpful information, e.g., using getchar() instead of getch(), if you go that route.
Definitely possible. Just be aware that gets() may not be entirely helpful for this purpose, since gets() interprets enter not as enter per se, but as "now, I the user, have entered as much string as I want to". So the input gathered by gets() from pressing just an enter will appear as an empty string (which might be workable for you).
See: http://www.cplusplus.com/reference/cstdio/gets/.
But there are other reasons not to use gets()--it does not let you specify a maximum to read in, so it's easy to overflow whatever buffer you are using. A security and bug nightmare waiting to happen. So you want fgets(), which allows you to specify a maximum size to read in. fgets() will place a newline in the string when an enter is pressed. (BTW, props to #jazzbassrob on fgets()).
You could also consider something like getch()--which really deals with individual key-presses (but it gets a bit complicated handling keys that have non-straightforward scan-codes). You may find this example helpful: http://www.daniweb.com/software-development/cpp/code/216732/reading-scan-codes-from-the-keyboard. But because of the scancodes issues, getch() is subject to platform details.
So if you want a more portable approach, you may need to use something heavier weight, but fairly portable, such as ncurses.
I suspect you can do what you want with either fgets(), keeping in mind that enter will give you a string with just a newline in it, or getch().
I just wanted you to be aware of some of the implementation/platform issues that can arise.
C can absolutely do this, but it's a little more complicated than one might guess at first attempt. That's because terminal input is, historically, very platform dependent.
Check out the conio.h library and its use in game making. You can compile with Borland. That was my first experience in unbuffered input and listening for keypresses in real time. There are certainly other ways though.

Input/ Output alternatives for printf/scanf

It may sound strange that knowing a lot about iOS and having some experience in .net, I am a newcomer to C. Somewhere I got this target to find average of n numbers without using printf and scanf. I don't want the code for the program but I am seeking alternatives to the mentioned functions.
Is code with printf/scanf required here? Also do let me know if my query stands invalid.
No, neither printf nor scanf is really needed for this.
The obvious alternatives would be to read the input with something like getc or fgets and convert from characters to numbers with something like strtol.
On the output side, you'd more or less reverse that, converting from numbers to characters (e.g., with itoa which is quite common, though not actually standard), then printing out the resulting string (e.g., with fputs).

C style printf/scanf

I've seen, here and elsewhere, many questions that, to get input data, use something like this:
...
printf("What's your name? ");
scanf("%s",name);
...
This is very reminiscent of the old BASIC days (INPUT for those who remember it).
The majority of those questions, if not all, are from people just learning C and are homeworks or example taken from their book.
I clearly remember that when I learned C I was told that this type of question/answer style was not a good practice for getting user input. The "Right Way" was either to get parameters on the command line (argv[...]) or reading from a data file to be parsed with fgets(). When user friendliness was a must, termio and friends had to be used.
Now, I wonder if anything changed in the past years. Are people thaught to structure user interaction as a set question/answer now?
I can only see disadvantages in using the printf()/scanf() approach, the main one being the diversity of terminals (^H anyone?) that could make difficult for the user to correct mistakes.
Could anyone point me to concrete advantages of this approach?
This structure is easy to explain and easy to learn, which is why it appears in so many introductory materials. Doing user input "the right way" in C can appear fairly daunting to a neophyte, especially when you have to deal with tokenizing and conversions.
However, I agree that it would be valuable for introductory materials to demonstrate more robust methods for handling user input.
I always thought the Unix way was to accept input from stdin. That way the caller of the command can pipe input in from another command, from a file or manually.
fgets() / sscanf() or similar is the right way to accept user input.
Take what you read here and elsewhere with a (large) pinch of salt.
The GNU readline library is really an excellent resource for this. Its main advantage is that it handles all the intricacies of editing, as well as letting users have their own input settings, eg Vi or Emacs mode.
This is the library bash and many other programs that accept interactive line-based data use.
By using the library, you get an interface that your users will have some knowledge of how to use, plus you get all sorts of nice features without having to explicitly code support for line editing.
I agree with CTFord, that IF you're doing a command line program, then stdin is a perfectly reasonable way to deal with input.
However, now adays, the correct way is to make a window that has a text box a label and a button and blah blah blah.
I know that doesn't really answer your question, but I think my point is that that style of programming has become MUCH less prevalent, so that it doesn't really matter what is "correct".
In many cases you are probably missing the point of the exercise if you think this is important. There's a world of a difference between a homework exercise and real world programs. The purpose of such exercises is seldom, if ever, to teach user interface design; the technique you describe is generally just a simple way to obtain test input for the real exercise, and is also probably encouraged by tutors who, pressed for time require submitted code to adhere to some 'course-style' to make marking easier. 'Course-style' is seldom the same thing as 'good-practice' or 'practical style', but that is not to say that it does not serve some purpose, or even that that purpose is beneficial to the students education.
Unfortunately novices often get hung up on this stuff when it is generally irrelevant to the actual purpose of the exercise. The problem is that user input using standard library primitives is not as foolproof and some tutors seem to think.

When/why is it a bad idea to use the fscanf() function?

In an answer there was an interesting statement: "It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets() to get each line in and then sscanf() that."
Could you expand upon when/why it might be better to use fgets() and sscanf() to read some file?
Imagine a file with three lines:
1
2b
c
Using fscanf() to read integers, the first line would read fine but on the second line fscanf() would leave you at the 'b', not sure what to do from there. You would need some mechanism to move past the garbage input to see the third line.
If you do a fgets() and sscanf(), you can guarantee that your file pointer moves a line at a time, which is a little easier to deal with. In general, you should still be looking at the whole string to report any odd characters in it.
I prefer the latter approach myself, although I wouldn't agree with the statement that "it's almost always a bad idea to use fscanf()"... fscanf() is perfectly fine for most things.
The case where this comes into play is when you match character literals. Suppose you have:
int n = fscanf(fp, "%d,%d", &i1, &i2);
Consider two possible inputs "323,A424" and "323A424".
In both cases fscanf() will return 1 and the next character read will be an 'A'. There is no way to determine if the comma was matched or not.
That being said, this only matters if finding the actual source of the error is important. In cases where knowing there is malformed input error is enough, fscanf() is actually superior to writing custom parsing code.
When fscanf() fails, due to an input failure or a matching failure, the file pointer (that is, the position in the file from which the next byte will be read) is left in a position other than where it would be had the fscanf() succeeded. This is typically undesirable in sequential file reads. Reading one line at a time results in the file input being predictable, while single line failures can be handled individually.
There are two reasons:
scanf() can leave stdin in a state that's difficult to predict; this makes error recovery difficult if not impossible (this is less of a problem with fscanf()); and
The entire scanf() family take pointers as arguments, but no length limit, so they can overrun a buffer and alter unrelated variables that happen to be after the buffer, causing seemingly random memory corruption errors that very difficult to understand, find, and debug, particularly for less experienced C programmers.
Novice C programmers are often confused about pointers and the “address-of” operator, and frequently omit the & where it's needed, or add it “for good measure” where it's not. This causes “random” segfaults that can be hard for them to find. This isn't scanf()'s fault, so I leave it off my list, but it is worth bearing in mind.
After 23 years, I still remember it being a huge pain when I started C programming and didn't know how to recognize and debug these kinds of errors, and (as someone who spent years teaching C to beginners) it's very hard to explain them to a novice who doesn't yet understand pointers and stack.
Anyone who recommends scanf() to a novice C programmer should be flogged mercilessly.
OK, maybe not mercilessly, but some kind of flogging is definitely in order ;o)
It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets() to get each line in and then sscanf() that.
You can always use ftell() to find out current position in file, and then decide what to do from there. Basicaly, if you know what you can expect then feel free to use fscanf().
Basically, there's no way to to tell that function not to go out of bounds for the memory area you've allocated for it.
A number of replacements have come out, like fnscanf, which is an attempt to fix those functions by specifying a maximum limit for the reader to write, thus allowing it to not overflow.

Resources