Understanding scanf behaviour - c

I am very new to C programming. My sir gave this code to find the maximum of n numbers. When I do as Sir says things are perfect i.e Write a number when the line - Type the number of numbers and write numbers in a row like 7 8 9 10 when Type the numbers pop up.
#include <stdio.h>
main()
{
int n, max, number, i;
printf("Type the number of numbers");
scanf("%d", &n);
if(n>0)
{
printf("Type the numbers");
scanf("%d",&number);
max=number;
for(i=1; i<n; i++)
{
scanf("%d", &number);
if(number>max)
max=number;
}
printf("MAX=%d \n", max);
}
}
But if I write suppose - 5 8 9 10 7 6 - then the program understands it like --
It puts n = 5 then puts number = 8 then loop executes number changes to 9 then number changes to 10 till 6 and then gves max.
So how is scanf working here? It takes digit individually although they are written in a row with spaces?

From the horse's mouth:
7.21.6.2 The fscanf function
...7 A directive that is a conversion specification defines a set of matching input sequences, as
described below for each specifier. A conversion specification is executed in the
following steps:
8 Input white-space characters (as specified by the isspace function) are skipped, unless
the specification includes a [, c, or n specifier.284)
9 An input item is read from the stream, unless the specification includes an n specifier. An
input item is defined as the longest sequence of input characters which does not exceed
any specified field width and which is, or is a prefix of, a matching input sequence.285)
The first character, if any, after the input item remains unread. If the length of the input
item is zero, the execution of the directive fails; this condition is a matching failure unless
end-of-file, an encoding error, or a read error prevented input from the stream, in which
case it is an input failure.
10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
...
12 The conversion specifiers and their meanings are:
d Matches an optionally signed decimal integer, whose format is the same as
expected for the subject sequence of the strtol function with the value 10
for
the base argument. The corresponding argument shall be a pointer to
signed integer.
...
284) These white-space characters are not counted against a specified field width.
285) fscanf pushes back at most one input character onto the input stream. Therefore, some sequences
that are acceptable to strtod, strtol, etc., are unacceptable to fscanf.
The processing for scanf is exactly the same; the only difference is that scanf always reads from standard input.
Examples:
Suppose you type SpaceSpaceSpace123Enter in response to the first prompt; the input stream then contains the sequence {' ', ' ', ' ', '1', '2', '3', '\n'}. When you call scanf( "%d", &n );, scanf reads and discards the leading blank spaces, then reads and matches the sequence {'1', '2', '3'}, converts it to the integer value 123, and assigns the result to n. Since there was a successful conversion and assignment, scanf returns 1.
If the input stream contains the sequence {' ', ' ', ' ', '1', '2', '.', '3', '\n'}, scanf reads and discards the leading blanks, then reads and matches the sequence {'1', '2'}, converts it to the integer value 12, and assigns the result to n. The input stream will still contain {'.', '3', '\n'}. Since there was a successful conversion and assignment, scanf will return 1.
If the input stream contains the sequence {'.', '3', '\n'}, then there is no matching sequence of characters ('.' is not a valid character in a decimal integer). scanf will leave the . unread and leave the value of n unchanged. Since there was not a successful conversion and assignment, scanf returns 0 to indicate a matching failure.
If an end-of-file is signaled on the input stream before any matching characters have been read, or if there's some other input error, scanf does not assign any new value to n and returns EOF to indicate an input failure.

The "%d" in scanf("%d",&number); causes has 3 stages of scanning user text input into an int.
0 or more leading whites-space, like ' ', '\n', '\t' and some others are read and discarded.
Numeric text like "123", "-123", "+123" is read until until a non-numeric character is read. (or end-of-file, or a rare input error).
That non-numeric character is put back into stdin for subsequent input calls.
If step 2 is successful in reading at least 1 digit, the function returns 1. Good code checks the returned value.
if (scanf("%d",&number) != 1) Handle_UnexpectedInput();
The important thing is that '\n' is not so special with scanf("%d",&number);. It acts like a separator like another white-space or non-numeric text.
'\n' does cause the buffered stdin to accept the line of user input for processing by the various scanf() calls.

Here it is simplified explanation (not overly simplified, I hope) on how scanf works from a user point of view:
The arguments are divided in two parts:
The first part is a “format string”. The string is made of at least
one format specifier. In its simplest form a specifier begins with
% and it is followed by a letter that specifies the type of
variable you’re expecting (“%d” – I’m expecting an integer). The
number of specifiers must match the number of parameters and types
in the second part.
The second part is made of one or more addresses to locations memory
where the data you input will be stored. The pointed types must
match the specifiers.
When called, the function will repeat the following steps, starting with the first specifier and the first pointer, until the end of format string is detected:
Read and discard any white-space until a non-white-space character is found (white-space: space, tab, NL, at least);
Read characters up to first white-space or a character that do not match the expected input for current specifier;
Convert them to the type of current specifier and
Store the result in the location pointed by the current pointer.
There are three typical beginner mistakes which will result in undefined behavior (crash, most likely):
You forget the address-of operator &.
The specifier and the type do not match.
The number of specifiers do not match the number of pointers.
int d;
scanf( "%d", d ); // no &
scanf_s( "%s", &d ); // s do not match int
scanf_s( "%d%d", &d ); // too many specifiers

when you press the keyboard, you are filling a buffer on your computer. the scanf will read the amount of consecutive data untill it hit a space, so "1234 43", on the code scanf("%d") you are saying "read one number", and 1234 is one number, that's what it will read.
But if you have a loop that will execute that scanf again, the number "43" is currently in the reading buffer, and scanf will read it without stopping.
The manual for scanf doesn't explains that and it's a bit confusing for a newcomer to understand why the application is not stopping there to read a new number.

Let me explain as simply as possible...
scanf reads bytes from the standard input stream stdin.
Let's say the input you give is "23 67 21 99\n" (The \n is from when you pressed Enter).
Then each next call to scanf will start reading from this input buffer and it will interpret what is sees as whatever you tell it ("%d", etc) while separating inputs by an empty character. This could be a new line, a space, a tab, etc.
While there are still bytes to be read, scanf will not wait for you to input. That is what is happening here.

let's keep it simple. I assume you don't know anything about buffer or stdin.
scanf is used to take input from user. Whenever you type a number and press 'space' or ' enter' on keyboard the number is entered into program for further purposes. When you type scanf("%d",&n); it mean take integer input from the user and store it on the address of variable n.

Related

What happen if scanf don't get enough characters from the standard input?

I know that the scanf function read the characters from standard input and it interpret them using the conversions specifications but what it will happen if there is some missing characters in the input stream ? it freeze until getting the next characters that it need to or it just ends ? For exemple :
the scanf call :
scanf ( " %d %d ", arg1, arg2) ;
the input stream : 14
Sorry for any mistakes in vocabulary. english is'nt my mother tong
What happen if scanf don't get enough characters from the standard input?
In general, scanning stops and scanf() returns the number of successful matching format specifiers. Unmatched input characters remains in stdin. The later specifiers' corresponding pointers' data are unchanged.
Yet there are many details - the above and below are over-simplifications.
scanf() stops under various conditions:
1) The happy path The format is fully satisfied. There is nothing more to the format string.
2) Input does not match the specifier anymore. Specifiers like "%d", "%f", "%99[^\n]", "%99s" consume input until some character that does not meet the specifiers needs - that character(s) is put back in the stream. If insufficient characters were read, scanning stops and the number of successful matching format specifiers is returned. If enough were read, scanning continues to the next part of the format. ("%n" is special - not addressed here).
int retval = sscanf("1abc", "%d", &i); // stop at 'a', return 1
int retval = sscanf("+1-", "%d%d", &i); // stop at '-', return 1
2) Input does not match a fixed character except white-space, scanning stops. Returning the number of successful matching format specifiers.
int retval = sscanf("1abc", "%da", &i); // stop at 'b', return 1
3) End-of-file occurs. scanf()returns the number of successful matched format specifiers.
4) Input error (rare). E.g. some internal mis-communication or trying to read from stdout. All pointers' data values are indeterminate. Return EOF.
5) Nothing matched, return EOF.
scanf ( " %d %d ", arg1, arg2) ;
With input "14" - end-of-file,
1) The first " " scans zero or more white-space. Never fails, scanning continues.
2) The first "%d" scans zero or more white-space then scans for numeric input consuming the "14". *arg1 is set to 14.
3a) End-of-file occurs. *arg2 unchanged. scanf() return 1.
With input "14\n" like above until 3a above.
3b) The 2nd " " scans zero or more white-space. It consumes '\n' and waits until a non-white-space is detected or end-of-file. scanf() is still processing and has not yet returned.
Given that you only input 14, scanf will wait for the remaining input specified in your format string or there is an input error, such as unexpected input or the input stream is closed.
Assuming the input stream is closed after 14.
The value of *arg2 will get unchanged.
But *arg1 will read fine; it will become 14.
Otherwise:
scanf will just wait until you input another integer. (and if you do not, your program will hang here forever)

C printf wont print before scaning next number

I got this piece of code
#include<stdio.h>
#include <stdlib.h>
int main()
{
int i;
int number, S;
float MO;
S = 0;
for(i=0;i<10;i++)
{
printf("AAA %d\n", i );
scanf("%d\n", &number);
S = S + number;
}
MO = S/10;
printf("%d , %f \n",S , MO );
return 0;
}
when the execution starts, AAA 0 is printed.I then give my first number.After that, i am expecting to see AAA 1 , but this will be printed only after i give my second number.
Checked this here
C/C++ printf() before scanf() issue
but seems i can get none of these solutions work for me
The answers claiming that this has something to do with flushing input or output are wrong. The problem has nothing to do with this. The appearance of the \n character at the end of the scanf() template string instructs scanf() to match and discard whitespace characters. It will do so until a non-whitespace character is encountered, or end-of-file is reached. The relevant part of the C11 Standard is §7.21.6.2 5:
A directive composed of white-space character(s) is executed by
reading input up to the first non-white-space character (which remains
unread), or until no more characters can be read. The directive never
fails.
In OP's case, a second number must be placed in the input stream so that the first can be assigned. The second number then remains in the input stream until the next call to scanf(). In the case of the example given by Stephan Lechner, taking input from a file, there is a number in the input stream after each number to be assigned, until the last number (if there are exactly ten numbers), and then the EOF causes scanf() to return. Note that OP could also have signalled EOF from the keyboard after each input. Or, OP could enter all numbers on one line, with an extra number to signal end of input:
1 2 3 4 5 6 7 8 9 10 11
The solution is simply to remove the \n from the end of the scanf() template string. Whitespace characters at the end of such a template string are tricky, and almost never what is actually desired.
Just remove the \n from the scanf format string:
scanf("%d", &number);
"\n" waits for non-white-space. Instead use scanf("%d", &number) #Michael Walz and check its return value.
Let us break down scanf("%d\n", &number);
"%d" Scan for numeric text. This specifier will allow leading white-space. Once some non-numeric character is found, put that character back into stdin. If valid numeric text for an integer was found, convert to an int and save to number.
"\n" Scan and discard 0 or more white-spaces such as '\n', ' ', and others. Continue scanning until non-white-space is encountered, put that character back into stdin.
Now return the number of fields scanned (1).
Notice step 2. User had to type in data after the Return or '\n' to get scanf() to return.
The issue is with the buffering of stdout which doesn't force a flush unless a \n is in the output or fflush is called specifically

Why value of second variable is irrelevant after adding space or new line?

I'm newbie in programming learning C language.I'm little confused right now.I tried to Google about it but can not find out the satisfactory result so i thought to sort out by asking the question in this website.Have a look at this short program
#include<stdio.h>
int main()
{
int num1,num2;
printf("enter the value of num1 and num2:");
scanf("%d %d",&num1,&num2);
printf("num1 = %d and num = %d",num1,num2);
return 0;
}
When i enter value For example- 215-15 without space or new line than it gives output num1 = 215 and num2 = -15 but when i enter space or new line between 215- and 15 then it gives output num1 = 215 and num2 = -175436266(or any unexpected number).
I know that when scanf() reads any character which is not in the catagory of conversion specification it put back that character and end processing other inputs.But in the first case -(minus sign) seems to be irrelevent input according to the conversion specification but it shows correct output but in the later case it not showing correct output.Why?
Because 215- 15 can only match one number: 215. As soon as scanf() reads a -, it stops processing the first match since - can't possibly be a digit of the current number, so num1 is matched with 215.
Then, no more numbers can match, because you are left with - 15. scanf() reads a - followed by a space, so there is no valid number to parse, and it returns (after pushing back the space and the dash). It doesn't assign anything to num2, so what you see when you print it is garbage.
So, why does it work with 215-15?
The space makes the difference. With 215-15, scanf() again matches the first number with 215, but now you are left with -15 in the input (rather than - 15, as in the earlier example). -15 doesn't have a space between the sign and the first digit of the number, so scanf() sees it as a valid number, and parses it successfully.
In short, in both examples, - is interpreted as being the sign of the number for the next match. But %d doesn't ignore whitespace characters between the digits of a number, or between the sign and the digits (although it ignores any amount of whitespace before the number starts - that is, either before the first digit, or before the sign). So, if it sees a - followed by a space, it fails. If it sees a - followed by one or more digits, it matches a number successfully, and consumes the digits until a character that is not a digit is found.
I think what is happening is described in the scanf reference at cplusplus.com.
Any character that is not either a whitespace character (blank,
newline or tab) or part of a format specifier (which begin with a %
character) causes the function to read the next character from the
stream, compare it to this non-whitespace character and if it matches,
it is discarded and the function continues with the next character of
format. If the character does not match, the function fails, returning
and leaving subsequent characters of the stream unread.
Also,
A single whitespace in the format string validates any quantity of
whitespace characters extracted from the stream (including none).
The scanf's format string is "%d %d". It expects a number, it throws away whitespace, and another number. After the first number, the '-' character was read did not match the format specifier so scanf failed early, leaving the num2 variable uninitialized.
If you check the return value of scanf, it would fail.

The scanf function, the specifer %s and the new line

I read into C11 standard this:
Input white-space characters (as specified by the isspace function) are
skipped, unless the specification includes a [, c, or n specifier.
so I understand that if I use that specifiers the next scanf can contains for example a new line.
But if I write this:
char buff[5 + 1];
printf("Input: ");
scanf("%10s", buff);
printf("Input: ");
char buff_2[5 + 1];
scanf("%[abcde]", buff_2);
and then I input, i.e., RR and then Return,
the next scanf fails because of \n.
So also %s doesn't discard a new line?
So also %s doesn't discard a new line?
%s tells scanf to discard any leading whitespace, including newlines. It will then read any non-whitespace characters, leaving any trailing whitespace in the input buffer.
So assuming your input stream looks like "\n\ntest\n", scanf("%s", buf) will discard the two leading newlines, consume the string "test", and leave the trailing newline in the input stream, so after the call the input stream looks like "\n".
Edit
Responding to xdevel2000's comment here.
Let's talk about how conversion specifiers work. Here are some relevant paragraphs from the online C 2011 standard:
7.21.6.2 The fscanf function
...
9 An input item is read from the stream, unless the specification includes an n specifier. An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.285)
The first character, if any, after the input item remains unread. If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.
10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.
12 The conversion specifiers and their meanings are:
...
c Matches a sequence of characters of exactly the number specified by the field width (1 if no field width is present in the directive).286)
...
s Matches a sequence of non-white-space characters.286)
...
[ Matches a nonempty sequence of characters from a set of expected characters
(the scanset).286)
...
285) fscanf pushes back at most one input character onto the input stream. Therefore, some sequences that are acceptable to strtod, strtol, etc., are unacceptable to fscanf.
286) No special provisions are made for multibyte characters in the matching rules used by the c, s, and [
conversion specifiers — the extent of the input field is determined on a byte-by-byte basis. The
resulting field is nevertheless a sequence of multibyte characters that begins in the initial shift state.
%s matches a sequence of non-whitespace characters. Here's a basic algorithm describing how it works (not taking into account end of file or other exceptional conditions):
c <- next character from input stream
while c is whitespace
c <- next character from input stream
while c is not whitespace
append c to target buffer
c <- next character from input stream
push c back onto input stream
append 0 terminator to target buffer
The first whitespace character after the non-whitespace characters (if any) is pushed back onto the input stream for the next input operation to read.
By contrast, the algorithm for the %c specifier is dead simple (unless you're using a field width greater than 1, which I've never done and won't get into here):
c <- next character from input stream
write c to target
The algorithm for the %[ conversion specifier is a little different:
c <- next character from input stream
while c is in the list of characters in the scan set
append c to target buffer
c <- next character from input stream
append 0 to target buffer
push c back onto input stream
So, it's a mistake to describe any conversion specifier as "retaining" trailing whitespace (which would imply that the trailing whitespace is saved to the target buffer); that's not the case. Trailing whitespace is left in the input stream for the next input operation to read.
%s consumes everything until a whitespace character and discards leading whitespace characters not trailing ones. The [ conversion specifier in the second scanf does not skip leading whitespace characters and therefore, fails to scan because of the newline character(which is a whitespace character) left over by the first scanf.
To fix the issue, either use
int c;
while((c=getchar())!='\n' && c!=EOF);
After the first scanf to clear the stdin or add a space before the format specifier(%[) in the second scanf.
Your excerpt from the standard omits important context. The preceding text specifies that skipping whitespace is the first step in processing a conversion specifier for a type other than c, [, or n.
The next step, other than for an n specifier, is to read an input item, which is defined as "the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence" (quoted from C99, but equivalent applies to C2011).
An s item "[m]atches a sequence of non-white-space characters", so with the input you specify, the first scanf() reads everything up to, but not including, the newline.
The standard explicitly specifies
Trailing white space (including new-line characters) is left unread unless matched by a directive.
so the newline definitely remains unscanned at this point.
The format given to the next scanf() starts with a %[ conversion specifier, which, as you already observed, does not cause whitespace (leading or otherwise) to be skipped, though it can include whitespace in the item that is scanned. Since the next character available from the input is a newline, however, and the given scan set for your %[ does not include that character, zero characters are scanned for that item. Going back to the standard (C99, again):
If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.
There are easier ways to read free-form input line by line, but you can do it with scanf() if you must. For example:
char buff[10 + 1] = {0};
printf("Input: ");
/*
* Ignore leading whitespace and scan a string of up to 10 non-whitespace
* characters. Zero-length inputs will produce a matching failure, leaving
* the buffer unchanged (and initialized to an empty string). End of
* input will produce an input error, which is ignored.
*/
scanf("%10s", buff);
/* Scan and ignore anything else up to a newline. There will
* be an (ignorable) matching failure if the next available character is a
* newline. Any input error generated by this call is also ignored.
*/
scanf("%*[^\n]");
/*
* Consume the next character, if any. If there is one, it will be a
* newline. An input error will occur if we're already at the end of stdin;
* a careful program would test for that (by comparing the return value to
* EOF) but this one doesn't.
*/
scanf("%*c");
printf("Input: ");
/* scan the second string; again, we're ignoring matching and input errors */
char buff_2[5 + 1] = {0};
scanf("%5[abcde]", buff_2);
If you're exclusively using scanf() for such a job then it is essential to read each line in three steps, as shown, because each one can produce a matching failure that would prevent any attempt to match subsequent items.
Note, too, how maximum field widths are matched to buffer sizes in that example, which your original code did not do correctly.

confused about getchar and scanf

I'm really confused about the usage of getchar() and scanf(). What's the difference between these two?
I know that scanf() [and family] get a character by character from the user [or file] and save it into a variable, but does it do that immediately or after pressing something (Enter)?
and I don't really understand this code, I saw many pieces of code using getchar() and they all let you type whatever you want on the screen and no response happen, but when you press enter it quits.
int j, ch;
printf("please enter a number : \n");
while (scanf("%i", &j) != 1) {
while((ch = getchar()) != '\n') ;
printf("enter an integer: ");
}
Here in this code can't I use scanf() to get a character by character and test it? Also, what does this line mean?
scanf("%i", &j) != 1
because when I pressed 1 it doesn't differ when I pressed 2? what does this piece do?
and when this line is gonna happen?
printf("enter an integer: ");
because it never happens.
Well, scanf is a versatile utility function which can read many types of data, based on the format string, while getchar() only reads one character.
Basically,
char someCharacter = getchar();
is equivalent to
char someCharacter;
scanf("%c", &someCharacter);
I am not 100% sure, but if you only need to read one character, getchar() might be 'cheaper' than scanf(), as the overhead of processing the format string does not exist (this could count to something if you read many characters, like in a huge for loop).
For the second question.
This code:
scanf("%i", &j) != 1
means you want scanf to read an integer in the variable 'j'. If read successfully, that is, the next input in the stream actually is an integer, scanf will return 1, as it correctly read and assigned 1 integer.
See the oldest answer to this SO question for more details on scanf return values.
As far as I understand,
the getchar function will read your input one character at a time.
scanf will read all types of data, and will be more useful to define a data group.
However, as far as strings go, my teacher recommends using gets instead of scanf. This is because scanf will stop 'getting' the data at the first white space you put in, like in a sentence...
while (scanf("%i", &j) != 1) {
while((ch = getchar()) != '\n') ;
printf("enter an integer: ");
}
Here's how this code breaks down.
scanf() consumes individual characters from the input stream until it sees a character that does not match the %i conversion specifier1, and that non-matching character is left in the input stream;
scanf() attempts to convert the input text into a value of the appropriate type; i.e., if you enter the string "1234\n", it will be converted to the integer value 1234, the converted value will be assigned to the variable j, and the '\n' will be left in the input stream;
if there are no characters in the input string that match the conversion specifier (such as "abcd"), then no conversion is performed and nothing is assigned to j;
scanf() returns the number of successful conversions and assignments.
if the result of the scanf() call is not 1, then the user did not enter a valid integer string;
since non-matching characters are left in the input stream, we need to remove them before we can try another scanf() call, so we use getchar() to consume characters until we see a newline, at which point we prompt the user to try again and perform the scanf() call again.
1. The %i conversion specifier skips over any leading whitespace and accepts optionally signed integer constants in octal, decimal, or hexadecimal formats. So it will accept strings of the form [+|-]{0x[0-9a-fA-F]+ | 0[0-7]+ | [1-9][0-9]*}
The scanf can scan arbitrarily formatted data and parse it as multiple types (integers, floating point, strings, etc). The getchar function just gets a single character and returns it.
The expression
scanf("%i", &j) != 1
reads a (possibly signed) integer from the standard input, and stores it in the variable j. It then compares the return value of the scanf function (which returns the number of successfully scanned conversions) and compares it to 1. That means the expression will be "true" if scanf didn't read or converted an integer value. So the loop will continue to loop as long as scanf fails.
You might want to check this scanf reference.
That the printf doesn't happen might be either because it never happens (use a debugger to find out), or it just seemingly doesn't happen but it really does because the output needs to be flushed. Flushing output is done either by printing a newline, or with the fflush function:
fflush(stdout);
As far as I know, scanf will read user input until the first whitespace, considering the input format specified. getchar, however, reads only a single character.
scanf will return the number of arguments of the format list that were successfully read, as explained here. You obtain the same result when pressing 1 or 2 because both of them are successfully read by the %i format specifier.
getchar reads one char at a time from input. where as scanf can read more depending upon the data type u specify.
its not good practice to use scanf() try using fgets(), its much more efficient and safe than scanf.

Resources